Mono3DVLT: Monocular-Video-Based 3D Visual Language Tracking | IEEE Conference Publication | IEEE Xplore