Leveraging Vision-Language Models for Open-Vocabulary Instance Segmentation and Tracking

Leveraging Vision-Language Models for Open-Vocabulary Instance Segmentation and Tracking | IEEE Journals & Magazine | IEEE Xplore