Loading [MathJax]/extensions/MathZoom.js
Confusion-Aware Prototypical Contrastive Learning for Open-Vocabulary Object Detection | IEEE Conference Publication | IEEE Xplore

Confusion-Aware Prototypical Contrastive Learning for Open-Vocabulary Object Detection


Abstract:

Pre-trained vision-language models (PVLMs) and pseudo-labeling have proven effective in open-vocabulary object detection (OVD). However, when PVLMs trained on image-text ...Show More

Abstract:

Pre-trained vision-language models (PVLMs) and pseudo-labeling have proven effective in open-vocabulary object detection (OVD). However, when PVLMs trained on image-text data are adapted for OVD tasks, they often encounter challenges with region-text misalignment, resulting in low-quality pseudo-labels (PLs) and suboptimal detection performance. In this paper, we introduce a Confusion-aware Prototypical Contrastive Learning (CPCL) method to address these issues. Our key observation is that while most PLs generated by PVLMs are correctly classified, they still suffer from low inter-class separability and contain a small amount of noise. Therefore, we first cluster pseudo-labels within each category to obtain category prototypes, reducing the impact of noise during training. These category prototypes are then used in confusion-aware prototypical contrastive learning to effectively decrease inference confusion probabilities, leading to improved detection accuracy for novel categories. Extensive experiments on the COCO and LVIS-v1 benchmarks demonstrate that our method significantly enhances inter-class separability and achieves competitive, and even state-of-the-art detection performance.
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information:

ISSN Information:

Conference Location: Hyderabad, India

References

References is not available for this document.