DH-Set: Improving Vision-Language Alignment with Diverse and Hybrid Set-Embeddings Learning | IEEE Conference Publication | IEEE Xplore