Abstract:
The task of Open-World Compositional Zero-Shot Learning (OW-CZSL) is to recognize novel state-object compositions in images from all possible compositions, where the nove...Show MoreMetadata
Abstract:
The task of Open-World Compositional Zero-Shot Learning (OW-CZSL) is to recognize novel state-object compositions in images from all possible compositions, where the novel compositions are absent during the training stage. The performance of conventional methods degrades significantly due to the large cardinality of possible compositions. Some recent works consider simple primitives (i.e., states and objects) independent and separately predict them to reduce cardinality. However, it ignores the heavy dependence between states, objects, and compositions. In this paper, we model the dependence via feasibility and contextuality. Feasibility-dependence refers to the unequal feasibility of compositions, e.g., hairy is more feasible with cat than with building in the real world. Contextuality-dependence represents the contextual variance in images, e.g., cat shows diverse appearances when it is dry or wet. We design Semantic Attention (SA) to capture the feasibility semantics to alleviate impossible predictions, driven by the visual similarity between simple primitives. We also propose a generative Knowledge Disentanglement (KD) to disentangle images into unbiased representations, easing the contextual bias. Moreover, we complement the independent compositional probability model with the learned feasibility and contextuality compatibly. In the experiments, we demonstrate our superior or competitive performance, SA-and-kD-guided Simple Primitives (SAD-SP), on three benchmark datasets.
Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 46, Issue: 1, January 2024)
Citations are not available for this document.
Cites in Papers - |
Cites in Papers - IEEE (6)
Select All
1.
Hirunima Jayasekara, Khoi Pham, Nirat Saini, Abhinav Shrivastava, "Unified Framework for Open-World Compositional Zero-Shot Learning", 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp.2706-2714, 2025.
2.
Yang Liu, Xinshuo Wang, Xinbo Gao, Jungong Han, Ling Shao, "Concept-Aware Graph Convolutional Network for Compositional Zero-Shot Learning", IEEE Transactions on Neural Networks and Learning Systems, vol.36, no.6, pp.10394-10406, 2025.
3.
B. Suvarna, A. Maanas Sai Surya Chandra, Y J S Ganesh, Shaik Mohammad Kaif, Bathula Sai Teja, "Optimizing Footwear Image Classification with Hyperparameter Tuning: Insights from the UTZappos Dataset", 2024 IEEE 3rd World Conference on Applied Intelligence and Computing (AIC), pp.1034-1039, 2024.
4.
Zhipeng Luo, Qiang Gao, Yazhou He, Hongjun Wang, Milos Hauskrecht, Tianrui Li, "Hierarchical Active Learning With Label Proportions on Data Regions", IEEE Transactions on Knowledge and Data Engineering, vol.36, no.12, pp.8434-8446, 2024.
5.
Yun Li, Zhe Liu, Hang Chen, Lina Yao, "Context-Based and Diversity-Driven Specificity in Compositional Zero-Shot Learning", 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.17037-17046, 2024.
6.
Zhuang Shao, Jungong Han, Kurt Debattista, Yanwei Pang, "DCMSTRD: End-to-end Dense Captioning via Multi-Scale Transformer Decoding", IEEE Transactions on Multimedia, vol.26, pp.7581-7593, 2024.
Cites in Papers - Other Publishers (2)
1.
Jiayu Hu, Senlin Shu, Beibei Li, Tao Xiang, Zhongshi He, "An Unbiased Risk Estimator for Partial Label Learning with Augmented Classes", ACM Transactions on Intelligent Systems and Technology, 2024.
2.
Xuechen Zhao, Guodong Ma, Shengnan Pang, Yanhui Guo, Jianxiu Zhao, Jinfeng Miao, "Zero-shot stance detection based on multi-expert collaboration", Scientific Reports, vol.14, no.1, 2024.