Loading [MathJax]/extensions/MathMenu.js
MEDIAN: Adaptive Intermediate-grained Aggregation Network for Composed Image Retrieval | IEEE Conference Publication | IEEE Xplore

MEDIAN: Adaptive Intermediate-grained Aggregation Network for Composed Image Retrieval


Abstract:

The Composed Image Retrieval (CIR) task aims to retrieve a target image that meets the requirements based on a given multimodal query (includes a reference image and modi...Show More

Abstract:

The Composed Image Retrieval (CIR) task aims to retrieve a target image that meets the requirements based on a given multimodal query (includes a reference image and modification text). Most existing works align multimodal semantics at both local and global granularity. However, they have failed to consider the mining of semantic correspondences at the intermediate-grained level, which has resulted in sub-optimal model performance. In this paper, we propose an adaptive interMEDiate-graIned Aggregation Network (MEDIAN). Compared with the conventional CIR models, MEDIAN is capable of generating intermediate-grained feature aggregation supervised signals and constructing graph attention networks to extract intermediate-grained features. Concurrently, MEDIAN also devises cross-modal semantic correspondence aligning guided by the target image, which in turn enables accurate multi-grained feature composition. The superiority of MEDIAN is demonstrated by extensive experiments on three benchmark datasets. Our code is available at https://windlikeo.github.io/MEDIAN.github.io/.
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information:

ISSN Information:

Conference Location: Hyderabad, India

Funding Agency:


References

References is not available for this document.