Journals & Magazines >IEEE Transactions on Multimedia >Volume: 26

Feature First: Advancing Image-Text Retrieval Through Improved Visual Features

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Current image-text retrieval methods mainly utilize region features that provide object-level information to represent images, making the retrieval results more accurate ...Show More

Metadata

Abstract:

Current image-text retrieval methods mainly utilize region features that provide object-level information to represent images, making the retrieval results more accurate and interpretable. However, there are several issues with region features, such as lack of rich contextual information, loss of object details and risk of detection redundancy. The ideal visual features in image-text retrieval should have three characteristics: object-level, semantically-rich, and language-aligned. To this end, we propose a novel visual representation framework to capture more comprehensive and powerful visual features. Specifically, since these region feature disadvantages are the grid feature advantages, we first build a two-step interaction model to explore the complex relationship between them from the spatial and semantic perspectives to integrate their complementary information, making the fused visual features both object-level and semantic-rich. Then, we design a text-integrated visual embedding module that utilizes textual information as guidance to filter redundant regions, further endowing visual features with language-aligned capabilities. Finally, we develop a multi-attention pooling module to better aggregate these enhanced visual features in a more fine-grained manner. Extensive experiments demonstrate that our proposed model achieves state-of-the-art performance on the benchmark datasets Flickr30K and MS-COCO.

Published in: IEEE Transactions on Multimedia ( Volume: 26)

Page(s): 3827 - 3841

Date of Publication: 15 September 2023

ISSN Information:

DOI: 10.1109/TMM.2023.3316077

Contents

References is not available for this document.

Feature First: Advancing Image-Text Retrieval Through Improved Visual Features

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Feature First: Advancing Image-Text Retrieval Through Improved Visual Features

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Authors

Figures

References

Citations

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?