Loading [MathJax]/extensions/MathMenu.js
AnANet: Association and Alignment Network for Modeling Implicit Relevance in Cross-Modal Correlation Classification | IEEE Journals & Magazine | IEEE Xplore

AnANet: Association and Alignment Network for Modeling Implicit Relevance in Cross-Modal Correlation Classification


Abstract:

With the explosive increase of multimodal data, cross-modal correlation classification has become an important research topic and is in great demand in many cross-modal a...Show More

Abstract:

With the explosive increase of multimodal data, cross-modal correlation classification has become an important research topic and is in great demand in many cross-modal applications. A variety of classification schemes and predictive models have been built based on the existing cross-modal correlation categorization. However, these classification schemes typically follow the prior assumption that the paired cross-modal samples are strictly related, and thus pay great attention to the fine-grained relevant types of cross-modal correlation, ignoring the high volume of implicitly relevant data which are often wrongly classified into irrelevant types. Even more, previous predictive models fall short of reflecting the essence of cross-modal correlation according to their definitions, especially in the modeling of network structure. Thus in this paper, by comprehensively investigating the current image-text correlation classification research, we redefine a new classification scheme for cross-modal correlation based on the implicit and explicit relevance. To predict the types of image-text correlation based on our proposed definition, we further devise the Association and Alignment Network (namely AnANet) to model the implicit and explicit relevance, which captures both the implicit association of global discrepancy and commonality between image and text and explicit alignment of cross-modal local relevance. Experimental studies on our constructed new image-text correlation dataset verify the effectiveness of our proposed model.
Published in: IEEE Transactions on Multimedia ( Volume: 25)
Page(s): 7867 - 7880
Date of Publication: 16 December 2022

ISSN Information:

Funding Agency:


References

References is not available for this document.