Skip to Main Content
Although multimedia objects such as images, audios and texts are of different modalities, there are a great amount of semantic correlations among them. In this paper, we propose a method of transductive learning to mine the semantic correlations among media objects of different modalities so that to achieve the cross-media retrieval. Cross-media retrieval is a new kind of searching technology by which the query examples and the returned results can be of different modalities, e.g., to query images by an example of audio. First, according to the media objects features and their co-existence information, we construct a uniform cross-media correlation graph, in which media objects of different modalities are represented uniformly. To perform the cross-media retrieval, a positive score is assigned to the query example; the score spreads along the graph and media objects of target modality or MMDs with the highest scores are returned. To boost the retrieval performance, we also propose different approaches of long-term and short-term relevance feedback to mine the information contained in the positive and negative examples.