Skip to Main Content
With the development of Internet and Web 2.0, large-volume multimedia contents have been made available online. It is highly desired to provide easy accessibility to such contents, i.e., efficient and precise retrieval of images that satisfies users' needs. Toward this goal, content-based image retrieval (CBIR) has been intensively studied in the research community, while text-based search is better adopted in the industry. Both approaches have inherent disadvantages and limitations. Therefore, unlike the great success of text search, web image search engines are still premature. In this paper, we present iLike, a vertical image search engine that integrates both textual and visual features to improve retrieval performance. We bridge the semantic gap by capturing the meaning of each text term in the visual feature space, and reweight visual features according to their significance to the query terms. We also bridge the user intention gap because we are able to infer the "visual meanings" behind the textual queries. Last but not least, we provide a visual thesaurus, which is generated from the statistical similarity between the visual space representation of textual terms. Experimental results show that our approach improves both precision and recall, compared with content-based or text-based image retrieval techniques. More importantly, search results from iLike is more consistent with users' perception of the query terms.