By Topic

Interactive Multimodal Visual Search on Mobile Device

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Houqiang Li ; Univ. of Sci. & Technol. of China, Hefei, China ; Yang Wang ; Tao Mei ; Jingdong Wang
more authors

This paper describes a novel multimodal interactive image search system on mobile devices. The system, the Joint search with ImaGe, Speech, And Word Plus (JIGSAW+ ), takes full advantage of the multimodal input and natural user interactions of mobile devices. It is designed for users who already have pictures in their minds but have no precise descriptions or names to address them. By describing it using speech and then refining the recognized query by interactively composing a visual query using exemplary images, the user can easily find the desired images through a few natural multimodal interactions with his/her mobile device. Compared with our previous work JIGSAW, the algorithm has been significantly improved in three aspects: 1) segmentation-based image representation is adopted to remove the artificial block partitions; 2) relative position checking replaces the fixed position penalty; and 3) inverted index is constructed instead of brute force matching. The proposed JIGSAW+ is able to achieve 5% gain in terms of search performance and is ten times faster.

Published in:

Multimedia, IEEE Transactions on  (Volume:15 ,  Issue: 3 )