Skip to Main Content
We propose a novel image representation called visual keyword histogram (VKH) for content-based indexing and retrieval. Visual keywords are domain-relevant visual prototypes (e.g. faces, foliage, buildings etc) with both perceptual appearance and textual semantics. Collectively, VKHs axe computed over spatial tessellation to represent the distribution of visual keywords in various parts of an image. To construct a vocabulary of visual keywords, an incremental neural network is deployed to learn visual keywords from examples. This allows us to build domain-specific visual vocabularies rapidly and incrementally. Last but not least, we propose a new visual query language called Query by Spatial Icons (QBSI) that allows a user to specify a query in terms of "what" and "where". A visual query term constrains whether a visual keyword should be present and a query formals chains these terms into a disjunctive normal form via logical operators. We show our approach on real and complex home photos with very promising results.