Skip to Main Content
We describe a novel sparse image representation for full automated content-based image retrieval using the latent semantic indexing (LSI) approach and also a novel statistical-based model for the efficient dimensional reduction of sparse data. Although images can be represented sparsely for instance by the discrete cosine transform (DCT) coefficients, this sparsity character is destroyed during the LSI-based dimension reduction process. In our approach, we keep the memory limit of the decomposed data by a statistical model of the sparse data. The aim is to find a small but "important" sub-set of coefficients, which represent semantics of images efficiently. The effectiveness of our novel approach is demonstrated by the large scale image similarity task of the NIST TrecVid 2007 benchmark.