Skip to Main Content
Due to the rapid development of information technology and the continuously increasing number of available multimedia data, the task of retrieving information based on visual content has become a popular subject of scientific interest. Recent approaches adopt the bag-of-visual-words (BOVW) model to retrieve images in a semantic way. BOVW has shown remarkable performance in content-based image retrieval tasks, exhibiting better retrieval effectiveness over global and local feature (LF) representations. The performance of the BOVW approach depends strongly, however, on predicting the ideal codebook size, a difficult and database-dependent task. The contribution of this paper is threefold. First, it presents a new technique that uses a self-growing and self-organized neural gas network to calculate the most appropriate size of a codebook for a given database. Second, it proposes a new soft-weighting technique, whereby each LF is classified into only one visual word (VW) with a degree of participation. Third, by combining the information derived from the method that automatically detects the number of VWs, the soft-weighting method, and a color information extraction method from the literature, it shapes a new descriptor, called color VWs. Experimental results on two well-known benchmarking databases demonstrate that the proposed descriptor outperforms 15 contemporary descriptors and methods from the literature, in terms of both precision at K and its ability to retrieve the entire ground truth.