Skip to Main Content
Automatic semantic classification of image databases is very useful for users' searching and browsing, but it is at the same time a very challenging research problem as well. In this paper, we develop a hidden semantic concept discovery methodology to address effective semantics-intensive image database classification. Each image is segmented into regions and then a uniform and sparse region-based representation is obtained. With this representation a probabilistic model based on statistical-hidden-class assumptions of the image database is proposed, to which the Expectation-Maximization (EM) technique is applied to analyze semantic concepts hidden in the database. Two methods are proposed to make use of the semantic concepts discovered from the probabilistic model for unsupervised and supervised image database classifications, respectively, based on the automatically learned concept vectors. It is shown that the concept vectors are more reliable and robust and thus promising than the low level features through the theoretic analysis and the experimental evaluations on a database of 10,000 general-purpose images.