Skip to Main Content
This work presents a novel sparse ensemble learning scheme for concept detection in videos. The proposed ensemble first exploits a sparse non-negative matrix factorization (NMF) process to represent data instances in parts and partition the data space into localities, and then coordinates the individual classifiers in each locality for final classification. In the sparse NMF, data exemplars are projected to a set of locality bases, in which the non-negative superposition of basis images reconstructs the original exemplars. This additive combination ensures that each locality captures the characteristics of data exemplars in part, thus enabling the local classifiers to hold reasonable diversity in their own regions of expertise. More importantly, the sparse NMF ensures that an exemplar is projected to only a few bases (localities) with non-zero coefficients. The resultant ensemble model is, therefore, sparse, in the way that only a small number of efficient classifiers in the ensemble will fire on a testing sample. Extensive tests on the TRECVid 08 and 09 datasets show that the proposed ensemble learning achieves promising results and outperforms existing approaches. The proposed scheme is feature-independent, and can be applied in many other large scale pattern recognition problems besides visual concept detection.