Skip to Main Content
We propose an incremental classifier learning framework that starts with a small amount of labeled training data to create an initial set of classifiers, and gradually incorporates unlabeled data into the incremental learning process to improve the models. A key to the effectiveness of the proposed framework is to judicially select a good incremental learning subset from all remaining unlabeled samples by computing a confidence measure and a margin-like discrimination score that measures potential contributions of the selection set to enhancing the existing models. To further refine the above selection set, class prior densities were also exploited. The proposed framework was tested on an automatic image annotation application using a subset of the Corel image set. When all data, including both initially labeled and incrementally learned samples, were used, the final model was shown to achieve a significant improvement over the initial set of classifiers in terms of micro-averaging F1 even when only a small number of images were initially labeled. Furthermore, when 30% of the images were initially labeled the incrementally learned models achieved comparable results to the case when models were created with all training data labeled.