Skip to Main Content
As a large-scale database of hundreds of thousands of face images collected from the Internet and digital cameras becomes available, how to utilize it to train a well-performed face detector is a quite challenging problem. In this paper, we propose a method to resample a representative training set from a collected large-scale database to train a robust human face detector. First, in a high-dimensional space, we estimate geodesic distances between pairs of face samples/examples inside the collected face set by isometric feature mapping (Isomap) and then subsample the face set. After that, we embed the face set to a low-dimensional manifold space and obtain the low-dimensional embedding. Subsequently, in the embedding, we interweave the face set based on the weights computed by locally linear embedding (LLE). Furthermore, we resample nonfaces by Isomap and LLE likewise. Using the resulting face and nonface samples, we train an AdaBoost-based face detector and run it on a large database to collect false alarms. We then use the false detections to train a one-class support vector machine (SVM). Combining the AdaBoost and one-class SVM-based face detector, we obtain a stronger detector. The experimental results on the MIT + CMU frontal face test set demonstrated that the proposed method significantly outperforms the other state-of-the-art methods.