In this paper, we present an automatic web image and video mining framework with the ultimate goal of building a universal human age estimator based on facial information, which is applicable to all ethnic groups and various image qualities. On one hand, a large (391 k) yet noisy human aging image database is collected from Flickr and Google Image using a set of human age-related text queries. Multiple human face detectors based on distinctive techniques are adopted for noise-prune face detection. For each image, the detected faces with high detection confidences constitute a bag of face instances. We further remove the outliers via principal component analysis (PCA), which results in a condensed image database with about 175 k face instances. A robust multi-instance regressor learning algorithm is then developed to learn the kernel regression-based human age estimator in the presence of bag label noises. On the other hand, about 10 k video clips are downloaded from YouTube. We extract tracked face sequences from these video clips. Although their age labels are unknown, the tracked faces within a sequence are naturally with identical ages. This age-consistence constraint for face pairs is used as an extra regularizer to enhance the robustness of the age estimator. The derived human age estimator is extensively evaluated on three benchmark human aging databases, and without taking any images from these benchmark databases as training samples, comparable age estimation accuracies with the state-of-the-art results are achieved.