Skip to Main Content
In increasingly many cases of interest in computer vision and pattern recognition, one is often confronted with the situation where data size is very large. Usually, the labels are expensive and the challenge is, thus, to determine which unlabeled samples would be the most informative (i.e., improve the classifier the most) if they were labeled and used as training samples. Particularly, we consider the problem of active learning of a regression model in the context of experimental design. Classical optimal experimental design approaches are based on least square errors over the measured samples only. They fail to take into account the unmeasured samples. In this paper, we propose a novel active learning algorithm which operates over graphs. Our algorithm is based on a graph Laplacian regularized regression model which simultaneously minimizes the least square error on the measured samples and preserves the local geometrical structure of the data space. By constructing a nearest neighbor graph, the geometrical structure of the data space can be described by the graph Laplacian. We discuss how results from the field of optimal experimental design may be used to guide our selection of a subset of data points, which gives us the most amount of information. Experiments demonstrate its superior performance in comparison with conventional algorithms.