This paper proposes two new graph-based query strategies for active learning in a framework that is convenient to combine with semi-supervised learning based on label propagation. The first strategy selects instances independently to maximize the change to a maximum entropy model using label propagation results in a gradient length measure of model change. The second strategy involves a batch criterion that integrates label uncertainty with diversity and density objectives. Experiments on sentiment classification demonstrate that both methods consistently improve over a standard active learning baseline, and that the batch criterion also gives consistent improvement over semi-supervised learning alone.
Published in:
Audio, Speech, and Language Processing, IEEE Transactions on
(Volume:21
,
Issue:
2
)
Date of Publication: Feb. 2013