Skip to Main Content
We describe active learning methods for reducing the labeling effort in a statistical call classification system. Active learning aims to minimize the number of labeled utterances by automatically selecting for labeling the utterances that are likely to be most informative. The first method, inspired by certainty-based active learning, selects the examples that the classifier is least confident about. The second method, inspired by committee-based active learning, selects the examples that multiple classifiers do not agree on. We have evaluated these active learning methods using a call classification system used for AT&T customer care. Our results indicate that it is possible to reduce human labeling effort at least by a factor of two.