By Topic

Active Property Testing

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Balcan, M.-F. ; Sch. of Comput. Sci., Georgia Inst. of Technol., Atlanta, GA, USA ; Blais, E. ; Blum, A. ; Liu Yang

One motivation for property testing of boolean functions is the idea that testing can provide a fast preprocessing step before learning. However, in most machine learning applications, it is not possible to request for labels of arbitrary examples constructed by an algorithm. Instead, the dominant query paradigm in applied machine learning, called active learning, is one where the algorithm may query for labels, but only on points in a given (polynomial-sized) unlabeled sample, drawn from some underlying distribution D. In this work, we bring this well-studied model to the domain of testing. We develop both general results for this active testing model as well as efficient testing algorithms for several important properties for learning, demonstrating that testing can still yield substantial benefits in this restricted setting. For example, we show that testing unions of d intervals can be done with O(1) label requests in our setting, whereas it is known to √ require Ω(d) labeled examples for learning (and Ω(√d) for passive testing [22] where the algorithm must pay for every example drawn from D). In fact, our results for testing unions of intervals also yield improvements on prior work in both the classic query model (where any point in the domain can be queried) and the passive testing model as well. For the problem of testing linear separators in Rn over the Gaussian distribution, we show that both active and passive testing can be done with O(√n) queries, substantially less than the Ω(n) needed for learning, with near-matching lower bounds. We also present a general combination result in this model for building testable properties out of others, which we then use to provide testers for a number of assumptions used in semi-supervised learning. In addition to the above results, we also develop a general notion of the testing dimension of a given property with respect to a given distribution, that we show characteriz- s (up to constant factors) the intrinsic number of label requests needed to test that property. We develop such notions for both the active and passive testing models. We then use these dimensions to prove a number of lower bounds, including for linear separators and the class of dictator functions.

Published in:

Foundations of Computer Science (FOCS), 2012 IEEE 53rd Annual Symposium on

Date of Conference:

20-23 Oct. 2012