Skip to Main Content
Is it better to design a classifier and estimate its error on the full sample or to design a classifier on a training subset and estimate its error on the hold-out test subset? Full-sample design provides the better classifier; nevertheless, one might choose hold-out with the hope of better error estimation. A conservative criterion to decide the best course is to aim at a classifier whose error is less than a given bound. Then the choice between full-sample and hold-out design depends on which possesses the smaller expected bound. Using this criterion, we examine the choice between hold-out and several full-sample error estimators using covariance models. The relation between the two designs is revealed via a decomposition of the expected bound into the sum of the expected true error and the expected conditional standard deviation of the true error.