Skip to Main Content
The global prediction of a homogeneous ensemble of classifiers generated in independent applications of a randomized learning algorithm on a fixed training set is analyzed within a Bayesian framework. Assuming that majority voting is used, it is possible to estimate with a given confidence level the prediction of the complete ensemble by querying only a subset of classifiers. For a particular instance that needs to be classified, the polling of ensemble classifiers can be halted when the probability that the predicted class will not change when taking into account the remaining votes is above the specified confidence level. Experiments on a collection of benchmark classification problems using representative parallel ensembles, such as bagging and random forests, confirm the validity of the analysis and demonstrate the effectiveness of the instance-based ensemble pruning method proposed.