Loading [a11y]/accessibility-menu.js
On the Dual Formulation of Boosting Algorithms | IEEE Journals & Magazine | IEEE Xplore

On the Dual Formulation of Boosting Algorithms


Abstract:

We study boosting algorithms from a new perspective. We show that the Lagrange dual problems of ℓ1-norm-regularized AdaBoost, LogitBoost, and soft-margin LPBoost with gen...Show More

Abstract:

We study boosting algorithms from a new perspective. We show that the Lagrange dual problems of ℓ1-norm-regularized AdaBoost, LogitBoost, and soft-margin LPBoost with generalized hinge loss are all entropy maximization problems. By looking at the dual problems of these boosting algorithms, we show that the success of boosting algorithms can be understood in terms of maintaining a better margin distribution by maximizing margins and at the same time controlling the margin variance. We also theoretically prove that approximately, ℓ1-norm-regularized AdaBoost maximizes the average margin, instead of the minimum margin. The duality formulation also enables us to develop column-generation-based optimization algorithms, which are totally corrective. We show that they exhibit almost identical classification results to that of standard stagewise additive boosting algorithms but with much faster convergence rates. Therefore, fewer weak classifiers are needed to build the ensemble using our proposed optimization technique.
Page(s): 2216 - 2231
Date of Publication: 18 March 2010

ISSN Information:

PubMed ID: 20975119

1 Introduction

Boosting has attracted a lot of research interests since the first practical boosting algorithm, AdaBoost, was introduced by Freund and Schapire [1]. The machine learning community has spent much effort on understanding how the algorithm works [2], [3], [4]. However, to date there are still questions about the success of boosting that are left unanswered [5]. In boosting, one is given a set of training examples , , with binary labels being either or . A boosting algorithm finds a convex linear combination of weak classifiers (a.k.a. base learners, weak hypotheses) that can achieve much better classification accuracy than an individual base classifier. To do so, there are two unknown variables to be optimized. The first one is the base classifier. An oracle is needed to produce base classifiers. The second one is the positive weights associated with each base classifier.

Contact IEEE to Subscribe

References

References is not available for this document.