By Topic

Applying Permutation Tests for Assessing the Statistical Significance of Wrapper Based Feature Selection

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Antti Airola ; Dept. of Inf. Technol., Univ. of Turku, Turku, Finland ; Tapio Pahikkala ; Jorma Boberg ; Tapio Salakoski

Feature selection is commonly used in bioinformatics applications, such as gene selection from DNA micro array data. Recently, wrapper methods have been proposed as an improvement over traditionally used filter based feature selection methods. In wrapper methods, the goodness of a feature set is often measured using the cross-validation performance of a machine learning method trained with the features. This can lead to over fitting, meaning that the cross-validation performance on the final selected feature set may be high even in cases when the selected features in fact are not informative. Evaluating the statistical significance of gained results is therefore of major concern. Non-parametric permutation tests have been previously used as a univariate filter for selecting individual features. In contrast, we propose using such tests to measure the statistical significance of the whole selection process, which is carried out by a wrapper method. We achieve computational efficiency by using a regularized least-squares based wrapper method, which combines a state-of-the-art classifier with matrix calculus based computational shortcuts for greedy forward feature selection. Permutation tests prove to be a practical tool for estimating the significance of gained results, as shown in simulations and experiments on two DNA micro array data sets.

Published in:

Machine Learning and Applications (ICMLA), 2010 Ninth International Conference on

Date of Conference:

12-14 Dec. 2010