By Topic

Exploring Software Quality Classification with a Wrapper-Based Feature Ranking Technique

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Kehan Gao ; Eastern Connecticut State Univ., Willimantic, CT, USA ; Khoshgoftaar, T. ; Napolitano, A.

Feature selection is a process of selecting a subset of relevant features for building learning models. It is an important activity for data preprocessing used in software quality modeling and other data mining problems. Feature selection algorithms can be divided into two categories, feature ranking and feature subset selection. Feature ranking orders the features by a criterion and a user selects some of the features that are appropriate for a given scenario. Feature subset selection techniques search the space of possible feature subsets and evaluate the suitability of each. This paper investigates performance metric based feature ranking techniques by using the multilayer perceptron (MLP) learner with nine different performance metrics. The nine performance metrics include overall accuracy (OA), default F-measure (DFM), default geometric mean (DGM), default arithmetic mean (DAM), area under ROC (AUC), area under PRC (PRC), best F-measure (BFM), best geometric mean (BGM) and best arithmetic mean (BAM). The goal of the paper is to study the effect of the different performance metrics on the feature ranking results, which in turn influences the classification performance. We assessed the performance of the classification models constructed on those selected feature subsets through an empirical case study that was carried out on six data sets of real-world software systems. The results demonstrate that AUC, PRC, BFM, BGM and BAM as performance metrics for feature ranking outperformed the other performance metrics, OA, DFM, DGMand DAM, unanimously across all the data sets and therefore are recommended based on this study. In addition, the performances of the classification models were maintained or even improved when over 85 percent of the features were eliminated from the original data sets.

Published in:

Tools with Artificial Intelligence, 2009. ICTAI '09. 21st International Conference on

Date of Conference:

2-4 Nov. 2009