By Topic

Microarray Data Mining: A New Algorithm for Gene Selection Using Lorenz Curves & Gini Ratios

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Quoc-Nam Tran ; Lamar (Texas State) Univ., TX, USA

Gene selection is a challenging task in microarray data mining because a typical microarray dataset has only a small number of records while having thousands of attributes. This kind of dataset creates a high likelihood of finding false predictions that are due to chance. Finding the most relevant genes is often the key phase in building an accurate classification model. Irrelevant and redundant attributes have negative impacts on the accuracy of classification algorithms. In this paper, we present a new method for gene selection utilizing techniques from economics. We modify the Lorenz curves and the Gini coefficients by taking into account the order of classes and the order of gene's discretized values and use them for selecting relevant genes. We believe that our method is the first one for attribute selection that considers the order of classes and the order of the attribute's discretized values. We implemented this new method and compare our method with SAM, one of the most popular gene selection methods. Experimental results with many different classification algorithms for the task of classifying lung adenocarcinomas from gene expression show that (a) Our new method is different with SAM in the sense that it finds very different sets of significant genes. (b) Our method selects genes for more accurate classification.

Published in:

Information Technology: New Generations (ITNG), 2010 Seventh International Conference on

Date of Conference:

12-14 April 2010