Abstract:
Feature selection plays an important role in reducing the size of datasets by choosing the most informative features and discarding the rest. The use of feature selection...Show MoreMetadata
Abstract:
Feature selection plays an important role in reducing the size of datasets by choosing the most informative features and discarding the rest. The use of feature selection in microarray datasets for detecting cancer is widely investigated. In this paper we provide a series of comparisons between perturbation-based feature selection (PFS) and traditional methods, such as principal component analysis (PCA), correlation based feature selection (CFS), and least-angle regression (LARS), and more recent methods, such as Hilbert-Schmidt independence criterion Lasso (HSIC-Lasso), minimum redundancy maximum relevance (mRMR), and a feature selection using support vector machines (FS-SVM). The performance of each method is demonstrated by conducting a series of comparisons on genomic cancer datasets, as well as, inflammatory bowel disease datasets. The experiments show that PFS and HSIC-Lasso are both scalable to large datasets.
Date of Conference: 05-07 June 2019
Date Added to IEEE Xplore: 05 August 2019
ISBN Information: