Skip to Main Content
In microarray data analysis, the large number of equally predictive gene sets and the disparity among them reveals the gap between necessary genes for accurate models and candidate genes for biomarkers. We propose to bridge this gap by a new learning task, feature cluster selection, which aims to select all relevant features in a data set and group them into coherent clusters. We provide problem definitions and an empirical solution to feature cluster selection. Experiments on microarray data show that our proposed solution can select highly predictive representative gene sets and discover gene clusters with statistical significance.