Skip to Main Content
When proposing a new classification scheme, perhaps in the form of a classification rule or feature selection method, modelers in the bioinformatics literature typically report its performance on data sets of interest, such as gene-expression microarrays. These data sets often include thousands of features but a small number of sample points, which increases variability in feature selection and error estimation, resulting in highly imprecise reported performances. This suggests that the reported performance of the proposed scheme would be less correlated with and highly biased from the actual performance if only the best results are demonstrated. This paper confirms this by showing the behavior of the joint distributions of the minimum reported estimated errors and corresponding true errors as functions of the number of samples tested in a large simulation study using both modeled and real data.