Skip to Main Content
There is no general consensus on which classifier performance metrics are better to use as compared to others. While some studies investigate a handful of such metrics in a comparative fashion, an evaluation of specific relationships among a large set of commonly-used performance metrics is much needed in the data mining and machine learning community. This study provides a unique insight into the underlying relationships among classifier performance metrics. We do so with a large case study involving 35 datasets from various domains and the C4.5 decision tree algorithm. A common property of the 35 datasets is that they suffer from the class imbalance problem. Our approach is based on applying factor analysis to the classifier performance space which is characterized by 22 performance metrics. It is shown that such a large number of performance metrics can be grouped into two-to-four relationship-based groups extracted by factor analysis. This work is a step in the direction of providing the analyst with an improved understanding about the different relationships and groupings among the performance metrics, thus facilitating the selection of performance metrics that capture relatively independent aspects of a classifier's performance.