Skip to Main Content
In empirical software engineering predictive models can be used to classify components as overly complex. Such modules could lead to faults, and as such, may be in need of mitigating actions such as refactoring or more exhaustive testing. Source code metrics can be used as input features for a classifier, however, there exist a large number of measures that capture different aspects of coupling, cohesion, inheritance, complexity and size. In a large dimensional feature space some of the metrics may be irrelevant or redundant. Feature selection is the process of identifying a subset of the attributes that improves a classifier's discriminatory performance. This paper presents initial results of a genetic algorithm as a feature subset selection method that enhances a classifier's ability to discover cognitively complex classes that degrade program understanding.