Skip to Main Content
The evaluation of the effectiveness of software clustering algorithms is a challenging research question. Several approaches that compare clustering results to an authoritative decomposition have been presented in the literature. Existing evaluation methods typically compress the evaluation results into a single number. They also often disagree with each other for reasons that are not well understood. In this paper, we introduce a novel set of indicators that evaluate structural discrepancies between software decompositions. They also allow researchers to investigate the differences between existing evaluation approaches in a reduced search space. Several experiments with real software systems showcase the usefulness of the introduced indicators.