Skip to Main Content
Software metrics offer us the promise of distilling useful information from vast amounts of software in order to track development progress, to gain insights into the nature of the software, and to identify potential problems. Unfortunately, however, many software metrics exhibit highly skewed, non-Gaussian distributions. As a consequence, usual ways of interpreting these metrics - for example, in terms of ldquoaveragerdquo values - can be highly misleading. Many metrics, it turns out, are distributed like wealth - with high concentrations of values in selected locations. We propose to analyze software metrics using the Gini coefficient, a higher-order statistic widely used in economics to study the distribution of wealth. Our approach allows us not only to observe changes in software systems efficiently, but also to assess project risks and monitor the development process itself. We apply the Gini coefficient to numerous metrics over a range of software projects, and we show that many metrics not only display remarkably high Gini values, but that these values are remarkably consistent as a project evolves over time.