Skip to Main Content
This paper begins by modeling general software systems using concepts from statistical mechanics which provide a framework for linking microscopic and macroscopic features of any complex system. This analysis provides a way of linking two features of particular interest in software systems: first the microscopic distribution of defects within components and second the macroscopic distribution of component sizes in a typical system. The former has been studied extensively, but the latter much less so. This paper shows that subject to an external constraint that the total number of defects is fixed in an equilibrium system, commonly used defect models for individual components directly imply that the distribution of component sizes in such a system will obey a power-law Pareto distribution. The paper continues by analyzing a large number of mature systems of different total sizes, different implementation languages, and very different application areas, and demonstrates that the component sizes do indeed appear to obey the predicted power-law distribution. Some possible implications of this are explored.