Skip to Main Content
The benefits of the analysis of software faults and failures have been widely recognized. However, detailed studies based on empirical data are rare. In this paper, we analyze the fault and failure data from two large, real-world case studies. Specifically, we explore: 1) the localization of faults that lead to individual software failures and 2) the distribution of different types of software faults. Our results show that individual failures are often caused by multiple faults spread throughout the system. This observation is important since it does not support several heuristics and assumptions used in the past. In addition, it clearly indicates that finding and fixing faults that lead to such software failures in large, complex systems are often difficult and challenging tasks despite the advances in software development. Our results also show that requirement faults, coding faults, and data problems are the three most common types of software faults. Furthermore, these results show that contrary to the popular belief, a significant percentage of failures are linked to late life cycle activities. Another important aspect of our work is that we conduct intra- and interproject comparisons, as well as comparisons with the findings from related studies. The consistency of several main trends across software systems in this paper and several related research efforts suggests that these trends are likely to be intrinsic characteristics of software faults and failures rather than project specific.