Skip to Main Content
Defect reports generated for faults found during testing provide a rich source of information regarding problematic phrases used in requirements documents. These reports indicate that faults often derive from instances of ambiguous, incorrect or otherwise deficient language. In this paper, we report on a method combining elements of linguistic theory and information retrieval to guide the discovery of problematic phrases throughout a requirements specification, using defect reports and correction requests generated during testing to seed our detection process. We found that phrases known from these materials to be problematic have occurrence properties in requirements documents that both allow the direction of resources to prioritize their correction, and generate insights characterizing more general locations of difficulty within the requirements. Our findings lead to some recommendations for more efficiently and effectively managing certain natural language issues in the creation and maintenance of requirements specifications.