Skip to Main Content
Modern requirements tracing tools employ information retrieval methods to automatically generate candidate links. Due to the inherent trade-off between recall and precision, such methods cannot achieve a high coverage without also retrieving a great number of false positives, causing a significant drop in result accuracy. In this paper, we propose an approach to improving the quality of candidate link generation for the requirements tracing process. We base our research on the cluster hypothesis which suggests that correct and incorrect links can be grouped in high-quality and low-quality clusters respectively. Result accuracy can thus be enhanced by identifying and filtering out low-quality clusters. We describe our approach by investigating three open-source datasets, and further evaluate our work through an industrial study. The results show that our approach outperforms a baseline pruning strategy and that improvements are still possible.