Skip to Main Content
An experiment was carried out by a group of scientists to compare different tools and techniques for detecting duplicated or near-duplicated source code. The overall comparative results are presented elsewhere. This paper takes a closer look at the results for one tool, Dup, which finds code sections that are textually the same or the same except for systematic substitution of parameters such as identifiers and constants. Various factors that influenced the results are identified and their impact on the results is assessed via rerunning Dup with changed options and modifications. These improve the performance of Dup with regard to the experiment and could be incorporated into a postprocessor to be used with other tools.