Abstract:
An experiment was carried out by a group of scientists to compare different tools and techniques for detecting duplicated or near-duplicated source code. The overall comp...Show MoreMetadata
Abstract:
An experiment was carried out by a group of scientists to compare different tools and techniques for detecting duplicated or near-duplicated source code. The overall comparative results are presented elsewhere. This paper takes a closer look at the results for one tool, Dup, which finds code sections that are textually the same or the same except for systematic substitution of parameters such as identifiers and constants. Various factors that influenced the results are identified and their impact on the results is assessed via rerunning Dup with changed options and modifications. These improve the performance of Dup with regard to the experiment and could be incorporated into a postprocessor to be used with other tools.
Published in: IEEE Transactions on Software Engineering ( Volume: 33, Issue: 9, September 2007)