Skip to Main Content
Software clones are considered harmful in software maintenance and evolution. However, despite a decade of active research, there is a marked lack of work in the detection and analysis of near-miss software clones, those where minor to extensive modifications have been made to the copied fragments. In this thesis, we advance the state-of-the-art in clone detection and analysis in several ways. First, we develop a hybrid clone detection method. Second, we address the decade of vagueness in clone definition by proposing a metamodel of clone types. Third, we conduct a scenario-based comparison and evaluation of all of the currently available clone detection techniques and tools. Fourth, in order to evaluate and compare the available tools in a realistic setting, we develop a mutation-based framework that automatically and efficiently measures (and compares) the recall and precision of clone detection tools. Fifth, we conduct a large scale empirical study of cloning in open source systems.