Abstract:
Similarity measurement is a significant process to determine the degree of similarity between two records. This paper presents a comparative analysis of important similar...Show MoreMetadata
Abstract:
Similarity measurement is a significant process to determine the degree of similarity between two records. This paper presents a comparative analysis of important similarity measurements which are utilised for the detection of duplicated records in databases. The work evaluates their strengths based on the efficiency of prevailing algorithms, the time required to process and identify duplications as well as performance accuracy. The analysis conducted found that among the most common similarity measurements, those based on the Jaro-Winkler algorithm significantly outperformed the other algorithms. This paper presents an enhanced strategy based on the Jaro-Winkler algorithm to improve the detection of similarity among database records. The ability to provide solutions to this problem will greatly enhance the quality of data used in decision-making.
Date of Conference: 25-27 November 2017
Date Added to IEEE Xplore: 12 March 2018
ISBN Information:
Electronic ISSN: 2155-6830