By Topic

Complete Coverage for Approximate String Matching in Record Linkage Using Bit Vectors

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Schraagen, M. ; Leiden Inst. of Adv. Comput. Sci., Leiden Univ., Leiden, Netherlands

Research in social history is increasingly influenced by the availability of digitized sources. Tools have to be developed to access these sources in an efficient way. This paper describes a tool that performs family reconstruction using record linkage: linking historical civil certificates based on record similarity. Most current approaches in record linkage apply heuristics to limit the amount of similarity computations at the expense of linking coverage. The current paper describes a binary tree based indexing approach that provides complete coverage within practical time bounds. The indexing scheme is constructed using a simulated annealing algorithm to optimize indexing efficiency. A comparison to other methods using heuristics and complete coverage is provided. The method is developed for Levenshtein edit distance, however an extension to other similarity measures is feasible. As an example, extension to Jaro distance is discussed.

Published in:

Tools with Artificial Intelligence (ICTAI), 2011 23rd IEEE International Conference on

Date of Conference:

7-9 Nov. 2011