Skip to Main Content
Various law enforcement and intelligence tasks require managing identity information in an effective and efficient way. However, the quality issues of identity information make this task non-trivial. Various heuristic based systems have been developed to tackle the identity matching problem. However, deploying such systems may require special expertise in system configuration and customization for optimal system performance. In this paper, we propose an alternative system called the Arizona IDMatcher. The system relies on a machine learning algorithm to automatically generate a decision model for identity matching. Such a system requires minimal human configuration effort. Experiments show that the Arizona IDMatcher is very efficient in detecting matching identity records. Compared to IBM Identity Resolution (a commercial, heuristic-based system), the Arizona IDMatcher achieves better recall and overall F-measures in identifying matching identities in two large-scale real-world datasets.