Loading [MathJax]/extensions/MathMenu.js
CMID: A New Dataset for Copy-Move Forgeries on ID Documents | IEEE Conference Publication | IEEE Xplore

CMID: A New Dataset for Copy-Move Forgeries on ID Documents


Abstract:

Copy-Move forgery has been widely studied as it is a really common forgery. Furthermore, it is the easiest forgery to create with serious security-related threats in part...Show More

Abstract:

Copy-Move forgery has been widely studied as it is a really common forgery. Furthermore, it is the easiest forgery to create with serious security-related threats in particular for distant remote id onboarding where company ask their customer to send a photo of their ID document. It is then easy for a counterfeit to alter the information on the document by copying and pasting letters within the photo. On the other hand, copy-move detection algorithms are known to perform worse in presence of similar but genuine objects preventing us from using them in practical situations like remote ID on boarding. In this article we propose a novel copy-move public dataset containing forged ID documents and study current state-of-the-art performances on this dataset to evaluate their potential use in practical situations.
Date of Conference: 19-22 September 2021
Date Added to IEEE Xplore: 23 August 2021
ISBN Information:

ISSN Information:

Conference Location: Anchorage, AK, USA

1. Introduction

Current state of the arts Copy-Move Forgery Detection (CMFD) algorithms perform extremely well on known public datasets. But one must consider practical use cases when studying copy-move. In particular, the application of CMFD algorithms on ID documents, requires the method to detect small duplicated elements in the presence of many Similar but Genuine Objects (SGO). To be practical in such situations, CMFD algorithm should be able to maintain the lowest false positive rate to avoid any manual verification. Because CMFD methods search for similarities in images, they will most likely struggle in the presence of SGO. Even though this fact has been acknowledged by the authors of [1] when proposing the COVERAGE dataset, most research often uses other public dataset, such as [2,3,4,5], to evaluate their works. On those datasets, it is common to observe near perfect results. But they do not represent realistic use cases for copy-move forgeries as they often contain large duplicated elements in images without SGO and are thus most of the time obvious and rather easy to detect. Sadly, apart from [1], no datasets propose challenging images with SGO. In this paper we propose the Copy-Move ID (CMID) dataset, a novel Dataset for copy-move forgery detection. Our dataset contains 893 forged images of ID documents. We used ID documents as it is a practical use case of CMFD algorithms and because an ID document contains many SGO which makes it extremely challenging for CMFD method. We evaluate state-of-the-art algorithm on this novel dataset to further confirm the issue first presented by [1]. We will first go through the current state of the arts CMFD method and datasets commonly used. Then we will describe how we automatically generated our dataset. And finally we will evaluate current methods on this dataset to provide a baseline result.

https://cmiddataset.github.io/

Contact IEEE to Subscribe

References

References is not available for this document.