A Latent Fingerprint in the Wild Database

Latent fingerprints are among the most important and widely used evidence in crime scenes, digital forensics and law enforcement worldwide. Despite the number of advancements reported in recent works, we note that significant open issues such as independent benchmarking and lack of large-scale evaluation databases for improving the algorithms are inadequately addressed. The available databases are mostly of semi-public nature, lack of acquisition in the wild environment, and post-processing pipelines. Moreover, they do not represent a realistic capture scenario similar to real crime scenes, to benchmark the robustness of the algorithms. Further, existing databases for latent fingerprint recognition do not have a large number of unique subjects/fingerprint instances or do not provide ground truth/reference fingerprint images to conduct a cross-comparison against the latent. In this paper, we introduce a new wild large-scale latent fingerprint database that includes five different acquisition scenarios: reference fingerprints from (1) optical and (2) capacitive sensors, (3) smartphone fingerprints, latent fingerprints captured from (4) wall surface, (5) Ipad surface, and (6) aluminium foil surface. The new database consists of 1,318 unique fingerprint instances captured in all above mentioned settings. A total of 2,636 reference fingerprints from optical and capacitive sensors, 1,318 fingerphotos from smartphones, and 9,224 latent fingerprints from each of the 132 subjects were provided in this work. The dataset is constructed considering various age groups, equal representations of genders and backgrounds. In addition, we provide an extensive set of analysis of various subset evaluations to highlight open challenges for future directions in latent fingerprint recognition research.

Abstract-Latent fingerprints are among the most important and widely used evidence in crime scenes, digital forensics and law enforcement worldwide.Despite the number of advancements reported in recent works, we note that significant open issues such as independent benchmarking and lack of largescale evaluation databases for improving the algorithms are inadequately addressed.The available databases are mostly of semi-public nature, lack of acquisition in the wild environment, and post-processing pipelines.Moreover, they do not represent a realistic capture scenario similar to real crime scenes, to benchmark the robustness of the algorithms.Further, existing databases for latent fingerprint recognition do not have a large number of unique subjects/fingerprint instances or do not provide ground truth/reference fingerprint images to conduct a crosscomparison against the latent.In this paper, we introduce a new wild large-scale latent fingerprint database that includes five different acquisition scenarios: reference fingerprints from

I. INTRODUCTION
L ATENT fingerprints were first reported to convict a suspect as evidence in 1893 [1].Over the years, latent fingerprints have been regarded as one of the most commonly and broadly used sources of evidence in crime scenes, digital forensics, law enforcement, etc [1].Latent fingerprints can be left on various surfaces when a finger makes contact with an object.The manner in which a finger touches the surface of an object has a significant impact on the latent fingerprint quality (e.g., sharpness, contrast, and visible area).There is a long history that latent fingerprint recognition was performed by latent examiners before the development of Automated Fingerprint Identification System (AFIS).In recent years, latent AFIS has become one of the most commonly arXiv:2304.00979v1[cs.CV] 3 Apr 2023 used technologies by law enforcement agencies worldwide [2].More than 300,000 latent fingerprint identification demands were sent to the FBI over the United States only in 2020 [3].
Unlike rolled and slap fingerprints (reference fingerprint images acquired using standard fingerprint capture devices), latent fingerprints are captured under unconstrained and unsupervised conditions.Low quality, partial visibility, and the absence of satisfactory number and quality of minutiae points are common issues faced in latent fingerprint recognition.The National Institute of Standards and Technology (NIST) announced two fingerprint vendor technology evaluations (FpVTE) in 2003 and 2012, respectively, [4] to advance the research on latent fingerprint recognition.FpVTE was intended for the evaluation of fingerprint system performance to meet the requirements for real-world applications for both reference and latent fingerprints.In the latest FpVTE2012, the lowest False Negative Identification Rate (FNIR) and False Positive Identification Rate (FPIR) were reported as 1.9% and 0.1% by the top-performing AFIS for reference fingerprints [5].However, the best identification rate was only 67.2% for latent fingerprints during the NIST Evaluation of Latent Fingerprint Technologies: Extended Feature Sets (ELFT-EFS) [6].The major difference in recognition performance between the reference and latent fingerprints is mainly caused by the low fingerprint quality of the ridge-and-valley structures in latent fingerprints.It is obvious that the further development of robust and high-accuracy latent fingerprint recognition systems is necessary, however a progress is currently limited by the sparse access to openly available datasets.
In the past decade, several studies have focused on developing latent fingerprint recognition algorithms [7].However, the performance evaluation of these methods was conducted on only a few databases, such as the NIST SD27 Database [8], IIIT-D Latent Fingerprint Database [9], and Tsinghua Overlapped Latent Fingerprint Database [10].Despite being valuable, these databases have major shortcomings:1) a small number of subjects respectively finger instances and latent fingerprint samples, 2) a constrained acquisition environment, and 3) limited availability.Moreover, one of the most commonly used latent fingerprint databases NIST SD27 Database has been withdrawn, making the development and performance evaluation of latent fingerprint recognition even more difficult.
In this paper, we first review all existing studies on latent fingerprint recognition to provide an overview to the reader.We then provide an extensive analysis of algorithms that are relevant for segmentation, minutiae extraction, and the comparison of latent fingerprints both within and across sensors.From a review of existing works, we note that latent fingerprint recognition algorithms have rarely been tested on large-scale datasets [11].To the best of our knowledge, there is no large-scale latent fingerprint in the wild database that containing both reference fingerprints (ground truth) and latent fingerprints acquired from different surfaces.Therefore, it is necessary to establish a new large-scale latent fingerprint in the wild database to meet the need for robust latent fingerprint recognition algorithm development and evaluation.
In order to address the limitations mentioned above, we provide the following three major contributions in this paper: • Noting the non-availability of public datasets, a largescale database of latent fingerprints in the wild is presented in this work which is referred to as "Latent Fingerprint In the Wild" (LFIW).The dataset is collected in six different scenarios, constituting a total of 13180 images of 132 subjects, and is released along with this paper.This dataset contains various age groups and equal representations of genders and backgrounds, making it a unique dataset for the performance evaluation of latent fingerprint recognition algorithms.As can be seen in Fig. 2, the comparison scores significantly decrease from reference vs. reference comparison to latent vs. latent comparison, which also indicates that the LFIW dataset is suitable for the evaluation and development for the current and future latent fingerprint recognition techniques.The LFIW dataset is available for academic research purposes 1 .
• Unlike other works, we also present a ground truth of fingerprints captured using optical and capacitive sensors to conduct an analysis of contact-based versus latent fingerprint comparison.In addition, owing to recent trends in the use of smartphone-based fingerphotos in biometrics, we also introduce fingerphoto images to benchmark the latent to the fingerphoto recognition.
• A benchmark and independent evaluation of 5 stateof-the-art fingerprint recognition methods and 1 latent fingerprint recognition approach is presented to highlight the performance limitations of existing approaches.For each method, a total of 118,620 mated comparison scores and 173,593,780 non-mated comparison scores were generated for performance evaluation to derive statistically significant conclusions The paper is organized as follows.Section II provides an overview of past studies which are related to our work in latent fingerprint recognition and databases.We introduce the large-scale LFIW database where the details of the whole dataset are illustrated in Section III.The evaluated fingerprint recognition algorithms are introduced in Section IV, followed by a detailed discussion of benchmarking results in Section V. Finally, Section VI draws the conclusions.

II. RELATED WORKS
Latent fingerprint recognition is a complicated process and the accuracy is generally low.As noted from Fig. 2, one can note that the comparison scores for the mated samples drop heavily from capacitive to latent fingerprint comparison.Compensation for human examiner supervision (or semi-automatic) can increase the accuracy of latent fingerprint recognition (see Fig. 3 for an example of common latent fingerprint recognition workflow).Such a workflow has a significant difference from the common AFIS operation mode (e.g.border control, mobile unlock and payment, etc.).With the rapid development of biometric technology, more and more fully automated latent fingerprint recognition algorithms have been proposed.There are three steps in the automated latent fingerprint recognition system: segmentation, minutiae extraction, and comparison.A brief review of existing approaches for these three steps is given below and the detailed summary is illustrated in Table II.Prior to discussing each of the components, we also discuss present available datasets for latent fingerprint recognition.

A. Latent fingerprint databases
There are several existing latent fingerprint databases available for performance evaluation, such as, West Virginia University (WVU) database [15], Multisensor Optical and Latent Fingerpr (MOLF) database [13], Tsinghua Latent Overlapped Fingerprint (TOLF) database [10], and so on.A list of commonly used latent fingerprint databases is given in Table I.However, NIST SD27 has been withdrawn and is no longer available.
As we can see from Table I, the size of existing latent fingerprint databases is small and the simulation of realworld scene latent fingerprints is a very challenging issue.
Unfortunately, the only real crime scenes database NIST SD27 has been withdrawn.To develop latent fingerprint recognition techniques, creating a new and challenging database is needed.Therefore, a new database that meets the following requirements is desirable: 1) Large-scale database including large number of unique finger instances and not just multiple fingers from a few unique subjects; 2) Real-world scenarios where latent fingerprints vary in terms of quality, resolution and material surfaces; 3) Both reference and latent fingerprints are available for each unique fingerprint instance in contact-less and contact-based scenarios; 4) Public availability of dataset for academic research under different latent and cross-sensor (latent-vs-contactbased) protocols.Compared to other existing latent fingerprint databases, the proposed LFIW dataset not only has the largest number of total fingerprint samples (13,180) from various scenarios but also has the largest amount of unique fingerprint instances (1318 from 132 subjects).The database further meets the criteria mentioned above.

B. Latent fingerprint segmentation techniques
Latent fingerprint segmentation can be defined as the separation of the fingerprint region from the entire image.Segmentation methods with high accuracy can not only reduce the computational complexity but also usually improve the minutiae extraction performance.There are two types of segmentation tasks: separating non-overlapping and overlapping latent fingerprints.For non-overlapped latent fingerprint segmentation task, an extended directional total variation model was developed by Zhang et al. [17] to search for and separate latent fingerprints from the background.Cao et al. [18] presented a dictionary based approach to segment latent fingerprints as well as improve their quality.Many machine-deep learning (ML/DL)based latent fingerprint segmentation approaches have been developed over the past few years.Patches from the region of interest of an image are trained in a convolutional neural network (CNN) and used for segmentation by Stojanovic et al. [19].A foreground (latent fingerprint) and background classification method was developed by Sankaran et al. [20], which takes advantage of random decision forest.Nguyen et al. [21] introduced a CNN-based latent fingerprint segmentation algorithm (SegFinNet) to compensate for the insufficient performance of existing Commercial Off-The-Shelf (COTS) latent fingerprint recognition methods.Compared to nonoverlapped latent fingerprint segmentation, separating overlapped fingerprints from each other and from the background is challenging.Chen et al. [22] applied local Fourier transform and relaxation labelling to segment overlapped fingerprints.To overcome the shortcomings of relaxation labelling-based methods, Zhao and Jain [23] developed a zero-pole model, Legendre polynomial, 2D Fourier Expansion, and monomial basis function for overlapped fingerprint segmentation.An adaptive neuro-fuzzy inference system classifier was used for overlapping fingerprint segmentation by Jeyanthi et al. [24].Stojanovic et al. [25] combined neural networks and Fourier analysis to separate the overlapping fingerprints.

C. Latent fingerprint minutiae extraction techniques
Many fingerprint minutiae extraction methods have been developed in the past, however, the number of minutiae extraction algorithms that are especially used for the latent fingerprints is limited.Su and Srihari [26] developed a latent fingerprint minutiae extraction approach using a regression Gaussian process model to estimate the location of finger core points and orientation fields.Sankaran et al. [27] presented to classify minutia or non-minutia regions in a latent fingerprint by using stacked denoising sparse auto-encoders.Tang et al. [28] used a fully connected CNN to extract minutiae from the complicated background so that latent fingerprint segmentation and quality enhancement are no longer needed in this approach.

D. Latent fingerprint comparison pipeline
It is not a simple task to find a match between an unknown fingerprint and a fingerprint in a big database, while this becomes even more difficult for latent fingerprint.Jain and Feng [29] combined extended fingerprint features and minutiae to perform latent fingerprint comparison.Paulino et al. [15] applied Descriptor-Based Hough Transform (DBHT) to compare reconstructed orientation fields in two latent fingerprints.Cao and Jain [1] proposed to generate two minutiae templates (obtained from CNN-based and dictionary-based ridge flow, respectively) and one texture template (virtual minutiae) for latent fingerprint comparison.In addition to use minutiae for matching, pores were also used by Nguyen and Jain [30] to increase the accuracy of latent fingerprint comparison.

III. LATENT FINGERPRINT IN THE WILD DATABASE
As noted in the previous studies, the performance evaluation of existing latent fingerprint recognition techniques is mainly based on only one or certain databases, which are usually limited in size, diversity of image acquisition devices, image quality, and realistic capture environment.The best way to evaluate the performance of a latent fingerprint recognition algorithm is to challenge it using different databases, image acquisition and testing protocols.In order to overcome these limitations and provide a new database for performance evaluation under real-world scenarios with high image quality, the LFIW database created in this work consists of six subsets of which two subsets are traditional fingerprints, three latent fingerprints and one fingerphoto set as provided below: 1) R-opt: Reference fingerprints from optical sensor; 2) R-cap: Reference fingerprints from capacitive sensor; 3) Smt: Smartphone fingerphotos; 4) L-wall: Latent fingerprints captured from wall surface; 5) L-ipad: Latent fingerprints captured from Ipad surface; 6) L-alum: Latent fingerprints captured from aluminum foil surface.Detailed information of the LFIW database is further provided in Table III.All fingerprint images have been cropped and rotated to remove the background in order to avoid unnecessary variables and facilitate the following processing steps (e.g.enhancement, minutiae extraction, etc.).Examples of the R-opt, R-cap, Smt, L-wall, L-ipad, and L-alum images are illustrated in Fig. 1 red dotted block.

A. Reference fingerprint images: R-opt and R-cap
For each of the 132 subjects in the LFIW database, two enrolment images were captured by using two professional fingerprint acquisition sensors: one optical fingerprint sensor and one capacitive sensor.The optical sensor is ZKTeco Live10R fingerprint capture device and the capacitive sensor is Bingup FPW-A360 fingerprint capture device.The original size of the fingerprint images from the optical sensor is 288×375 pixels (106 KB) and is 256×360 pixels (91 KB) for the capacitive sensor.All reference fingerprint images are in 500 ppi.There are a total of (132 subjects×10 f ingers− 2 lost) × 2 sensors × 2 enrolment = 5272 reference fingerprint images in the LFIW database.

B. Smartphone fingerphoto images: Smt
All smartphone fingerphoto images were taken by Huawei Honor20 smartphone (48+8+2 megapixel triple camera).All Overlapped latent fingerprint segmentation methods Chen et al. [22] Fourier transform and relaxation labelling FVC2002 [31] Insufficient for singular points Zhao and Jain [23] Joint orientation modeling NIST SD27 FVC2002 [31] Better manual marking minutiae subjects were asked to place each of their ten fingers on a white background under additional white light source.The acquisition distance to the fingers and the focus were controlled and the build-in flash has been turned off.The original size of the fingerprint images is 3000 × 4000 pixels (∼2MB) and the ppi is 96 by default.There are a total of 132 subjects × 10 f ingers − 2 lost = 1318 smartphone fingerprint images in the LFIW database.

C. Wall surface latent fingerprint images: L-wall
In order to simulate the latent fingerprints captured on the wall in a real crime scene (indoor environment, such as office, bank, school, etc.), subjects were required to touch all 10 fingers on an office desk partition wall to leave their fingerprints on the wall.Copper powder was used to make fingerprints visible and wall latent fingerprint images were taken by Iphone 8 plus smartphone (12 megapixel dual camera).Additional white light source was used and the acquisition distance to the fingerprints was controlled while the build-in flash has been turned off.The original size of the wall latent fingerprint images is 3024 × 4032 pixels (∼2.5MB) and the ppi is 72 by default.There are a total of 132 subjects×10 f ingers−2 lost = 1318 wall surface latent fingerprint images in the LFIW database.

D. Ipad surface latent fingerprint images: L-ipad
In order to simulate the latent fingerprints captured on the surface of electronic devices as well as on the glasses in a real crime scene, subjects were required to touch all 10 fingers on an Ipad screen surface (without protective film) to leave their fingerprints on the Ipad screen.Copper powder was used to make fingerprints visible and latent fingerprint images were taken by the same Iphone 8 plus smartphone.The acquisition setups and images properties are the same as the L-wall.Since additional white light source was used, the screen reflection was avoided as much as possible during the acquisition process.There are a total of 132 subjects × 10 f ingers − 2 lost = 1318 Ipad screen surface latent fingerprint images in the LFIW database.

E. Aluminum foil surface latent fingerprint images: L-alum
In order to simulate the latent fingerprints captured on the (deformable) metal surface in a crime scene, subjects were required to touch all 10 fingers on an aluminum foil surface to leave their fingerprints on the foil.Copper powder was again used to make fingerprints visible and aluminum foil surface latent fingerprint images were taken by the same Iphone 8 plus smartphone.The acquisition setups and images properties are the same as the L-wall.Since additional white light source was used, the aluminum foil reflection was avoided as much as possible during the acquisition process.There are a total of 132 subjects × 10 f ingers − 2 lost = 1318 aluminum foil surface latent fingerprint images in the LFIW database.

F. Fingerprint images preprocessing
All Smt, L-wall, L-ipad, and L-alum original images have been cropped and rotated manually for further processing.Moreover, all preprocessed fingerprint images (JPEG format) have a 500 ppi version, a gray-scale 500 ppi version, a grayscale 500 ppi PNG format version, and a gray-scale 500 ppi PGM format version.

IV. FINGERPRINT RECOGNITION ALGORITHMS
As described in the previous sections, a number of existing state-of-the-art fingerprint recognition algorithms are evaluated on the new LFIW database.Meanwhile, different versions of minutiae/features have been generated by these evaluated algorithms and stored in the benchmark databases for further evaluation.In this section, we briefly discuss the algorithms that were tested on the LFIW database.
1) NIST Biometric Image Software (NBIS) [33]: The NBIS is one of the most well-known fingerprint recognition toolkits that can be freely used and distributed.Two components are used for the performance evaluation: MINDTCT and BOZORTH3.MINDTCT is a minutiae detector and it can automatically locate and record ridge endings and bifurcations in a fingerprint image.BOZORTH3 is a fingerprint comparison algorithm and it is minutiae-based.It accepts minutiae generated by the MINDTCT algorithm.All extracted minutiae from MINDTCT are stored in the benchmark databases and can be used for other fingerprint-comparison algorithms in case needed.
2) Minutia Cylinder-Code (MCC) fingerprint recognition SDK [34], [35]: The so-called 'cylinder' is a 3D data structure containing minutiae distances and angles.Any standardized minutiae position and direction (e.g.ISO/IEC 19794-2 [36]) can be used as mandatory pre-condition to establish the cylinder.Instead of designing complex metrics to calculate local similarities and generate the comparison score, a very simple algorithm is applied in MCC by taking advantages of the cylinder invariance.MCC uses ISO/IEC minutiae information to generate its own minutiae template for fingerprint comparison [37], [38].
3) VeriFinger fingerprint recognition SDK [39]: VeriFinger is a commercial fingerprint recognition software designed for biometric systems developers and integrators by Neurotechnology [39].The software can conduct fast fingerprint comparison in 1-to-1 and 1-to-many modes.The VeriFinger algorithm is based on deep neural networks and follows the commonly accepted fingerprint recognition scheme, which uses a set of minutiae along with a number of proprietary algorithmic solutions that enhance system performance and reliability.VeriFinger can produce its own minutiae template for fingerprint comparison.It has also been submitted to the FVC-onGoing [37], [38] framework and has reached NIST MINEX compliance.
4) MinutiaeNet minutiae extractor [40]: MinutiaeNet can perform fully automatic latent fingerprint minutiae extraction by using two independent deep neural networks.The first network is named as CoarseNet and it estimates the minutiae score map and minutiae orientation based on CNN and fingerprint domain knowledge (enhanced image, orientation field, and segmentation map).FineNet is the second network and it refines the candidate minutiae locations based on the score map.MinutiaeNet has been particularly tested on NIST SD27 latent fingerprint database and the performance is better than several other state-of-the-art minutiae extraction algorithms.However, MinutiaeNet needs to apply other methods for minutiae comparison, such as MCC or BOZORTH3.
5) MSU Latent Automatic Fingerprint Identification System [41]: MSU-LAFIS is an end-to-end latent fingerprint search system, which has five main steps: 1) fingerprint region of interest segmentation, 2) segmented image pre-processing, 3) feature extraction, 4) feature comparison, and 5) comparison results generation.Two isolated feature extraction algorithms are used to produce additional feature templates.In order to avoid an insufficient number of extracted features from latent fingerprints (too small area or very low image quality), the feature template can be established by combining real extracted features and a group of generated virtual features.Each latent fingerprint feature and its neighbourhood are used to obtain a 96-dimensional descriptor for feature comparison.The descriptor length of the virtual feature is further compressed from 96 to 16 to increase processing speed by using product quantization.

V. PROTOCOLS, RESULTS AND DISCUSSION
With the newly introduced dataset, we also conduct an extensive evaluation by introducing three different protocols.The first protocol is to establish the baseline performance in traditional fingerprint capture devices (optical and capacitive sensors).The second protocol is to evaluate the scenarios of comparing the latent-vs-latent and latent fingerprint with traditional contact-based fingerprints.The third protocol is to account for comparison of latent fingerprints with contactless fingerprints derived from fingerphotos.With our protocols, we cover all possible scenarios of relevance in real-world use cases.Given the large scale of LFIW dataset, we also perform both verification and identification experiments to provide the reader with an understanding of the challenges and thereby suggest directions for future works.

A. Verification results -Overall
Before each of the protocols is considered, we provide an overall evaluation of the dataset by combining all the images of LFIW dataset.In the overall evaluation experiment, a total of 118,620 mated comparison scores and 173,593,780 non-mated comparison scores are generated.The Detection Error Tradeoff (DET) curves of the overall comparison experiments for the LFIW dataset are presented in Fig. 4 along with the detailed results for various metrics in Table IV.Two algorithms, VeriFinger and MCC perform slightly better than the average, however, the Equal Error Rate (EER) is 22.82% and 32.27% respectively.Even the MinutiaeNet and MSU-LAFIS which are particularly designed for latent fingerprint recognition perform poorly with an EER of 51.85% (MinutiaeNet-MCC)/45.61% (MinutiaeNet-NBIS) and 47.51%, respectively.We have therefore analyzed the causes of low performance and observe a high Failure To Enrol Rate (FTER) for most of the selected algorithms.The overall FTER are: Abstract-Latent fingerprints are one of the most important and broadly used evidence in sources of evidence in crime scene, digital forensic, law enforcement worldwide.Despite the number of advancements reported in recent works, we note significant open issues such as independent benchmarking and lack of large-scale evaluation database for improving the algorithms are inadequately addressed.The existing databases, mostly of semi-public nature, lack in acquisition in the wild environment, and post-processing pipelines.Moreover, they do not represent a realistic capture scenario similar to the real crime scenes, in order to benchmark the robustness of algorithms.Further, existing databases for latent fingerprint recognition do not have large number of unique subjects/fingerprints or do not provide ground truth/reference fingerprint images to conduct a crosscomparison against latent.In this paper, we introduce a new wild large-scale latent fingerprints database including five different acquisition scenarios: reference fingerprints from (1) optical and (2) capacitive sensors, (3) smartphone fingerprints, latent fingerprints captured from (4) wall surface, (5) Ipad surface, and (6) aluminum foil surface.The new database consists of 1320 unique fingerprints captured in all of the above mentioned settings.A total of Kiran -XX reference from ...., and 13180 latent fingerprint images from 132 subjects are provided in this work.The dataset is constructed considering various age groups, equal representation of genders and backgrounds are in the database.In addition, we provide an extensive set of analysis on various subset evaluation to highlight open challenges for future directions in latent fingerprints recognition research.
Index Terms-Biometrics, latent fingerprints, fingerprint recognition, database, performance evaluation.Dechao Sun is with the .especially developed for latent fingerprint, so it corresponds to our expectation that it can handle more than 90% of the latent fingerprints in the LFIW database.The FTER for VeriFinger reaches more than 40% which means that almost half of the latent fingerprints in the LFIW database cannot be used for recognition or identification when using VeriFinger algorithm.On one hand, the high from selected non-latent-oriented methods indicates that the latent fingerprints in the LFIW database are much more difficult to be processed compared to existing fingerprints and latent fingerprints.Therefore, robust latent fingerprint recognition algorithms are needed.On the other hand, the pre-processing approaches from the selected algorithms can be optimized to be able to handle latent fingerprints in the LFIW database.

I. RELATED WORKS
The distributions of the overall comparison scores from the selected algorithms are illustrated in Fig. 5.It can be observed that none of the selected algorithms can well separate the mated scores and the non-mated scores well.The highest frequency mated and non-mated scores are almost overlapping for all the methods.Compared with other methods, the mated scores of VeriFinger are more distributed far away from the non-mated scores (see Fig. 5 (c)).This is probably due to the high FTER where latent fingerprints have been rejected during the enrollment phase while it could process reference Frequency (Genuines) •10 4 Score distributions experiment: MinutiaeNet-MCC 4 Score di Legend Fig. 5: Distributions of the overall comparison scores in the 'Latent in the Wild' database 4. Two algorithms perform slightly better than the average (VeriFinger and MCC), however, the overall performance is not quite promising.The reason for the general low performance of selected fingerprint recognition algorithms with respect to the accuracy reported in the original publications could be due to the challenge and difficulty of the benchmark database.Even the MinutiaeNet and MSU-LAFIS are particularly designed for latent fingerprint recognition, their performances are not outstanding.During the 4. Two algorithms perform slightly better than the average (VeriFinger and MCC), however, the overall performance is not quite promising.The reason for the general low performance of selected fingerprint recognition algorithms with respect to the accuracy reported in the original publications could be due to the challenge and difficulty of the benchmark database.Even the MinutiaeNet and MSU-LAFIS are particularly designed for latent fingerprint recognition, their performances are not outstanding.During the minutiae (features) extraction process, we observed that the •10 4 Score di Legend Fig. 5: Distributions of the overall comparison scores in the 'Latent in the Wild' database 4. Two algorithms perform slightly better than the average (VeriFinger and MCC), however, the overall performance is not quite promising.The reason for the general low performance of selected fingerprint recognition algorithms with respect to the accuracy reported in the original publications could be due to the challenge and difficulty of the benchmark database.Even the MinutiaeNet and MSU-LAFIS are particularly designed for latent fingerprint recognition, their performances are not outstanding.During the minutiae (features) extraction process, we observed that the Frequency (Genuines) •10 4 Score distributions experiment: MinutiaeNet-MCC •10 4 Score Legend Fig. 5: Distributions of the overall comparison scores in the 'Latent in the Wild' database 4. Two algorithms perform slightly better than the average (VeriFinger and MCC), however, the overall performance is not quite promising.The reason for the general low performance of selected fingerprint recognition algorithms with respect to the accuracy reported in the original publications could be due to the challenge and difficulty of the benchmark database.Even the MinutiaeNet and MSU-LAFIS are particularly designed for latent fingerprint recognition, their performances are not outstanding.During the 4. Two algorithms perform slightly better than the average (VeriFinger and MCC), however, the overall performance is not quite promising.The reason for the general low performance of selected fingerprint recognition algorithms with respect to the accuracy reported in the original publications could be due to the challenge and difficulty of the benchmark database.Even the MinutiaeNet and MSU-LAFIS are particularly designed for latent fingerprint recognition, their performances are not outstanding.During the •10 4 Genuine scores 51950 •10 4 Score di Legend Fig. 5: Distributions of the overall comparison scores in the 'Latent in the Wild' database 4. Two algorithms perform slightly better than the average (VeriFinger and MCC), however, the overall performance is not quite promising.The reason for the general low performance of selected fingerprint recognition algorithms with respect to the accuracy reported in the original publications could be due to the challenge and difficulty of the benchmark database.Even the MinutiaeNet and MSU-LAFIS are particularly designed for latent fingerprint recognition, their performances are not outstanding.During the minutiae (features) extraction process, we observed that the  There are also a number of mated comparison scores separated far from the non-mated scores for MSU-LAFIS in Fig. 5 (f), but the proportion is lower than VeriFinger.The mated and non-mated scores of MinutiaeNet (both MCC and NBIS) are highly overlapped.In addition to the distributions of the comparison scores, we present the most important performance indicators measured on the LFIW database for the overall comparison experiments in Table IV.It can be noted that VeriFinger has the highest Area Under the ROC Curve (AUC) value and lowest EER.Except VeriFinger, MCC out perform the rest of the algorithms, however, its AUC and EER is still far from indicating the properties of a robust latent fingerprint recognition system.Nevertheless, it is difficult to distinguish whether those higher mated comparison scores are from reference comparisons or latent comparisons by only looking at the overall experimental results.Therefore, in the following parts we will investigate the results for reference and latent comparisons, respectively.

B. Protocol I: Verification results -Traditional Sensors
We illustrate the DET curves of the NBIS, MCC, VeriFinger, MinutiaeNet-MCC, MinutiaeNet-NBIS, and MSU-LAFIS comparison experiments for the LFIW database in Fig. 6 and the most important performance indicators measured for the selected algorithms in Table V.From Fig. 6 and Table V we can observe that the reference comparisons (e.g.R1-opt to R2-opt, R2-opt to R2-cap) have better performance than latent comparisons (e.g.R2-opt to L-wall, L-ipad to L-alum).Except for VeriFinger, the best performance in reference comparisons is from 'R1-opt to R2-opt (reference fingerprints from optical sensor session 1 vs. reference fingerprints from optical sensor session 2)' (see blue dashed lines with square markers in Fig. 6).The comparisons 'R1-cap to R2-cap' give the best performance for VeriFinger.Although the comparisons 'R1cap to R2-cap' (red dashed lines with triangle markers) are between the same acquisition device, the performance is lower than 'R1-opt to R2-opt' for most of the algorithms.It means that the utility of reference fingerprints from optical sensor is better than the capacitive sensor measured by NBIS, MCC, MinutiaeNet (both MCC and NBIS), and MSU-LAFIS.The performance of the remaining reference comparisons are very similar.In Table V, the EER from the 'R1-cap to R2-cap' comparison experiment is 0.82% for MCC, which is also higher than the 'R1-opt to R2-opt' experiment.However, both the above mentioned two EERs are smaller (0.11% and 0.89%, respectively) than the NBIS.Moreover, the difference between 'R1-opt to R2-opt' and 'R1-cap to R2-cap' comparison experiments for MCC are also 0.78 less, compared to NBIS.It means that MCC has a better ability to process the reference fingerprints in the LFIW database than NBIS.We can see from Table V that the overall EERs and FMR100s for VeriFinger are lower than NBIS and MCC, which is the same as we already discussed previously in the overall results section.By comparing Fig. 6 (d) and (e), as well as EER values for MinutiaeNet we can discover that, NBIS has slightly better overall system performance than MCC when using the extracted minutiae from MinutiaeNet.However, neither MCC nor NBIS can achieve better system performance for MinutiaeNet compared to other fingerprint recognition systems.
1) Protocol Ia: Verification results -Traditional Cross-Sensors: We also study the cross-sensor recognition for the completeness of the analysis by comparing the optical v/s capacitive sensors.However, the recognition performance of 'R1-opt to R1-cap' (orange dashed line with triangle markers) and 'R1-cap to R2-opt' (purple dashed line) are lower than same-sensor comaprison for MSU-LAFIS (see Fig. 6 (f)).

C. Protocol II: Verification results -Traditional v/s Latent
We further consider a realistic evaluation scenrio where the latents are to be compared against traditional fingerprints.As noted from Table V, the overall performance for latent v/s traditional fingerprints is low.The EER values are around 50% and the FMR100 values are close to 100% for most of the reference to latent and latent to latent comparisons from all algorithms.The results suggest that comparing latent fingerprints in the LFIW database is a very complex task for the selected algorithms.An interesting EER value 18.8% can be observed from the 'L-ipad to L-alum' comparison experiment for NBIS.This EER value is much less than the others obtained from latent fingerprints comparisons.Introspecting the comparison scores, we note a very high FTER in NBIS for 'L-ipad to L-alum' comparisons resulting in a misleading low EER.The results also indicate that latent fingerprints captured from Ipad surface and from aluminum foil surface are the most difficult ones for NBIS to extract minutiae.We can see the EER values for VeriFinger are low for many reference fingerprints to latent fingerprints comparison (e.g.EER for R1-opt to L-ipad is 10.2% and latent fingerprints to latent fingerprints comparison (e.g.EER for L-wall to L-ipad is 7.1%) experiments.After investigating the comparison scores from these comparison experiments, we explore that the number of scores is quite small.For example, there are 29 mated scores left for Lwall to L-ipad comparison, there is only one mated score left for L-wall to L-alum comparison experiments.All the above discovered atypical EER and FMR values (e.g.low for NBIS and high for VeriFinger) are due to the high FTER that already discussed previously (noted in Table V).Although MinutiaeNet has been tested on NIST SD27 latent fingerprint database and MSU-LAFIS is especially developed for latent fingerprints, they still fail to provide robust latent fingerprints recognition performance on the LFIW database after looking at the EER and FMR100 values in Table V.

D. Protocol III: Verification results -Fingerphoto Comparisons
We further consider another protocol according to recent trends and benchmark the performance for fingerphoto to latent fingerprint comparison.Specifically, the protocol is aimed at using fingerphotos as a replacement to traditional fingerprint capture from contact-based sensors.We therefore evaluate, fingerphotos as reference and compare it to latent comparisons.From Fig. 6 and Table V (in the bottom sector),

E. Identification results
In addition to demonstrating verification results, we also illustrate the performance of identification results using Cumulative Match Curves (CMC) of the Rank Identification (RI) rates (rank-10) for NBIS, MCC, VeriFinger, MinutiaeNet-MCC, MinutiaeNet-NBIS, and MSU-LAFIS identification experiments for the LFIW database as shown in Fig. 7. From Fig. 7 we can observe that the overall RI rates for reference comparisons are higher than latent identification.The RI rates for MCC, VeriFinger, MinutiaeNet-MCC, and MSU-LAFIS can reach 100% at rank-6 to rank-10.While the RI rates for NBIS and MinutiaeNet-NBIS are lower than 60% in all 10 ranks.Similar to the verification results, 'R2-opt to R1-opt' comparisons (gray dashed lines in Fig. 7) remains the best performance but none of the algorithms can get RI rates to 1 before rank-4.
In summary, none of the selected algorithms can provide high accuracy latent fingerprints minutiae extraction and comparison by using fingerprints from the LFIW database for both verification and identification scenarios.From the experimental results above we can conclude that latent fingerprint recognition by existing techniques is still very challenging and complex, especially for latent fingerprints captured in the wild conditions.The results indicate again the need for the development of robust latent fingerprint minutiae extraction and recognition algorithms for wild latent fingerprints.

F. Directions for future works
As illustrated from the experimental results discussed in the above sections, the system performance of evaluated latent fingerprints recognition algorithms does not meet the operational requirements.By looking at FMR100, we can see from the overall results in Table IV or the reference/latent comparisons results in Table V that the result is around 20% only for reference fingerprints comparison experiments.For latent and smartphone fingerprints comparison experiments, the FMR100 is higher than 90% and many of them are even close to 100%.From a practical point of view, this behaviour would cause a noticeable number of false matches and, as a result, a bigger number of false non matches during latent fingerprints recognition at crime scene, digital forensic, or law enforcement scenarios.This would not help to find the real suspects.Therefore, the directions for the future works could be as the following: • Given that the existing fingerprint recognition systems have very low performance on smartphone and latent fingerprints captured from different material, more accurate and robust algorithms are desired in order to overcome the difficulties and challenges of wild latent fingerprints recognition.
• As it also has been discussed that the FTER is relatively high for existing fingerprint recognition algorithms, including systems that particularly for latent fingerprints.Reliable and accurate latent fingerprints pre-processing (e.g.segmentation, quality enhancement, minutiae orientation estimation, etc.) and minutiae extraction approaches need to be developed.
• While there exist quality assessment algorithms like NFIQ [42] for fingerprints, quality assessment algorithms that can predict the recognition performance for latent fingerprints are still missing.
• As an additional direction, the performance of human examiner latent fingerprints comparison could be investigated in a standardized manner to discover the important factors in recognizing the latent fingerprints captured in wild environment.
VI. CONCLUSION Latent fingerprints recognition has always been a complex and challenging task with the availability of no public and large-scale datasets.In this work, we have introduced LFIW, a new database of latent fingerprints in the wild.This database has included six different acquisition scenarios: reference fingerprints from (1) optical and (2) capacitive traditional fingerprint sensors, (3) smartphone fingerphotos, latent fingerprints captured from (4) wall surface, (5) Ipad surface, and ( 6) aluminium foil surface.The new database consists of 1318 unique fingerprint instances captured in all of the abovementioned settings.A total of 2636 reference fingerprints from optical and capacitive sensors, 1318 fingerphotos from smartphone, and 9224 latent fingerprints from every 132 subjects are provided in this work.The presented wild latent fingerprints database with large number of unique fingerprints will be publicly available in order to allow researchers to benchmark their algorithms in a free and sustainable manner to develop robust and accurate latent fingerprints recognition algorithms.Additionally, a benchmark of several existing state-of-the-art fingerprint recognition systems is also provided in this paper to eliminate the limitations in the existing latent fingerprints recognition methods, and to provide some directions for future works in this research field.

Fig. 1 :
Fig. 1: Our contribution of Latent Fingerprint In the Wild (LFIW) dataset compared to previous works.

Fig. 2 :
Fig. 2: Examples of comparison scores of the fingerprints from the proposed LFIW database.

Fig. 4 :
Fig. 4: DET curve of the overall comparison experiments for the LFIW database.

Fig. 5 :
Fig. 5: Distributions of the overall comparison scores in the 'Latent in the Wild' database

Fig. 5 :
Fig. 5: Distributions of the overall comparison scores in the 'Latent in the Wild' database

TABLE I :
Latent fingerprint databases.The proposed database will be publicly available upon the acceptance of this paper.

TABLE II :
Latent fingerprint recognition techniques.

TABLE III :
Properties of the 'Latent Fingerprint in the Wild' database.
* * one of the subjects (worker) lost two Thumb fingers.

TABLE IV :
Performance indicators measured on the LFIW database for the overall comparison experiments.
67%, and FTER for different protocols are given in Table V in blue color.Such a high FTER for the second and third protocol in Table V indicates the difficulty to extract minutiae from images in LFIW making it a challenging dataset.Although some minutiae (features) can be successfully extracted, the number of high quality minutiae (features) might be insufficient for an eligible comparison.MSU-LAFIS has the lowest FTER, NBIS and MinutiaeNet (both MCC and NBIS) have the FTER lower than 30%.Since MSU-LAFIS is JOURNAL OF L A T E X CLASS FILES, VOL.XX, NO.XX, XXXX 2023 1 This paper was supported by the National Natural Science Foundation of China (Grant No. 62106228), Zhejiang Natural Science Foundation (Grant No. LQ22F020003), Ningbo Natural Science Foundation (Grant No. 2021J175), and the Ningbo Yongjiang Talent Introduction Programme 2021.Xinwei Liu, Renfang Wang, Hong Qiu, Hucheng Wu, Qiguang Zheng, Siyou Xiao are with the College of Big Data and Software Engineering, Zhejiang Wanli University (ZWU), 8 Qianhu South Road, Ningbo, Zhejiang, China.
Kiran Raja, Raghavendra Ramachandra, Christoph Busch are with the Department of Computer Science, NTNU, 2815 Gjøvik, Norway.

TABLE V :
Performance indicators measured on the LFIW database for the six different algorithms.