3D-FLARE: A Touchless Full-3D Fingerprint Recognition System Based on Laser Sensing

“The friction ridge pattern is a 3D structure which, in its natural state, is not deformed by contact with a surface”. Building upon this rather trivial observation, the present work constitutes a first solid step towards a paradigm shift in fingerprint recognition from its very foundations. We explore and evaluate the feasibility to move from current technology operating on 2D images of elastically deformed impressions of the ridge pattern, to a new generation of systems based on full-3D models of the natural non-deformed ridge pattern itself. There are already a small number of previous studies that have already started scratching the surface of 3D fingerprint recognition and that should not go overlooked. However, the vast majority of these few successful approaches published so far, are based on the reconstruction of fingerprints from multiple 2D images acquired with different lighting conditions (photometric stereo 3D reconstruction) or acquired from different angles (stereo vision 3D reconstruction). Such reconstruction methods lead in general to 2D fingerprints wrapped over the overall volume of the finger. These volumetric fingerprints have shown some promising performance, but still miss the real depth information of the ridge pattern, which, in the best case scenario, is coarsely estimated during the error-prone reconstruction process. In the present work we take one step further, directly acquiring for the first time in a consistent and repeatable manner, full-3D fingerprint models stored as point-clouds, where each point is defined by its $[x,y,z]$ coordinates. This way, the 3D data is directly measured by the sensor, with no post-processing reconstruction stage required. The complete recognition system developed represents as well an alternative to traditional technology based on minutiae detection. It shows that image-based processing algorithms and descriptors can be successfully applied to the new full-3D data, reaching very competitive results and confirming the high distinctiveness of the models.


I. INTRODUCTION
''All natural objects are unique if examined in enough detail'' -Gottfried W. Leibniz This quote by the famous mathematician and philosopher Gottfried W. Leibniz, was used by Simon Cole in his article ''The Myth of Fingerprints'' to question the amount of usable distinctive information available in a fingerprint [1].
While it is difficult to argue against Leibniz's statement, when we speak about automatic computerised recognition of fingerprints, it is Cole that hits the mark [2]. The question is not anymore whether or not natural fingerprints are unique, but rather, if the digital representation that is used for their recognition is able to capture that uniqueness in an automatic, measurable, consistent and repeatable way (i.e., usable).
The associate editor coordinating the review of this manuscript and approving it for publication was Aysegul Ucar .
In the wake of the previous discussion, it can be argued that the key to reach high accuracy in biometrics lies, to a large extent, at the very beginning of the recognition chain: the acquisition process. Following Leibniz's quote, we need to develop acquisition technology that is able to examine (and capture) in enough detail, the natural biometric characteristics.
This need for detailed and reliable digital models of the natural world, is summarised in computer science as the wellknown GIGO principle: ''Garbage In, Garbage Out''. Or, in other words, the results of a computerised system can only be as accurate and reliable as the information entered into it. In the biometric field in particular, such principle bears a direct relation with the concept of data quality as reflected by the fidelity definition given in the ISO/IEC 29794-1 standard [3]: if the acquired biometric sample (i.e., digital representation of the natural biometric characteristic) does not VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ reliably render its original counterpart (garbage in), only high error rates can be expected (garbage out). The argumentation above leads to a foregone conclusion: the more reliable and detailed the digital representation of the object being automatically recognised, the higher its uniqueness, the easier to be differentiated from other similar objects, and the better the system accuracy. In the case of fingerprint recognition, two features stand out in order to acquire a high fidelity digital representation of the ridge structure: • Human fingerprints are 3D anatomical structures. This rather obvious statement is, however, one of the key factors behind the accuracy ceiling fingerprint-based systems are running into. Present fingerprint recognition technology relies on 2D images of the fingerprints. Such a downgrading from the 3D space to the 2D pixelbased plane implies that some valuable, and potentially distinctive information, is lost.
• In their natural state, fingerprints are not deformed by contact with an object. Such a second trivial remark is also one of the main parameters that explains the level of intrinsic failure of fingerprint recognition. The majority of current fingerprint acquisition sensors require the finger to be in contact with a surface in order to capture the final 2D fingerprint image. Such a touch-based acquisition procedure introduces additional variability among samples as a result of: 1) the elastic deformation that affects the finger when it is pressed against a surface; 2) imprecise imaging due to changes in the skin condition (e.g., dry or moisturised skin). This variability is not present in the fingerprints natural non-deformed state (or at least to a much lesser degree), but is generated as a direct result of the acquisition process. Furthermore, these changes are very difficult to predict or correct at the time of acquisition as they depend on uncontrollable factors such as: the pressure applied on the sensor, the amount of rotation after contact, or the condition of the fingerprint. Substantiated on the previous two factual observations, it is reasonable to hypothesise that: a high quality 3D model acquired in a touchless way, should in principle be a more reliable representation of a natural fingerprint than a high quality 2D image acquired with touch-based technology, since the 3D model is closer to the actual physical reality of the fingerprint. This way, the 3D model would have the potential to lead to higher recognition accuracy.
As further reinforcement of the previous hypothesis, let's assume that the acquisition of a perfect 3D model of the fingertip were possible. In that case, it would be fairly straightforward to project it onto a perfect, non-deformed 2D rolled fingerprint image. That is, all the information contained in a given 2D image is also present in the perfect 3D model. However, the inverse process, that is, going from a perfect 2D image to a perfect 3D model, would be impossible, given that part of the original spatial information would be lost during the 2D acquisition and would have to be estimated in the reconstruction process.
For the above-mentioned reasons, it seems reasonable to expect that the previous hypothesis holds: on paper, full-3D touchless acquisition has the potential to produce better recognition accuracy results than 2D touch-based sensing. If that is the case, the question that immediately follows is: Why does theory fail to be put in practice? Why is current fingerprint acquisition technology based on touch-sensors that capture 2D images? Why aren't there any known commercial or standard solutions to capture full-3D fingerprint models in a touchless manner? The answer to these questions lies in a fundamental concept ''hidden'' in one of the previous paragraphs: a high quality representation of the fingerprint is required.
It may be true that, assuming both representations of the fingerprint (i.e., full-3D touchless model and 2D touch-based image) of the same quality, the 3D model should lead, in principle, to better recognition accuracy. However, it is also true that a high quality 2D touch-based image is preferable to a medium-to-low quality 3D fingerprint model.
The fact of the matter is that capturing a high quality 3D model of a fingerprint is a very challenging task. Up to date, in spite of some recent valuable initiatives (see Sect. II), there is not yet any sensor capable of accomplishing the feat with sufficient degree of detail (as required by Leibniz), in a fast, consistent and repeatable way. As a result, full-3D fingerprint recognition still remains today, to a very large extent, uncharted territory [4]. In the meanwhile, 2D touch-based sensors keep to produce more reliable fingerprint representations and, in consequence, higher accuracy recognition results.
The current paper builds upon the lessons learned in the preliminary work presented in [5], to become a first robust step towards setting the foundations to consolidate a new biometric mode: full-3D fingerprint recognition. The contributions of the work can be summarised as follows: 1) Development of the first fingerprint scanner based on laser sensing technology able to acquire accurate full-3D fingerprint models in a touchless manner; 2) Development of the first complete recognition system of full-3D fingerprints; 3) Analysis of the discriminative potential of image based descriptors to model the new 3D data, as an alternative to traditional minutiae detection methods; 4) Acquisition of the first multi-resolution database of full-3D fingerprints, comprising 200 different fingers and 3,000 samples; 5) Evaluation of the whole methodology showing the high discriminative power of the novel biometric mode.

A. SHORT NOTE ON TERMINOLOGY
By definition, a fingerprint is the impression left on a surface by the friction ridges of a subject's fingertip. Where the fingertip is the part of the finger corresponding, in length, to the last phalange (3rd phalange).
A friction ridge (also referred to as papillary ridges or epidermal ridges in the specialised anatomy literature), is a raised portion of the epidermis on the digits (fingers, thumbs and toes), the palm of the hands and the sole of the feet.
Each friction ridge of the epidermis (outer skin) is anchored to the dermis (inner skin) by a double row of peglike protuberances, or papillae.
The key concept in the fingerprint definition is that a fingerprint is formed on an external surface, as a result of the contact between the friction ridges of the fingertip and the surface. Traditional fingerprint recognition systems focus on the analysis and recognition of these impressions, either produced with ink on a paper, or on the platen of a 2D touchbased live scanner.
Contactless technology, such as the one presented in this work, does not operate on the impressions produced by the friction ridges, since there is no contact between these and a surface, but it recognises the friction ridges themselves. The 3D sensor described in the present work creates a model of the friction ridges which is later processed to perform the recognition process. Therefore, strictly speaking, this is not a ''fingerprint'' recognition system, but rather a ''friction ridge'' recognition system. However, with the generalised use of fingerprints for personal identification since the early years of the 20th century, the term fingerprint has lost part of its original meaning and is now used indifferently to refer to the impressions on a surface and to the anatomical friction ridges of the human body. Following this widespread trend in the literature, in the current article we will use the term ''fingerprints'' to actually refer to the ''friction ridges'' found in the fingertips.

II. RELATED WORKS
From the origin of automated fingerprint recognition technology in the early 1960s [6], there has been a huge economic and scientific investment in the development of live-scan 2D touch-based fingerprint sensors. This concentration of effort on 2D over 3D technology, was not so much a conscious decision, but the natural consequence of two determinants: 1) On the one hand, there was a pure technical reason. During the big boom of computer-based biometric recognition in the 1980s and 90s, 3D sensing technology was still in its infancy. It was not mature enough to acquire reliable models of such fine and detailed structures as friction ridges. 2) On the other hand, there was also a practical motivation to foster 2D contact-based sensors. This technology produces images that are compatible with the traditional ink-and-paper acquisition method applied from the beginning of the dactyloscopic science back in the 19th century [7]. Therefore, 2D touch-based technology was the natural way to move forward, without producing an undesirable compatibility gap between the old off-line procedures and the new live-scan technology.
However, in the present decade, massive progress has been achieved in the accuracy of 3D sensing and also in the computational capacity for the digital analysis of 3D models. This new reality has resulted in the development of a large variety of applications, graphical tools and algorithms for 3D data processing. We could rightfully say that we are currently witnessing the real blossoming of the 3D era. As a result, the technical limitations for the emergence of a new generation of 3D fingerprint sensors have been largely surmounted.
As an effect of this technical evolution, in the current state of the art in fingerprint recognition, we can find some research works as well as industrial applications that have advanced, to some extent, in the direction proposed by the present work.
Over the last few years, contactless technology is rapidly gaining ground over traditional touch-based sensors, supported by two appealing features [8]: 1) it eliminates the variability introduced among fingerprint samples due to differences in applied pressure and skin condition; 2) it also avoids cleanliness issues such as ''ghost-fingerprints'' on the sensor platen, and even potential health problems derived from multiple individuals touching the same surface (a risk which has been made even more evident with the COVID-19 pandemic). As a result, today we can find a fairly wide range of finalised commercial touchless fingerprint 2D scanners from some of the top companies in the biometric industry such as IDEMIA, GEMALTO, NEC or TBS. In fact, already a few years back, the FBI (Federal Bureau of Investigation) certified the first two commercial contactless scanners for their use in the PIV program (Personal Identity Verification of Federal Employees). 12 In addition to industry, there are also a number of ongoing research initiatives to further improve the contactless technology. One of this novel lines that stands out due to the level of effort dedicated to it, is the development of specific algorithms for a new generation of finger-photo recognition systems designed to be seamlessly integrated in smartphones [9], [10]. Fingerprint 2D contactless recognition is even being considered at the moment for its future deployment at border crossings [11].
All previous efforts have led to the active involvement of the US NIST (National Institute for Standards and Technology) in the development of fingerprint contactless sensors through the CRADAs technology transfer program (Cooperative Research and Development Agreements). As a result, NIST has published a series of technical reports addressing: 1) the usability of contactless fingerprint sensors [12]; 2) guidance for the evaluation of contactless readers [13]; and 3) the current level of accuracy difference and interoperability among contactless and contact-based systems [14]. In a similar line, two reports promoted by the US Department of Justice were conducted to compare the accuracy reached using samples acquired in a contact-based versus a contactless manner [15], [16]. Following the somewhat suboptimal results reported in these evaluations with regard to the level of compatibility of samples acquired with contactless and contact-based sensors, nowadays, one of the most active areas of research in the context of 2D contactless fingerprint recognition, is the development of algorithms to improve this interoperability issue [17], [18].
These initiatives show the growing interest that fingerprint touchless acquisition is gaining in the biometric community. It can be inferred from the strong institutional and private investment showed above, that in the not so distant future, a significant part of fingerprint applications will go touchless.
Some of the on-going touchless projects claim the acquisition of 3D fingerprints. While at some point of the recognition process there is, indeed, a volumetric representation of the fingertip, the vast majority of these works are based on acquiring multiple 2D pixel-based images of the finger, and then generating a 3D reconstruction from them. Very few of these works consider the direct acquisition of a full-3D fingerprint model consisting of vertices defined by their [x, y, z] spatial coordinates. In fact, the existing literature on 3D fingerprint recognition may be classified in three main strategies, according to the method applied to arrive at the 3D representation of the fingertip: 1) fingerprint 3D reconstruction from 2D images of the finger surface; 2) fingerprint 3D reconstruction from 2D images of the finger surface and inner layers; 3) direct fingerprint full-3D acquisition. Each of these strategies are discussed in the next paragraphs.
By far, the most followed strategy to obtain 3D fingerprint representations of fingerprints, is the estimation of the surface depth data from multiple 2D images. These works follow one of two types of acquisition-reconstruction techniques: • Photometric stereo 3D reconstruction. Such methods estimate the shape of the fingerprint using multiple 2D images taken with variable lighting conditions, from a fixed viewpoint. This reconstruction technique assumes that the object (i.e., fingerprint) is illuminated only directly by the sensor light source [19]- [22] • Stereo vision 3D reconstruction. This is currently the most extended approach in 3D fingerprint recognition studies. In this case, multiple 2D images of the fingerprint are acquired simultaneously with two or more cameras. Corresponding points are later detected in the 2D images and used to estimate the 3D depth information, according to the triangulation principle [23]- [31]. Although this formula relying on ''reconstructed-3D data from multiple 2D samples'' has had some success, it only partially addresses the issue of the information loss derived from the initial acquisition of 2D images to generate the final volumetric model. Another challenge created by this type of approaches is the reconstruction process itself, which is usually computationally expensive and adds an extra errorprone stage to the recognition chain.
The second strategy that can be found in the literature to obtain 3D fingerprint models follows a somewhat different principle: taking 2D images not only of the fingerprint surface, but also from internal skin layers. This approach was first followed using ultrasonic sensors in two works by different teams that were published simultaneously in 2008 [32], [33], and then, more recently, using the medical imaging technology known as Full-Field Optical Coherence Tomography (FF-OCT) [34]. In both cases, the raw scanner output are 2D images of the transversal section of the skin, so that the spatial z dimension may be extracted from the successive finger ''slices''. As in the works mentioned previously, the final fingerprint model is reconstructed from the set of 2D images and not directly acquired by the scanner. The ultrasounds approach was restricted to the very preliminary works cited above where two respective proofs-of-concept were presented [32], [33]. In spite of no further public research activity in this line, currently, a company focused on mobile communications, advertises the commercialisation of touchbased 3D fingerprint sensors utilising ultrasound technology, that have been integrated in existing smartphones [35]. As far as we know, the FF-OCT approach has had some further research visibility [36], [37], but is still in its experimental phase, without having produced yet a stable and reliable recognition system.
Finally, the third type of methods capable of producing 3D fingerprints, include those based on sensors that directly acquire 3D data. The system developed in the present work belongs to this category. To date, all the works in this group are based on scanners using Structured Light Illumination (SLI). The pioneering work that first suggested the possibility to obtain a 3D model of a fingerprint following this technique was published in 2005 [38]. The same team further developed the acquisition method in subsequent works [39], [40]. Structured-light 3D scanners such as the one built in the works above, project successive light patterns of different frequencies on the target. A fixed camera looks at the shape of each single pattern and computes the distance of every point in the field of view according to its deformation with respect to the original pattern. The three main challenges of SLI technology for 3D fingerprint scanning are: 1) SLI-based scanners very often encounter difficulties handling translucent materials, such as skin and human tissue, because of the phenomenon of subsurface scattering; 2) the spatial resolution achieved by current SLI-based technology is significantly lower than that of laser triangulation-based scanners like the one used in the present work (see Sect. IV); 3) the object is analysed as a whole for each different pattern projected, therefore, a perfect motionless acquisition is required to avoid misalignments in the data modeled by each pattern. Insufficient spatial accuracy combined with small movements of the finger can result in noisy samples for very detailed structures like the friction ridge. The researchers leading the first works in this field presented initial recognition results in 2010 using their SLI scanner, on a database of 11 users and 441 3D fingerprint samples [41]- [43]. Two other smaller case-studies following the SLI acquisition technique were presented by different research teams in more recent dates [44], [45].
All the valuable initiatives mentioned in this section, show the willingness of the biometric community to search for alternatives that can advance traditional 2D touch-based acquisition. However, according to the reports presented so far, although 2D contactless fingerprint technology is rapidly gaining ground, it has not yet reached the reliability and accuracy standards of state of the art 2D touch-based systems [14]- [16]. Furthermore, as already mentioned, most existing methods do not consider the direct acquisition of full-3D fingerprint models, but the reconstruction of volumetric fingerprints from 2D images. The current work represents an ambitious step forward in the field of fingerprint recognition, moving not only to a touchless acquisition scenario, but also to the real non-reconstructed three-dimensional space.
At the end of the article, in Sect. X-B, the reader can find a comparison of the most relevant works in the state of the art dealing with the recognition of 3D fingerprints (see Table 3).

III. 3D-FLARE: FULL-3D FINGERPRINT LASER RECOGNITION SYSTEM
''The primary challenge in a biometric recognition system is to design a suitable sensor, feature representation scheme, and similarity measure to minimise the recognition errors.'' This quote by Anil K. Jain et al., appears in a recent review article where the authors summarised, in their expert view, the ''core research challenges in biometrics'' [46]. No other statement could condense in a more precise manner the task undertaken in the present work: The design of a complete new biometric system almost from scratch.
The 3D fingerprint recognition system described in this article has been triggered by a new ground-breaking prototype sensor capable of acquiring highly accurate full-3D finger models in a touchless, fast, reliable and repeatable fashion.
Finger 3D models acquired by this sensor are intrinsically different in nature to the 2D images used by existing fingerprint recognition technology. Consequently, previous processing and feature extraction methods devised for 2D fingerprint systems are, for the most part, not applicable to the new data. The novel scanner is, therefore, the catalyst in a domino effect which has lead to the development of a number of new algorithms to process the 3D data, in order to reliably extract and compare its most salient features.
The full system is depicted in Fig. 1, from the acquisition stage to the computation of the final similarity score. From an overall perspective, the system follows, to a large extent, the phases of classical fingerprint recognition systems. However, the individual methods used in each of the stages substantially differ from those considered in traditional 2D technology. For instance, no minutiae detection is involved in the comparison process. Each of the stages highlighted in Fig. 1 are described over the next sections.
One of the biggest challenges posed by the new 3D data is that, in addition to a possible displacement (translation) of the finger in the three x, y and z axes, the finger can also be rotated according to any of the three angles roll, pitch and yaw, as shown in Fig. 2. This spatial variability of the models needs to be corrected prior to their comparison, so that the final similarity score is produced using aligned samples. The system has been designed to be robust to this  variability through: 1) hardware measures integrated in the acquisition sensor that limit the spatial freedom of the subject VOLUME 8, 2020 to place the finger and 2) the development of processing algorithms capable of compensating small differences in the finger positioning (as specified in the system stages shown in Fig. 1).

IV. STAGE 1: TOUCHLESS FULL-3D FINGERPRINT ACQUISITION BASED ON LASER TRIANGULATION
''To overcome some of the constraints currently affecting biometric systems, the design of novel sensors is required''. This quote has been extracted from a 2019 paper by Ross et al., reflecting upon some of the fundamental issues that still need to be addressed in biometrics [47]. The statement reinforces the idea already expressed by Jain et al. in their 2016 paper that reviewed 50 years of automated biometric systems [46]. In that article, the authors gave their vision of the future in biometrics and expressed their conviction that ''the improvement in sensors will mitigate the intra-subject variations caused by sensor limitations to a large extent. [. . . ] The development of novel sensors can be expected to further push the limits on quality, usability, and cost.'' These two claims, by some of the most authorised voices in biometrics, perfectly summarise the motivation behind this first stage of the system, which main objective is to: develop the first fingerprint sensor capable of directly acquiring high resolution full-3D models of the finger in a touchless manner, avoiding this way any reconstruction process from 2D images.
The advantages of such an unprecedented scanner may be summarised as follows: 1) the resulting 3D finger models do not exhibit the level of elastic deformation present in 2D fingerprint images produced by standard touch-based scanners; 2) they also do not present the level of noise caused by touch-based technology due to changes in the skin condition (moisture/dryness); 3) they are neither affected by changes in the illumination conditions, contrary to images captured by 2D contactless sensors relying on traditional imaging technology, since the final result is not a pixel matrix representing light intensity levels, but a point cloud of spatial coordinates [x, y, z]; 4) the same as other 2D contactless sensors, it avoids the cleanliness and potential health problems derived from touch-based readers (clearly highlighted by the COVID-19 pandemic), eliminating as well the issue of the so-called ''ghost-fingerprints''.
The key factor to be taken into account for the development of such a sensor is that fingerprint ridges are very fine physical structures with a width/depth in the range of 0.1-0.3 millimeters [48]. The acquisition of an accurate full-3D model of this small dimension constitutes a challenging engineering problem. The task becomes even more difficult considering that a finger is a living object that cannot be kept fully still.
Among the different 3D scanning technologies available in the market, contactless 3D active sensors based on laser triangulation present the highest spatial accuracy, down to the range of a few microns, making them a perfect fit for the acquisition of fingerprints. The main drawback of these scanners is their restricted range of operation (i.e., distance between the target and the sensor), limited to only several centimeters if very high resolution is required. However, this is not a significant constraint in the specific case of 3D fingerprint scanning, as the finger can be placed as close to the sensor as needed.
In light of the discussion above, the prototype 3D fingerprint acquisition device assembled for the present work (shown in Fig. 3) is a contactless active scanner based on the triangulation principle. The scanner uses a line projection laser diode to illuminate the target (i.e., finger). A fast CMOS camera captures, from an angle, the light reflected on the finger. This way, using triangulation, the shape of the line imaged on the sensor can be directly related to the shape of the finger along the laser line.
The scanning process is fairly simple. The camera and the laser diode form a triangle (as shown in panel (e) of Fig. 3). The length L between the camera and the laser diode is known. The laser is perpendicular to the ground. The angle of the camera with respect to the ground is also known, in this case 45 • . These three pieces of information (i.e., length L, angle of the laser diode and angle of the camera) determine the shape and size of the triangle and give the location of the points in the finger segment illuminated by the laser line.
As shown in the three pictures of the top row in Fig. 3, the laser 3D fingerprint scanning prototype built for this research is composed of the elements described in the following paragraphs.
Laser diode. The optical properties of human skin change with light wavelength. For longer wavelengths in the visual spectra, i.e. red to deep red colors (650nm and over), the skin presents a non negligible absorption coefficient. This means that an illuminated spot is observed not just by the light directly reflected on the surface, but also presents a glowing area around the specific point. This is caused by the light transmitted into the inner tissue which is reflected back after being internally refracted in deeper layers (i.e., subsurface scattering phenomenon). Such effect translates into inaccurate readings of the skin surface. For shorter wavelengths, i.e. green-blue (450nm-550nm), the light absorption coefficient of the skin decreases to almost zero. With this in mind, an active light with shorter wavelength is better suited for the acquisition of the friction ridge, located in the epidermis. As a result, we selected a 10mW StingRay Laser Diode emitting green light at a wavelength of 514nm.
Camera. The fingerprint scanning prototype uses a 3D-enabled camera manufactured by PhotonFocus. The CMOS sensor in the camera presents a resolution of 2048 × 1088 pixels and is connected internally with highspeed electronics which compute the position of the line projected by the laser through the triangulation process described previously. The camera is capable of acquiring up to 1000 frames/sec, however, the higher the acquisition speed, the more reflected light required in order to obtain a high precision model of the finger. In essence, the higher the acquisition speed, the higher the power of the illumination source (i.e., laser diode). Given that, for safety reasons, a very low power laser of just 10mW was selected, the camera sampling frequency was set to 600 frames/sec.
Even with the selection of a short wavelength green-light laser, a very restricted amount of subsurface scattering can still be present in the finger readings. To minimise the possible distortion caused by this phenomenon, the sensor only considers the average peak from all the reflected light, which corresponds to a hypothetically perfect line illuminating the finger.
Optics. The optical system coupled to the camera consists of a high quality lens with a focal length of 12mm and a band-pass filter matching the active 514nm wavelength of the laser, in order to optimise the signal-to-noise ratio. The filter eliminates illumination sources (i.e., noise) outside the wavelengths covered by its band of operation, that could potentially affect the reading of the camera.
Fixing support. Both the camera and the laser diode are mounted to a blue plastic arm designed and 3D-printed in our lab (see pictures (a) and (b) in Fig. 3). The function of this fixing support is to ensure that the laser and the camera are held at a constant distance L, forming an angle of 45 • between the scanning illumination axis and the optical axis (see pictures (d) and (e) in Fig. 3).
The key parameter of the prototype scanner is its spatial accuracy. In order to optimise it, the complete structure, including the camera and the laser mounted to the fixing support, is calibrated so that each pixel in the CMOS sensor represents an absolute position [x, y, z] along the laserprojected plane.
As a minor note, please bear in mind that, for simplicity of the prototyping process, the fixing support has been produced in plastic. Therefore it can undergo minimal geometrical transformations due to changes in the room temperature causing the expansion/contraction of the material. However, considering that the sensor is kept in a laboratory with a variation of ±5 • Celsius, the potential impact of the thermal expansion coefficient on the final readings is negligible.
Translation stage + motor controller. The translation stage moves the fixing support, holding the laser diode and the camera, at a constant speed in order to scan the full length of the finger. The movement is handled by a motor controller that receives commands from the acquisition software.
The translation stage presents a maximum scanning range of 5cm from the starting to the finishing position (as shown in picture (e) of Fig. 3). This way, in order to acquire the full fingerprint, from the tip to the joint of the 3rd phalange, the finger has to be inserted less than 5cm. Typically, in order to avoid acquisition errors, only the 2nd and 3rd phalanges should be introduced in the scanner through the hole in the protection box.
The stage can vary its speed up to 50mm/sec, covering its complete acquisition range in just one second. Protection box. The external orange box that can be seen in the bottom row of Fig. 3, was designed and built specifically for the project in our lab and it fulfils two main goals. 1) Firstly, it serves as safety measure for the laser diode. The box is made of plexiglass panels that absorb (i.e., filter out) the light emitted by the laser (same operation principle as laser protection goggles). It should be noted that the 10mW laser belongs to the very low range of laser class IIIB, which comprises the power range [5mW-499mW]. Lasers below 5mW are considered eye-safe to be used as pointers. For class IIIB, protection goggles are only suggested (neither recommended nor mandatory). 2) Secondly, the box also serves as a guide for the correct positioning of the finger both in height and direction (see pictures (e) and (f) in Fig. 3). In order to obtain an accurate 3D reconstruction, it is critical that the finger is placed approximately at the height where the laser and camera axis intersect, so that if falls roughly in the center of the field of view of the camera (please see the central picture of the bottom row in Fig. 3). To this aim, a finger-guide built above the insertion hole, aids the user to correctly place the finger in the scanner so that a high quality model is produced.
The finger-guide built as part of the protection box is essential to restrict the spatial displacement and rotation among finger samples. As explained in Sect. III, this is one of the biggest challenges that has been addressed in the development of the 3D fingerprint recognition system. The fingerguide does not eliminate completely the spatial variability, but it restricts it to a level that can later be compensated by the processing algorithms in the remaining phases of the system. Roughly, all fingers are captured at the same height, facing down, perpendicular to the laser, along the scanning direction of the translation stage.
Acquisition software. A specific software application was developed in order to: 1) select and regulate the scanning speed of the translation stage; 2) minimise acquisition mistakes by automatically controlling the finger sequence followed for the generation of the 3D-FLARE DB (please see Sect. IV-A for further details on the DB); 3) automatically store the captured files with the correct naming convention.
The sensor produces files in PLY format, which is a standard representation of point cloud objects. In these files, the finger is modeled by a N × 3 matrix, where each of the N rows is a point defined by its spatial coordinates [x, y, z]. The points are uniformly distributed over a rectilinear grid in the x and y axes with an almost constant resolution (negligible variations may exist in the size of the grid step due to the scanner precision).
The resolution in the x dimension (transversal to the finger) according to our scanner configuration is approximately 0.03mm. The depth resolution in the z direction is up to 0.008mm. The sensor moves longitudinally to the fingerprint along the y dimension. As such, this spatial resolution depends on the scanning speed (i.e., speed of the translation stage) and the sampling rate of the camera (600 frames/sec). Samples were acquired at three different speeds of the translation stage 10mm/sec, 30mm/sec and 50mm/sec, which in turn resulted in three different resolutions in the y dimension: 0.017mm, 0.05mm and 0.08mm. This will enable us, in the experimental evaluation, to analyse the effect of the acquisition resolution on the accuracy of the new recognition system.
Just as reference, a typical FBI-certified optical touchbased sensor producing 2D images operates at 500ppi resolution, which translates into a spatial accuracy of 0.05mm in the x and y axes (no information available in the z dimension). This resolution would be similar to the 3D sensor operating between 30mm/sec and 50mm/sec. For the slowest scanning speed, that is, 10mm/sec, the 3D sensor would present an equivalent 2D resolution of approximately 1500ppi.
Depending on the acquisition speed and the size of the finger, the resulting raw PLY files weigh approximately 18-24 Mbytes at 10mm/sec acquisition speed, 7-9 Mbytes at 30mm/sec, and 3-5 Mbytes at 50mm/sec.
The previous resolution in each of the x, y and z axes, are theoretical optimal values that can be obtained when scanning perfectly still objects. However, it is not possible to maintain a finger completely motionless. Therefore, in order to cope with this motion factor, the computation of the spatial values of the final point cloud, assumes that the finger presents a smooth surface with no discontinuities. Whether or not the sensor is capable of suppressing the finger movement, in order to produce 3D models that are accurate enough to be used reliably for personal authentication, can only be determined through experimental evaluation. The protocol and results of such assessment are described in Sect. X.
A. 3D FINGERPRINT LASER REcognition DATABASE: 3D-FLARE DB Using the prototype contactless 3D sensor described in the previous section, we have acquired a new 3D fingerprint database in order to: 1) test the reliability and consistency of the sensor; 2) devise specific processing methods for 3D fingerprints; and 3) evaluate the accuracy of the complete 3D fingerprint recognition system developed in this work.
The 3D Fingerprint LAser Recognition DB (3D-FLARE DB) contains the index and middle fingers of both hands from 50 subjects, that is, 200 different fingers. All subjects are caucasian adults between 28 and 55 years of age, computer based workers, with a sex ratio of 40 men and 10 women. The acquisition was conducted in a standard office-like environment with no specific control over illumination. Volunteers were given the option to be sitting on a revolving chair in front of the sensor or standing up, depending on what position felt more comfortable for them.
Each finger was acquired 15 times: five samples at a speed of 50mm/sec (fastest speed allowed by the translation stage), five samples at 30mm/sec and five samples at 10mm/sec. In order to simulate a more realistic acquisition scenario, samples of the same finger were not captured consecutively. The scanning protocol followed by each subject was: left middle, left index, right index and right middle at 50mm/sec, same sequence at 30mm/sec, same sequence at 10mm/sec; repeat all five times. This way, after each single sample acquisition, the subject had to remove the finger and introduce the next one in the sequence. This process was defined to ensure sufficient spatial variability among samples from the same finger. The acquisition of all 60 samples (4 fingers × 5 samples × 3 speeds) of the same subject took around 10 minutes. In order to minimise mistakes in the acquisition protocol, the specific software described in Sect. IV automatically controlled the finger sequence, the scanning speed and the file storage.
As a result of the previous process, the database comprises a total 3,000 full-3D finger models (i.e., point clouds). The structure of the database is represented in Fig. 4 where four examples of typical raw 3D finger models are shown. All four fingers correspond to the same subject. Different view points have been used to depict each of the samples in order to better illustrate the three dimensional nature of the models.
Due to data protection legislation, at the time of publication of the present paper, we are not able to release the database to the public for research purposes. The distribution of the database may be achieved in the mid-term future. For the time being, as part of the article submission, the interested reader can have access to the next additional multimedia material: 1) a video showing the database acquisition protocol, the sensor in operation and some sample 3D fingerprints; 2) the data corresponding to two subjects in the database. These sample data includes both the raw samples in PLY format acquired at all three scanning speeds, and the processed data in MATLAB format used for recognition as described in the next sections of the article.

V. STAGE 2: 3D FINGERTIP SEGMENTATION BASED ON THE FINGER CURVATURE
The raw data captured by the 3D sensor corresponds to the whole length of the finger inserted within the scanning range. This raw model typically includes not only the fingertip, but also part of the second phalange, as shown on the left of Fig. 5. Therefore, the objective of this first processing stage is to segment the part of the acquired model corresponding exclusively to the fingertip (i.e., third phalange).
The novel segmentation method takes advantage of the depth information contained in 3D models. In particular, it is based on the curvature of the finger, a feature that cannot be extracted from the flat images of fingerprints used by traditional 2D systems. The curvature has already been considered for recognition purposes [28], showing limited discriminative capacity, with a best Equal Error Rate (EER) of around 15%. However, in the present work it has proven to be a very valuable characteristic for the segmentation task at hand. The 3D finger model captured by the prototype scanner is a smooth continuous surface defined by z over a rectilinear grid in the x and y coordinates. The fingertip is contained in a rectangle limited by [x min , x max ] and [y min , y max ] that comprises the z info corresponding only to the 3rd phalange of the finger. The objective of the segmentation process is to find those four limits. VOLUME 8, 2020 Limits in the y axis (i.e., longitudinal to the finger). The key task of the whole segmentation process is to locate the limit y max . This parameter will be referred to as the ''Endof-Fingerprint'' point (EoF), and is defined by the joint of the 2nd and 3rd phalanges. Once the EoF is determined, the other three limits (i.e., y min , x min and x max ) will be derived from it.
In order to retrieve the EoF, a two-step method has been developed (depicted inside the dashed square in Fig. 5). Step 1, it is smoothed using a 40 point moving average filter. The resulting smoothed section is downsampled in order to take only 30 equidistant points including the first and last. This smoothed, downsampled section will be referred to as z sd (x m , y). In order to obtain its curvature, the second derivative with respect to the y dimension is computed, that is, z(x m , y EoF ), plotted in blue in the top right panel of Fig. 5 and also in Fig. 6, coincides with the minimum of this curvature function.
Once the limit y max , i.e., EoF, has been set, the limit y min is defined as y min = 0.1L, where L is the length of the fingertip between the first scanned point in the longitudinal axis and EoF. The very first scanned point is not taken as the initial point of the fingertip because, towards the edges, the laser illuminates the finger with an angle that diverges significantly from the perpendicular, losing accuracy.
Limits in the x axis (i.e., transversal to the finger). The x min and x max limits will be close to the sides of the fingertip. As before, the absolute maximum and minimum values in x are not taken as limits given that the sensor loses accuracy towards the edges of the finger. For that reason, The resulting 3D data of this segmentation process is a rectangular section of the fingertip, of dimension [0.9L×0.95W ]. This section is taken along the central longitudinal axis of the finger and therefore compensates small yaw angles between the finger orientation and the scanning direction, as depicted in Fig. 6.

VI. STAGE 3: DETACHMENT OF THE FINGERPRINT FROM THE FINGERTIP
In this stage, the goal is to make the system robust to: 1) possible variations in the height of the finger with respect to the laser diode during the acquisition process; 2) small rotations according to the pitch and roll angles (see Fig. 2). For this purpose, we separate the friction ridges, comprising the distinctive identity information of the subject, from the rest of the finger volume which has already proven to be ineffective for recognition purposes [28].
As explained in Sect. I-A, the fingerprint (or friction ridges) is the outermost part of the fingertip. It is formed in the epidermis skin layer that could potentially be ''peeled off'' from the rest of the fingertip, keeping the discriminative information contained in the finger. Making an analogy with communication theory, this case would be similar to an AC signal with a DC level that carries the information in its amplitude. By filtering out the DC level (finger volume) and considering only the AC signal (fingerprint), none of that information is lost.
The overall underlying shape of the finger (DC level) is determined for each point in the fingertip using a lowpass mean filter of size 50 × 50. Then, the fingerprint is ''detached'' simply by substracting this mean value from the original full fingertip. This detachment process is depicted in Fig. 7. The resulting fingerprint is shown on the right from two different perspectives: (top) from an angle that allows to distinguish the depth information of the 3D model; (bottom) from the zenithal perspective, that helps to discern the ridge pattern the way we are accustomed to in 2D images.
As a result of the previous process, all fingerprints are normalised in the z axis. Also, small rotations of the finger in the pitch and roll angles during acquisition are roughly converted into simple translations in the final detached fingerprint (these translations are later compensated in stage 4 of the system, as will be described in Sect. VII).
All detached fingerprints of one subject in the 3D-FLARE DB, acquired at 10mm/sec, are shown in Fig. 8 from the zenithal viewpoint. The visual similarity among samples of the same fingers is a first positive indication of the consistency of the first three stages of the recognition system described to this point (i.e., acquisition, fingertip segmentation and fingerprint detachment).
The fingerprints produced as output in this stage, constitute the system templates. In a real system, these templates would be stored as reference for future comparison with the probe samples. Their size is around 1000Kb for acquisition speed 10mm/sec, 350Kb for 30mm/sec and 150Kb for 50mm/sec.

VII. STAGE 4: ALIGNMENT AND CROPPING
The input to this stage 4 are two ''detached'' 3D fingerprints as produced after stage 3, which correspond to: 1) the probe fingerprint 3D model (i.e., probe template); and 2) the reference fingerprint 3D model stored in the database of the recognition system (i.e., reference template). The two models are represented by the vertical solid arrow and the dashed horizontal arrow entering stage 4 in Fig. 1.
Prior to their comparison, these two fingerprints need to be aligned, in order to determine the overlapping surface between the two. This shared surface will be referred to as Region of Interest (ROI). It contains the shared information of both fingerprints, that will be extracted in stage 5 of the system (see Sect. VIII) and will later be utilised to generate the final similarity score in stage 6 (see Sect. IX). VOLUME 8, 2020 The alignment and cropping method is shown in Fig. 9, with all the 3D models involved in the process depicted from the zenithal perspective.
Given that the sampling resolution of the acquisition system in the x and y coordinates is monotonically increasing and almost constant, the detached fingerprints can be considered as structured 3D data following a rectangular mesh (i.e., they are represented by a depth matrix). As such, they can be treated as 2.5D ''images'' where each equidistant point may be regarded in an analogue way to a pixel in a 2D image. However, the ''2.5D-pixel'' values are not integer numbers representing an illumination intensity level, but real values showing the depth in the z axis in that specific point. Given that the nature of the acquisition technology is intrinsically different to that of traditional 2D imaging (see Sect. IV), the ''2.5D-pixel'' depth values are robust to external illumination conditions, contrary to what occurs for pixels in standard 2D images.
In light of the previous discussion, it is justified to anticipate that existing methods for image registration can be efficient tools for the alignment of the detached fingerprints [49]. Historically, the so called area-based methods, sometimes called correlation-like methods or template matching, were the first image registration approaches to be systematically analysed and still remain among the most efficient algorithms for this purpose, given that certain conditions are met [50].
Area-based methods deal with images without attempting to detect salient objects/features. Windows of predefined size or even entire images are used for the correspondence estimation. These methods have shown very good performance for problems that meet the three conditions described in the next paragraphs.
First, images should be acquired under controlled illumination and sensing conditions. Classical area-based methods like cross-correlation exploit for registration directly image intensities, without any structural analysis or singular local points detection. Consequently, they are sensitive to the intensity changes, introduced for instance by noise, varying illumination, and/or by using different sensor types.
Each ''pixel'' value in the detached fingerprints does not represent an illumination intensity, but the real depth in the z axis. Furthermore, all samples have been acquired with the same sensor. Therefore, we can safely consider that this first requirement is met, since there should not be any significant variability among samples due to illumination or sensor changes.
Second, images to be registered should differ solely by a translation. The rectangular window, which is most often used in area-based methods, is not able to cover the same parts of the scene in the reference and probe images if they are noticeably deformed by more complex transformations than simple translations (e.g., elastic deformation of the object). As mentioned before, the system is significantly robust to small rotations in any of the three angles shown in Fig. 2, due to the combination of: 1) the finger placement restriction enforced by the finger guide during the acquisition process; 2) the fingertip segmentation process performed in stage 2 (see Sect. V); and 3) the fingerprint detachment process followed in stage 3 (see Sect. VI). As a result of the joint effect of the three processes, small finger rotations can be safely approximated by translations in the detached fingerprints. It should also be recalled that the touchless acquisition sensor developed in stage 1, avoids the introduction of elastic deformation in the 3D finger models, such as the one that affects 2D images captured using traditional live-scan touchbased technology.
Third, images should not present large areas with low distinctiveness. Correlation-like methods present a nonnegligible probability that a window in the probe image containing a smooth area without any prominent details, will be matched incorrectly to a different smooth area in the reference image, as a result of their mutual non-saliency.
Given the type of data considered in the present problem, i.e., 3D fingerprint models, it is highly unlikely that such smooth patches can be found in the samples, unless some acquisition error has occurred. By definition, the 3D models represent the ridge structure of the fingerprint, which is a succession of ridges and valleys with no plateaus or flat areas.
In light of the explanations given in the previous paragraphs, the three conditions for the efficient application of area-based image registration methods, are met to a high degree in the fingerprint alignment problem addressed in this stage of the system. Therefore, as depicted in Fig. 9, the optimal registration between the two input models is obtained computing their cross-correlation and taking the alignment that generates the maximum value. Then, the ROI is finally cropped as the corresponding overlapping surface between the two aligned input fingerprints.
This alignment compensates translations between fingerprint samples, adding robustness to the system with respect to small finger rotations in the pitch and roll angles. Please recall that such rotations had been roughly converted into translations in the previous stage of the system (see Sect. VI).

VIII. STAGE 5: FEATURE EXTRACTION
Current live-scan sensors, following the traditional paperand-ink acquisition process, generate fingerprint images which are essentially black and white impressions of the ridge pattern. This ''close-to-binary'' images limit the application of many descriptors and algorithms developed for other image processing problems where the input is typically a fullscale grey picture (e.g., face recognition, object detection, or scene interpretation). This is one of the reasons that explains why, still to this date, the vast majority of 2D fingerprint recognition systems are mainly based on the same principles developed in the 19th century for the manual comparison of fingerprints, that is, the detection and pairing of local minutiae points [7]. This minutiae-based trend has carried on to most works considering volumetric 3D fingerprints reconstructed from multi-view 2D images [21], [22], [41].
It is true that different holistic approaches have also been analysed in the literature as possible alternatives to minutiae detection. These global methods compare fingerprint images based on information such as orientation, frequency or ridge texture [51]. However, in general, they have clearly shown lower discriminative capacity, which has resulted in their use being restricted mainly to: 1) complement comparison scores obtained from minutiae-based algorithms; 2) extract information from very low quality images where minutiae detection is difficult or not reliable (e.g., latent fingerprints).
Please recall that the 3D fingerprint ROI extracted in the previous stage of the system, is a rectangular matrix where each point represents the depth information of the ridge pattern. As already mentioned, in a way, this may be regarded as a full grey-level 2D image composed of smooth continuous real values. These samples differ significantly from traditional black-and-white fingerprints, adding substantial new information that can be exploited by image processing algorithms.
In contrast to previous fingerprint comparison approaches, the new information available in 3D fingerprint models, opens up the possibility of successfully applying general descriptors that have already demonstrated their high discriminative power in other problems related to image processing. For the first time, such descriptors may be used on their own, and not as additional features to complement minutiae. This way, the challenging and not always reliable task of minutiae detection and pairing can be avoided, unleashing the full potential of modern image analysis techniques, including deep-based technology.
The same way that registration methods for 2D images are suitable for the alignment of 3D fingerprints (see Sect. VII), existing algorithms for 2D image analysis may be used to draw the discriminative information from the 3D ROI. In particular, as shown in Fig. 10, the feature extraction process is based on two descriptors widely used in image processing: Histograms of Oriented Gradients (HOG) and Local Binary Patterns (LBP). Both descriptors have been studied in 2D fingerprint recognition to enhance minutiae comparison strategies in order to improve the overall accuracy of the system [52]. As will be explained in the next paragraphs, they exploit two different sources of information from the ridge structure: orientation (HOG) and depth (LBP). Consequently, they are expected to complement well each other, allowing this way to exploit their synergism through information fusion strategies, as a mean to increase the accuracy of each singular descriptor.
The principle behind the Histogram of Oriented Gradients (HOG) descriptor is that the appearance and shape of local patterns within an image can be described by the distribution of intensity gradients or edge directions. It was first proposed in 2005 as an effective tool for localizing pedestrians in complex images [53], and has reached increasing popularity due to its good discriminative power for many types of objects, and to its tolerance to different common variability sources in images. The aim of this descriptor is to represent an image by a set of local histograms which count occurrences of gradient orientation in a local cell of the image. The implementation of the HOG descriptor is usually achieved by: 1) computing the gradient of the image; 2) dividing the image into small sub-regions called cells; for each cell, 3) building a histogram of the gradient directions; and finally 4) normalizing histograms within some groups of cells, named blocks, to achieve higher robustness to possible image variability.
In this work we divide each 3D fingerprint ROI into C equal non-overlapping cells of size 50 × 50. For each cell, the gradient orientation and magnitude of each point is calculated. The gradients are discretised over 9 equally sized bins in the [−180 • , 180 • ] range and the resulting 9-bin histogram is calculated weighting each point by the magnitude of its gradient, according to the histogram bin. The entire descriptor is normalised to unit length within each cell, applying an overall normalization in blocks of size 2 × 2 cells. This process results in a feature matrix of size C × 9, where each row represents the values of the 9-bin histogram of one cell. Since cells are of fixed size, the number of rows C in the matrix (i.e., cells) depends on the actual size of the ROI.
The second descriptor used to exploit the information comprised in 3D fingerprints are the Local Binary Patterns (LBP). LBPs were introduced in 2002 by Ojala et al. [54] and, since then, they have shown to be a very powerful grayscale local texture descriptor with high discrimination capacity and low computational complexity. Over the last two decades, the number of variations of the original LBP algorithm that have been developed are nothing short of overwhelming. Even two editions of a specific international workshop were exclusively dedicated to this descriptor. 3 LBP variants have been applied, with significant success, to a vast range of problems in computer vision [55], including the description of 3D surfaces [56]. In biometrics, LBPs have been especially popular in both 2D and 3D face recognition tasks [57], where for some years they set the bar for state of the art accuracy, before the advent of the new generation of systems based on deep learning.
The LBP operator is obtained from a point p and its symmetric neighbour set of P points placed on a circle of radius R. It represents the difference between the intensity value of p from the intensity values of its neighbourhood. Where the value of the center point is greater than the neighbour's value, the LBP takes value 0, otherwise it takes value 1. This operation results in a P-digit binary number (the LBP). An LBP code is defined ''uniform'' if the number of transactions between 0 and 1 of the sequence is less or equal to two. Uniform patterns are particularly relevant since they represent basic image structures such as spots and edges.
In the present work we have used the LBP-uniform configuration with R = 5 and P = 8, usually noted in the literature as LBP(u2, 8,5). With this parameterisation, each sample is represented by a feature vector (i.e., histogram) of dimension 1 × 59. It measures the occurrence of each type of the possible 58 uniform patterns, with the last value of the histogram taking into account all non-uniform patterns.
It should be noted that the implementation parameters of the HOG and LBP descriptors have been set on a development pool of users taken from the 3D-FLARE DB, with no overlap with the dataset used for evaluation (see the full experimental protocol in Sect. X). This parameter setting process included, 1) for the HOG descriptor: size of the cells (50 × 50), size of the blocks (2 × 2) and number of bins (9); 2) for the LBP descriptor: number of points P = 8 and size of the radius R = 5.

IX. STAGE 6: COMPARISON
The HOG feature matrix (size C × 9) and the LBP feature vector (size 1 × 59) generated in the previous stage for the reference and probe fingerprint ROIs, are compared here according to the chi-square distance. This metric has shown very good performance to quantify the dissimilarity between histograms [58]. In the case of the HOG descriptor, the chisquare distance is computed for each 9-bin histogram corresponding to the C total cells, resulting in C partial scores. Then, the final dissimilarity score d HOG is obtained as the average of those C partial scores. In the case of the LBP descriptor, the chi-square distance is directly applied to compare the two LBP feature vectors, producing the dissimilarity score d LBP .
Please note that d HOG and d LBP are dissimilarity scores, that is, the higher their value the less similar the two compared fingerprints. Additionally, both scores are unbounded in the range [0, ∞]. In order to transform them to bounded 3 https://sites.google.com/site/lbp2014ws/ similarity scores, both values are normalised to the range [0, 1] using the inverse of the logistic function, which has shown high efficiency for score fusion in multimodal biometric systems [59].
Lastly, the two normalised scores sn HOG and sn LBP , are combined in the final similarity score using the weighted sum, which is one of the most effective score level fusion techniques for multiple classifiers [60]. Equal weights are selected for both normalised scores, resulting in: s = 0.5 · sn HOG +0.5·sn LBP . A diagram of the full comparison process is shown in Fig. 11.

X. EVALUATION: EXPERIMENTAL PROTOCOL AND RESULTS
The new full-3D fingerprint recognition system presented from Sect. III to Sect. IX, is evaluated on the 3D-FLARE DB described in Sect. IV-A. The main objective of the experiments is to determine if the hypothesis set forth in the introduction of the present work, holds: a high quality full-3D model of the fingerprint has the potential to lead to higher recognition accuracy than 2D fingerprint images captured with current touch-based state of the art technology.
The fulfilment of the previous general goal, will also allow us to assess the performance of the complete system and to answer ancillary questions such as: Is the prototype acquisition sensor sound and able to acquire in a reliable and repeatable way high resolution full-3D fingerprint models? Are these models usable for fingerprint recognition? Is it possible to perform accurate fingerprint recognition not based on minutiae detection? Do 3D fingerprints contain enough information to be recognised following a holistic approach based on the complete model and not on the analysis of specific local points? What is the minimum spatial resolution required to obtain high accuracy with the new 3D fingerprint models?

A. EVALUATION PROTOCOL
To reach the objective defined above, the 3D-FLARE DB was divided in two separate non-overlapping datasets for development and test. The development set contains all samples of five subjects randomly chosen from the full database and was used to set the parameters of the HOG and LBP descriptors, as specified in Sect. VIII. The test set comprises all samples of the remaining 45 subjects in the database. This data is used to evaluate the system, computing the mated and non-mated sets of comparison scores, for each different acquisition speed: 10mm/sec, 30mm/sec and 50mm/sec. Two different verification scenarios are considered in the experiments: 1) ''1vs1'' scenario, where each different finger in the test dataset is regarded as a different identity, i.e., 45 × 4 = 180 identities. This would be the case of a verification system where each individual is recognised using only one finger. 2) ''4vs4'' scenario, where each subject is an identity, i.e., 45 total identities. This would be the case of a system where each subject is recognised using all four fingers available in the database (i.e., left index/middle, right index/middle). The mated and non-mated sets of scores in the two scenarios were computed as follows: • ''1vs1'' scenario. Mated similarity scores for each acquisition speed are obtained comparing all 5 samples of one finger to all other samples of that same finger, without repetition. This leads to 10 mated scores per finger, which totals 180 × 10 = 1, 800 mated scores for the complete test set. Non-mated scores are computed comparing one sample of each finger to one sample of 15 randomly selected fingers, which leads to 180 × 15 = 2, 700 non-mated scores for the complete test set.
• ''4vs4'' scenario. Both mated and non-mated scores are obtained as the average of the four scores corresponding to each of the four fingers of the same subject in the ''1vs1'' scenario. This leads to 45 × 10 = 450 mated scores, and 45 × 15 = 675 non-mated scores.
These sets of scores were computed for: 1) the HOG descriptor (sn HOG ); 2) the LBP descriptor (sn LBP ); and 3) for the final fused score (s). This way we can compare the discriminative ability of each individual descriptor and their level of complementarity.

B. RESULTS: VERIFICATION ACCURACY
The Detection-Error Trade-Off (DET) curves combine in one same plot the False Match Rate (FMR) and False Non-Match Rate (FNMR) of the system, using logarithmic axes. These curves are efficient graphical tools to visually compare in just one plot the accuracy of different systems. The lower the curve, the better the system.
In Fig. 12 we show the DET curves of the full 3D recognition system developed in the present work for the ''1vs1'' verification scenario, for all three acquisition speeds: 10mm/sec (left), 30mm/sec (center) and 50mm/sec (right). In each of the charts, the light grey curve corresponds to the score obtained by the LBP descriptor sn LBP , the dark grey curve to the HOG descriptor sn HOG , and the black curve to the score-level fusion of both s. As a complement of Fig. 12, Table 1 gives the FNMR values for three different operating points defined according to the FMR: 1%, 0.1% and 0.01%. These are the operating points reported in the Fingerprint Verification Competition OnGoing (FVC-Ongoing). 4 The main conclusions that can be extracted from both Fig. 12 and Table 1 are: • FINDING 1. Touchless full-3D fingerprint recognition is feasible and shows high discriminative potential, with an Equal Error Rate (EER) of 1.04% in the ''1vs1'' scenario.
• FINDING 2. Full-3D fingerprint recognition can be achieved with high accuracy based on image processing descriptors, contrary to what has been observed so far in existing 2D and 3D-reconstructed systems, where, for the time being, there is not yet a competitive alternative to traditional minutiae-based algorithms.
• FINDING 3. As could be expected, higher spatial resolution (i.e., slower acquisition speed) leads to lower error rates. A big accuracy drop is observed for the fastest acquisition speed, 50mm/sec, with respect to 30mm/sec and 10mm/sec. Note that 50mm/sec corresponds to a sampling resolution of 0.08mm in the y axis (see the sensor description in Sect. IV). The ridge pattern is known to be a structure in the range of 0.1-0.3mm. This means that, depending on the size of the friction ridge, the sampling rate in the y axis at 50mm/sec may not suffice to comply with the Nyquist theorem, which states that in order not to lose any information comprised in the fingerprint, the sampling rate should be at least double than  the finest ridge pattern (i.e., minimum 0.05mm spatial resolution for small ridges of size 0.1mm). As a result, 50mm/sec can lead to an inaccurate representation of the fingerprint and, consequently, to more recognition errors.
On the other hand, the recognition accuracy difference between 10mm/sec and 30mm/sec is not significant, as both speeds result in a spatial resolution that satisfies the Nyquist theorem. Depending on the final application, a decision on the acquisition speed should be taken considering other variants in addition to accuracy: 1) time of acquisition, 5 seconds at 10mm/sec with respect to 1.7 seconds at 30mm/sec; 2) size of the acquired models, 20Mb and 8Mb respectively; 3) size of the final templates, 1000Kb and 350Kb depending on the speed; 3) time of processing and comparison, with the smaller models (i.e., acquired at 30mm/sec) speeding up the process close to five times per comparison score computed.
• FINDING 4. The HOG descriptor shows higher discriminative power than the LBP descriptor. The HOG DET curve is always clearly below the LBP DET curve, with an EER which is around half the value for all the scenarios considered. This means that there is less variability in the direction information of the ridge pattern (captured by HOG) among samples of the same finger, than in the information related to the depth of the pattern (captured by LBP). This may be explained by the combination of three different factors: 1) it is plausible that, in its natural state, the direction of the ridge pattern may in fact be more distinctive than its depth; 2) the 3D sensor is more accurate and consistent at acquiring the direction of the ridge pattern, rather than its depth information; 3) the LBP descriptor is more sensitive to small misalignments that may exist between the compared samples.
• FINDING 5. As was predicted, the HOG and LBP descriptors are highly complementary. In the two scenarios that lead to sufficient spatial resolution (i.e., 10mm/sec and 30mm/sec), the DET curve of the fused score s is clearly the lowest curve for all FMR and FNMR rates. It presents an improvement in accuracy of around 50% with respect to the best individual descriptor sn HOG . Such combination efficiency is likely the result of both descriptors measuring the distinctiveness of 3D samples based on two different sources of information: their direction (HOG) and their depth (LBP). All previous findings are confirmed and reinforced by the results obtained in the ''4vs4'' verification scenario, presented in Figs. 13-14 and Table 2. Unlike the ''1vs1'' case, Fig. 13 does not show the DET curves. Instead, it depicts the mated (black circles) and non-mated (grey crosses) score distributions, plotted with the HOG score sn HOG in the x axis, VOLUME 8, 2020  and the LBP score sn LBP in the y axis. As can be observed, for the 10mm/sec and 30mm/sec acquisition speeds, there is a perfect separation between both distributions, (EER = 0%), which means that an eventual DET plot would not show any curve. As complementary information to this figure, we also include: 1) Table 2, which presents the EER for the individual HOG and LBP scores and for the final fused score, for all three acquisition speeds; 2) Fig. 14, which shows the FMR and FNMR curves for the acquisition speed of 30mm/sec. In this last figure, we can appreciate that there is a significantly large range of values of the fused score s where both curves are equal to zero (s [0.28, 0.38]), that is, the system does not make any verification mistakes.
The reader may have noticed that the results produced by the ''4vs4'' experiments correspond to a perfect zero-error system. Does this mean biometrics has been solved? Unfortunately not. As is well known in statistics, the fact that in a given evaluation nothing goes wrong, does not necessarily imply that everything is all right [61]. The ''4vs4'' experiments do not prove that full-3D fingerprints lead unequivocally to perfect recognition, but rather, that more data is required in order to assess the system in a finer way. We should not forget that the test dataset in this scenario contains 45 different identities, which have resulted in 450 mated scores and 675 non-mated scores. According to the ''rule of three'', often used in statistical analysis as a reliable rule of thumb [62], for this size of the score sets, there is a 95% confidence that the EER in the ''4vs4'' scenario is lower than 0.5%. Furthermore, given the wide range of s values with zero error, it seems reasonable to predict that the real EER of the system is likely lower than 0.1%. In summary, in order to confirm these estimations, a larger database is required to compute the error rates in a more precise manner. Comparison of the most relevant touchless 3D fingerprint recognition methods from the state of the art. Works appear ordered firstly by type of 3D acquisition methodology and secondly by date of publication. By column, the information given is: 1) reference to the work where the system is reported; 2) type of data used in the experiments, namely, reconstructed-3D data, or full-3D data; 3) size of the database with the number of different fingers and the number of total samples; 4) type of features extracted for the comparison of fingerprints; 5) Equal Error Rate (EER) achieved in the given database for the ''1vs1'' verification scenario.
In spite of the limits imposed by their statistical significance, the ''4vs4'' results do strongly support all the findings expressed in the ''1vs1'' scenario, especially in light of the big separation gap observed between the mated and non-mated distributions in Fig. 13 (10mm/sec and 30mm/sec plots). Therefore, in brief, the ''4vs4'' experiments emphasise the potential of full-3D fingerprint recognition as a new biometric mode.
For reference, in Table 3 we present a comparison between the system developed in this work and other approaches from the state of the art that address the problem of 3D fingerprint recognition. The comparison is made considering the following characteristics: type of data used (reconstructed-3D or full-3D), size of the evaluation database, type of features extracted and EER achieved. Please note that, since each work reports the results in its own evaluation benchmark, the figures given for the EER are not directly comparable and should be understood as a general indication of the discriminative ability of the system.

XI. DISCUSSION AND CONCLUSION
''The point to be made here is that, at the most fundamental technical level, biometrics is evolving very slowly from its origins''. With this statement, James L. Wayman summarised in a 2007 article, the development of biometrics over the last 40 years [63]. Wayman stressed this observation even further by asserting that ''a quick overview of biometric history shows that much of what we consider to be ''new'' in biometrics was really considered decades ago.'' While Wayman's claims can seem somewhat bold at first glance, after a sober analysis we may conclude that they are not so far away from being factual for many biometric characteristics. For instance, although the last decade has brought some novel ground-breaking proposals in speaker recognition, this technology is still today ultimately based on the cepstral coefficients introduced as fundamental features by Luck in a pioneering work from 1969 [64]. Another illustrative example that supports Wayman's assertions is signature recognition, where some of the first works dating back to the early 1960s, already considered dynamic information such as acceleration and pressure [65], which are time functions usually regarded as an innovation brought to the field by modern digitising tablets. Furthermore, the majority of current iris recognition systems are based on the feature extraction method and iriscode template developed and patented by Daugman in one of the first iris-related works back in 1993 [66].
Perhaps one of the few exceptions to this lack of fundamental progress in biometrics, is face recognition. Face recognition has taken advantage of the successive advances in image processing, to evolve from the first studies in the 1960s based on distance measures between anatomical landmarks [67], to the acclaimed eigenfaces study and other algorithms based on hand-crafted features in the 1990s and 2000s [68], to the current generation of deep learning systems [69]. Each of these changes in the core of the technology, has produced not just an incremental improvement, but a big leap in performance.
In terms of historical evolution, almost at the other end of the spectrum with respect to face recognition, we find fingerprint-based systems. Fingerprint recognition can be regarded as the epitome of Wayman's affirmations. Few biometric characteristics have evolved as little regarding their fundamental recognition methods. If we were to summarise the current situation of fingerprint recognition in a catchy and somewhat provocative headline, we could say that: nowadays, fingerprint recognition consists on the application of 21st century computational processing power, to 19th century recognition methods.
While it is evident that such a statement would be oversimplistic and, we may even say, a bit sensationalist, there is a basis of reality in it. To this date, fingerprint recognition systems are mostly based on the principles laid in the 19th century for the manual detection and pairing of minutiae. This is not to say that fingerprint recognition has not improved its accuracy. On the contrary, especially over the last decade, fingerprint systems have made rapid strides, considerably enhancing our ability to accurately identify subjects in large scale databases within seconds. However, this swift path of technological evolution can be put down, to a large extent, to the astounding progress of computational capacity, and not to fundamental changes in fingerprint processing methods. It would not be too far from fact to state that, for the most part, fingerprint technology has got a lot better and faster, at doing essentially the same core tasks.
It does seem that, at the moment, the enormous potential of new sensing systems and image processing algorithms is not being fully exploited by fingerprint technology. A point has been reached in which the biometric community has started to speculate with the idea that traditional 2D image-based fingerprint systems may be reaching their accuracy ceiling. This ceiling has proven to be certainly high and, likely, can be pushed incrementally higher thanks to the continuous improvement of computing capabilities. However, in order to take a big leap beyond it, just as happened in face recognition with the advent of deep-based systems, a profound redesign of the way fingerprint recognition has been traditionally performed may be needed.
The current article constitutes a first step towards the arduous goal of shaking the foundations of fingerprint recognition, anchored in over a century and half of immutable practices and tradition. As initial milestone in this quest, we have shown that a paradigm shift from the 2D plane to the 3D space is feasible. In a nutshell, the paper can be regarded as a starting point in the ambitious path of establishing full-3D fingerprint recognition as a new biometric mode, similar to the appearance of 3D face recognition as a plausible alternative (or complement) to traditional authentication based on 2D facial pictures.
Even though this is still a novel on-going research line, the work carried out so far has already reached some relevant achievements presented in this paper: • It has been shown that it is possible to acquire accurate full-3D fingerprint models in a fast, consistent, repeatable and reliable way based on laser sensing technology.
• Touchless full-3D fingerprint recognition is feasible and shows high discriminative potential, with an EER of 1% in the ''1vs1'' verification scenario and, very likely, below 0.1% in the ''4vs4'' scenario (EER = 0% in the evaluation set of data used in the work).
• Full-3D fingerprint recognition can be achieved with high accuracy based on image processing approaches different from the classical minutiae-based algorithms used in existing 2D technology and in previous works employing reconstructed-3D finger data.
• The ridge orientation information in 3D samples extracted by the HOG descriptor, is more consistent and discriminative than the depth information captured by the LBP descriptor.
• Orientation and depth information of the ridge pattern are highly complementary and their fusion leads to largely improved accuracy with respect to any of the two descriptors individually. The work also opens a number of research possibilities to the biometric community that will need to be addressed in the future, some of which are specified in the next paragraphs.

A. DATA
There is a well known principle in machine learning that preaches: ''in God we trust, all others must bring data''. Following this principle, more data is needed in order to assess more accurately the system error rates in the ''4vs4'' verification scenario where at the moment an EER = 0% has been reached.

B. QUALITY
As mentioned in the introduction, the key element to a successful and accurate biometric system is to avoid the GIGO principle (''garbage in, garbage out''). To this end, it would be valuable to develop specific quality metrics for 3D fingerprints, capable of estimating the goodness of a given sample for recognition purposes and, therefore, of predicting accuracy.

C. DEEP LEARNING
Since the advent of the deep learning era at the beginning of the 2010 decade, technology based on Deep Neural Networks (DNNs) has continuously shown to outperform previous methods in virtually any image-based problem related to machine learning and computer vision. To date, one of the few exceptions to this technological revolution has been fingerprint recognition, where traditional methods based on minutiae detection keep to clearly achieve lower error rates than deep-inspired algorithms [70]. The results presented in the current work have shown that full-3D fingerprints contain enough information to use general image descriptors as a real alternative to traditional minutiae-pairing algorithms. This opens the door for applying all the potential of current image processing technology, including deep learning techniques, to 3D fingerprint recognition. Given this disquisition, the application of DNNs to the problem of 3D fingerprint recognition should be explored to determine if they can improve, or complement, hand-crafted features such as the ones considered in the present work.

D. AGE EFFECT
Different works have shown that traditional 2D touch-based fingerprint technology suffers from a significant accuracy decrease when dealing with specific age groups such as children and elderly [71]. For the case of children, this drop in performance has been put down to the size of fingerprints and it has been shown to be largely reduced through the application of growth models [72] or through the use of scanners with significant higher resolution than the classical 500dpi ones [73]. In the case of elders, on the other hand, the hypothesis that has been put forward to explain the error rates increase is that the problem is originated by the traditional touch-based acquisition process itself, which is illsuited to dryer, less elastic skin, typical of a more advanced age. Given that the scanner built in the current work presents a higher spatial resolution (appropriate to acquire smaller fingerprints) and is also contactless (appropriate to acquire fingerprints with sub-optimal skin condition), it can be an effective way to mitigate the loss of accuracy both for children and elders.

E. ERGONOMICS/USABILITY
As specified in [47], the human-biometric sensor interface plays a pivotal role in the system accuracy in its usability and, therefore, in its acceptance by the public. Consequently, enhancing the ergonomics of the acquisition sensor should become one of the priorities moving forward. This design upgrade may be reached acquiring simultaneously all four fingers (i.e., ''slap-prints''). This would also allow for a much faster acquisition and processing in the ''4vs4'' recognition scenario, making it a realistic alternative to ''1vs1'' verification in applications where time is a key constraint.

F. 3D-2D COMPATIBILITY
In the field of fingerprint recognition, all existing legacy databases contain 2D images. In some contexts, like for instance in the case of law-enforcement or civil applications (e.g., national ID registry, passport registry), it is essential that any change in technology is back-compatible with existing data. For that purpose, algorithms capable of translating 3D point clouds into traditional black and white ridge-pattern 2D images would eventually have to be developed.

G. PRESENTATION ATTACKS
Presentation attacks (also referred to as spoofing), have emerged as one of the major security concerns in biometrics. It will be necessary to test the vulnerability of the new 3D technology to this type of threat, and also to understand to what extent laser sensing technology may be an efficient presentation attack detection method.
As a wrap-up to this work we can say that, given the current state of development of automatic fingerprint recognition, it does not seem irrational to think that a big leap forward in terms of accuracy may only be attained through a profound change in the technology as we currently know it, from its very foundations. Whether or not full-3D fingerprint recognition represents this breakthrough in the field, only time will tell. However, the first step described in the present article is certainly encouraging and does seem like a sound and solid one in this direction.