A Survey on Synthetic Biometrics: Fingerprint, Face, Iris and Vascular Patterns

Synthetic biometric samples are created with an ultimate goal of getting around privacy concerns, mitigating biases in biometric datasets, and reducing the sample acquisition effort to enable large-scale evaluations. The recent breakthrough in the development of neural generative models shifted the focus from image synthesis by mathematical modeling of biometric modalities to data-driven image generation. This paradigm shift on the one hand greatly improves the realism of synthetic biometric samples and therefore enables new use cases, but on the other hand new challenges and concerns arise. Despite their realism, synthetic samples have to be checked for appropriateness for the tasks they are intended which includes new quality metrics. Focusing on sample images of fingerprint, face, iris and vascular patterns, we highlight the benefits of using synthetic samples, review the use cases, and summarize and categorize the most prominent studies on synthetic biometrics aiming at showing recent progress and the direction of future research.


I. INTRODUCTION
The need for a new survey on synthetic biometrics is argued by the rush development and popularization of Deep Convolutional Neural Networks (DCNN) moving the focus in image generation from mathematical modeling to data-driven synthesis. This can be seen as a paradigm change because of an overall change in techniques to be used and requirements on auxiliary data. While in early studies the problem of generating synthetic biometric samples was addressed by (mathematical) modeling of physiological structures and acquisition procedures [1], modern studies rely on Generative Adversarial Networks (GAN) transferring the underlying image characteristics from training data to synthetic biometric samples. Moreover, the existing surveys on this topic are quite outdated having been published in 2004 [2] and 2006 [1].
The most powerful driver for the recent development of synthetic biometrics is the introduction of regulations on protection of private data such as the General Data The associate editor coordinating the review of this manuscript and approving it for publication was Vincenzo Conti .
Protection Regulation (GDPR) in the EU [3]. Biometric data are a special kind of private data protected by Article 9 of the GDPR. Biometric data cannot even be anonymized due to its nature. This fact dramatically hinders usage of real biometric samples not only in industrial but also in academic research. Synthetic biometrics solves this problem by introducing virtual individuals. And biometric samples of virtual individuals can be made public without any legal concerns.
If done properly, replacement of datasets of real biometric samples by synthetic datasets is a step towards open research which requires the reproducibility of experimental results by any independent third party and hence public sharing of both algorithms and datasets used in studies [4].
Besides privacy protection, an important use case of synthetic data is augmentation of training and evaluation datasets. Synthetic samples may compensate for unavailable data aiming at covering all possible natural variations as well as unnatural malicious modifications. Taking control over sample variation and sample attributes as well as ensuring equal distribution of sample attributes (e.g. race, gender, age) in datasets enables fair and unbiased application of machine learning.
The main advantage of neural generative models in comparison to traditional modeling is the realistic appearance of generated samples. In recent studies, GAN models have been already used for rendering photorealistic face, iris and fingerprint images. However, the number of publicly available datasets of synthetic biometric images is very limited, as well as the commonly agreed evaluation methodology for appropriateness of synthetic biometric datasets is still not presented.
Our main contribution is in reviewing the use cases of synthetic samples as well as summarizing and categorizing the most prominent studies on synthetic biometrics aiming at showing recent progress and the direction of future research.
The remainder of the paper is organized as follows: Section II elaborates on the nature of synthetic biometric samples and categorizes use cases, requirements on synthetic data and generation techniques. Section III outlines traditional modeling studies to create synthetic fingerprints, faces, irises and vascular patterns. Section IV focuses on data-driven image generation of the same modalities. The public synthetic biometric datasets and synthesis tools are listed in Section V. Our summary is drawn in Section VI.

A. SYNTHESIS IN BIOMETRICS
Following the argumentation in [5], we notice that the ''biometric data'' is an ambiguous term, so is a synthesis of ''biometric data''. This survey focuses on biometric samples as ''biometric data'' acquired by biometric sensors and represented in the form of images which is the case for several of the most established biometric modalities: face, fingerprint, iris, and vascular traits. Synthesis of other types of ''biometric data'' e.g. feature measurements, matching scores and decision data is beyond the scope of this work.

1) SYNTHESIS AS MODELING
In the encyclopedia of biometrics, Buettner [6] defines the ultimate goal of biometric sample synthesis as the application of computer-aided parametric modeling to creation of a synthetic corpus of biometric samples which is indistinguishable from a corpus of real biometric samples obtained from people. The parametric models are either mathematical equations describing physics of biometric sample acquisition or statistical models derived from empirical analysis of real biometric samples.
Buettner stresses that modeling helps on the one hand to fundamentally understand which factors affect the digitization process of biometric modalities for a specific sensor type and on the other hand to efficiently generate synthetic images that are visually similar to real biometric samples which in turn may improve testing of biometric algorithms. This perfectly fits into an analysis-by-synthesis paradigm in which the models of real-world phenomena are learnt to predict perceptual observations [1].

2) SYNTHESIS AS TEMPLATE INVERSION
In fact, biometric analysis can be seen as a special case of representation learning, a process in which a raw object is translated to an abstract feature vector that is suitable for object recognition. From this simplistic perspective, synthesis of biometric samples has been seen as an inverse task to biometric analysis [1] in which a raw biometric signal is reconstructed from its abstract representation.
Let us refer to the result of biometric analysis as a biometric template, then biometric template inversion is a special case of parametric modeling in which the identity of a subject is given by the data stored in a template. In early stages of research it was believed that biometric samples cannot be reconstructed from templates. Meanwhile, this belief has been contradicted. In [7] one can find an overview of approaches to invert biometric templates.
For face, the identity is directly associated with a facial image and there is no commonly agreed convention on features to be stored apart from a frontal face image [8]. Hence, for face, the problem of template inversion exists only for deep learning based representations. The de facto standard for iris representation is a 2048-bit binary code called IrisCode [9]. How IrisCode can be turned into an iris image is demonstrated in [10]. For fingerprint, the template comprises a list of minutiae represented by type, x-and y-coordinates, and an orientation angle [11]. The most recent study on fingerprint reconstruction is in [12]. Note that for the template inversion task the visual realism of the resulting pattern is not always a primary objective.

3) PARADIGM'S CHANGE
Similar to how the focus of biometric research has been changing from the proof of the concept to applicability of biometrics, to security issues in biometric systems e.g. template protection, to intentional attacks mitigation like Presentation Attack Detection (PAD), the focus of synthetic biometrics has been changing from physics-based modeling to statistical modeling and finally to data-driven synthesis.
Physics-based modeling relies on studying physics of an object of interest and its interaction with an environment to predict possible projections onto the sensor space. Statistical modeling relies on statistical analysis of the object's representations to predict the representations of modified objects. Data-driven synthesis relies on collecting enough representations for learning of a deep neural network that is then capable of generating random representations as well as representations of modified or even unseen objects.

B. USE CASES FOR SYNTHETIC SAMPLES
Since synthetic biometric samples can be generated with significantly less effort in terms of time and manpower than collecting real biometric samples, the general motivation of creating synthetic samples is a lack of collected samples of a specific kind [5]. To be more precise, synthesis helps to compensate for biases and simulate natural (e.g. aging, environmental influences) or malicious (e.g. presentation attacks) variations at less cost. Use cases for synthetic data in medical research are listed in [13].
We propose to assign use cases to one of the four categories: (1) Identity-aware synthesis of virtual subjects, (2) Data augmentation for missing variability and attack modeling, (3) Cancelable biometrics, and (4) Inverse biometrics; see Fig. 1. The first two use cases are common for all kinds of synthetic data without special focus on biometric samples.

1) IDENTITY-AWARE SYNTHESIS OF VIRTUAL SUBJECTS
The first use case can be interpreted as replacement of real data by synthetic data which helps getting around data privacy concerns [4]. Introduction of virtual subjects with the corresponding biometric modalities has two major goals: (i) Sharing data for open research and challenges, and (ii) Privacy-preserving development of biometric systems including training and evaluation.

2) DATA AUGMENTATION
Data augmentation is a common approach for bias mitigation. In fact, it is tremendously hard or almost impossible to collect data that cover most of the possible presentation variations of a biometric modality or even collect data from minority group people. At the same time, missing important variations in training samples lead, on the one hand, to biased models and, on the other hand, to high classification error rates. Moreover, biometric classification models are often subjected to intentional attacks that could be modeled in synthetic samples. All in all, there are two major goals of data augmentation: (i) Training more robust algorithms and (ii) Thorough evaluation of already trained classification models (better testing of biometric algorithms). Data augmentation for advanced model training has received massive attention in recent studies in e.g. fingerprint classification [14] and face recognition [15]. GANs designed for face representation modeling are capable of building a robust individual face model from a single still image [16], [17]. In a trivial case, data augmentation can be performed by modest image deformations which do not destroy the initial structure of a biometric modality, or by identity-aware synthesis of new images.

3) CANCELABLE BIOMETRICS
Cancelable biometrics addresses the usage of synthetic samples in place of real biometrics in stored templates. Such a replacement grants a biometric authentication system flexibility towards the case of a compromised reference data storage. If an individual synthetic template has been stolen, it can be easily replaced by another synthetic template. This use case has been first mentioned in [1] and elaborated in [18] on example of fingerprints and in [19] on example of faces. Although this use case can be seen as an implication of the first use case (replacement of a real subject by a virtual one) with the same requirements on the synthetic samples, we single it out because cancelable biometrics targets security rather than privacy concerns. Note that, similar to the first use case, it is essential here that synthetic samples obfuscate individual characteristics i.e. do not match real reference samples.

4) INVERSE BIOMETRICS
Inverse biometrics addresses reconstruction of biometric samples from biometric templates. Inverse biometrics itself is not directly a use case of synthetic biometric samples but rather a tool for creating those and a proxy to performing presentation attacks on biometric systems. It is also an important tool for identity-aware synthesis of virtual subjects. Note that a strong generative model applied for inversion of biometric templates is a powerful instrument in the hands of a perpetrator. The capability of reconstructing biometric images from ''irreversible'' templates may lead to identity theft in the case of a compromised biometric reference dataset. Moreover, modeling of multi-identity biometric images such as face morphing [20] by e.g. blending GAN embeddings and a subsequent image reconstruction [21] or direct image blending [22] is a serious threat to identity verification systems. Even simple face swapping in e.g. deepfakes [23] can be used as a means of mass consciousness manipulation.

C. GENERAL REQUIREMENTS ON SYNTHETIC SAMPLES
It is clear that different use cases imply different requirements on synthetic samples (see Fig.2 for visualization). For the first use case the original identities must be disguised and several impressions of the same biometric entity are required. For the second use case the variability of samples must be increased by preserving the identity. The main objective in the third use case is concealing the original identity and stable generation of biometric substitutes. Nevertheless, general requirements on synthetic biometric samples can be formulated to assure the utility of samples in general and for a special use case.
Inappropriate synthetic samples, no matter for which purpose they are used, lead to non-meaningful results. To the best of our knowledge, the first attempt to formulate requirements on biometric generative models is done in [5]. The authors propose three criteria: flexibility, parsimony and consistency. Flexibility means that the generative model has enough parameters to model data under study. Parsimony means that the generative model is as simple as the data allows but not simpler. Consistency means that the distribution of synthetic samples is sufficiently close to that of real samples. Out of the three proposed criteria, the first two are rather the guidelines for model design, while the consistency can be practically validated in synthesized images. The proposed metric is goodness-of-fit criteria measured in a Chi-square or Kolmogorov-Smirnov test.
However, a commonly agreed methodology on how to practically assess the quality and utility of synthetic biometric samples still does not exist. In our previous study [24], we propose seven requirements on synthetic biometric samples and/or generative models on example of synthetic fingerprint images: • (R) Synthetic images should appear realistic which encompasses two aspects: real and synthetic samples cannot be told apart, neither by the naked eye nor by analyzing image statistics.
• (I) Synthetic images should be of sufficiently high resolution.
• (A) Synthetic images should preserve privacy, meaning that the virtual individuals cannot be linked to real individuals.
• (D) Synthetic images assembled into a dataset should be diverse enough to describe the broad variety of representations of a biometric modality caused by physical interactions with an environment.
• (C) The generative model should be capable of controllable generation of samples meaning that not only non-mated but also mated samples can be produced.
• (B) Synthetic images should reflect basic characteristics of the ground truth images.
• (E) In a synthetic dataset, the distributions of subject attributes such as gender, age or ethnicity should be controlled to avoid biases.
Although the requirements are formulated focusing on fingerprint images, we believe that they can be simply generalized for images of other biometric modalities and moreover for any kind of synthetic images generated by a neural generative model. We also believe that linking the proposed requirements to metrics specific for images under consideration can grow into an established evaluation methodology for synthetic data.
An evaluation methodology for synthetic data looking at realistic recognition accuracy values and the similarity of comparison score distributions to real data for the case of fingerprints is suggested in [25] and for the case of person re-identification data in [26].
Medical studies such as [27] have an alternative perspective on utility evaluation of synthetic data.

D. GENERATION METHODS
In general, there are three approaches for obtaining synthetic biometric samples: adaptation, synthesis and reconstruction from a biometric template.
In the case of adaptation, an existing biometric sample is modified to mimic certain acquisition conditions such as sensor characteristics, environmental variability, or physiologybased variations. For fingerprints, there is an adaptation tool called StirTrace [28]. Adaptation is a commonly used technique for data augmentation. An exhaustive review of face data augmentation approaches is presented in [15].
Synthesis aims at creating artificial samples from scratch according to a predefined sensor, physiological and environmental conditions making use of a pre-trained conditional generative model e.g. Glow [29] for faces or SFinGe [30] for fingerprints. Reconstruction from a biometric template aims at interpreting the essential identity-specific information from a biometric template and generating artificial samples by parameterizing a previously learned generic model towards the identity.

III. TRADITIONAL MODELING
The most prominent and recent studies on model-based synthesis of biometric images and reconstruction from biometric templates are listed in Table 1.

A. FINGERPRINTS
Historically, first conclusive modeling efforts were made for fingerprints back in the first half of the 20th century [58].
A simple yet practical model for ridge pattern orientation based on singular points (cores and deltas) is proposed in [31] and applied in SFinGe [30] for algorithmic synthesis of realistic fingerprints. In [32], it is shown how ridge orientations can be modeled by fitting Legendre polynomials. Modeling of fingerprints in [30] goes far beyond modeling of master fingerprints i.e. iterative application of Gabor filters, tailored to local orientation and frequency. A crucial step is mimicking distortions to simulate traction and torsion forces applied to a finger during its placement on the sensor as well as rendering texture and adding noise. However, the applied texture and noise models are not well justified leading to inability of reproducing texture that is statistically representative for real fingerprints. In [33], texture and noise are derived from VOLUME 11, 2023 real fingerprint images enabling realistic textures in synthetic images. Simulation of a fingerprint generation process by Petri net is proposed in [34] with the focus on mimicking skin diseases. The net depicts synthesis states and possible transitions between them. First, master fingerprint is created, then environmental influences are added, then user and finger conditions, then fingerprint damages, and finally sensor characteristics. The study in [35] focuses on modeling contactless fingerprints. Synthetic samples are designed to reflect properties of capturing process, subject characteristics, and environmental influences.
Inversion of fingerprint templates was first addressed by Hill back in 2001 [59]. However, the intensive research started in 2007 with the studies of Cappelli et al. [36] and Ross et al. [37]. In contrast to fingerprint drawing from scratch, the locations of singular points are derived from the given minutiae and further on the orientation map. Ridge frequencies are also adjusted to minutiae. Apart from the zero-pole model [31], ridge patterns can be estimated using the minutiae triplet model [37] or the AM-FM model [38], [39] which makes use of eight neighboring minutiae. Reconstruction of ridges based on patch dictionaries, proposed in [40], allows for idealistic ridge patterns.

B. FACE
Modeling of faces has always been an intensively studied topic of computer vision. Indeed, face recognition is only one domain in a broad variety of applications making use of highly realistic synthetic faces. The first important mathematical concept allowing for modeling face attributes and preserving individual characteristics was proposed by Sirovich and Kirby [41] way back in 1987. Later on, Turk and Pentland [42] called this concept ''Eigenfaces'' and demonstrated its application to face recognition. Eigenfaces is a domain specific rephrase of eigenvectors known as an integral part of Principal Component Analysis (PCA). The main idea is that a face image is decomposed in a linear combination of Eigenfaces and the coefficients of Eigenfaces define face appearance. Individual characteristics, environmental influences and other appearance variations are modeled by modifying one or several coefficients. The next step was made by Cootes [43] who split face into texture and a graph of facial landmarks and applied PCA to texture and geometry separately. Image rendering is then done by texture warping. This move allows for better modeling of face poses. The breakthrough in face modeling is achieved by Blanz and Vetter [44] with the concept of a morphable 3D face model. Practically, for geometry representation, facial landmarks are replaced by a 3D head shape and PCA is used, as before, for modeling variations in pose, illumination, expression, etc. This model has been actively used in most face modeling studies until the paradigm change in 2014 due to the success of DCNN in classification tasks and introduction of GAN [60]. A head model suitable for deformation in anthropometrically meaningful ways using the underlying muscle and bone structure is proposed in [45]. The head growth from early childhood to adult can be simulated. In fact, complex 3D models can be perfectly combined with facial landmark localization and texture warping to rotate or frontalize faces in 2D images [61] which in turn greatly improves the performance of face recognition.
We intentionally omit ''face reconstruction from a template'' because of an absence of commonly agreed convention on ''face features'' so that the standard face template is a high-quality frontal face image. Hence, face reconstruction makes sense only for deep-face features addressed in Section IV-B.

C. IRIS
Modeling of iris images is a relatively young research field having its roots in the early 2000s. The ocularist's approach to iris synthesis [46] overlays semi-transparent texture layers built from topological and optic models. The layers are designed to mimic stroma, collarette, limbus, pupil and sphincter muscle components. It has been stated that such fake iris patterns printed onto vanity contact lenses pose a threat to iris identification systems. Synthesis of iris images based on PCA and super-resolution is proposed in [47]. In [48], iris is modeled by Markov Random Fields (MRF) applied to stitching of texture primitives representing radial furrows, crypts and limbus. The study in [49] is a further development of [48] in which MRF modeling is applied to background texture, and radial furrows, concentric furrows, collarette and crypt are synthesized as texture patches and embedded into the texture. An anatomy-based method for synthesizing iris images with the focus on compiling a large database of synthetic irises is proposed in [50]. The authors put emphasis on ''realism'' stating that a synthetic iris should not only look like a real iris but also have statistical characteristics of a real iris.
Iris reconstruction from binary IrisCode relies either on a deterministic approach based on Gabor filtering [51] or on a probabilistic approach based on genetic algorithms [52].

D. VASCULAR BIOMETRICS
A review on several synthesis techniques for hand-related vascular biometric samples provided in [62] is not restricted to palm vein data as suggested by the title. The first work in this field is more recent as compared to other modalities - [53] proposes a method for creating hand vein sample data, employing (randomly positioned) ''key points'' which are interconnected by a random process and dilated to achieve vessel apprearance. Texture is generated using a contrast variation model. As introduced in [54] similar methodology for finger veins, based on setting slightly disturbed vein nodes, which are interconnected by randomly steered vessel growth and NIR variability is also modeled to result in realistic appearance.
Eye-related traits have been considered in the context of creating synthetic data -in particular retina recognition, 33892 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
as fundus imagery is an important diagnosis element in ophthalmology as well. In [55], a patch-based approach is used to create retinal backgrounds and foveae, model-based texture synthesis techniques have also been employed for the generation of realistic optic discs and vessel networks. The approach has been refined in [56], where the Active Shape model and Kalman filtering combined with an extension of the Multiresolution Hermite vascular cross-section model facilitates realistic vessel network modeling.
Finally, also sclera data has been subject to synthesis attempts: [57] proposes to apply an earlier published non-parametric texture synthesis technique which relies on systematically selecting random patches from ''primitive images'' (i.e. seed or source images) to produce similar texture, using a part of the UBIRISv1 dataset as primitive images.

IV. DATA-DRIVEN SYNTHESIS
Neural generative models have been applied in biometrics for many purposes. The most prominent are biometric image completion [95], image quality enhancement [96], image style transfer [97], generation of random realistic biometric samples [78] and conditional generation for identity-aware image reconstruction or retaining/obfuscation of soft biometric attributes [98]. The majority of neural generative models are based on GAN. Since face as biometric modality has always been a pioneering one in computer vision research, the majority of feasibility studies on GAN-based synthesis deal with face images, but the developed GAN architectures can be applied to other biometric images as well. GAN models allow for creating realistic patterns and transferring domain characteristics from an existing dataset. These two aspects are very challenging for model-based generation and are natively supported in data-driven generation. The most prominent and recent studies on data-driven synthesis/reconstruction of biometric images are listed in Table 2.
The key feature of data-driven synthesis is that no knowledge about image semantics is required. However, the prerequisite for training of a generative model is a large set of diverse real biometric samples. Here, one can see a logical mistake. If real biometric samples are already presented, why do we use them for training and not for the initial task e.g. evaluation of a biometric system? The first reason is privacy protection. Indeed real biometric data can hardly be used in open research. Second, a synthetic dataset can be made of arbitrary size enabling training of DCNN from scratch.

A. FINGERPRINTS
In [63], a Wasserstein GAN is applied for generating master fingerprints that match multiple original fingerprints. The Finger-GAN proposed in [64] is a connectivity imposed fingerprint generation GAN that is introduced and applied to simulate images from two fingerprint datasets: FVC2006 and PolyU. The SynFi approach in [65] enables high-resolution realistic fingerprints generation by combining GAN with a super-resolution network. In [66], a combination of a convolutional autoencoder and an improved Wasserstein GAN is used for synthesizing 512 × 512 pixel rolled and plain fingerprints. In [67], the feasibility of three established NVIDIA GAN architectures: progressive growing GAN, StyleGAN and StyleGAN2 is validated for realistic fingerprint pattern generation. The Clarkson Fingerprint Generator (CFG) [68] is trained using StyleGAN and proprietary dataset captured with a Crossmatch Guardian scanner. In [69], a CycleGAN is applied for texture transfer from real fingerprints to Anguli generated fingerprints with added sweat pores. Then a super-resolution network increases the image size. The PrintsGAN is proposed in [70] for fingerprint synthesis from DeepPrint representation with control over a fingerprint identity.
In fact, there are two ways of fingerprint synthesis with a pre-defined identity. The first is in combining model-based reconstruction from minutiae and further application of styletransfer (e.g. CycleGAN) to add realism and transfer domain characteristics, as in [69]). The second way is to use conditional GAN with identity as a condition. If identity is given by a deep representation, as in [70], the identity-aware synthesis is straightforward. If identity is not explicitly presented or given by minutiae, an additional encoding step is required.
Fingerprint reconstruction from a minutiae template is addressed in [71] using a conditional GAN (pix2pix) originally proposed in [72]. In [73], a convolutional minutiae-tovector encoder is used in combination with StyleGAN2 for identity-preserving, attributes-aware fingerprint reconstruction from minutiae. In [12], fingerprint reconstructions from deep network embedding and from minutiae are compared qualitatively and quantitatively regarding the inversion attack performance.

B. FACE
Modeling of face as a vector in a latent space of a generator network is a current research trend. Making moves in the latent space of the generator both person identity and facial attributes can be modified on demand. Sophisticated GAN models allow for face learning from a single still images and therefore unconstrained face recognition [17]. The most successful face modeling networks are Glow [29] from OpenAI and StarGAN [75] from CLOVA AI Research. However, the best visual quality have face images randomly produced by NVIDIA GAN models: progressive growing GAN [77], StyleGAN [78], StyleGAN2 [79], StyleGAN2ada [80] and StyleGAN3 [81]. The majority of recent face modeling approaches build on NVIDIA architectures trying to extend it by a conditional generation mechanism. This can be done either by statistical analysis of the latent space [99], [100], or by semi-supervised disentanglement learning [101], or by detecting ''StyleSpace channels'' for attribute control [102].
The problem of reconstructing face images from deep embeddings goes far beyond the field of inverse biometrics. The question is whether an image encoded by one deep   [82], [83], and [84]. All approaches mentioned reconstruct face images from FaceNet embeddings.

C. IRIS
There are only few studies on GAN-based iris synthesis. Iris-GAN is proposed in [85] to generate irises from random variables using a deep convolutional GAN. The synthetic irises lack realism and mostly suffer from unrealistic patterns around the iris boundary. In [86], a conditional GAN (pix2pix) is applied to iris synthesis aiming at data augmentation to improve accuracy of iris recognition. In [87], the Relativistic Average Standard Generative Adversarial Network (RaSGAN) is applied to synthesize high-quality images to mimic iris presentation attacks and hence to support presentation attacks detection (PAD) against unseen attacks. Aiming at compensating for under-represented iris presentation attacks in the training set, the Cyclic Image Translation Generative Adversarial Network (CIT-GAN) is proposed in [88] to enable domain style transfer and improve PAD accuracy.
GAN-based iris reconstruction from three types of iris templates is introduced in [90] referred to as RESIST. Apart from a traditional Gabor filter template, two deep templates 33894 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  are addressed: embedding of DenseNet with and without normalization. The RESIST architecture is a modified U-Net (pix2pix) extended by L1 perception loss and SSIM loss.
For both iris recognition and periocular recognition, GANbased synthetic eye images are created using a segmentation mask as input and being gaze-preserving as described in [103].

D. VASCULAR TRAITS
In [91], GAN-based data augmentation is proposed for enhanced training in CNN-based finger vein recognition, where the suggested FCGAN approach outperforms other GAN architectures. Here, classically augmented finger vein sample images are used as training data. Contrasting to this approach, [92] generates the vascular network using a classical approach, while generating the texture for the image using a GAN approach. Similarly, a nature-inspired algorithm is used in [93] to form the vessel pattern, and the StyleGAN2 for texture generation.
A good overview on GAN-based synthetic retina (fundus) images generation methods (including an analysis of weaknesses) is contained in [112]. Among other techniques, [113] uses a multiple-channels-multiple-landmarks (MCML) approach in which vessel structure is combined with optical disc and optical cup images, which are then fed into pix2pix or CycleGAN architectures to generate realistic texture. A multitude of different CNN architectures is used and compared overall. A related approach is taken in [112], where a VAE is employed to create vessel trees, which are then used by a GAN architecture to create full fundus images. After training, this results in an end-to-end system. In [114], a pix2pixHD GAN is used to create synthetic retinal images, where ophthalmologists were not successful in discriminating real from artificial data.

V. DATASETS AND CHALLENGES WITH SYNTHETIC DATA A. DATASETS
Due to recent data protection regulations, many biometric datasets were removed from the public domain. For instance, VOLUME 11, 2023 NIST has taken down forensic fingerprint datasets NIST SD4, NIST SD14 and NIST SD27, and Microsoft has taken down MS-Celeb-1M database of 10 million faces scraped from the internet. This tendency explains the growing interest in synthetic biometric datasets that can replace the real ones. There is a number of studies on compilation of large synthetic datasets for almost all biometric modalities. In [66], [67], and [68] there are approaches for fingerprints, in [99], [100], and [115] for faces, and in [50] for irises. However, the amount of publicly available datasets is quite limited. Table 3 summarizes public synthetic datasets.

B. CHALLENGES
After it has been established that synthetic faces can be successfully applied in face recognition and even completely substitute for real faces [100], [107], [116], [117], the next step is organizing large-scale biometric campaigns in which recognition models are trained solely from synthetic samples addressing the whole variety of biometric modalities. To the best of our knowledge, the only such competition related to biometrics and focusing on face morphing attack detection took place within the International Joint Conference on Biometrics 2022. 1

VI. CONCLUSION
The abundance of recent studies on generation of synthetic biometric samples suggests that interest in experimenting with synthetic data is currently very high and is even growing. There is a belief that by 2024, 60% of the data used for the development of AI-based solutions will be synthetically generated, 2 and by 2030, synthetic data will completely overshadow real data in AI models. 3 Although there is still controversy on whether the synthetic datasets can replace real biometric datasets, the number of published neural generative approaches for biometric images increases. While face biometric still enjoys a sufficient number of publicly available datasets, many fingerprint datasets have been removed from public domain due to privacy concerns. Synthetic biometric samples have a versatile use spreading from augmentation of training and test data to enabling privacy-friendly evaluation of biometric algorithms and sharing data for open research. Hence, many researchers seek for publically available synthetic biometric datasets. Synthetic biometrics also have a close relation to several security-related domains such as inverse biometrics, cancelable biometrics and privacy-enhancing technologies implicating an improvement of security if synthetic samples are involved. Utility evaluation of synthetic biometric samples is the major challenge now. We believe that linking the requirements proposed