Cross-spectrum Thermal Face Pattern Generator

Conversion of a visible face image into a thermal face image (V2T), or one thermal face image into another one given a different target temperature (T2T), is required in applications such as thermography, human body thermal pattern analysis, and surveillance using cross-spectral imaging. In this work, we propose to use conditional generative adversarial networks (cGAN) with cGAN loss, perceptual loss and temperature loss to solve the conversion tasks. In our experiment, we used Carl and SpeakingFaces Databases. Frèchet Inception Distance (FID) is used to evaluate the generated images. As well, face recognition was applied to assess the performance of our models. For the V2T task, the FID of the generated thermal images reached quite low value of 57.3. For the T2T task, we achieved a rank-1 face recognition rate of 91.0 % which indicates that the generated thermal images preserves the majority of the identity information.


I. INTRODUCTION
B OTH visible and thermal spectra provide useful biometric information on human subjects. Most biometric tasks, such as face detection and recognition, focus on the visible spectrum. Thermal cameras, unlike visible spectrum ones, allow for capturing the low-light scene. However, in most cases, thermal images cannot be used for face recognition given that legacy databases of faces contain only visible spectra images. In thermography for healthcare applications, the opposite image-to-image translation may be needed. This paper focuses on two tasks. The first is dedicated to answering the question: How to convert a face image taken under visible spectrum into another face image taken under thermal spectrum? The second task poses the question: How to convert a thermal face image taken at a certain measured body temperature into a thermal image given a different temperature?
In this paper, we set up a deep learning model to solve these two image-to-image translation tasks. We use a conditional generative adversarial network (cGAN) [1], conditioned on input visible or thermal images, to generate the output thermal images. Our cGAN consists of a 'U-Net' generator (G) and a PatchGAN discriminator (D). We use the Carl Database [2] and the SpeakingFaces Database [3] containing paired visible and thermal images to train our model. For converting a visible face image into a thermal face image (V2T), we use the SpeakingFaces Database which contains aligned visible face images and thermal images to train our model, with a cGAN and Mean Absolute Error (MAE) loss. For the second task, converting a thermal face image into another thermal face image with the target temperature (T2T), we modify our model to include temperature information. We use the Carl Database containing thermal faces with different temperatures to train our model, and train it using a cGAN, perceptual [4] and temperature loss. We extend our previous work [5], by applying the Frèchet Inception Distance (FID) [6] and face recognition techniques to measure the performance of the proposed cGAN. Applying face recognition techniques on the resulting generated thermal images shows an acceptable performance.
The main contributions of our work are listed below: 1) We propose to use cGAN to solve two cross-spectrum image-to-image translation tasks, V2T and T2T. To solve these tasks, we modified the structure of cGAN as cGAN V 2T and cGAN T 2T . To the best of our knowledge, it is the first time that the T2T task is approached this way. 2) Two different databases are used to train cGAN V 2T and cGAN T 2T , with different loss function combinations. 3) We evaluate the performance of our two cGAN models using FID and face recognition techniques. The paper is organized as follows: Section II provides a literature review related to GAN and image-to-image translation tasks. Section III describes the architecture of the proposed cGAN. Section IV provides information on the databases used for training, and describes the pre-processing steps applied to the images. Section V summarizes the results of the proposed cGAN application. Section VI concludes the paper.

II. RELATED WORK A. GENERATIVE ADVERSARIAL NETWORKS
GANs were proposed in 2014 [7], followed by multiple successful applications in various fields [8]. A GAN is comprised of a generator (G) and a discriminator (D). D is trained on real samples from collected database and fake samples generated by G, and is responsible for classifying samples as real or fake. G is to generate fake samples that are real enough to fool D. The adversarial process is formulated as a minimax game [7]: (1) where the distribution of the real samples x is denoted by p data , p z represents the distribution of the noise input z, and E is to calculate the expectation of the following expression. By training G and D together, they compete with each other, then reach an equilibrium where G can implicitly learn the distribution of the collected samples.
However, GANs suffer from adversarial training instability and mode collapse problem [9], [10]. An improved architecture called DCGANwas proposed in [9] to resolve the training instability problem, by using the CNN and batch normalization [11]. Most recently, Wasserstein GAN [12] [13] and LSGAN [14] were proposed to improve the adversarial training stability and alleviate the mode collapse problem.
In our work, we adopt cGAN [1], with G aiming to generate thermal face images, conditioned on input visible or thermal images. Many image-to-image translation tasks, such as converting semantic labels to real city photos or converting architectural labels to real structural images, were approached using the cGAN to generate target images [15]. For the V2T task considered in our paper, the condition is the input visible image, and the proposed cGAN is built to generate a corresponding thermal image. This task is similar to the one described in [16] and [17]. K. Lai et al. [16] used cycleGAN to convert images between visible and thermal domain. K. Landry et al. [17] uses GAN to convert images from thermal spectrum into visible. In our experiment, we focus on converting images from visible spectrum into thermal. For the T2T task, the condition includes both the input thermal image and a target temperature, and the proposed cGAN is constructed to generate a thermal image given a target temperature.

B. FACIAL ATTRIBUTE EDITING
Facial attribute editing task tries to manipulate one or more attributes of a given face image, while preserving its identity details [18]. In contrast, our approach to T2T aims to manipulate the temperature attribute for a given thermal face. Many methods have been proposed recently to solve the facial attribute editing problem, and most of them have only been applied to visible face images. Li et al. [19] trained a deep learning-based identity-preserved attribute transfer model to add or remove a single attribute to or from a face. To do so, they incorporated an adversarial attribute loss and an identity-preserved loss. Shen and Liu [20] adopted the dual residual learning method to simultaneously train two models by adding or removing a specific attribute. StarGAN [21] performed the facial attribute editing task based on the attribute labels by employing an attribute classification loss [22] and a cycle consistency loss [23], which can perform multi-attribute editing simultaneously with only a one single model.
Wang et al. [24] proposed an IPCGAN with a cGAN module, an identity-preserved module and an age classification module, to address the face aging problem by editing the age attribute on a given face. Unlike this approach, in our paper, the facial attribute editing is applied to the thermal spectrum, i.e., manipulating the temperature attribute of thermal face images. In our T2T task, we use thermal images from the Carl Database [2] to train our cGAN, with a cGAN, perceptual and temperature loss function.

III. PROPOSED METHOD
In our work, we use cGAN to solve both of the imageto-image translation tasks. Our implementation of cGAN consists of a generator (G) and a discriminator (D). The minor differences in the structures of the G and D for V2T and T2T will be described in detail in Subsection III-A. For V2T conversion, the input of G V 2T is a visible image, and the expected output is a thermal image of the same identity. The input of D V 2T includes both a visible image and a thermal image, and the output indicates whether the input thermal image is the real paired image of the input visible image. For T2T conversion, the input of G T 2T is a thermal image and a target temperature, and the output is another image with a thermal pattern corresponding to the target temperature, and preserving the identity. The input of D T 2T is a thermal image, and the output indicates whether the input thermal image is real or generated. By training each G and D together, they should reach an equilibrium such that G generates a thermal image that is real enough to fool D.   Fig. 2, and the minor difference will be described in Section III-A. 35 FIGURE 2: cGAN network for T2T conversion. G T 2T converts a thermal image into another thermal image with target temperature, and we add one more channel to represent the temperature information in the input unit. The temperature predictor is pre-trained to provide the temperature loss. VGG is to provide perceptual loss. D T 2T is a 6-layer PatchGAN that determines whether the input thermal image is real or generated.

A. PROPOSED FRAMEWORKS
The following is an overview of the cGAN used, similar to the cGAN described in [15], with a discussion of necessary changes. We use 'U-Net' [25] as our generator (G) and a 6layer PatchGAN as our discriminator (D) for both conversions. Both possess the convolution-BatchNorm-LeakyRelu [9] units. Fig. 1 shows the model details of G V 2T . Similar to our previous design [26], G V 2T has an input unit, 7 encoding units, a bottleneck unit, 7 decoding units, and an output unit. Each encoding unit down-samples the previous unit by 1/4 (1/2 of length and 1/2 of width) with strides = 2, and each decoding unit up-samples the previous unit by 4 (2 × of length and 2 × of width) times. For the i th decoding unit, we stack it with the last i th encoding unit in the channel dimension, before applying the LeakyRelu function. The filter size is set to 4 * 4 for all units. For encoding unit, the filter number is set as 64 first, and doubles for each of the next unit, until it reaches 512; after that, it remains at 512. The filter number for each decoding unit is the same as the encoding unit connected to it. For the bottleneck, the filter number is set to be 512, and the activation function is set to be ReLU. For the output unit, the filter number is set to 1, and the activation function is Sigmoid. Fig. 2 shows the PatchGAN architecture of D T 2T . It is a 6-layer CNN, with the number of filters set to 64, 128, 256, 512, 512, and 1, respectively. For the first 4 layers, the stride is set to 2, and the for the last 2 layers, the stride is set to 1. The output is a 16 * 16 matrix, with each value mapping to a 70 * 70 receptive field in input images as ground-truth or generated.
The input layer of G T 2T and D T 2T are slightly different from G V 2T and D V 2T . For G T 2T , we add one more channel to incorporate the target temperature information so that the input layer has 4 channels, while for G V 2T it has 3 channels. For D T 2T , we only input the thermal images into the networks so that the input unit has 3 channels, while for D V 2T in has 6 channels with concatenated visible and thermal images as input.

VOLUME -, 2020
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3144308, IEEE Access

B. OBJECTIVE FUNCTION FOR V2T
The objective function of cGAN V 2T is defined as follows: (2) where x and y are paired visible-thermal images. G V 2T tries to minimize L G V 2T , and D V 2T aims to minimize L D V 2T . Additionally, we appended the Mean Absolute Error (MAE) loss to help the generator converge faster, and also preserve the identity information of the ground-truth thermal images in the pixel-order. The MAE loss can be calculated by the averaging the absolute difference between the pixel values of the same coordinate: where V stands for the number of pixels in the input or output images. Combining both losses together, our final loss function can be expressed as follows: where α indicated the weight of L M AE with respect to L G V 2T . Here, we set α = 100, determined by the best visual effect of generated thermal images on the training set.

C. OBJECTIVE FUNCTION FOR T2T
The objective function of cGAN T 2T conversion can be written as follows: where x is the input thermal image and t is the target temperature. Additionally, we use the perceptual loss L prp as proposed by Johnson et al. [4]. It is formed using the features extracted from selected layers of the pre-trained VGG network [27] given by: where F (i) represents the i th layer with V i activations of the VGG network, and N is the number of selected layers in the VGG model. In our work, we selected 4 layers of VGG network as F , to calculate L prp . With L prp , we preserve the similarity of the high-level features of input and output thermal images.
In addition, we use a temperature prediction loss to enforce the generated thermal image correspond to the target temperature. To get the thermal pattern corresponding to the target temperature, we pre-train a temperature predictor. We label each thermal image by the value of temperature, based on the temperature information provided by [2]. Our temperature predictor has the same architecture as Alexnet [28], except that the last fully-connected layer has only one unit with tanh activation. The temperature loss is defined as follows: In the above equation, σ() corresponds to an MAE loss, and T P () is our trained temperature predictor. During backpropagation, L temp guides the parameters of G T 2T to change and generate thermal faces that represent the target temperature. with all losses, our final loss function can be expressed as follows: where α and β are set at 100 and 500, determined by the criterion of the best visual effect of the generated thermal images from training set.
The reason that we use L prp rather than L M AE for T2T task is that increasing or decreasing the temperature of the input thermal face image inevitably changes the pixel values of it, thus causing a large L M AE . To some extent, L M AE and L temp are contradictory in the T2T task. However, one of our goals for the T2T task is to keep the subject identity information, so we use L prp to preserve the high-level features of the input thermal faces.

IV. EXPERIMENT SETUP A. DATABASES
In this paper, we use two databases. The first is the Carl Database [2], in which visible and single channel grayscale thermal images, containing one human face per image, are collected simultaneously, using a TESTO 880-3. The database contains 41 subjects, with 32 males and 9 females. For each of them, four image recording sessions were performed within two months, each under three different lighting settings (natural, infrared and artificial lights), and five sequential image acquisitions for each setting. This generates 41 * 4 * 3 * 5 = 2, 460 visible-thermal image pairs. However, the visible-thermal image pairs are not aligned, and, thus, cannot be used to train the cGAN V 2T directly. We will provide details of our alignment process in Section IV-B. In addition, this database contains the temperature matrix corresponding to the thermal images, which provides the possibility to generate a thermal image with a different target temperature. In our work, to keep it consistent with another database, we used the Testo software [29] to convert the raw data into thermal images using an 'iron' palette. We used these images instead of the provided gray-scale thermal images, to train and test the proposed model.
The second data set is the SpeakingFaces Database [3], which provides much more well-aligned visible-thermal face image pairs, compared to the Carl Database. In the Speak-ingFaces Database, subjects are required to read sentences, and their voice, thermal faces and visible faces were recorded by microphones, FLIR T540 thermal camera, and Logitech C920 Pro HD web-camera simultaneously. The database contains 142 subjects. In our work, images of subjects 1 − 21 are not used due to the incompleteness of these samples. For each subject, the image acquisition is conducted using two trials, each trial involves 8,100 frames. To record the images, the cameras are placed in 9 different positions in order to acquire the face images from 9 different angles, and 900 frames each. In our work, only the face images taken in the front angle from SpeakingFaces Database are used. Due to the high-volume of the database and the highsimilarity between the images taken subsequently, we sampled only 5% of the frontal face images. Altogether, we use 122 * 2 * 900/20 = 10, 980 thermal-visible image pairs. The thermal images and visible images are aligned. However, unlike the Carl Database, the temperature information of the thermal images was not available in this database.

B. ALIGNMENT, FACE EXTRACTION AND RESIZING
The following is an overview of our approach to preprocessing of the two databases. The pre-processing step is same as our previous work [5]. We consider the resolution of 256 * 256 of the face images used to train our two cGAN models. In the Carl Database, the visible and thermal images were not aligned and, thus, could not be used for training or testing directly. We used the facial landmarks annotated manually by Alperen [30], and coordinated the mapping to extract the faces from visible and thermal images. We then resized the extracted faces to achieve the resolution of 256 * 256. Fig. 3 shows an example of an original visiblethermal face image pair, and the same pair and after the alignment and resizing. Based on the 6 facial-landmarkposition pairs (blue points) provided by Alperen, we learn the coordinate mapping between the visible image and the thermal image. Details of the coordinate mapping will be described in the next paragraph. Next, we applied the pretrained face detector tool dlib [31] to extract the face from the visible image and expand it by a factor of 1.3, in order for the whole face (solid green box) to be extracted. The learned coordinate mapping is used to map the solid green box within visible image into the thermal image; this results in extracting the thermal face (dashed green box). At last, we resize the two extracted faces to achieve a resolution of 256 * 256 and get the aligned visible-thermal face image pair.
Below we explain the coordinate mapping process. The Cartesian coordinates of the six points in the visible image are annotated as follows: {x vn , y vn } , 1 ≤ n ≤ 6. The same coordinates in the thermal image are denoted as {x tn , y tn } , 1 ≤ n ≤ 6. We map the points in the visible images into the points in the thermal images through the linear transformation as described below: wherea x , b x , a y , b y are the coefficients to control the linear mapping. We calculate the coefficients by minimizing the squared error: a y , b y = arg min ay,by 6 n=1 (a y × y vn + b y − y tn ) 2 (10) Since the visible and thermal images in the SpeakingFaces Database are already aligned, we directly use the dlib to extract the face from the visible images, and then expand it by a factor. We use the same coordinates to extract thermal face. We then resize the extracted faces to achieve a resolution of 256 * 256, and form the training set for the two cGAN models.

C. TRAINING AND TESTING FOR V2T
To train our V2T model, we use the visible-thermal image pairs of subjects 21 − 142 from the SpeakingFaces Database. We use images collected from subjects 41 − 142, all together 102 * 2 * 45 = 9, 180 image pairs, to train the model. Images collected from subjects 21 − 40, provide 20 * 2 * 45 = 1, 800 images pairs, to test the model. The FID metric [6] is applied in our case to evaluate the similarity between the generated thermal images and the ground-truth thermal images. The FID is computed by passing the generated and real images into the pre-trained InceptionV3 model [32] and using the difference from the last polling layer: where µ represents the mean for the real (r) and generated (g) images, Σ represents the covariance for the real (r) and generated (g) images, and tr is the trace linear function. The lower FID score means that the quality of images generated by the generator is similar to the real ones; for example, the FID of 0 indicates that the generated and the ground-truth images are identical. Note that FID can vary between 0 and 600 in some cases.

VOLUME -, 2020
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3144308, IEEE Access

D. TRAINING AND TESTING FOR T2T
In our experiment, we use images collected from subjects 1 − 30, all together 30 * 4 * 3 * 5 = 1, 800 images, to train the model. The images collected from subjects 31 − 41, totally 660 images, are used to test the model. The temperature of the centre of the subject's forehead on the thermal face image is considered to be the ground truth temperature. The temperature distribution of thermal images in different range are given in Table 1, which indicates that this is an unbalanced database. We used 80% of the images to train our temperature predictor, and use the remaining 20% to test the predictor. We linearly scale the temperature into [-1, 1] so that it falls into the range of tanh output. We use MAE to evaluate our temperature predictor. Note that the MAE is 0.27°C in the testing database.

E. FACE RECOGNITION IN THERMAL SPECTRUM
In order to verify the similarity of the original and the generated face in both T2T and V2T, we apply the face recognition in both thermal and visible spectra. The following methodology [16] was used to evaluate the performance of such face recognition. Face recognition is applied in this work to find the real identity of a test sample among all subjects in the database. We apply transfer learning [33] on three pre-trained models: InceptionV3 [32], Xception [34], and MobileNet [35].
The face recognition in T2T was performed using a fourfold cross-validation, with each fold representing a different acquisition session. For V2T conversion, we use a two-fold cross-validation. The number of folds is determined by the session numbers available of each database.
We train each CNN model with the images from three (for T2T) or one (for V2T) session(s), and validate it with the images from the remaining session. After fine-tuning the optimal parameters for the particular validation set, we test the final performance of the CNN with the synthesized thermal images. We compare the performance of the real and generated thermal images. We also perform the face recognition in visible spectrum for comparison.
Our approach employs Transfer Learning. It involves loading the pre-trained weights optimized for the ImageNet challenge first. The last fully-connected layer and classification layer from each model are removed, and an average pooling layer, two fully-connected layers with 512 units, and a classification layer are added to each model. Two training processes are applied sequentially. Initially, the model is trained with the other layers frozen, allowing only the newly added layers to update, with a higher learning rate for 50 epochs. Next, the entire model is trained with a lower learning rate for 50 epochs with a fine-tuning purpose. Due to the high-volume of the SpeakingFaces Database, for the face recognition in V2T task, we resize the images to achieve the 128 * 128 resolution. For T2T task, we keep the resolution of 256 * 256.

A. V2T GENERATED IMAGES
For V2T conversion, we use three instances of visible images: ground-truth thermal images using the 'iron' palette, generated 'iron' palette thermal images, and the corresponding FID values, as shown in Fig. 4. The generated thermal images reach satisfactory visual effects. For some parts of the generated images, our model does not perform well, such as a thermal image in the first column that seems to have an artifact in the hair part. For the second subject, the generated thermal image cannot reproduce the curly hair in the forehead. We use FID to measure the similarity between the generated thermal images and the ground-truth thermal images. The FID of all images in the test set achieved the value of 57.3 which is reasonable low compared to the maximum FID values that can reach 600 or higher.

1) V2T Face Recognition
The performance of face recognition (using 1 : N comparison) is evaluated by the rank performance and the True Acceptance Rate (T AR) at the targeted 1% and 0.1% False Acceptance Rate (F AR). For each subject, we calculate T AR = T P/(T P + F N ), where T P and F N represent the number of true positive and false negative, respectively. The overall T AR was taken as the average T ARs for each subject. For evaluation, the targeted 1% or 0.1% F AR corresponds to a specific acceptance threshold. The testing set was formed based on the synthesized thermal images using the cGAN V 2T , and the validation set was formed using original thermal images.   Table 2 illustrates the performance for the validation set (Valid) and test set (Test) of the three different networks. Valid shows higher T AR because the validation images have the same spectrum as the training images. Test produces lower T AR since Test are formed from generated images. For testing, the MobileNet network demonstrates the best performance in terms of T AR at the targeted F AR and rank-1, and the Xception model reports the worst result. Fig. 5 shows the Cumulative Matching Characteristic (CMC) for the face recognition rate of the three network models when accepting identities at ranks between 1 to 30. All the models perform better on Valid compared to Test, which indicates the loss of identity information following the V2T conversion. MobileNet performs the best, while InceptionV3 performs the worst.
We used cGAN V 2T to convert visible face images into thermal face images, and FID and face recognition techniques to evaluate the generated thermal images. These images keep similarity to the ground-truth thermal images, and the overall FID reached 57.3. The rank-1 face recognition rate of Test is lower than Valid (14.0% for InceptionV3 and 15.8% for Xception). MobileNet has a better performance than other two networks, with regard to TAR and rank-1 accuracy, on all of the sets.

B. T2T GENERATED IMAGES
For T2T conversion, we list four instances of the input and generated thermal images with different target temperatures as shown in Fig. 7. The four input thermal images have four different original temperature, from lowest to highest. As seen in Fig. 7, from left to right, thermal images become reder due to their higher facial temperature. We also used the trained temperature predictor to estimate the temperatures of the generated thermal images, and list the MAE in Table 3. Table 4 reports the performance for the Valid and Test, for the three network models. Valid shows higher performance because the validation images come from the remaining session of the database. The Test produces lower performance, due to the fact that the Test are generated by the cGAN T 2T . VOLUME -, 2020  For testing, MobileNet and Xception show similar and good performance while InceptionV3 shows the worst. Fig. 6 illustrates the performance of different network models when accepting identities from rank 1 to 10. For testing, the CMC curve shows that Xception and MobileNet models are similar in their performance, while the Incep-tionV3 model performs the worst. All the models perform slightly worse at the Test than Valid sets, which indicates that our cGAN T 2T model preserves most the identity information. We compared our results with Athira. S et al.'s work [36], which utilizes local binary pattern [37], pyramid histogram of oriented gradients [38], k-nearest neighbors [39], and support vector machine [40] for thermal face recognition with different schemes, with the highest being 91.0%. Our approach shows a higher recognition rate over most of the approaches. Table 3 reports the performance of face recognition using MobileNet model at different target temperatures in Test. Generated thermal images with target temperatures of 33.5, 34.0 and 34.5°C have similar performance in terms of TAR and rank-1 face recognition rate. The generated thermal images with target temperature of 33.0°C have slightly worse performance because of the insufficient amount of low temperature samples in the database. Table 5 reports the performance of face recognition using MobileNet model of each subject in both the Test and Valid set. A small fraction of subjects (6,7,8,9,11,12,20,39) have an obvious decline of face recognition on Test compared with Valid set. Most of the subjects have a similar performance in terms of face recognition.

C. T2T FACE RECOGNITION
We use cGAN T 2T to convert thermal images into thermal images with a given target temperature. Our generated thermal images show high structural similarity to the groundtruth thermal images. Similarly, the face recognition performance for each subject and each target temperature on Test with Valid sets indicates that our approach preserves the structure corresponding to the target temperature with high accuracy. The rank-1 accuracy of Test has a slight decline compared to Valid: 2.8% lower for Xception, 5.0% lower for MobileNet, and 7.1% lower for InceptionV3. This means that the generated thermal images almost completely retain the identity information.

VI. CONCLUSION
In this paper, we describe an approach to solution of two image-to-image translation tasks using cGAN: V2T and T2T. To convert visible to thermal images, we train our cGAN V 2T using the criteria such as cGAN loss and MAE loss. The FID of the generated thermal images reached 57.3. To convert thermal to thermal images with different target temperatures, we train our cGAN T 2T using the criteria of cGAN loss, perceptual loss and temperature loss.
We also used face recognition techniques to evaluate the generated images. For T2T conversion, we reached a high face recognition rate, which means that the generated thermal images preserve the subjects' identity. The main outcome of this paper is a proof of feasibility of the proposed technique to generate thermal images given a target temperatures, once either visible or thermal images of the subject are provided.
The proposed solutions address multiple applications from various fields. For example, in healthcare and medical sciences, one may need to model the thermal pattern distribution on subjects' faces given a target temperature. In surveillance, there might be a need to convert visible spectrum image stored in a legacy database to thermal, in order to compare with a probe image, acquired using thermal surveillance cameras, to determine identity of a person.