Lighting-and Personal Characteristic-Aware Markov Random Field Model for Facial Image Relighting System

In this study, we proposed an example-based facial image relighting framework to relight an unseen input 2D facial image from a specified source lighting condition to a target lighting. Such a relighting framework is highly challenging, since the highlight or shadow areas in the relighted image usually follow the specific facial feature characteristics of the input image. When facial images have to be lighted in a specified lighting condition style followed a few provided pairwise lighting examples, the problem becomes even more complex. In contrast to existing learning-based relighting framework that directly predicts the intensity of the target image, we use a personal and lighting-specific transformation to formulate the appearance correlation between source and target lighting conditions. We propose a lighting-and personal characteristic-aware Markov Random Field Model to estimate the transformation parameter of the input subject, which is integrated with additional classifiers to determine input facial regions should be enhanced or maintained. Experimental results show that the proposed kernel facial relighting model can avoid overfitting problem and generates vivid and recognizable results despite the scarcity of training samples. Furthermore, the relighted results successfully simulate the individual lighting effects produced by the specific personal characteristics of the input image, such as the nose and cheek shadows. Finally, the effectiveness of the proposed framework is demonstrated using a robust face verification test for images taken under side-light conditions.


I. INTRODUCTION
Fontal lighting conditions are an essential requirement for most face-related applications, such as face identification, recognition, and animation. However, in many real-world situations, frontal lighting cannot be guaranteed. Accordingly, the problem of developing systems for the automatic synthesis of facial images under different lighting conditions has attracted growing interest in the field of computer vision and machine learning in recent years. Furthermore, in some entertainment applications, there is a need for headshot portraits taken under particular lighting conditions to present a certain lighting style. However, developing an automatic facial image relighting system capable of reproducing the unique light and shadow style of a particular lighting condition is particularly challenging.
Histogram matching approaches (such as methods proposed in [1,17,25]) are typical image relighting methods, where the objective is to convert the statistical intensity distribution of the input image to that of the (target) reference image. Such methods work well in natural photography, but do not produce image with a light effect depending on specific image features or structure. Some well-known image lighting frameworks have been proposed to analyze specific image content or features with the help of machine learning models to determine the relationship of the image appearance between the source and the target lighting conditions. For example, the tone mapping method proposed in [14] integrates with a principal component analysis (PCA) clustering algorithm to consider the semantic lighting factors of image transformation tasks; the high dynamic range image generation method in [12] applies a convolutional neural network (CNN) to map input images to target outputs. However, lighting factors tend to be more complex, for example, the image intensity distribution of different images lighting varies under the same light source.
In this study, we propose a facial relighting framework for relighting a 2D image of a given source lighting condition in a target light whose style is defined by some pairwise light examples from different subjects. In order to generate vivid and recognizable results, the proposed framework meets the following constraints: (i) the personal characteristics of relighted images generated by the proposed framework should be close to the original input; (ii) The result must be visually similar to the target light conditions, such as following common shapes and positions of shadows and highlights, as indicated in the provided examples; (iii) The results of different subjects 'lighting and shadow effects should have different characteristics, depending on their personal characteristics.
Unlike existing relighting frameworks, which predict target intensity directly, we propose a lighting transformation based on personal and light-specific parameters to map source images to the target results. Here, the transformation parameters of each input subject are estimated using a novel proposed lighting-and personal characteristic-aware MRF model that introduces two classifiers to determine whether the input facial region that must be enhanced or maintained and to select its on-demand references for parameters estimation. The remainder of this paper is organized as follows. Section 1.A briefly reviews previous example-based relighting frameworks. Section 2 presents the formulation of the relighting transformation, and the proposed example-based face-relighting framework. Section 3 introduces the lighting and personal characteristic-aware Markov random field models to predict the personalized transformation parameters. Section 4 presents and discusses the experimental results. Section 5 provides some brief concluding remarks and indicates the major contributions of the present research.

A. RELATED WORK
There are several well-known learning-based models that have been successfully applied for image transformation tasks (the facial image relighting task is one of image transformation applications). In this section, we will discuss how different kernel models apply to face lighting tasks and their individual abilities to avoid overfitting problems.

1) KERNEL MODELS FOR RELIGHTING TRANSFORMATION
Broadly speaking, existing example-based facial image relighting frameworks can be broadly classified as either single-reference image-based approaches or multiplereference image-based approaches. Examples of single reference approaches are 3D face model-based frameworks in [5,26] and 2D image-based framework in [2,4,18]. These frameworks calculate the ratio of intensity between the input image and the reference image, and apply it to match the light and shadow distribution of the input to the reference image. The critical problem with this method is that it follows the unique characteristics of a reference image of a different subject, creating unrealistic and unexpected light effects.
On the other hand, some relighting systems proposed in [6,21,28] are leaned on based on multiple-references, namely training facial image datasets, where image appearance correlation between the source and target lighting conditions of these provided references is constructed by different kernel models. For example, kernel principal component analysis (kernel PCA) and Tensor PCA model are used in face relighting frameworks in [28] and [6], respectively, where the coordinates of these parametric-based model capture the significant facial and lighting features of provided reference samples, which are then used the feature for leaning the image appearance mapping between different lighting conditions. On the other hand, authors in [21] apply the Markov Random field (MRF) model to construct the mapping of image appearance between different lighting conditions in nonparametric manner, where in general, MRF-based model can produce clearer but less variational relighting results.
Recently, neural network-based generative models, such as the generative adversarial network and the autoencoder network, have been widely used for image transformation tasks. Authors in [34] and [27] applied the conditional hourglass network and the conditional U-net model, respectively, to relight facial images from the source lightingcondition to the target condition. Asymmetric joint GAN model and RelightGAN are proposed in facial relighting frameworks [8] and [29], respectively, where with the help of the discriminator model, the generator model can generate relighting results with higher quality. Generally speaking, the NN-based approaches can generate relighting results with more diversity appearances and more realistic image quality, but it required more training examples to learn the model parameters.

2) SEMANTIC CONSTRAINT FOR IDENTITY PRESERVATION AND LIGHTING CONDITION
In order to make the results preserve the personal characteristics as the input instead of the same as the provided reference(s), some existing leaning-based relighting frameworks include the identity consistency constraint for avoiding the overfitting problem. For example, in the relighting frameworks [2] and [18], the input facial image is decomposed as multiple image scale components, then only parts of the image components, which are manually determined, are enhanced while others are retained to preserve the personal characteristics of the input subject. Facial relighting systems in [15,16] proposed to calculate facial appearance maps or task-aware masks first, and then use to guild the relighting process, i.e. determine different facial regions should be enhanced or preserved according to lighting and identity semantics.
For the NN-based frameworks, GAN-based relighting frameworks in [8,29], similar to the well-known cycleGAN [35] and PIX2PIX [11] approaches, utilize the concept of the bi-direction transformation to include the additional identity and lighting semantic loss function to improve the relighting quality. In contrast, autoencoder-based relighting systems in [30,34] included additional lighting condition labels to improve model stability; the facial relighting framework [9], similar to frameworks [2,18], adopted multiple image scales as features for training NN-model, where each image (scale) component concentrate specialized facial features.

3) PROPOSED LIGHTING-AND PERSONAL CHARACTERISTIC-AWARE MRF MODEL KERNEL APPROACH
Generally speaking, the performances of existing examplebased kernel models for image transformation/synthesis tasks are highly dependent on the collected training reference samples. However, it is difficult to collect pairwise facial images that are properly lit in both the source lighting condition and the target lighting condition to train the transformation model in a supervised way. In this study, we proposed a lighting-and personal characteristic-aware MRF model for the lighting transformation task based on a limited size of training dataset. To avoid the overfitting problem; we do not estimate the intensity of target result directly, instead we included a simple linear relighting transformation for the relighting purpose, where the personalized parameters of relighting transformation are dynamically estimated for each input individual. This preserves the facial structure of the input because each input grayvalue is enhanced instead of being replaced by a pattern from another subject.
The proposed MRF models the correlation between input appearance and its corresponding transformation parameter in a patch-based manner, such correlation being constructed with multiple references in a non-parametric way. Especially the transformation parameters of the different regions of the face are not completely derived from the training reference; on the contrary, the parameters of some regions of the face are calculated directly from the appearance of the input. Such ondemand reference selection process for different facial regions are guided by their induvial identity and lighting sematic meaning. Therefore, the relighting image of the face is visually similar to the appearance of the face in the target lighting condition, while maintaining the personal characteristics of the input subject. Accordingly, the relighting results provided a useful foundation for downstream face verification/recognition processes.

A. BAYESIAN FORMULATION FOR FACIAL IMAGE RELIGHTING
This study develops an example-based facial image relighting system capable of relighting the appearance of an input 2D facial image, , from a specified source lighting condition X to a target lighting condition Y, , . We model the mapping function form the source lighting condition X to the target condition Y at the pixel position ( , ) by using a simple linear transformation by: (1) In (1), the bias values, ̅ ( , ) and ̅ ( , ), are taken as the average component values within patch regions centered at coordinates ( , ) in image and image , respectively; the gain ( , ) is defined as the ratio between the unbiased source component value and its corresponding target component value followed by processing a Gaussian filter.
Instead of predicting the target pixel value ( , ) directly, our framework predicts the personalized transformation parameters for different input, i.e. the enhancement matric ( , ) and ̅ , ( , ) of all pixels, in this way, the facial structure of input source image , can be better preserved. We seek the optimal solution to maximize the posterior probability ( | ) by personalized enhancement matric * and ̅ * , i.e., To carefully model (2), we follow the multiscale relighting framework in [18], and simply decompose a facial image as its global component and the local detailed component facial model by: where ( ) is the Gaussian blurring function with kernel size . is a smooth version of the original input capturing the overall global face shading effect, e.g., lighting or shading regions of an image distribution; the residual between and the is then taken as the local detailed component, , which indicates individual local detailed features, such as clear shadow edges or facial feature contours. We let the posterior probability ( | ) in (2) as the energy function ( | , ), and the solution strategy is naturally divided into these two components: where two maps, a global map and a local detailed map , are included to indicate the identity and lighting condition of different facial regions.

B. PROPOSED RELIGHTING FRAMEWORK
As shown in Figure 1, to solve (4), the proposed relighting framework decomposes the unseen input facial image, , , under a particular source lighting condition, X, as its global lighting distribution component , and the local detailed feature component , . Consequently, two adaptive maps, and , of the input , are detected by two classifiers, ( ) and ( ), to expose his/her lightingaware or personal characteristics-aware regions when changing lighting condition from X to Y. We proposed lighting-and personal characteristic-aware Markov Random Field models, − and − , for predicting the transformation parameter matric in (1)

III. LIGHTING AND PERSONAL CHARACTERISTIC-AWARE MARKOV RANDOM FIELD MODELS
The present study uses the YaleB [7] database for the purposes of training classifiers, ( ) and ( ), and also for construct the correlation between the input appearance and its corresponding personalized parameters in / − / . In particular, N pairwise samples from N different subjects are collected in the dataset, i.e., , are generated for modeling (4), and each subject's , / can be transformed by its personalized parameter matrix ( / , ̅ , / ) (formulated as (1)).

We train the identification and lighting condition classifiers
( ) and ID ( ) by using Adaboost algorithm. There are three main training processes, namely the generation of the feature pool where possible facial appearance patterns in target conditions are collected, the weak classification construction that selects important feature from the pool for classification tasks, and the weak classification ensemble processes where the strong ( ) and ID ( ) classifications are formed by the selected weak classifiers.
In contrast to the lighting condition classification framework proposed in [3], in which several lightingsensitive facial regions and lighting contrast patterns are manually specified, we define the feature pattern pool in a data-driven manner for ( ) .
where ( ) is the feature representation of based on VOLUME XX, 2017 9 the feature ; ( ) is the patch location, where is extracted from, i.e.
( ) is located at one of grids in , in Fig. 2(a); ( ) is calculated as the maximal correlation feature response of by applying to at neighborhood regions of N( ( )); the threshold is leaned from the training dataset . These M weak classifiers are the ensemble of the lighting classifier ( ), and the facial regions of these M selected features, ( ( )), are located at lighting-aware facial regions. The learning process of the boosting algorithm of ( ), is similar to that of ( ); The feature pool contains patterns of local detailed component facial patches, which are from N training samples { , }at different facial positions. M' patch patterns are selected from to classify the identity of the pairwise sample ( , ) under two different lighting conditions. The weak identity classifier ℎ m′ (( , ); , ′ ) of the selected feature ′ ∈ is formulated as: here the feature response value, ′ (( , )) is measured in two terms; the outer-max term evaluates the correlation between the feature pattern ′ and the facial patch of cropped from the ( ′ ) neighborhood regions with a maximum value of correlation, and the inner max-term evaluates the image patch similarity of this pairwise sample ( , ) . Note that, similar to (5), the similarity scores in (6) (7) where if the pi'th facial region is located at one of M selected lighting discrimination positions, ∈ ( () ), the corresponding weak classifier ℎ () is applied to determine the lighting condition label of , . On the other hand, the map ] of a given pairwise local detailed component, ( , ) is determined by ID (), the pi'th map entry , is denoted as: where, if the pi'th facial patch is located at one of selected identity discrimination positions, ∈ ( ()) , the corresponding weak classifier ℎ′ () is applied to determine the identity consistency of the pairwise input at this region position.
According to (7) and (8), two types of facial regions are defined: (i) the fixed-type regions (corresponding to / entries equal to 0): It means the corresponding facial region in the input image (or the initial result) is already classified as the appearance taken under lighting condition Y, thus the result patch can be calculated directly from the input for preserving the facial feature characteristics of the input subject, and (ii) the inference-type regions ( / entries equal to 1): as their name indicates, the appearance of these regions should be enhanced, and their personal-specific parameter matrices for relighting transformation are to be determined from the training refences. ).
-Analysis of : the entries corresponding to , = 0 (in (7)) are marked as the white dots, which means that the facial region of , at this position is already similar to the facial appearances in the target lighting condition Y, even though , is form the lighting condition X. Since the effects of lighting on such facial features are observable, only some regions of , are miss-classified as from the lighting condition Y (white dots).  [13,22,31,33], where the MRF graph is usually static for all testing subjects, the recognized maps and change the static MRF graph into a personalized version, as shown in Fig. 2(b). Under the patch-based MRF formulation, the energy function ( / | / , / ) in (4) becomes: where the given detected is used to determine if the pi'th facial region should be enhanced or not. The data term, ( ), corresponds to the likelihood probability of Bayesian formulation (2), which evaluates the correlation between the observation input patch and its corresponding hidden parameter matric ( / , and ̅ / , ) to be estimated; the smoothness term ( ) corresponds to the prior probability in (2), which is used to evaluate the fitness score of the hidden parameter matric of two adjacent patches ( is an indicator function; we intend to evaluate the candidate's fitness score of the inference-type node in the first term, and for the second term for the fixedtype node; we let the constant equal to 1; is determined by cross-verification for normalization purposes;Φ x () is a feature mapping function that map a patch candidate ( , ̅ ) , for example selected from k'th training sample, to its the corresponding patch in the source lighting condition X, , . The smoothness term ()uses to evaluate whether the parameter metrics of two adjacent nodes and , are well overlapped: where ( ) is used to define the overlapped region of these two adjacent patches. Note that, the normalization terms and ′ in both (10) and (11) play important roles in the prediction result; please refer to APPENDIX.  two strategies, the on-demand candidate selection for hidden nodes for face modeling, and the optimization process to select optimal result for the final enhancement matrix prediction.

1) NON-PARAMETRIC FACE MODELING AND INFERENCE PROCESS
For each inference-type region (see two orange illustrated patches in MRF in Fig. 4), their induvial ( , ̅ , ) candidate patterns are obtained from the , 's K relevant reference samples, and the map is used as the query feature for the . In the case of the pi'th reference-type node, the k'th relevant training sample represents an observation and hidden relationship pattern i.e. ( , , ′ , , , ′ ( , ′ , ̅ , , ′ )), and the pattern location of , is the ′ facia region, and the patch , , ′ appears to be the closest to the current observational patch ,test , . On the other hand, for fixed-type nodes (five green patches in Fig. 4), the facial appearance ,test in these facial regions are already classified as the target lighting condition Y. Therefore, the hidden pattern , in the pi'th fixed-type node is defined as a matrix of 1, meaning that it does not scale on this patch; and the ̅ , .
is calculated directly as the corresponding patch ̅ , , , so that the result intensity does not change. Once each node has its on-demand candidate(s), the optimal result of structural matric ( , ̅ , ) are determined by using the belief propagation (BP) algorithm [22] to maximize (9) with constraints (10) and (11). The overall algorithm for the prediction processes of the lighting enhancement matrix, ( Table. 1.   5 illustrates the detected maps and the relighting results for two randomly selected testing subjects under two specific lighting conditions, c3 and c5 (these conditions are defined in Fig. 7). Especially, for each lighting condition case, both the relighting case and the lighting removal case are provided, and the results of both cases are generated using the proposed algorithm, but the definition of the source and target lighting conditions of the relighting case and the removal case are exchanged. A detailed inspection of the various images reveals the following abilities and properties of the detected semantic facial regions: -Analysis for reference sources for facial regions with VOLUME XX, 2017 9 and : For all lighting conditions, the discriminative areas (i.e., white and red dots) of are mainly concentrated in the shadow areas of the face, where inference-type nodes (red areas) represent a greater percentage of the total number of nodes. That is, most relighted regions of ̂, , must be derived incorporated into training samples. On the contrary, the discriminating areas of are concentrated mainly in the facial features (e.g. eyes or mouth), with about 50% of fixed-type (white) and inference-type (red) nodes. In other words, certain facial regions of result ̂, are calculated directly from the input , to retain the personal characteristics of the input subject.

2) DISCUSSION
-Create personal and lighting-specific effect: Generally speaking, human shadows and highlights are very sensitive to individual characteristics. In Fig.5, and differ depending on the specific lighting condition and the unique characteristics of the input subject, so that the results generated by the proposed framework not only reasonable but also more diverse. Consequently, the relighting results reflect a personalized shadow effect, while the removal results successfully maintain the facial feature appearance of the input facial features (see the second, fourth, and sixth rows). Note that, for each prediction result , shown in the last row, its corresponding ground-truth can refer to the image , shown in the first row of the opposite lighting transformation case. However, some artifacts still exist in the final results in the case, if the input image with large shadow areas (e.g., cases c5 to c0). where for a fair comparison, each of these two methods combines the global and local components processes individually and merges them. For visualization purposes, the first and second columns for each subject show the relighting result and its particular region result obtained after increasing the contrast, respectively. Fig. 6(a) shows ̂, of three testing subjects, estimated results produced by the proposed approach (M1) show a more significant lighting contrast around facial features than those produced by the traditional MRF model (M2). In Fig. 6(b), the corresponding estimated ̂, components of these testing subjects are shown, where the estimated facial features produced by the proposed M1 approach are closer to the original input image characteristic than those produced by M2. The final results ̂, , which integrate the global and local components, are shown in Fig. 6(c). The proposed method generates better and clearer mouth shape in the first two subject than the typical M2 method. Furthermore, there are still some artifacts in the synthesized result of the right eye region produced by M2 method. The present study uses the YaleB database for training and testing purposes. The database contains 38 individuals, 32 of whom are randomly selected to collect the training sample images. while the remaining 6 subjects are used for testing purposes. For each image, an affine transformation process was performed to geometrically align the image with a 192×168 image size using three reference points. This present study considers 9 different lighting conditions, namely the normal (natural) lighting condition (referred to as c0) and eight different lighting conditions (referred to as c1~c8), as shown in Fig. 7. In the configurations of the MRF model and the classifier, each image is decomposed into overlapping patch grids, with image patch sizes of 20 * 20 and the overlapping distance between adjacent patches set to 10 pixels. In particular, and ′ in (10) and (11) play an important role in MRF relighting modeling, and are determined by cross-verification (see APPENDIX); In this study, , and ′ of the estimation process are set to 1.6 and 1.3, ̅ , to 1.6 and 1.2, to 2.7 and 2.3, and ̅ , to 2.7 and 1.4. In addition, the search window for the MRF modeling and the classifier learning processes are defined as the border area of 5 pixels expanded from the center of the image patch. Finally, for the inference-type MRF nodes, the number of state candidates required for MRF optimization was set to 5.   Overall, these images support the following observations:

1) VISUALIZATION ANALYSIS OF THE RELIGHTING RESULT UNDER DIFFERENT LIGHTING CONDITIONS
Results on different subjects have personalized, not monotonous lighting effects: Each row of Fig. 8(a) shows that shadows of each synthesis result vary greatly depending on facial characteristics, contours, and positions of a particular light source. For example, in c3, the shapes of shadow in the nose regions differs significantly from that of  Fig. 8(b), each column shows that each subject's restored images from different lighting source condition successfully preserve the input subject's particular facial characteristics. However, under source lighting condition c8, since the input image has large areas of facial shadows, the light removal results contains some artificial ghosting effects, e.g., the eyes of the third subject lost the geometric symmetry.
-Distribution of real and relighted images under different lighting conditions is consistent: Fig. 9 analyzes the image relighting results obtained using the proposed framework by projecting the relighted images on the manifold space constructed by the real training images from all 9 different light conditions. Fig. 9 shows different symbols for the results of the lighting and the removal of testing subjects displayed in Fig. 8(a) and (b), which are spatially close to the real image distribution, which is marked in different numbers.
Overall, the results presented in Fig. 9 and Fig. 10 confirm that the proposed framework ensures the individual's lighting or shadow areas are visually different from each another. To be specified, the relighted images under the same lighting condition in Fig. 9 are different, as are the distribution of each symbol in Fig. 10. At the same time, the distributions of each symbols (results of a particular lighting condition) are closed to the distributions of its corresponding ground-truth lighting condition to other lighting conditions. Figures 10, 11,12 compare the relighting performance of the proposed method with that of 5 existing example-based relighting models, indicting PCA+KCCA (the kernel approach in [10]), PCA+DDCM (the kernel approach in [23]), 2DDCM (the kernel approach in [24]), CVAE [20] (the deep leaning-based kernel approach in relighting framework [8]), and MRF approach (simplified kernel approach in [21]). All of these 4 methods are trained using the same training data set as the proposed approach. We also include two relighting models, the histogram matching (the kernel approach in [25]) and headshot [18], which require a user-specified reference rather than a training dataset. We manually select the most suitable image for each testing subject as the reference.

2) COMPARISON WITH OTHER RELIGHTING MODELS
-PSNR/SSIM quality assessment for relighting results: As shown in Fig. 10, the proposed M G/D -MRF, MRF, 2DDCM, and CVAE models have higher performance than the other models in terms of SSIM/PSNR value, and their visual comparisons under the lighting condition c3 are shown in Fig. 11. (a visual comparison of MRF model is shown in Fig.6). Although the results of the 2DDCM model are highest in SSIM/PSNR score of Fig.10, but our results are more detailed and clear contours than those 2DDCM/CVAE model.

-Light removal results for the robust facial recognition:
The performance of the various relighting methods was further evaluated by the top 10 ranking performance of the face retrieval process in Fig. 12. Different relighting models are used to transfer images from side lighting conditions (c1~c8) to a natural light condition (c0), which are used as queries for the face identification database. Note that the retrieval process was performed in the PCA subspace, and the similarity of the query images to the database images was evaluated by the Education distance. The face recognition environment was conducted using two databases, namely: Face database: the database containing all 38 subjects from the training and testing databases form the normal light condition c0, each image has its own annotated identity. Query database: the database contains 48 query images of 6 testing subjects under eight lighting conditions, and different lighting models are then applied to generate their removal images for comparison with the database image. -Discussion for model performance: The results presented in Fig. 12 show that the proposed relighting method yields a better retrieval performance than the other methods. In particular, the retrieval results are consistent with the PSNR results presented in Fig. 10(b). Fig. 12 confirms that the proposed approach can retain the individual characteristics of input images since the sematic notable regions of the testing subjects are automatically and adaptively detected and integrated into the proposed lighting model. Results produced by 2DDCM/CVAE methods also achieve good retrieval  Identity retrieval performance when using different relighting methods to remove lighting effects. performance since they preserve the complete 2D facial geometry of the testing image. In contrast, the headshot and histogram matching methods only consider a single reference image, and hence the relighting results tend to be similar to the reference image. As a result, their retrieval performances are degraded. Finally, the PCA-based CCA and DCM methods both consider multiple learning samples for the relighting process. However, they suffer from an overfitting problem, which destroys the geometric characteristics of the facial image and degrades the retrieval performance accordingly.  To further evaluate the overfitting issue, two additional testing procedures were performed to test the proposed approach.

3) ROBUSTNESS OF THE PROPOSED APPROACH
-Cross-dataset testing: The training data set was taken from the YableB database, while the testing samples were taken from the CMU database [19], where the same geometry normalization operation was applied to the images in both datasets. The corresponding relighting results are presented in Fig. 13. It is seen that even when the testing images are taken from a different database than that used for training purposes, the proposed framework still produces relighting results with realistic shadow patterns and facial features that resemble those of the input image. However, some obvious artificial noise is still evident in some of the predicted results (marked by red circles) due to inconsistencies between the images in the two databases.
-Facial appearance-inconsistency testing: In Fig. 15, training and testing datasets were both compiled from the YableB database. However, the testing images were rendered geometrically inconsistent with the training images by introducing resolution, geometry, out-of-plane, and facial expression variations. It is seen that the proposed method achieves a reliable prediction performance even when geometric inconsistencies exist between the testing images and the training images. We infer that our proposed MRF graph model is adaptively changed, and hence an acceptable relighting result is achieved. A detailed inspection of the global and local adaptive maps, and , confirms that the extracted maps are changed following these image inconsistencies.

V. CONCLUSION
This study has proposed a novel lighting-and personal characteristic-aware Markov random field (MRF) framework for relighting facial images. The experimental results have shown that the proposed framework has three main advantages compared to existing example-based face relighting schemes. (i) the adaptive MRF graph model ensures that the results are not only similar to the training samples under the target lighting conditions, but also preserves the individual personal characteristics of the system input. (ii) The proposed framework can produce a relighting image based on input the facial characteristics, so the results are not only reasonable but also more diverse, i.e. the distributions of the produced shadow and highlight is not monotonous. (iii) The proposed model provides acceptable results for unseen test subjects, despite the limited number of training samples. This is because the proposed framework detects areas that need to be enhanced based on system input facial characteristic and then use these areas to dynamically change the MRF graph to select the on-demand reference for each patch. Overall, the proposed model can be applied to the image transformation tasks, especially for the supervised leaning situations where only limited training pairwise samples are available. In a future study, the capabilities of the proposed framework will be extended to a wide variety of face synthesis/transformation applications, including nonphotorealistic rendering, super-resolution, and facial expression synthesis.

ACKNOWLEDGMENT
This work was supported in part by the Ministry of Science and Technology of Taiwan under Grant MOST 109-2221-E-005 -056 -MY2.

APPENDIX:
For most MRF models [22,31,32], the data term and the smoothness term play important roles in determining the optimal hidden node state. To define suitable and discriminative values, and ′ in (10) and (11) for selecting discriminative candidates, a cross-verification process is applied. In practice, for example, to determine the appropriate value of , 30 different values of (ranging from 0~3) are considered. For each value, we construct a covariance matrix with size (N: the number of training samples), and its entry (i, j) is calculated by the average similarity values of patches for the i'th and j'th training subjects, where the patch similarity value is defined using the formulation in (10) and (11) based on this specified / ′ value. Finally, the correlation matrix of a given / ′ with highest entropy value is the optimum. Fig. 15 shows the entropy curves of the data (likelihood) () and smoothness (prior) () terms for different / ′ values, where the gain and base matrices ( / or ̅ / ) are evaluated separately. Note that, in each case, the horizontal axis represents the different or ' value, while the vertical axis shows the corresponding response entropy value. In general, the higher the value of entropy, the more discriminatory it is to select training sample for modeling construction (refer to the three covariance matrices in the first subfigure, (; ), where the optimum is marked by red circles).