Vein Recognition Algorithm Based on Transfer Nonnegative Matrix Factorization

Most of the existing vein recognition algorithms are only effective for specified datasets, and once replacing the vein image acquisition device, i.e., the properties of the collected vein images are changed, the performance of the algorithm will be degraded greatly. Therefore, a transfer Nonnegative Matrix Factorization (NMF) based vein recognition algorithm is proposed, which makes vein features more universal. Its contributions are mainly reflected in the following two aspects: 1) The orthogonal constraint is imposed on the model to reduce the redundancy between feature bases and increase the difference between the features of different veins; 2) The differences between the vein features in different datasets are reduced based on Maximum Mean Difference (MMD) constraint, i.e., the knowledge of the source dataset is transferred to the target dataset well, and the universality of vein features can be improved. Experimental results show that the proposed algorithm outperforms state of the art methods on two dorsal hand vein datasets and two finger vein datasets.


I. INTRODUCTION
Since vein recognition has the advantages such as living recognition, internal feature, non-contact acquisition and special light source at the same time, it has become an important biometric recognition technology [1]. To make a vein recognition system work well, the common method is that a vein image acquisition device needs to be selected firstly which can be represented by Device A; then, a large number of vein images are collected using the device and annotated manually; in final, the proper feature extractor and classifier can be acquired through training. If we want to build another vein recognition system, a new acquisition device is needed which can be represented by Device B, and the other processes are the same as the above system. As we know, the work of collecting and annotating samples is very expensive for a new system, if the feature extractor and The associate editor coordinating the review of this manuscript and approving it for publication was Byung-Gyu Kim . classifier based on Device A can be transferred into the system based on Device B, it is very easy to make the new vein recognition system work well. However, there is no unified image acquisition device for vein recognition currently, and the properties of the collected images by different devices may be different from each other, therefore, we think that the feature extractor and classifier based on Device A are difficult to be suitable for the system using Device B. To solve the above problem, we proposed a novel vein recognition algorithm with feature transferability, which can transfer the knowledge from the source recognition system to the target recognition system by using a small number of samples collected by the target system. Its innovations mainly include the following two aspects: 1) Orthogonal constraint is imposed on the NMF model to reduce the redundancy between feature bases and the feature difference between veins; 2) The universality of vein features can be further improved by reducing the feature distribution differences between datasets.
The remainder of this paper is organized as follows. In section 2, some related works are addressed. A novel NMF model for vein recognition is proposed in section 3. Section 4 uses a projected gradient algorithm to solve the proposed NMF objective function. The iteration is proved to be convergent in section 5. We prove the effectiveness of the proposed algorithm through experiments in section 6. Finally, the conclusion is drawn in section 7.

II. RELATED WORK
The existing researches on vein recognition can be roughly divided into the following two categories: 1) Vein recognition based on gray features. It is very common to extract effective gray features by analyzing the high-frequency information of the vein images, where the high-frequency information can be acquired based on multiscale analysis theories which include traditional wavelet transform [2], [3], Bandelet transform [4], Gabor transform [5]- [10], Curvelet transform [11], Contourlet transformation [12], [13], Histogram of Oriented Gridients (HOG) operator [14], spatial curve filtering [15], ridgelet transformation [16], Scale-Invariant Feature Transform (SIFT) [17] and some improved Gabor transform methods [18]- [21]; In recent years, some deep learning methods have also been gradually used in vein recognition such as deep neural networks [22] and convolution neural networks [23], [24]; In addition, the gray statistical distribution based methods were also verified to be effective such as intensity distribution [25], hierarchical hyper-sphere model [26], sparse representation [27], gradient distribution [28], etc. 2) Vein recognition based on points and curves. The effective vein features can be also extracted from the key points and vein curves in the images, where these features include the location relationship between intersections and endpoints [29]- [32], image corners [33], the shape of vein curves [34]- [37], and multiple coding modes of vein curves [38]- [43], etc.
Experimental results show that these algorithms have achieved good recognition effect on the specified datasets, however, it is rare to discuss whether the algorithms can be applied to other datasets and have transferability. Therefore, it will be of great significance to propose an effective vein recognition algorithm which is universal for multiple datasets.

III. TRANSFER NMF MODEL WITH ORTHOGONAL CONSTRAINT
The sizes of the vein images collected by different devices are different from each other, so in order to ensure the consistency of the dimensions of vein features, the collected vein images should be normalized in size firstly. In our algorithm, all regions of interest of vein images are extracted using the method in [13] and normalized to M × N pixels using bilinear interpolation. Then, the original feature f M ×N can be obtained by reshaping the image matrix to the vector.
For many recognition problems, it is very critical to establish a set of effective feature bases by using dimensionality reduction. As we know, the decomposed elements, which are obtained by some classical dimensionality reduction methods such as principal component analysis (PCA), linear discriminant analysis (LDA), etc [44], can be positive or negative. Though the negative elements can be acceptable from mathematical perspective, they are difficult to explain in some cases, for example, the pixels in basis image can't be negative. Therefore, it is appropriate to use NMF model for feature dimensionality reduction, where the principle of NMF is shown in (1), where the columns of F, U and V represent the original feature vectors, basis vectors and coefficient vectors respec- To improve the accuracy and universality of the recognition algorithm, it is not enough to impose only nonnegative constraint on the decomposition, so we will impose the orthogonal and transferable constraints on the decomposition besides negative constraints, where the reasons are as follows:

A. TRANSFERABLE CONSTRAINT
For the specified datasets, there are some existing NMF models which can obtain effective feature bases and feature vectors [14], but once the dataset is changed, the performances of these algorithms will be degraded significantly, the reason for this result is that the marginal distributions, i.e., the feature distributions, of the vein images collected by different devices are different, and if the relationship between different datasets is not taken into account in the model, the vein features obtained from the specified dataset will lack of universality. Therefore, imposing transferable constraint on the model will improve the universality of features well.
Suppose the dataset with labeled vein images as source domain D s , in which the original feature of the ith vein image is f i s , and the dataset established by another device is set as target domain D t , where the original feature of the jth vein image can be represented as f j t . From the above analysis, the marginal distributions of the vein images in source and target domains, i.e., the feature distributions in two datasets which are built using different acquisition devices, are different, P f s = P f t . To share the extracted features under different datasets, we hope that the vein feature distributions in the two domains are as similar as possible after decomposition, i.e., P (v s ) ≈ P (v t ), where v s and v t are the coefficient vectors, i.e., the new feature vectors in source and target domains. Therefore, in the improved NMF model we use MMD to measure the similarity of feature distributions as shown in (2), where n 1 and n 2 are the numbers of the samples selected from source and target domains, and the objective function can be improved to (3), and α is a balance factor.

B. ORTHOGONAL CONSTRAINT
The unique features of each vein are very helpful for accurate recognition, and we hope that at the same time of reducing the redundancy between feature bases, our proposed method can increase the difference between different vein features after decomposition. Therefore, in the model we impose orthogonal constraints on the decomposed basis vectors and coefficient vectors, where the orthogonal constraint of the basis vectors can reduce the redundancy between feature bases, and the orthogonal constraint of the coefficient vectors can increase the difference between different vein features. The objective function can be further improved as shown in (4), where β is another balance factor. To solve the objective function conveniently, (2) can be transformed as follows: A s = a n 1 0 n 2 T , A t = 0 n 1 a n 2 T , where 0 n = 0 0 . . . 0 1×n , a n = 1/n 1/n . . . 1/n 1×n , and (5) and (6) can be derived, then, it can be concluded that (2) is equivalent to (7), and the objective function can be rewritten as (8).

IV. OBJECTIVE FUNCTION SOLUTION AND VEIN RECOGNITION A. OBJECTIVE FUNCTION SOLUTION BASED ON GRADIENT DESCENT METHOD
In our method, the gradient descent method is used to solve the objective function. First, we need to solve the partial derivatives of the function J (U, V ) with respect to the variables U and V as shown in (9) and (10), where Then, U and V can be optimized using the iterative methods as shown in (11) and (12) according to [14].
After obtaining the updating manners of U and V , the optimal parameters U * and V * can be solved as shown in Algorithm 1: Algorithm 1 Parameters optimization process 1. Input: the original feature matrix F, the model parameters r, α and β, the error threshold ξ , and the maximum number of iterations N max .

B. VEIN RECOGNITION BASED ON COSINE DISTANCE
Since the samples of each vein are limited, some common used classifiers may fall into local optimum during optimizing parameters such as support vector machine or neural network. In addition, if the input image does not belong to any vein in the dataset, the classification result output by the above classifiers will be wrong. From the above analysis, we will output the recognition result by using the nearest neighbor method based on cosine distance, where the recognition process is shown in Algorithm 2:

V. PROOF OF CONVERGENCE
To prove the convergence of (11) and (12), it is necessary to introduce some auxiliary functions, which need to meet (14). Proof: According to (13), (14) can be deduced easily.
Now, the proof of Lemma 1 is complete. From Lemma 1, we know that (11) and (12) can be proved to be convergent through defining appropriate auxiliary functions.
Lemma 2: Suppose that U is regarded as a separate parameter in the objective function, (15) can be defined as the auxiliary function.
Proof: The partial derivative of the function J (U, V ) with respect to the variables U can be solved as (16), then, (17) can be obtained according to Taylor series expansion, and (18) holds, therefore, J (u) < G u, u ij , and it is concluded that Lemma 2 is correct. Lemma 3: Suppose that V is regarded as a separate parameter in the objective function, (19) can be defined as the auxiliary function.
Proof: The partial derivative of the function J (U, V ) with respect to the variables V can be solved as (20), then, (21) can be obtained according to Taylor series expansion, and (22), (23) and (24) hold, therefore, it is concluded that Lemma 3 is correct.

VI. EXPERIMENT RESULTS AND ANALYSIS A. DATASETS
In the experiments we use the following 4 vein datasets:  Fig.1.

B. PARAMETER SETTINGS
To achieve the best recognition performance, it is critical to choose the appropriate model parameters. In the proposed method, some parameters will be set based on experience as shown in Table 1, and others are obtained through experimental results such as r, α and β. After giving the experience-based parameters, we need to solve the other parameters through experiment, i.e., if the best recognition performance can be achieved under the specified parameters, the parameters can be considered the optimal parameters. Since we hope that the proposed method not only can achieve high recognition accuracy, but also has good transferability, the objective function of representing recognition performance should measure the two capabilities well as shown in (25),  where R o and R t represent the capabilities of accuracy and transferability respectively, which can be acquired through the two following experiments: Experiment 1: To measure the recognition accuracy, the training and test data are from the same dataset; Experiment 2: To measure the transferability, the training and test data are from different datasets. In the experiments, two images of each vein need to be chosen, one is the input image, the other is the image to be matched, and R o and R t represent the average recognition accuracies in Experiment 1 and Experiment 2 as shown in Table 2 and Table 3, where the recognition result is obtained using Algorithm 2.
From Table 2 and Table 3, we can see that when r/(M × N ) = 0.3, α = 1, and β = 1, the best recognition performance can be achieved.

C. COMPARISON AND ANALYSIS OF ALGORITHMS
After obtaining the model parameters, we will do the comparative experiments from the following three aspects.
1) First, to prove the importance of the constraint term H (v s , v t ) which represents feature transferability in the objective function, we remove the constraint term H (v s , v t ) VOLUME 8, 2020  temporarily and obtain a new objective function as shown in (26), and the new feature vectors in the datasets can be acquired based on (26), where all parameters of the new model are the same as the original model except the parameter α. As we known, t-SNE is an effective method for measuring the similarity of feature distributions, therefore, in the experiment the difference of vein features between the source and target domains will be visualized using t-SNE in Fig.2.
We can see that the marginal distributions of the vein images in two datasets are significantly different as shown in Fig.2(a) and Fig.2(c) without the constraint term  H (v s , v t ), however, the difference almost disappears as shown in Fig.2(b) and Fig.2(d) when imposing H (v s , v t ) on the model, where the coordinates of each point represent the features after dimensionality reduction. Therefore, the transferable constraint term can reduce the feature difference between different datasets effectively, i.e., it is concluded that  by using the proposed method the extracted features have good universality for multiple datasets.
2) Then, we will prove the proposed algorithm has good performance by comparing the experimental results of different methods, where the performance can be measured by both the recognition accuracy and the algorithm transferability. Since each recognition algorithm has its own matching rule during recognition, if we want to compare the recognition results of these algorithms, it is very necessary to define an unified matching rule, where the follow matching rule is adopted in the experiment. First, the images which represent the same veins and different veins are selected as the positive samples and negative samples; Then, the classification results, i.e., the probability distributions that the testing sample is recognized as each vein, can be obtained based on different recognition algorithms; In final, each pair of testing samples can be judged to match each other or not through calculating the cosine distance of their classification results.
To compare the recognition accuracies of different algorithms, the training and testing samples should be come from the same datasets, therefore, we use different algorithms for training and testing on Dataset 1 and Dataset 2 respectively firstly, and the recognition performance can be measured by the FAR(False Accept Rate)-GAR(Genuine Accept Rate) curves, where the experimental results are shown in Fig.3, where GAR_Aver means the average value of the genuine accept rates which are obtained based on Dataset 1 and Dataset 2 respectively. It can be seen from Fig.3 that the recognition performance curves of different algorithms are relatively similar, which indicates that when training on a single dataset, most of recognition algorithms can achieve good recognition results. VOLUME 8, 2020  After comparing the recognition accuracies, we will analyze the transferability of these algorithms, i.e., we want to know that whether the knowledge obtained in one dataset can be suitable for another dataset or not. In this experiment, we will select the images of N s veins in Dataset 1 and 0.1N s veins in Dataset 2 as the training samples, then, the images which represent the other veins in Dataset 2 are used to be testing samples, the experiment process can be expressed as the symbol D 1 → D 2 , where D 1 and D 2 represent the source and target datasets respectively. Besides D 1 → D 2 , we also carried out D 2 → D 1 , D 3 → D 4 , D 4 → D 3 , and the experimental results are shown in Fig.4.
It can be seen from Fig.4 that when the knowledge obtained from the source dataset is transferred to the target dataset, the proposed algorithm can still achieve a high recognition accuracy, but the performances of other algorithms are degraded to varying degrees. it is concluded that for the recognition system which use a new vein image acquisition device, we do not need to re-collect and label a large number of samples, but can transfer the recognition algorithm of another system into this system only using a small number of samples.
To further compare the transferability of different algorithms, we will adjust the number of the samples which are selected from the target dataset, and acquire the transferability variation of different algorithms as shown in Table 4, where N t is the number of the samples which are used in training, and the elements represent the genuine accept rates when the false accept rate is fixed as 0.1.
It can be seen from Table 4 that these existing algorithms can't achieve good recognition performance on the testing images until the ratio of the number of training samples in the target dataset to in the source dataset is close to 50%. Obviously, the proposed algorithm only needs a small number of the samples in the target dataset, and can achieve ideal recognition effect.

VII. CONCLUSION
To improve the universality of vein recognition for multiple vein datasets, i.e., vein image acquisition devices, we propose a novel vein recognition algorithm with good transferability. Its most significant contribution is that for the vein recognition system with new image acquisition device, we can obtain the effective features using only a small number of images collected by the device without establishing large-scale dataset. Experimental evidences show that the proposed approach outperforms state of the art methods on standard vein datasets in terms of transferability. However, although some good results have been achieved, there are still some problems to be solved. For example, to further improve the universality of the algorithm, it is necessary to increase the sizes of datasets in future.  He is currently an Associate Professor with the College of Information Science and Engineering, Northeastern University. His research includes at the intersection of machine learning and image processing. His current research interest is to develop deep learning algorithms for medical image processing and industrial intelligent systems. VOLUME 8, 2020