PalmHashNet: Palmprint Hashing Network for Indexing Large Databases to Boost Identification

Palmprint identification aims to establish the identity of a given query sample by comparing it with all the templates in the database and locating the most similar one. It becomes computationally expensive as the size of the database grows. It is because the number of comparisons becomes proportional to the number of templates stored in the database. The process needs to be accelerated to get a response in real-time, especially for large databases. This paper proposes a palmprint database indexing approach called PalmHashNet that generates highly discriminative embeddings to create a fixed-size candidate list for comparison to make identification a constant time operation. Acquired palmprint images are fed to the feature extraction network, which is pre-trained using softmax loss. A margin is added to the softmax loss to minimize the intra-class distance between samples belonging to the same class. It ensures that the features have high intra-class and low inter-class similarity. k-means and locality sensitive hashing (LSH) is investigated for index table creation. In this setting, cluster centers for k-means and hash values in the case of LSH serve as indices. The features are extracted for a given query palmprint and compared with the index values. The candidates lying in the most similar bin are retrieved for identification. The advantage of the proposed approach is that the query palmprint is compared with a small percentage of database instead of the whole. The proposed approach offers probabilistic guarantees for query identification in the selected bin. Experiments are conducted on four widely used palmprint databases viz. CASIA, IITD-Touchless, Tongji-Contactless and Hong Kong Polytechnic University Palmprint II (PolyU II). A penetration rate of 0.022%, 1.032%, 4.555%, and 0.39% at 100% hit rate is achieved on these databases, respectively. It makes the identification process approximately 4500, 96, 21, and 256 times faster on the respective databases.


I. INTRODUCTION
Many real-world applications call for the need for access control and security. Contrary to traditional modes of authentication such as PIN, passwords, tokens that can be stolen or forged, biometrics provide a mechanism to authenticate an individual by analyzing their physical (fingerprint, palmprint, iris, face etc.) or behavioral (gait, voice, signature etc.) traits. Recently, palmprint-based biometric has gained wide popularity due to its non-intrusive nature, easy acquisition, and robust textural features. Palmprint is an impression acquired from the inner part of the hand lying between wrist and fingers. A region of interest (RoI) of palmprint is extracted from the acquired hand images of the individuals. It con- The associate editor coordinating the review of this manuscript and approving it for publication was Weizhi Meng .
sists of complex and unique patterns that are utilized as features for human authentication. The features are classified as high resolution and low resolution based on the quality of the acquired image. The low-resolution features include wrinkles, texture, and principal lines, which can be further sub-classified as heart, head, and life line. These are visible to the naked eye as well. On the other hand, ridges, singular points, and minutiae points can only be extracted from high-resolution images [1]. These features are shown in FIGURE 1 (b) and (c) respectively. A biometric authentication system involves two stages viz. enrolment and recognition. Enrolment, also known as registration, is a phase wherein the palmprint samples are acquired from the population and stored in a database with a unique identifier. Palmprint recognition refers to the process of authenticating a person by utilizing their palmprint sample and comparing it with the palmprint template(s) stored in the database. It is further classified into verification and identification. During palmprint verification, a palmprint sample along with its claimed identity is given to the system. A comparison of the input image and the template stored in the database against the claimed identity is made and the system outputs are either matched or not matched. On the other hand, during identification, the identity of the sample is unknown and needs to be established. To do so, the given sample is compared with all the templates in the database and the most similar one is retrieved as its true match. To sum it up, verification involves only one comparison, whereas the identification involves N comparisons, where N is the number of templates in the database.
With digitization and increase in deployment of biometric authentication systems such as Aadhar (India) [2] and MyKad (Malaysia) [3], there has been an increase in the size of biometric databases. With an increase in the number of enrolments, identification has become a less efficient and computationally expensive task as the number of required comparisons is proportional to the size of the database. The problem can be addressed if the search space can be narrowed down while still outputting the true match of the given sample [4]. This paper addresses the problem of accelerating human identification in large palmprint databases. Research suggests the use of a classification-based approach that splits a biometric database into different categories. When a query sample is provided to the identification system, then its category is determined first. Then all the samples belonging to the category are retrieved as a candidate set for the comparison with the query. An approach discussed in [5] utilizes gender to narrow down the search space of the face database. However, the division of the database is primarily skewed and non-uniform across all the databases. Therefore, it will not always ensure an efficient reduction in the search space. Hence, there is a need to devise a technique to generate a suitable list of candidates for comparison with query palmprint. The process of finding top-t candidates for comparison with the probe image to accelerate the identification process is termed as Indexing of biometric databases. Indexing eliminates the need for linear search in the database for the identification of biometric query samples. Instead, the index table returns only a subset of the database that tends to be the most similar to the query image. The advantage of indexing a biometric database in the process of identification is pictorially depicted in FIGURE 2. There are three components of Indexing viz. feature extraction, index table generation and retrieval. In the first stage, features are learned for each palmprint image using the proposed network. The learned feature vectors have salient and highly discriminative characteristics of the image. In the second stage, the extracted features that are similar to each other are grouped and are associated with an index. Since there exists high intra-class and low inter-class similarity among the features, the similar feature vectors end up going in the same bin of the index table [6]. The fundamental idea is to find a suitable feature vector and generate an index that could remarkably reduce the number of comparisons. The next stage, retrieval, aims to extract the feature vector for the probe image and find out the most similar index/bin. The candidates lying in the selected bin are fetched out for comparison with the probe sample.
However, there are certain challenges that need to be accounted for while designing an indexing technique for a biometric database [4]. 1) Variable number of features: It is not necessary that the features extracted from a palmprint image at two different time stamps would be same. It may happen that some features could be missing while some false features could also appear. Hence, the features may vary in number. 2) No order: There is no pre-defined order among the biometric features unlike some structural data. Therefore, they may appear in different order which becomes a challenge. 3) Occlusion and illumination: Presence of occlusion, rotation and translation may also affect the feature extraction process. The features may vary in number because of presence of aforementioned variations. 4) Transformation: The acquired image may be rotated, zoomed-in, zoomed-out etc. thus, transforming the extracted features. It is therefore, expected of any indexing technique to address all the aforementioned challenges.
The performance of any biometric indexing technique is determined by the quality of extracted features. The features are expected to have low inter-class and high intra-class similarity. After traditional handcrafted approaches, many CNN based methods have been proposed that automatically learn indexing-suitable features for palmprint databases. The feature extraction process is carried out by training the network for a classification problem using softmax cross entropy loss [7]. The features are extracted from the layer connected just before the fully connected layer. It is believed that the learned feature vectors are good if they are able to classify the input image correctly. But, the learned features may not turn out to be optimal if no explicit constraint is applied on feature vector distribution which may lead to a general spread of the learned feature vectors. To address this, metric-based methods have been introduced that uses distance-based criterion to separate feature embeddings from different classes and bring them closer otherwise. It proposes a novel metric-based palmprint feature extraction network that uses a function called 'additive margin loss' [8] to supervise the training process. The softmax loss is capable of maximizing the inter-class distance among samples of different classes but unable to minimize the intra-class dissimilarity among the samples of the same class. Therefore, a margin is added to the loss function to handle the intra-class variation better. The extracted features are indexed using k-means clustering and Locality Sensitive Hashing to boost identification process. A block diagram of the proposed approach is shown in FIGURE 3. Key contributions of the paper are listed below. Contributions: 1) This paper proposes PalmHashNet, a novel indexing technique that learns compact feature vectors for palmprint images to facilitate faster identification.
The proposed approach learns features suitable for indexing that ensure fast and accurate retrieval during identification. 2) Softmax loss with additive margin has been used to train the model for palmprint database indexing and to learn the feature vector embeddings simultaneously. This loss function ensures that the learned feature embeddings have low inter-class along with high intraclass similarity. It ascertains that the index space distribution is regularized to be similar to the uniform distribution. The network outputs a 512-dimensional distinct compact feature embedding corresponding to every palmprint sample and is associated with an index in the index table. 3) Two different techniques viz. k-means clustering and locality sensitive hashing (LSH) with k-nearest neighbor search have been explored for index table creation. The generated index is used for the retrieval of top-t matches for identification. The indexing performance of both the considered techniques has been compared. 4) The proposed approach has been evaluated on four publicly available databases viz., CASIA [9], IIT Delhi Touchless [10], Tongji Contactless [11] and PolyU II [12] Palmprint Database. All the databases contain palmprint images collected in an unconstrained environment. The results show the efficiency of the learned features in the identification process. Rest of the paper is organised as follows. Next section discusses related work done in the field of palmprint recognition and palmprint database indexing. Section III describes the proposed approach, i.e feature extraction, indexing and retrieval, in detail. Experimental setting and results have been analyzed in the Section IV followed by the conclusion in Section V.

II. RELATED WORK
This section discusses the related work that has been done in the area of palmprint recognition and palmprint database indexing. The quality of the learned feature vectors plays a major role in the effectiveness of the system. This paper compares the recognition performance on the palmprint databases with other state-of-the-art approaches in TABLE 4 to justify the goodness of the learned features through PalmHashNet. Therefore, this section gives an overview of the palmprint recognition and indexing approaches proposed in the literature.

A. PALMPRINT RECOGNITION
Palmprint recognition started with Boles and Chu [13], wherein the authors proposed that palm features such as its shape and lines could be used for human authentication. Lu et al. [14] claimed that using principal lines only for matching palmprint samples is not a good idea as some people may have similar patterns of principal lines. Therefore, they utilized a circular Gabor filter to extract features from low-resolution palmprint images. Kong and Zhang [15] proposed a palmprint verification system based on orientation information in palmprint lines. The paper proposes a competitive code by extracting orientation information using 2-D Gabor filters. Angular matching was used to compare the generated codes. An improvised version of competitive code called robust line orientation code (RLOC) was proposed by Jia et al. [16]. Feature extraction has been performed using a modified version of the radon transform. The features have been matched using pixel-to-area comparison. Zuo et al. [17] extended competitive code and proposed multi-scale orientation palmprint feature extraction called sparse multi-scale competitive code (SMCC). The proposed method is robust to illumination and scale variation. A palmprint verification method combining dominant orientation code and side code (DRCC) has been proposed in [18]. The proposed technique extracted both the codes by applying weights on the Gabor filter responses to improve the results further. A local binary pattern-based feature descriptor (LLDP) to extract local features from palmprint images has been proposed in [19]. Li and Kim [20] proposed a local feature descriptor that takes into account the direction and thickness information, making it robust to translation and rotation. Zhong et al. [21] proposed a siamese network utilizing two weights sharing VGG-16 networks to learn discriminative features for palmprint images. A histogram feature descriptor has been proposed in [22]. The paper suggests the use of apparent direction and latent direction computed from the energy map of the apparent direction. These directions are combined to form a single feature descriptor for palmprint images. Zhao and Zhang [23] proposed a CNN-based local feature extraction network in which a palmprint image is divided into five parts, and these parts, along with the complete palmprint image, are given to the proposed network. Zhong and Zhu [24] proposed a palmprint recognition system by combining large margin cosine loss and center loss.

B. PALMPRINT INDEXING:
The first technique for palmprint identification was proposed by You et al. [25], where hierarchy of four-level features has been used. The paper uses different matching strategies at different stages while reducing the search space. Li et al. [26] proposed a palmprint retrieval technique by using texture features. The searching was performed in two stages; firstly, global features were used to find out a small-sized candidate list and then local features were used to output the final result from the selected candidates. Paliwal et al. [27] made use of Vector Approximation (VA+) file database to generate score-based indexing scheme. A ridge features-based indexing technique was proposed by Yang et al. [28]. The paper first aligns all the palmprint images in a unified coordinate system and then uses ridge density and orientation information for indexing. Chen et al. [29] proposed a technique that outputs a binary feature vector and applied spectral hashing technique to index the feature embeddings. Yue et al. [30] proposed two techniques that used different features for hashtable creation. The first one is based on orientation pattern VOLUME 9, 2021 (OP), which refers to orientation features. While the other one uses principal orientation patterns (POP) i.e., orientation patterns that lie in the region of principal lines. An accelerated and improvised indexing technique that uses features generated from POP was proposed in [31]. A convolutional neural network (CNN) based feature extractor was proposed in [32]. The proposed network outputs a 128-d feature vector, and later, supervised hashing was implemented. A method using the difference of block means has been proposed by Almagtuf et al. [33]. In this method, no feature extraction has been performed. Rather, simple operations such as adding and subtracting overlapping blocks are used to compute palmprint code in each direction. Chen et al. [34] proposed a double-orientation feature to account for unstable orientation fields. It further used a window-based feature measurement for faster retrieval. A distillation-based loss function has been proposed in [35] that generates binary feature vectors for palmprint images. Zhu et al. [36] proposed an adversarial metric learning technique to make the palmprint embeddings uniformly distributed over a hypersphere. This is done by utilizing distance metrics and confusion terms. The paper also introduced a new palmprint database that is collected in an unconstrained environment. Zhao and Zhang [37] proposed a deep convolutional neural network-based technique that extracts discriminative features from the whole palmprint image and its patches separately and combines them to make compact feature. Jia et al. [12] evaluated the performance of various hashing techniques for retrieval of palmprint images. The paper considered four supervised, unsupervised and deep hashing methods each and PolyU II, PolyU M_B, HFUT, TJU, and PolyU 3D databases for evaluation. It has been reported that column sampling-based discrete supervised hashing (COSDISH) [38] performs best among other considered hashing techniques. Recently, an end-to-end CNN-based network that learns binary hash values for palmprint images has been proposed in [39]. It uses structural and pixel-level features by adding a similarity measurement module after the last fully connected layer.

III. PROPOSED APPROACH
This section provides details of the proposed approach for indexing a palmprint database. There are three components in this approach viz. feature extraction, indexing of the extracted features, and lastly, retrieval of the suitable candidate list for matching with the probe sample. The acquired image from the palm sensor may contain unnecessary backgrounds such as part of the hand other than palm or background clutter. Therefore, the region of interest (RoI) is segregated from the acquired images before feature extraction. This region should contain only the palm region, which lies between the ends of the fingers and the wrist, as shown in FIGURE 1(a). Various techniques have been proposed in literature that aims at extracting palmprint RoI [40]- [51]. The most popular RoI extraction technique has been proposed by Zhang et al. [43]. It starts by binarizing the input image to obtain a boundary of the hand that can separate it from the background. The next step is to form a unified coordinate system. A boundary is drawn between the gaps between the little and ring finger and the middle and index finger. The boundary of the gap between the ring and middle finger is not obtained as it is of no use. A tangent is drawn connecting the two boundaries, and two coordinates (x 1 , y 1 ) and (x 2 , y 2 ) are selected that lie on the respective boundaries and the tangent. These two points are joined to get the y-axis of the coordinate system. Origin of the coordinate system is obtained at the intersection of the y-axis and the line perpendicular to the y-axis that passes through the mid-point of the image. A sub-image is extracted from this image for feature extraction.
For feature extraction, we propose PalmHashNet, a metricbased network that learns discriminative feature vectors for input palmprint images. These feature vectors are later indexed using k-means clustering and Locality Sensitive Hashing for index table creation. During retrieval, when a query palmprint needs to be identified, it is passed to the trained PalmHashNet that extracts a feature vector corresponding to the query image. After that, the query feature vector is compared to all the indices (cluster centers or hash values) to find the most similar bin. All the candidates lying in that bin are retrieved for comparison. Hence, instead of matching the query palmprint image with all the samples in the database, it is only matched with the retrieved candidates making it an operation with O(1) time complexity. All the stages of the proposed approach are described in the following subsections.

A. FEATURE EXTRACTION
First stage in the identification process is the feature extraction. In this stage salient and discriminative features are extracted from palmprint images. The performance of any biometrics indexing technique is determined by the quality of extracted features. The features are expected to have low inter-class and high intra-class similarity. In other words, the features belonging to same class should be closer to each other than the features belonging to different classes in the feature embedding space. This condition is required to improve accuracy using nearest neighbor. This would hence make the indexing and further, the identification process efficient.
Generally, feature extraction process is carried out by training a classification network using softmax loss [7]. Softmax loss is defined as an amalgamation of softmax function, crossentropy loss and the last layer of a convolutional neural network [52]. Different subjects or individuals are treated as different classes and the layer connected just before the fully connected layer, serve as the feature embedding layer. Mathematically, softmax loss L SM is defined as: where N and C are the total number of samples and number of classes respectively. Activation of j th neuron in the last . Geometrical Representation of (a) softmax loss and (b) additive margin loss. DB 0 is the decision boundary created by the softmax loss whereas DB 1 and DB 2 are the decision boundaries learned by additive margin loss for class C 1 and C 2 respectively. The boundary becomes a regional margin instead of single vector when additive margin is applied.
fully connected layer having weight W j and bias b j for the i th palmprint sample with feature f i is given as a j = W T j * f i + b j . There would be C activations, one corresponding to each class. Let the ground truth for the i th palmprint sample be the class y i where i ∈ {1, 2, . . . , C}, then the activation of the corresponding neuron can be given as a y i = W T y i * f i + b y i . Using this, the Eq. 1 can be written as.
Considering a binary classifier, the posterior probabilities of a palmprint having the feature vector f i belonging to the class C 1 or C 2 can be obtained by using the softmax as shown in Eq. 3 and Eq. 4 respectively.
where (W T 1 , b 1 ) and (W T 2 , b 2 ) are the weight and bias corresponding to the class C 1 and C 2 . The classifier outputs C 1 as the class of the query palmprint if p(C 1 ) > p(C 2 ) and the output is C 2 otherwise. The classification solely depends upon the weight and bias term and uses W T j * f i + b j for the decision. Element wise multiplication in W T j * f i is equivalent to the dot product therefore the activation can be re-written as a j = ||W T j || ||f i || cos θ j + b j , where θ j is the angle between vectors W j and f i . It can be observed that the activation depends upon angle θ j and the weight vector norm W j both. If we normalized the weights to unity, the classification would become directly dependent on the angle between f i and W j 's. Therefore, weight vectors and feature vectors are normalized as shown in Eq 5. Here, f i and W j are original feature and weight vector respectively. The values of ||f i || and ||W j || are set to unity using Eq. 5. After normalization, the posterior probabilities given in Eq 3 and Eq 4 can be equivalently changed to p(C j ) = cos θ j . Therefore, the decision boundary becomes cos θ 1 − cos θ 2 = 0. Normalized features can be plotted on a hypersphere manifold with fixed radius (say 'r') as in shown in FIGURE 4. Softmax loss in the Eq. 2 can now be represented as: log e r cos θ y i e r cos θ y i + C j=1,j =y i e r cos θ j While training, if the sample belongs to class C 1 then the angle between f i and W 1 is smaller than the angle between f i and W 2 . However, this kind of decision boundary works for the classification but, it does not enforce higher intra-class similarity as the compact localization of the features of same class is not mandatory. Therefore features obtained from different samples of the same class covering usual variation are scattered around feature space. This is more visible phenomenon if samples contain high intra-class variations such as occlusion, pose, illumination etc.
In order to make the decision boundary more stringent, a margin m is added to θ. Consider a sample belonging to class C 1 . This would imply that cos θ 1 > cos θ 2 . By adding a margin m in the θ 1 , the equation that needs to be satisfied becomes, cos(θ 1 − m) > cos θ 2 . The expression cos(θ 1 − m) is larger than cos θ 1 which in turn is greater than cosθ 2 . The same relationship exists between θ 1 and θ 2 . The decision boundary for class C 1 then becomes cos(θ 1 − m) = cos(θ 2 ).

Algorithm 1 Feature Extraction and Indexing
Input: Set of palmprint images (P = P 1 , P 2 , . . . , P N ) Output: Index table I 1: for each palmprint sample P i do 2: Extract feature f P i using the trained PalmHashNet. 3: Append f P i to feature set f P i.e. f P ← f P i 4: end for 5: Apply k-means clustering or Locality Sensitive Hashing on the feature vector set f P . 6: for each cluster center cc i (k-means) or hash value h i (LSH) do 7: Create an entry in index table I with cc i or h i . 8: Put ID and feature vectors of all the candidates in I lying in cc i or h i . 9: end for return I Similarly, to correctly classify the another feature belonging to class C 2 , it is required that cos(θ 2 − m) > cos(θ 1 ) and the decision boundary becomes cos(θ 2 − m) = cos(θ 1 ). Due to the margin, lower bound of cosθ 1 becomes much greater than cosθ 2 thereby, enforcing higher intra-class compactness. The modified softmax function is written as below.
The geometrical representation of additive margin loss is shown in FIGURE 4. It shows that the initial decision boundary i.e. the one created by softmax is now changed to DB 1 and DB 2 respectively for class C 1 and C 2 respectively. Therefore, we can conclude from the figure that intra-class difference among the samples of same class is minimized by adding this extra marginal region to the angle.
PalmHashNet: This paper proposes a deep convolutional network named PalmHashNet that extracts features from palmprint images using the modified softmax loss function, given in Eq. 7. It has been seen that deeper networks result in loss of information, resulting in stagnation of accuracy, and eventually, it decreases. To address this problem, consider a shallow and a deep network. The deep network consists of a shallow network plus some additional layers that act as an identity function. The deeper network would act like the shallower counterpart in the worst-case scenario when this is done. However, it may happen that the deep network would learn the better features and reduce the error significantly. Residual networks that consist of a series of residual units are widely popular for image classification problems. Residual connections are added to the network to retain information of the previous layers. The proposed approach uses ResNet-18 as the backbone architecture to learn important yet discriminative features from palmprint images for indexing. The feature extraction process initiates by training ResNet-18 with the softmax loss. There are 16 convolution layers, two max-pooling layers and a fully connected layer in ResNet-18 Extract feature f q j using the trained PalmHashNet. 3: for each index i in I do 4: Compute cosine similarity between hash value (h i ) and f q j .

5:
Create a table S with h i and cosine similarity. 6: end for 7: Find the maximum score value. 8: Retrieve IDs stored in the most similar bin to get candidate list X for matching. 9: end for return X architecture. The filter size of the first convolution layer is 7 × 7 while in other layers, it is 3 × 3. This is followed by a global average pooling layer and a batch normalization layer. The global average pooling (GAP) layer aggregates the input features by taking an average across the channels. This consolidation brings down the requirement of the number of parameters and thus, reduces the chances of over-fitting. The output feature becomes robust to spatial translations of the input images [53]. Activations of the GAP layer output are fed to the batch-normalization (BN) layer [54] which normalizes the input by subtracting it by mini-batch mean and diving by the mini-batch standard deviation. Mini-batch refers to a subset of the training data that is given to the network in one epoch. Batch normalization smoothens the landscape of the loss function by bringing the spread of all the input dimensions to the neurons to the same distribution, resulting in faster training of the model. A dropout layer has been introduced to avoid the over-fitting further. It is followed by a fully connected layer of 512 neurons. The last fully-connected layer serves as the feature embedding. The weights of this layer, along with the feature embeddings, are normalized. This makes the classification process solely dependent on the angle θ formed between the feature vector and weight vector. The whole network (PalmHashNet) is then trained in an endto-end manner using the modified softmax loss, mentioned in Eq. 7. The architecture is shown in TABLE 1. PalmHashNet learns feature embeddings that have more intra-class and less inter-class similarity. The learned feature vectors are sent to the indexing module for index-table creation, explained in the next sub-section.

B. INDEXING
Identification aims at finding the closest or most similar sample from the database for a query palmprint sample. The naive approach for identification involves comparing the query sample with all the images in the database and sorting the similarity score list to find the most suitable match. However, this process becomes computationally expensive with an increase FIGURE 5. Sample palmprint region of interest (RoI) images from the CASIA-Palmprint [9], IIT Delhi Touchless [10], Tongji Contactless [11], and PolyU II Palmprint database [12] (row wise respectively).
in the size of the database. Therefore, there is a need to reduce the number of comparisons by reducing the search space for efficient identification. This is accomplished by indexing the database by associating the extracted features to an index and generating an index table that could be referred to during the retrieval. This paper uses two techniques, namely, 1) kmeans Clustering [55] and 2) Locality Sensitive Hashing [56] to explore clustering and hashing as a means for indexing the considered palmprint databases. The algorithm for feature extraction and indexing is given in Algorithm 1.

1) k-MEANS CLUSTERING
The objective of the k-means clustering algorithm is to partition a set of feature vectors into a specified number of disjoint groups. Let there be a feature vector set F = {f P 1 , f P 2 , . . . , f P N } where, P 1 , P 2 , . . . , P N represent N palmprint samples. k-means results in splitting F into 'k' disjoint clusters c 1 , c 2 , . . . , c k such that similar feature vectors lie in the same cluster while separating those that are different from each other in the feature vector space. Each cluster has a representative data point that is also known as the mean of the feature vectors lying in that particular cluster. The algorithm starts by initializing 'k' centers using k-means++ initialization. Let the centers of 'k' clusters are represented by m 1 , m 2 , . . . , m k . k-means is a distance-based clustering algorithm that computes the euclidean distance between a feature vector f P i where, i = {1, N }, and every cluster center. This helps in determining the closest cluster and f P i is assigned to that cluster. After every iteration, the cluster centers get updated by computing the mean of all the feature vectors assigned to it. The process of assigning feature vectors to a cluster center and updating the cluster centers after each iteration is repeated until no further change in cluster assignment is observed or the maximum number of iterations has been exhausted. The goodness of clustering, which refers to how well k clusters approximate the feature vectors set, is evaluated by computing intra-cluster variance. Intra-cluster variance measures the amount of spread observed among the feature vectors lying in a particular cluster. It is computed using, where, j and i denote number of clusters and feature vectors respectively. m j is the center of cluster 'j.' After convergence, the cluster centers are stored as hash values in the index table and the palmprint IDs along with their feature vectors that lie in a particular cluster are stored in the bucket corresponding to it.

2) LOCALITY SENSITIVE HASHING
A Locality Sensitive Hashing (LSH) function maps the feature vectors to a lower-dimensional representation. Similar feature vectors are mapped in the same bucket with a high probability in the lower-dimensional space. The main objective of LSH is to maximize the probability of collision of similar items i.e; the probability of two similar feature vectors lying in the same bucket should be high. The hash function for an input feature vector f P i is computed by using two random values, r and x. Here, r is a d-dimensional vector whose entries are randomly chosen from a set of vectors following the Gaussian distribution. The dot product is quantized into a set of hash bins with the objective that all the nearby feature vectors should lie in the same bucket as shown, In this equation, w is the quantization width, and x is a random variable lying between 0 and w. Quantization width determines the number of entries or candidates that would lie VOLUME 9, 2021 Two conditions must be satisfied to serve the purpose of reducing number of comparisons for identification of a query palmprint sample. These are, • The probability of two feature vectors lying in the same bucket of index table should be high if they are close to each other in the feature embedding space. Let there are two feature vectors represented by f P 1 and f P 2 . Let the euclidean distance between the two feature vectors is <d 1 . This distance is ≤d 1 which is the threshold distance value that determines if the given two feature vectors are close to each other in the feature embedding space. In this case, both f P 1 and f P 2 will lie in the same bucket. This is mathematically represented as, • Contrary to the previous condition, this condition states that the probability of two dis-similar feature vectors, f P 1 and f P 3 , lying in the same bucket should be low. Let d 1 and d 2 be the euclidean distance between f P 1 and f P 2 and f P 1 and f P 3 respectively. Since f P 1 and f P 3 are dis-similar feature vectors, the distance between them should be greater than d i.e., d 2 ≥ a × d, where a is any constant. Therefore, the probability of them lying in the same bucket of the index table will be less. Mathematically, it can be shown as, To further increase or reduce the probability given in the conditions respectively, a hash function of t-bits can be generated by performing t dot products in parallel using Eq. 9. A t−bit hash value of a feature vector belonging to palmprint sample P 1 can be computed by concatenating t values determined using the Eq. 9. h(f t P 1 ) can be written as =h 1 (f P 1 ), h 2 (f P 1 ), . . . , h t (f P 1 ). After implementing LSH on the set of feature vectors generated by the trained model, we get a data structure that consists of hash value and the candidate IDs in its corresponding bucket.
In these equations, f P 1 , f P 2 and f P 3 are feature vectors and d 1 and d 2 is the distance between f P 1 and f P 2 and f P 1 and f P 3 respectively. · is the euclidean distance between two vectors and we have considered that d 2 > d 1 . After implementing LSH on the set of feature vectors generated by the trained model, we get a data structure that consists of hash value and all its nearby candidates' IDs in one row. To further increase or reduce the probability given in condition (1) and (2) respectively, a hash function of t-bits can be generated by performing t dot products in parallel using Eq. 9. That is, h(f t P 1 ) can be computed by concatenating t hash bits as given by, h(f t

C. RETRIEVAL
The objective of the Retrieval stage is to return a list of candidates that could be its probable matches when a query palmprint image is shown to the identification system. It commences by extracting the feature vector of the query image q i using the proposed PalmHashNet. In the case of the index table generated through k-means, the centers of each cluster act as indices. Therefore, the query feature vector is compared with all the cluster centers or indices of the index table. The one with the highest similarity is selected, and candidates in that cluster are retrieved for identification. On the other hand, in the case of locality sensitive hashing, a t−bit hash function corresponding to the query feature vector is computed using Eq. 9. The hash bucket, which is most similar to the calculated hash function, is selected from the generated index table. The process is explained in Algorithm 2. This is followed by the identification that aims at finding a true match of the query sample from the retrieved candidate set. All the retrieved candidate IDs' feature vectors are matched with the query feature vector and their similarity scores are obtained. The score list is sorted in decreasing order, and the rank of the true match is recorded to gauge the performance of the indexing approach. This process makes identification a constant time operation as the size of the retrieved candidate list is fixed.

IV. EXPERIMENTAL RESULTS
This section gives details about experimental setting such as the databases and evaluation parameters that have been considered for evaluating the proposed approach along with 1) CASIA PALMPRINT IMAGE DATABASE [9] This database, also referred to as CASIA-Palmprint consists of 5502 palmprint images collected from both left and right hand of 312 subjects. Eight images from each subject were collected in single session and the individuals were not instructed regarding positioning their hands. Therefore, the acquired images have huge pose variance. RoI of size 128 × 128 is segmented from the images.
2) IIT DELHI TOUCHLESS PALMPRINT DATABASE [10] IITD palmprint database was collected from students and staff of IIT Delhi in June 2006 and July 2007. A total of 2600 images were collected from 460 palms of 230 subjects. All the images are in bitmap format and the RoI has a resolution of 150 × 150 pixels.

3) TONGJI CONTACTLESS PALMPRINT DATABASE [11]
This database is comparatively larger in size than the other two. It consists of palmprint images collected from both hands of 300 subjects. Ten images of each palm per subject were acquired in two separate sessions, making it a database of total 12000 images. The images have a resolution of 128 × 128 pixels.

4) HONG KONG POLYTECHNIC UNIVERSITY PALMPRINT II DATABASE (PolyU II) [12]
This database consists of 7752 palmprint samples collected from 193 individuals. The samples have been collected in two different sessions with a gap of 2 months. Ten palmprint samples from each palm of all the individuals were acquired.
The extracted RoI has a size of 128 × 128 pixels.

B. EVALUATION PARAMETERS
The evaluation of the proposed approach is done in two ways. As specified earlier, the performance of indexing and identification would depend on the quality of learned feature vectors. Therefore, before proceeding to the identification, we first evaluate the quality of feature vectors using 1) Accuracy, 2) Equal Error Rate (EER), and 3) Discriminative Index (DI) of the recognition system. Each test sample was matched with each sample in the train partition to calculate genuine and imposter matching scores. A genuine match is when matching is done between samples belonging to the same class or palm. It is called an imposter match if both the samples belong to different palms. The parameters mentioned above are used to evaluate the performance of any biometric verification system. The performance of an identification system is evaluated in terms of correct recognition rate (CRR), hit rate (HR), penetration rate (PR), and Cumulative Match Characteristic (CMC) curve. All these are described below.

1) EQUAL ERROR RATE
False acceptance rate (FAR) is the number of imposter matches that got accepted by the system out of total number of samples shown to the system. Whereas, False Rejection Rate (FRR) is the ratio of the genuine samples that got rejected by the system. Equal error rate (EER) is the point where FAR becomes equal to FRR. It can be written as given in (14), where nFA is the total number of samples that got falsely accepted out of the total number of testing samples that were matched from different class (nIM ). 2

) DISCRIMINATIVE INDEX
It refers to the separation between genuine and imposter scores. DI is defined as given in the equation below, where µ (.) and σ (.) denotes mean and standard deviation of either VOLUME 9, 2021 genuine or imposter scores.

3) ACCURACY
Accuracy of a recognition system is defined as number of correctly recognized samples out of total queries made to the system. Mathematically it can be expressed as,

4) CORRECT RECOGNITION RATE
CRR is defined as the rank-1 recognition rate i.e. number of queries that have got their true match at the rank-1. If nCR is the number of correctly recognized queries at rank-1 out of total nQ queries made to the system, then CRR can be defined as given in (17) CRR = nCR nQ (17)

5) HIT RATE
During identification, a query or probe image is referred to as correctly identified if one of the retrieved candidates belongs to the same identity as that of probe image. Therefore, Hit Rate (HR) is defined as the ratio of correctly identified queries (nCIQ) with respect to the total number of queries made to the system (nQ), as given in (18).

6) PENETRATION RATE
It is the percentage of the database that needs to be retrieved for correct identification of the query image. If nQ number of queries are made to the system and for each query i, C i number of candidates are retrieved from a total of D  templates, where D corresponds to the size of the database. Then, penetration rate (PR) can be defined as,

7) CUMULATIVE MATCH CHARACTERISTIC CURVE
CMC curve is a rank-based metric that shows relationship between identification probability at a given rank. That is, what ratio of total queries got correctly identified till a particular rank.

C. TRAINING AND TESTING PROTOCOL
The images contained in the considered databases are not uniform, and there is no standard training and testing protocol associated with the databases. Therefore, this paper uses a training and testing partition that is mostly followed for the mentioned databases. The training partition refers to the gallery images or the samples stored in the database. On the other hand, the testing partition contains the query images for which identification needs to be performed. The CASIA-palmprint database is partitioned into 80%−20% split for training and testing, respectively. In the IIT Delhi palmprint database, each subject has either given five or six images. Hence, we have used four images for training and the remaining 1 or 2 images for testing. For the Tongji contactless palmprint database, the training and testing split is done session-wise i.e. ten images from session one are used as gallery images while the remaining ten images collected in session two are used as query images.

D. EXPERIMENTAL SETTING
The proposed PalmHashNet is implemented using the Pytorch library of the Python programming language. Lately, the palmprint acquisition has been performed in an unconstrained environment for a better user experience. The acquired images tend to have occlusion, illumination variation etc. Data augmentation is applied on the training partition of the databases to make PalmHashNet robust to such variations. It helps make the deep neural network more potent by training it on different variations of the palmprint samples. Python Augmentor library [57] has been used to augment the training partition by applying image transformation operations such as random zoom, distortion, rotation, and illumination. With these four operations applied to each image, four new images were created, increasing the training partition five times. Hence, the size of the training partition became 12480, 9205, 19300, and 30000 for CASIA, IITD-Touchless, PolyU II and Tongji Contactless databases, respectively. The test partition has not been augmented as the test needs to be performed on the original images only. A computer with Xenon (R) processor with 32 GB RAM and 12 GB on-card RAM on NVIDIA Tesla K40C GPU has been used to train the network and evaluate the indexing and retrieval performance. For index table creation, k-means and LSH have been utilized for partitioning the set of learned feature vectors. k-means is used to cluster similar feature vectors based on similarity for index table creation. The number of clusters, denoted by k, for a particular database is determined by using a metric called silhouette coefficient [58]. Silhouette coefficient is a metric that evaluates the goodness of a clustering technique.
Mathematically, it is defined as (q−p)/ max(p, q) where p and q refer to the average intra-cluster and inter-cluster distance, respectively. Intra-class distance is the distance between each point within the same cluster. On the other hand, inter-class distance is computed between data points lying in different clusters. The silhouette coefficient value lies between −1 to +1 with +1 indicating well-separated clusters. The silhouette coefficient for various values of k is computed to determine the appropriate value of k. It was experimentally observed that its value is highest at k = 60. This became the initial point for finding the suitable value k for the indexing approach. Further, k-means was implemented for all values in the range interval of [5,100] with a difference of 5. It was empirically determined that indexing performance achieved the best results when k = 65 for the considered databases.

E. RESULTS
Results are computed for both verification and identification system to validate the performance of the proposed approach. Firstly, the quality of the learned features is computed because their discriminative ability determines the performance of the indexing module. Therefore, the recognition results are listed, followed by indexing performance based on two different techniques, namely k-means clustering and Locality Sensitive Hashing (LSH).

1) RECOGNITION PERFORMANCE
To evaluate the verification performance of the proposed approach, we have computed the Accuracy, EER, and DI of the system. Each sample in the test partition is matched with all the samples in the training partition. The detail regarding the number of matchings comprising of genuine and imposter matchings is given in TABLE 2. In the table, gallery images refer to training split as these are the images that are indexed. In contrast, probe samples refer to the testing split as these images are used for querying the identification system. The proposed approach obtained >99% accuracy for CASIA, IITD Touchless, and PolyU II database. However, for Tongji contactless, the reported accuracy is 97.85%. However, the   Figure 6. Here, FAR = FRR line is not straight because FRR is plotted on the log scale. The log scaling brings the focus towards a more meaningful lower side of the curve. The area under the ROC curve depicts error and a system with a lesser area is considered better in general. Different ROC curves shown in Figure 6 compare the classification performance of the proposed feature on the four databases viz. CASIA, IITD-Touchless, Tongji Contactless, and PolyU II databases. The verification result of the proposed approach is compared with various palmprint recognition techniques proposed in the literature. It can be seen from TABLE 4 that PalmHashNet performs best in terms of equal error rate when tested on CASIA, IITD Touchless, and PolyU II database. It has the lowest EER for these three databases.

2) INDEXING PERFORMANCE
Indexing performance is evaluated in terms of hit rate and penetration rate. It is expected from a good indexing technique to achieve a lower penetration rate at a high hit rate.  7.

a: TIME ANALYSIS
The time-based performance of the proposed approach is evaluated in terms of speedup. It refers to how fast the identification process has become by using the proposed approach for indexing compared to the naive approach for identification (without indexing). It is calculated by taking into account the total time (in seconds) to find a suitable candidate set from the created index To compare the results with state-of-the-art techniques, we used Rank-1 identification rate as most papers report that. The first comparison of the proposed approach is made with [12].  [11]. For IITD Touchless and Tongji Contactless databases, the approach proposed by Zhu et al. [36] gave the best results till now. However, approach proposed in [33] performed best on PolyU II database. It is clearly evident from the TABLE 8 that our approach achieves best value of rank-1 identification rate for CASIA, IITD touchless and PolyU II databases. Therefore, by examining both the comparisons, we can say that the proposed approach outperforms other techniques proposed in the literature.

F. ABLATION STUDY
The proposed approach solely depends on the feature extraction process. Thus, it was required to find the best features that could perform well for the recognition and identification process. Therefore, two different ablation studies are conducted in this paper. The first objective is to check the effect of adding a margin to the softmax loss for training the feature extraction model. To understand that, the feature extraction model was trained with and without a margin to observe the effect on the quality of the extracted feature vectors. Rank-1 identification rate was computed for the considered databases on the extracted features using only the softmax loss. TABLE 9 shows the comparison with respect to the rank-1 identification rate obtained by the proposed approach and the model trained with the softmax loss. It was observed that the Rank-1 identification rate improved by 10.29%, 14.53%, 4.59%, and 2.75% times on CASIA, IITD-Touchless, Tongji Contactless, and PolyU II databases respectively by introducing additive margin loss in the feature extraction network. The second study analyzes the effect of different dimensions (sizes) of feature vectors to find the most appropriate size that best represents a palmprint image. Three dimensions of feature vectors such as 128, 256, and 512 are analyzed to find the suitable feature vector size that best represents a palmprint image taken from the considered databases. Indexing is performed on the databases for all the feature dimensions mentioned above using Algorithm 1. It was empirically determined that the system achieves higher accuracy with a 512-dimensional feature vector. The same is shown in TABLE 10. A relationship between the hit rate and the penetration rate is established for all the combinations. A candidate set is retrieved for a query palmprint image from the indexed palmprint database. Hit rate determines the confidence by which a query image can find its true match in the retrieved set of candidates. Penetration rate refers to the percentage of the whole database required to find the true match of a query image. The true identity of a query sample is expected to be established by retrieving only a small percentage of the database (low penetration rate) with high confidence (high hit rate). An efficient biometric indexing approach is expected to achieve a high hit rate at a lower value of penetration rate. The same has been shown in TABLE 10. It can be observed that 512-d feature vector performs best on mostly all the considered databases. Hence, we have conceded 512-d feature vector in this study. The graphs showing hit rate vs. penetration rate for all the experiments are shown in FIGURE 8.

V. CONCLUSION
This paper proposes a palmprint database indexing approach called PalmHashNet that generates highly discriminative embeddings to create a fixed-size candidate list for comparison to make identification a constant time operation. Softmax loss with additive margin is used to train the model for palmprint database indexing and to learn the feature vector embeddings simultaneously. This loss function ensures that the learned feature embeddings have low inter-class along with high intra-class similarity. It ascertains that the index space distribution is regularized to be similar to the uniform distribution. The learned embeddings are indexed using k-means Clustering and Locality Sensitive Hashing technique to create an index table. Whenever a query image is given to the identification system, the features are extracted using PalmHashNet for query palmprint. The generated feature vector is matched with all the indices of the index table and the candidates lying in the most similar index are retrieved for comparison. Verification and identification experiments are conducted on four publicly available popular palmprint databases viz. CASIA, IITD-Touchless, Tongji-Contactless and PolyU II palmprint databases to show a thorough evaluation of the extracted features. The proposed approach achieved >99% accuracy on CASIA, IITD-Touchless, and PolyU II palmprint database while it achieved 97.85% accuracy on Tongji Contactless database. The reported equal error rate for all the databases is less than 1%, which is quite good. During identification, PalmHashNet achieved a penetration rate of 0.022%, 1.032%, 4.555% and 0.39% at 100% hit rate for the respective databases. Therefore, it can be concluded that to find the true match of a query sample with 100% confidence, it is required to look out for <1% of the CASIA and PolyU II database and 1.03% and 4.55% of the IITD-Touchless and Tongji Contactless database respectively. Hence, by using PalmHashNet, we need to search only a small percentage of the database instead of the whole database for identification without compromising accuracy. The proposed approach outperforms other state-of-the-art recognition and indexing techniques proposed in the literature, making identification a constant time operation.