Biometric Information Recognition using Artificial Intelligence Algorithms: A performance comparison

Addressing crime detection, cyber security and multi-modal gaze estimation in biometric information recognition is challenging. Thus, trained artificial intelligence (AI) algorithms such as Support vector machine (SVM) and adaptive neuro-fuzzy inference system (ANFIS) have been proposed to recognize distinct and discriminant features of biometric information (intrinsic hand features and demographic cues) with good classification accuracy. Unfortunately, due to nonlinearity in distinct and discriminant features of biometric information, accuracy of SVM and ANFIS is reduced. As a result, optimized AI algorithms ((ANFIS) with subtractive clustering (ANFIS-SC) and SVM with error correction output code (SVM-ECOC)) have shown to be effective for biometric information recognition. In this paper, we compare the performance of the ANFIS-SC and SVM-ECOC algorithms in their effectiveness at learning essential characteristics of intrinsic hand features and demographic cues based on Pearson correlation coefficient (PCC) feature selection. Furthermore, the accuracy of these algorithms are presented, and their recognition performances are evaluated by root mean squared error (RMSE), mean absolute percentage error (MAPE), scatter index (SI), mean absolute deviation (MAD), coefficient of determination (R2), Akaike’s Information Criterion (AICc) and Nash-Sutcliffe model efficiency index (NSE). Evaluation results show that both SVM-ECOC and ANFIS-SC algorithms are suitable for accurately recognizing soft biometric information on basis of intrinsic hand measurements and demographic cues. Moreover, comparison results demonstrated that ANFIS-SC algorithms can provide better recognition accuracy, with RMSE, AICc, MAPE, R2 and NSE values of ≤ 3.85, 2.39E+02, 0.18%, ≥ 0.99 and ≥ 99, respectively.


I. INTRODUCTION
T HE geometric models for biometric hand features (fingerprint and palm) are easy to hack in contrast with the biometric bones-based system (phalangeal biometric models). This is because hand geometric model is based on external measurement of hand features (hand patterns; knuckle creases) which are highly exposable. Suspects leave traces every day and everywhere they have utilized biometric resources. Even more aggravating, if suspects have committed crimes, third party presence or interfering on the same system might have defeated biometric cue traces when matching sample to suspect. However, in cyberspace context, utilization of external hand measurements and/or knuckle creases may allow criminals to easily utilize information of legitimate users for unauthorized access or activities. Another limitation of these features is that, it does not favor victims of skin diseases, in a situation where top layers of their skin (palm and fingerprints) are affected [1]. Thus, victims may no longer get access to any external hand measurement-based biometric recognition system, even if they have fully recovered [1], [2]. However, this problem might have defeated crime detection, accident rescue, amnesia victims identification, missing persons, unknown deceased, and cyberspace authentication. In contrast, the phalangeal biometric model require intrinsic hand features such as length/width of Proximal Phalanx, Distal Phalanx etc. which are not as exposed as their counterparts [2]. Therefore, hand biometric recognition based on intrinsic measurement of human bones stand as solution for human identification and crime detection. Moreover, an appropriate selection of intrinsic hand features yield an excellent result [3], [4]. Due to these reasons, evaluation of intrinsic hand features based on demographic cues by learning models aid in recognizing biometric hand information. It is well known that recognition of biometric hand features and demographic cues is a complex non-linear procedure that happens by a complex interaction of different redundant hand features. Pearson Correlation coefficient (PCC) features selection is generally appreciated as a stable feature representation of complex and non-linear dynamical behaviour of hand biometric information. This analysis minimize models training complexity, and reduce misclassification during recognition. Recognition from selective intrinsic hand features by PCC is an effective and reliable method for critical infrastructure.
Studies of human demographic characteristics from biometric hand features have been extensively proved in the body of literature purposely on Artificial Intelligence (AI) learning or descriptive statistics analysis. It is being documented that external hand features (geometric features) are deployed in biometric recognition, such information involves measurement of lengths, circumference, thickness, width of fingers, palm, wrist, skin texture etc. to predict human sex, age, and human height [5]. Karki and Singh demonstrate a strong relationship between the pattern of fingerprints and human gender [6]. Meanwhile, in [7] relationship of biometric information in cross-domain identity recognition is demonstrated. Among well-known biometric gesture recognition method is Discriminant analysis [8], which was followed by descriptive statistics approaches [9]. Likewise, Thakar et al. propose a method of gender determination using ridge characteristics and ridge density of fingerprints. These features were statistically analyzed and their results proved the usefulness of biometric information in determining demographic information [10]. Another variation of the descriptive statistics model is a bootstrap estimate, where a large number of sample sizes similar to actual samples are drawn with replacement from samples, and desired statistic was determined within each sample. Bootstrap fails to approximate different ratios if they belong to similar finger measurements [11]. Linear and curvilinear regression models were built to investigate two hundred and fifty (250) students using measurements of their height, hand length and breadth to predict gender. However, these models return low accuracy, features are not robust for biometric predictions [12]. In [13] Stevenage et al. proposed method to demonstrate restrictions while matching a suspect hand, analysis was carried out using Analysis of Variance (ANOVA).
As visited from the literature above, these models may not provide good prediction results, which lead to wrong conclusions. Misclassification from descriptive statistics, such as linear or curvilinear is correlated to cut-off point and there is no cut-off value that should be optimal. Nevertheless, AI algorithms are independent of any value of this cut-off. In addition, misclassifications from classical and analytical statistics approaches happen because of nonlinearity from biometric features and model parameter settings. To handle nonlinearity of biometric features, AI algorithms remain as good choice [14], [15]. Multiple AI algorithms are combined using a fuzzy inference system, where hand geometry is particularly considered as intrinsic biometric cues from nearinfrared human hand images to locate interphalangeal joints of human fingers [2]. The paper employed pulse response, hand geometry and finger vein using Convolutional Neural Network fuzzy (CNN-fuzzy) inference for detection of a live human body to improve performance against counterfeit attempts [2]. A part from the fuzzy recognition algorithms, Alias et al. in [16], adopted an SVM for fingerprint classification. Gavrilova et al. in [17], used kinect sensor to extract human emotion cues in multimodal context, which are trained with SVM classifier. Recognition of human hand gestures is performed using ECOC-SVM [18]. Their results indicate feasibility of using SVM in real-world applications. Henceforth, to further explore the capabilities of AI-based algorithms to learn intrinsic hand features in human demographic cues recognition, SVM demonstrates to be better as it strikes a better balance between nonlinear feature points to predict four demographic information (sex, height, foot size, and log-weight) [19]. The results obtained using the abovementioned AI algorithms are quite promising and motivated us to explore other AI-based algorithms for recognition of demographic cues from biometric information. However, the major drawback of all these algorithms is that they do not prove to be robust in case of nonlinear features and have large parameter computation which lead to low accuracy. Therefore, to further explore the capabilities of AI-based algorithms and to overcome the drawbacks of different parameters estimation, the Error Correction Output Code (ECOC) scheme is used with Pearson correlation coefficient features (PCC) to recognize the demographic cues from biometric hand features.
Inspired by the above contributions, in this paper, we compare the performance of two improved AI algorithms 2 VOLUME 4, 2016 for high-precision recognition of demographic cues from biometric hand features. The first algorithm is composed of SVM and Error correction output code (ECOC), namely SVM-ECOC with PCC. ECOC algorithm is robust in handling unreliable and noisy features. It has the benefit to minimize a single multi-class problem [20]- [22]. The second algorithm is composed of Adaptive Neuro-fuzzy Inference System (ANFIS) optimized by subtractive clustering with least square approximation (SC), namely ANFIS-SC with PCC. SC has good estimating capability to ANFIS parameters, and it is robust in high dimensional data for moderate features. These two algorithms are validated using distinct intrinsic hand features selected using PCC from biometric and physiological hand data set in [19]. In what follows, we give an outline of the contributions made in this paper.
(a) We investigate the impact of intrinsic hand features on biometric information recognition. (b) We extend PCC technique for selecting key hand biometric features to ease AI recognition/learning algorithm. (c) Some of the available optimization schemes are employed to avoid redundant parameter estimation of AI learning algorithms, so that the dimensions of parameter vector are reduced and the computational accuracy is improved. (d) Computational comparison is carried out between ANFIS-SC and SVM-ECOC algorithm to illustrate the high efficiency of the ANFIS-SC algorithm. (e) Increase in recognition accuracy is the major contribution of this article in the field of biometric crime detection and human identification. (f) Numerical results show that the ANFIS-SC algorithm maximizes recognition performance for soft biometric cues. Organization of our paper proceeds as follows: related state-of-the-art methods, research gap, motivation, and contributions formed Section I. The details of ANFIS, and SVM are described in Section II. This section also dealt with the optimization procedures of our adopted algorithms. Section III described data description and its characteristics. It also dealt with PCC test, feature extractions and selections, recognition phase, parameter settings, and performance metrics. Section IV give out results of the AI recognition algorithms and their various performance evaluations. It also interpreted performance comparison between our works and some existing model, and running time. Conclusion, applications and limitations as well as future direction formed Section V. To make more precise explanation, we list the notations used throughout the article in Table 1.

II. MATERIALS AND METHODS
Artificial intelligence (AI) is a superior technique when there is a nonlinearity problem among biometric features recognition. AI can handle problems that have non-linear solutions irrespective of fitness of non-linear features. In addition, AI learning approach is suitable when there is a need to solve biometric recognition within a certain time.
Exploiting the benefits of AI algorithm, we proposed to build features from PCC [23] into optimized AI algorithm. PCCbased feature selection minimized features dimensionality [24] and AI learning complexity. This section describe architectures of the following adopted AI algorithms; (a) ANFIS-SC (b) SVM-ECOC and steps of the methods are described in flowchart 1.

A. ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM (ANFIS)
ANFIS is an AI approach, composing Artificial Neural Network (ANN) and fuzzy inference networks (FIS) as a single ANFIS model, leveraging individual limitations of ANN and FIS methods [25], [26]. One major advantage of ANFIS is its ability to make good representation of complex and non-linear connections among features [25]. In addition, its soft computing and rapid convergence raised the interest of using ANFIS in prediction of complicated relationships [27]- [30]. ANFIS aim to enhance/optimize FIS parameters by referring learning protocol according to input-output relationship vector. ANFIS optimization is carried out such that calibration errors among trained samples and original samples are minimized [14]. The following equations define and describe ANFIS: Here two inputs combination are used to formulate ANFIS layers, for two inputs there exists linear and nonlinear parameters, where A and B rule from 1st and 2nd order Sugeno FIS utilized η 1 , η 2 , Ω 1 , Ω 2 , ω 1 and ω 2 as consequent parameters, so that τ of h i tunes in ϵ 1 ; τ of h i tunes in Ω 1 ; and τ of h i , tunes in ϵ 2 , and τ of h 2 tunes in Ω 2 , respectively. The output of the system is illustrated by χ ι,d . Where ι and d stand for order and node order combination respectively. Henceforth, 1st and 2nd fuzzy rules could be produced from the combination of two hand input feature sets h i as: Where η i , ω i , and Ω i denote consequent parameters. According to two fuzzy rules, feed-forward layers of ANFIS can be designed as follows: Layer One: The nodes are adaptive here and fulfills quantifier rule, then node output can be given as: where l d and ϵ d denote premise parameters, that determine shape of Gaussian membership function (MF). The Gaussian MF is given in Eq. (2), Layer Two: Contain fixed nodes. It calculates explosion power per each rule, which is given as: Layer Three: The nodes are fixed in this layer. The output of this layer is called normalized firing strength T d , formulated as: Layer Four: This layer consist of adaptive nodes. It calculate consequent parameters from successive node as product of T d and the first order polynomial. It output is given as: Layer five: This layer has single fixed node Σ. It computes the input signals from all antecedent layers. Function of this layer Z 5,i could be calculated using: However, from Eq. (6), f d can be estimated to be nearly equal to the original information with respect to the observed function f d , so that demographic cues could be recognized for any available intrinsic hand features. But one main challenge is how to produce recognized demographic cues to be nearly equal to the original intrinsic hand gestures, therefore, this brought matter of minimizing function, which is defined as follows: However, ANFIS layers Eqs. (1)-(7) suffer from thornyparameter estimation, which seriously lead to noise during recognition. Nature-inspired optimizers [31]- [33] played significant role in estimating ANFIS parameters. In contrast, parameter estimation is achieved through clustering mechanisms. Clustering serve as robust technique to handle cluster information. Therefore, empowering ANFIS with clustering is a substitute to nonlinear features estimation. There exist other ANFIS parameter estimations according to clustering mechanism such as: Fuzzy C-mean (FCM); which optimize ANFIS parameters by minimizing FCM-based derived objective function. Its major limitation is inability to estimate large number of parameters [26], [34]. Grid partition algorithm (GP); effectively handle ANFIS parameters for small sample size, its flexibility is due to inherent large amount of rules and parameters generation. However, its main challenge is inherent exponential growth of fuzzy rules due to increase in sample size [26]. Subtractive clustering (SC); where generated clusters are utilized to derive iterative-based optimization clustering to estimate ANFIS parameters. This method has good representation, if high dimensional case is utilized for moderate biometric hand features [26], [33]. Optimization of ANFIS parameters using SC achieved best performance [35]. In SC, every biometric features are considered as competent cluster centre. The competent cluster over entire input-output feature pose is computed as their Euclidean distance equation over the whole feature poses. The poses having high competent greater than the threshold value are considered cluster centers. The primary concept is to calculate the density index ϑ i , corresponding to the biometric hand features h i having positive constant b t (determine distance between cluster centres), which is given as: Henceforth, SC algorithm decides to select topmost density index as the first cluster center which is modified as: We further chooses feature pose with high competent as next cluster centre. This loop in Eq. (9) is iterated until the stopping criteria is achieved. The cluster centres are utilized to generate fuzzy rules and parameters which lead to design the FIS model. The rules are fuzzified using Eq.
(2), which adaptively alter the characteristics of FIS to make the pattern looks like it corresponding antecedent function. The major contribution to the existing ANFIS-SC is made on lines 9-17 in Algorithm 1, where we fine tune FIS and ANFIS-based model every iterations with presently labeled recognizer (calculated cluster).
Therefore, computed estimated clusters are applied to generate iterative optimization-based clustering for ANFIS model parameters identification [35]. In this paper, the number of cluster centers are set to be 14 for the 112 biometric hand pose h 1 , h 2 , · · · , h D . Every biometric hand pose represents a particular member of cluster centre. Thus, making the same amount of fuzzy rules set Ω and cluster centres, and each correspond to features of cluster. The formulation in Eq. (9) is iterated until the convergence of sufficient amount of VOLUME 4, 2016 cluster centres is achieved. Biometric hand features vector is fuzzified by adopting Gaussian MF with sigmoid activation (hybrid MFs), which yield membership degree per each feature (equals to 112×Ω). As a result, raises the dimensionality of the biometric hand features input vector. The hybrid MFs allow smooth transition among linear and nonlinear parameters, when compared to triangular and trapezoidal MFs [36]. In addition, hybrid MF has less parameters in contrast with bell and triangular MF thus make it more flexible. Therefore, the Gaussian MFs is formulated as in Eq. (2). However, ith rule can be obtained as follows: if in 1 = τ i 1 , and in 2 = τ i 2 , · · · ,in 112 = τ i 112 , hence output is given as f i . Where i = 1 : τ . Then, first order Sugeno-type fuzzy model for τ i n at ith MF and jth hand intrinsic features dimension, where n = 1 : 112, is moderated as: where [η i 1 , · · · , η i 112 , ω i ] denote consequent parameters. The premise parameters are obtained from τ i n Gaussian MF Eq. (2). Tuning the values of these parameters will vary MF, and also alter behaviors of FIS. The FIS union is described as follows: where l in denotes nth element of cluster centre l i and ϵ i denotes radius of neighborhood. However, we estimated consequent parameters using least-squares estimation method. Finally, output of Sugeno-type fuzzy algorithm is computed with Eq. (6). Then, defining η i = [η i 1 , η i 2 , · · · , η i 112 ] as row vector of linear parameters η i 1 · · · , η i 112 and H = [in 1 , in 2 , · · · , in 112 ] T denote hand intrinsic gestures space, then substituting for f i in Eq. (6), it becomes:

B. SUPPORT VECTOR MACHINE (SVM)
SVM is an artificial intelligence classifier, applies to nonlinear data recognition and classification. SVM gain superiority among classical machine learning and AI classifiers in daily life application due to flexibility and ease-of-use for handling various classification issues [37]. One benefit of SVM is the modeling and conversion of a dimension for high altitude pattern recognition problems, empirical risk minimization, efficient learning, and good generalization ability [38]. The SVM model formulation is defined by letting h i to be the extracted features from both left and right hand, h l and h r respectively. Then we let i to represent set of outputs for sex, height, weight, and foot-size respectively where i = 1 : n samples of measured data. In short biometric hand information used for training comprises of sample and label pairs as: where w is a vector of n row features and b denote bias which is constant. Therefore w and b can be solved using finest separating plane. Now w can be formulated as; To solve for b there is need to minimize ||w|| such that biometric hand features vector can be given as where c i is a box constraint. Hence, distance between hyperplanes is 2 ||w|| . Thus, to maximize distance, there is need to minimize ||w||. To avoid any hand pose to fall within hyperplane margin, there is need to add constraints so that each hand pose must fall on its correct margin side as follows; Which can be rewritten as; .., n. However, support vectors can be realized from maximum hyper-plane margin which is completely determine by closet h i to margin. Support vectors are h i on boundary, those for which Therefore, Eq. (17) can not fit to non-linear biometric features. Thus, Hinge loss function is very useful for hand poses that are linearly non-separable , and actual output respectively. Now Eq. (16) is configured with Hinge loss function and becomes Eq. (18).
where λ represent parameter which increase margin for h i to fall within rightful side and λ||w|| 2 represent loss function.
. Therefore, optimization problem in Eq. (18) becomes Eq. (19) at some conditions: Under To swap among maximizing distance between 1 and −1 from Eq. (19) we now added a box constraint parameter c, where c = 0.05. Furthermore, to achieve maximum non-linear separation among classes, it is imperative to transform biometric features h i . This can allow to achieve high dimension projection. However, available SVM kernel function such as Polynomial, Radial Basis Function (RBF), Hyperbolic tangent are highly robust for high dimension projection and efficiency of SVM model. Although, RBF kernel function achieved best performance [39]. RBF is known as Gaussian kernel which is mostly applied to non-linear data such as biometric hand poses. Since SVM general equations are formulated, then we are now to handle classification on our proposed hand intrinsic features on it transformed shape ϕ(h i ) and configuring RBF kernel function. RBF can be given as follows: We now configure RBF kernel function in Eq. (20), then it becomes: whereh i represent hand intrinsic features, γ represent parameter for kernel adjusting. For γ > 0 and sometimes represented as γ = 1/(2σ 2 ). The values of γ and c are hyperparameters tuned to attain optimum SVM model with RBF kernel. Then, value of γ defines influence and extent to which training hand poses attain. Lower value means 'far' while higher value means 'close', higher value of γ result in good accuracy. However, c find model tolerance towards misclassification, its low value yield low accuracy for SVM model while its high value yield good accuracy but may result to failure in generalization. We settled these hyper-parameters with middle values. Finally, we can achieved SVM classifier , c i which can be obtained through Error Correcting Output Code (ECOC) optimization problem solution. Classifier is equivalently given as: It is worthy noting, that SVM algorithm is designed for a binary classification, which need ensemble ECOC strategy for multi-classification problem [40]. ECOC is adopted to handle samples with unreliable or noisy information [41]. Application of ECOC is extended to multi-classification SVM algorithm to minimize single multi-class problem. Theoretical background of ECOC is provided in the following section II-C. The concept behind SVM multi-class classification problem of non-linear mapping with soft computing is achieved from modified SVM using ensemble ECOC design.

C. ERROR CORRECTING OUTPUT CODE (ECOC)
The ideology of ECOC, implies, the coding/decoding over the scheme itself. Its input consists of set E classes, with e set of binary partition classes, that are considered as potential dichotomizers, which are learned against the partitions. Therefore, codeword having length e is achieved per class, inside codeword each bit matches output of dichotomizer: usually coded by +1 or −1 based on their classes and set of membership. Aligning the codeword in a matrix shape, such that matrix , it is defined as ∈ [−1, +1] E×e . In this case, the matrix is coded with five dichotomizers [20] [υ 1 , · · · , υ 5 ] with respect to four class problem [s 1 , · · · , s 4 ] and codewords [κ 1 , · · · , κ 4 ]. The goal of the alignment is to learn the input sets with respect to label [(h 1 , D 1 ), · · · , (h n , D n )] at a given input sets and labels D l . Considering Fig. 2, entries VOLUME 4, 2016 with orange and red represents +1 and −1 for dichotomizer and null inside matrix . Thus, learning of first classifier is to distinguish among s 3 versus s 1 , s 2 and s 4 , whereas the learning behavior of second classifier is to recognize s 2 and s 3 versus s 1 and s 4 . Moreover, ECOC is also encoded on ternary schemes as shown in Fig. 2b. In ternary symbol-based ECOC, symbol zero is added, which lead to the generation of seven dichotomizers [υ 1 , · · · , υ 7 ] per matrix, thus the coding matrix and codewords for a four class problem is obtained as ∈ [−1, 0, +1] E×e and [κ 1 , · · · , κ 4 ], respectively. Therefore, the computation of dichotomizers can be formulated as follows: The principle of decoding is shown in Fig. 2. The code z is realized when employing binary classifiers e, per each input sets h i during verification. Therefore, z is collated with root code words (κ i , i ∈ [1, · · · E]) of each class, which is defined in . These input sets are allocated to the closest code word from available schemes such as Hamming decoding, inverse Hamming decoding, and euclidean decoding. The second decoding strategy in ECOC is referred as ternary decoding. Ternary decoding is performed from any of attenuated euclidean decoding, loss-based decoding, and probabilistic-based decoding [42]- [44]. For ternary scheme, Hamming decoding strategy classify the validation input sets by class s 1 . However, the validation codeword do not include the zero bit, this is because the response per dichotomizer is given as υ j ∈ [−1, +1]. For details of this scheme authors are encouraged to read [20]- [22]. Inspired by the above benefits, in this paper, we extended application of ternary ECOC on SVM binary learners. The major contribution to the existing SVM-ECOC is made on lines 12-28 in Algorithm 1, where we fine tune the SVM-ECOC model every 10 iterations with presently labeled recognizer. We achieved finest combination of SVM-ECOC with suitable RBF kernel and box constraint.The rest of the parameter combinations are reported in Table 10.

III. EXPERIMENT
The reviewed literature highlighted the understanding of the principal practices to achieve best performance and efficiency. Based on the reviewed literature, the biometric and physiological data set was taken for experimental investigation and a flowchart was designed to achieve the paper objective.

A. BIOMETRIC HAND FEATURES
We adopted biometric and physiological data set in [19]. The data was extracted from 112 participants (with equal number (a) ECOC with binary scheme (b) ECOC with ternary scheme Figure 2: ECOC code from four class argument of male and female to be 56). The participants were all Caucasians aged within 18 to 35 years. Demographic information gathered from the participants consisted of sex (male or female), height, weight and foot size. Demographic descriptive analysis is extended from the work in [19]. The steps for the proposed methods are explained in flow chart Fig. 1

B. PEARSON CORRELATION COEFFICIENT (PCC)
We first consider Pearson's correlation coefficient (PCC) to investigate most significant and reliable features for algorithms training. PCC test is successful to give insight and clues in choosing best model features in real applications [45], [46]. Another reason for PCC test is to minimize AI learning complexity. Here, φ denotes two parameters that are independent between (h i , O i ), the covariance among those Results of test are displayed on Tables 2 -9. The Tables demonstrated that eleven (11) hand intrinsic features are independent of each other, with the exception that some samples have independent correlation between features less than 0.5, which is statistically negligible. Each table demonstrate PCC between features. Diagonal entries stand for the correlation of a particular feature with itself. Any cell that has value not less than 0.5 is assume to possess high correlation. We can deduced that majority of features are highly correlated. There are some few features with lower correlation and may be preserved. These correlational patterns were found to be steady among data of participants. Nevertheless, these features can still be chosen as independent variables. The significant features will be selected, and procedures for that will be explained in the next section.
We averaged significant features and emphasize on feature fusionβ which decrease training complexity of the AI model. The performance of fused features 5, with regards to the rate of recognition, would be discussed in brief. In this experiment, we design our input features vector from the following selected input information: (a) Seven (7) features for height recognition from right hand are selected as shown in Table 2. (b) Nine (9) features for height recognition from left hand as shown in Table 3. (c) Three (3) features for sex recognition from right hand as shown in Table 4. (d) Two (2) features for sex recognition from left hand as shown in Table 5. (e) Nine (9) features for weight recognition from right hand as shown in Table 6. (f) Thirteen (13) features for weight recognition from left hand as shown in Table 7. (g) Four (4) features for foot-size recognition from right hand as shown in Table 8. (h) Eleven (11) features for foot-size recognition from left hand as shown in Table 9. To further visualize insight of selected features, we employed curve-fitting tool for generating fitting plot. Performance of fitting analysis is demonstrated using metric in Eq. (32). The results demonstrated that most of the selected features are statistically significant with R 2 of 0.2-1.0. It indicates that less significant features were below 0.65 which are still considered in our models development. Though these features are not proportionate for recognition task, can still serve as vital information for lessening complexity of designing hand camera rig and AI. Results show that not all features are equally significant in biometric gestures discrimination. This is required to validate whether ECOC-SVM and ANFIS-SC are sufficiently enough for recognizing demographic cues from biometric hand features.

1) Recognition using SVM-ECOC
SVM is designed as explained in section II-B, to divide line between two classes and to maximize margin. To achieve multi-class binary recognition, we ensemble ECOC code with SVM. We have chosen kernel function in SVM algorithm for nonlinear intrinsic hand features evaluation, to enable more increase of features to fit hyper-plane. Biometric intrinsic hand features are divided into 70% training and 30% validation. The features are trained using SVM-ECOC algorithm for better optimization. γ > 0 is chosen (that is, γ = 1/(2σ 2 )). Then, value of γ defines influence and extent to which training hand poses attain good accuracy. The values of γ and c are hyper-parameters tuned to attain optimum SVM-ECOC recognition. However, c find model tolerance towards misclassification, its low value yield low accuracy, while its high value yield good accuracy but may result to failure in generalization. We settled these hyper-parameters with middle values. Finally, we achieved SVM classifier with , c i which can be obtained through VOLUME 4, 2016  Fig. 5 to represent multi-class labels 1, · · · , 4. However, if minimum Hamming distance among each pair of code words is E, then code may correct minimum per bit errors. In as much as error Ψ distant away less than E(E−1) 2 from actual feature, the nearest feature is classified as correct one. This code can correct up to 2 errors per four input features, and can correct up to 3 errors per 13 biometric intrinsic hand features. As our number of classes E is 3 < E ≤ 7. Then per biometric intrinsic hand feature has length 2 E−1 − 1. Also, the matrix needs 13 binary classifiers (i.e. 13 columns) altogether. Then finally, evaluation metrics are used to analyze performance of the SVM-ECOC algorithm.

2) Recognition using ANFIS-SC
ANFIS-SC is activated using randomly selected 70% and 30% of biometric intrinsic hand dataset for both training and validation phases, respectively. ANFIS-SC is adopted here to recognize size, height, weight and sex (O i ). The chosen parameter values of ANFIS-SC is described in Table 10. Accordingly, we achieved finest combination of ANFIS model having maximum number of inputs h i plus one output O i , making 113 input-output vector, with fourteen (14) Gaussian MF in each biometric hand pose. Every feature takes in two parameters, generating one hundred and forty nonlinear parameters. The linear equations take in five parameters having fourteen (14) rules, which generate eighty four (84) linear parameters in total. The finest model is obtained with 0.3 as the value of cluster radius, versus three hundred and fifty maximum iteration. Furthermore, unlike SVM, ANFIS has complex non-linear projection. Thus it is suitable to learn biometric information.

E. PARAMETER SETTINGS
The following parameters in Table 10 are chosen for the two adopted AI algorithms design.

F. PERFORMANCE METRICS
The evaluation metrics are used to realized the performance of the input-feature models and the adopted algorithms, as given below. Variables O i , O ′ i , S ,, θ, σ 2 r and O a denotes observed data, verified (predicted) features, number of samples, number of parameters, as well as mean observed features, respectively.

4) Mean Absolute Deviation (MAD)
12 VOLUME 4, 2016 Furthermore, we reported ANFIS-SC model's simplicity, flexibility and degree of fitness according to the following two metrics: Akaike's Information Criterion correction (AICc), and Nash-Sutcliffe model efficiency index (N-S).

6) Akaike's Information Criterion (AICc)
AIC estimate degree of information lost for a model [47]. If small number of datasets are employed for model development, AIC index may likely to overfit, thus corrected AIC (AICc) is formulated to handle AIC overfit. Small samplesize AICc metrics evaluate the quality of a model according to structural flexibility and level of deviation from average value. It is obtained during model's verification of unseen observations [26], [48]. Therefore, low value of AICc describes best model. AICc is computed as

7) Nash-Sutcliffe model efficiency index (NSE)
This metric is defined to evaluate the model fitness and the level of its deviation, it has an index value ranging from −∞ to 1 [26]:

IV. RESULTS AND ANALYSIS
In this section, performance results of two adopted algorithms according to generalization of demographic information are presented. Generalization performance of the algorithms is computed using well-defined evaluation metrics of section III-F. Our results demonstrated that optimized ANFIS model with SC tracks nonlinear data pattern of intrinsic hand features in most of the data pattern. ANFIS-SC outperforms optimized SVM with ECOC model. The best results were settled at model 1 (includes all the selected features from PCC scheme). Best results are illustrated bold-faced in Tables 11-18. The performance speed of the two algorithms are explained in section IV-B. However, optimized SVM-ECOC model outperforms conventional SVM from one of the existing model.

A. COMPARISON BETWEEN PERFORMANCE OF OPTIMIZED ANFIS-SC, SVM-ECOC AND SOME EXISTING MODELS
According to Table 19, our adopted algorithms are compared based on the work in [19]. Evaluation metrics in Section III-F were used, but in some metrics where conventional method did not utilize that terms, those places were left blank with dash to indicate that, metrics is not available.
The comparison is made based on number of recognition accuracy, R 2 , RMSE, AICc, MAD, and MAPE, respectively. The best result is achieved by ANFIS-SC. However, in some instances SVM-ECOC is computationally efficient compared to ANFIS-SC and LOG model. LOG models recognized majority features efficiently, while disregarding minority features as noise, which lead to mislcassification and low computed accuracy. In addition, comparison is made against performance of LOG-based classification as displayed in Table  20. Results indicate applicability and superior performance of ANFIS-SC in biometric recognition than the LOG-based method.
Moreover, ANFIS-SC model has superior performance when compared to other two models, therefore it is chosen for further analysis according to two metrics: AICc index and NSE index. AICc is adopted due to our small sample size (S) and number of different parameters (θ). However, value of AICc is obtained from Equation (33), with 1.86. We observed that AICc from ANFIS-SC model achieved best result, which lead to good model flexibility. Smaller criterion of AICc depict model accuracy. In addition, final prediction error of ANFIS-SC Model is obtained as 2.815 × 10 −6 . This value demonstrates that ANFIS-SC model has good generalization quality to biometric hand intrinsic cues. However, ANFIS-SC model fitness of 99% is achieved through evaluation of NSE metrics for the complete proposed features. Degree of model information lost is evaluated through AICc with good results and interpretation as detailed in Table 21. The value of η N −S is obtained through Equation (34), with 98.99% model accuracy. Therefore, our obtained results demonstrates computational flexibility and soft computing of ANFIS-SC model to nonlinear datasets. Hence, this work focus only on the performance accuracy of the algorithms rather than speed performance of the adopted algorithms, however the following section present the algorithms' running time.

B. RUNNING TIME
As described from Tables 22-23, running-time complexity for training and validating recognition of SVM-ECOC and ANFIS-SC algorithms are provided. Running-time complexities approximate all bounds of each algorithm during recognition of the selected biometric information. Running time from SVM-ECOC is obtained from its corresponding output, which comes from response of classifier f, parameter to observe whether recognition of SVM-ECOC classifiers can be averted to best recognition, and variable which select best input features to be used during validation φ. SVM-ECOC running time is given by: However, training of SVM-ECOC algorithm may amount to run time from the following equation: The running time during validation is given as T (n). This is because the best models with good accuracy settle at n features, thus the overall run time complexity during validation can be formulated as follows: where T stand as big T (time complexity), R u denote maximum number of upper bounds, ℵ maximum number of iterations, n number of samples in the data set, φ number of selected features per each model combination. However, time taken to verify unseen inputs in SVM-ECOC algorithm raises based on the computation of (φ × n). This means that, it depend on the number of selected features and number of input sets. Therefore, it is finally settled at: Furthermore, the ANFIS-SC run time can be obtained by computing the time complexity of SC and ANFIS. In what follows, the SC run time can be formulated as: T (ℵ(n + n 2 )).
SC time complexity, computed distance matrix among data set couplets, which need T (nm 2 ) arithmetic operations. Since distance matrix is obtained, then, SC resulted in having equal number of iterations to that of the number of clusters. Per each cycle, the running time of Eq. (9) is given as T(m). In general, ANFIS run time is observed from T(1) which is the run time from fuzzy rules. Whereas, Eq. (2) has run time of T (n) and Eq. (3) has run time T (n 2 ). Thus, the total run time of ANFIS-SC is achieved by summing over all time complexity among the SC and ANFIS, as follows: T (ℵ(nU + obj.U )) + T (1) + T (n) + T (n 2 ) = T (n 2 ), (40) where U and obj. denote number of solutions and objective function. Moreover, run time of the compared AI algorithms have brief implementation as compared to other methods. Moreover, the major drawback of extending SC with AN-FIS is the trial and error while selecting the most suitable and stable radius. This is due to a small cluster radius that may gives more centers, which as a result may lead to overfitting. Whereas large cluster radius may lead to less centers, that may lead to under-fitting, as a result may decrease the recognition accuracy of the models.

V. CONCLUSIONS
In this work, we initialized AI algorithms using ECOC and subtractive clustering optimization schemes as potential enabler in conventional AIs to handle complex parameters estimation. The work aimed to investigates correlation among intrinsic hand measurements and demographic features. It particularly used AI algorithms and LOG for comparison to analyzed and recognize sex, height, weight, and foot-size from 21 hand features extracted through measurement of hand bones. The compared models are realized according to proper parameters chosen. The evaluation metrics show that AI recognition performed better than LOG while predicting demographic characteristics. Our results have shown VOLUME 4, 2016 to agree with and performed better than previous method. Specifically, ANFIS-SC based learning demonstrates high performance in terms of accuracy and speed, when compared to SVM-ECOC learning. ANFIS-SC learning results qualified it worthy in many applications such as crime detection and soft-biometric features recognition. This work has direct application in both biometric and forensic industries. The major limitations of both methods are related to accuracy and high number of computational burden for a limited number of the hand corpus. This means that, on restricted hand corpus, the quality of the biometric recognition of AI algorithms would be already rather good. Moreover, adding more demographic and psychological cues such as race/ethnicity, life expectancy, and facial marks can further improve the recognition capabilities by reducing the number of classification errors. In addition, AI algorithms yield premature and slow convergence.