KPLS Optimization With Nature-Inspired Metaheuristic Algorithms

Kernel partial least squares regression (KPLS) is a technique used in several scientific areas because of its high predictive ability. This article proposes a methodology to simultaneously estimate both the parameters of the kernel function and the number of components of the KPLS regression to maximize its predictive ability. A metaheuristic optimization problem was proposed taking the cumulative cross-validation coefficient as an objective function to be maximized. It was solved using nature-inspired metaheuristic algorithms: the genetic algorithm, particle swarm optimization, grey wolf optimization and the firefly algorithm. To validate the results and have a reference measure of the efficiency of the nature-inspired metaheuristic algorithms, derivative-free optimization algorithms were also applied: Hooke-Jeeves and Nelder-Mead. The metaheuristic algorithms estimated optimal values of both of the kernel function parameters and the number of components in the KPLS regression.


I. INTRODUCTION
Partial least squares (PLS) regression is a linear method that seeks to predict a set of dependent variables (Y) from a set of predictors (X) by extracting orthogonal factors that maximize predictive ability, also called components [1], [2]. PLS is not appropriate for describing data structures when these exhibit nonlinear variations [3], so Rosipal and Trejo [4] proposed the kernel partial least squares regression method (KPLS), which transforms the original datasets into a space of arbitrary dimensionality characteristics, where the generation of a linear model is possible [5]. A recurring difficulty when implementing KPLS regression is determining both the number of components and kernel function parameters that maximize its predictive capacity [6], [7], generating an optimization problem.
In recent decades, so-called nature-inspired metaheuristic algorithms have emerged, which are powerful and effective in addressing various types of optimization problems, do not require assumptions to work well and are flexible and easy to implement [8]. In addition, according to Lin et al. [9], metaheuristic algorithms obtain better search subsets to determine optimal or approximate solutions in the transformed space, so they are suitable for determining both the kernel The associate editor coordinating the review of this manuscript and approving it for publication was Zhenzhou Tang . function parameters and number of components in the KPLS regression [10].
The aim of this work is to propose a methodology to optimize the predictive ability of KPLS regression through nature-inspired metaheuristic algorithms such as the genetic algorithm (GA), particle swarm optimization (PSO), grey wolf optimization (GWO) and the firefly algorithm (FFA). The relative performance of these algorithms is evaluated in terms of accuracy, convergence of estimates and computational speed. To validate the results and have a reference measure of the efficiency of the nature-inspired metaheuristic algorithms, the performance of two derivative-free algorithms is compared: Hooke-Jeeves (HJ) and Nelder-Mead (NM) [11]. This article is an extension of the conference proceeding presented for the International Workshop in Statistical Methods and Artificial Intelligence IWSMAI 20 [12], focused on the use of GA as an optimization agent. The scope of this article is broader in that it incorporates into the analysis a set of both nature-inspired metaheuristic algorithms and deterministic algorithms.
Several studies have chosen to use nature-inspired metaheuristic algorithms to improve the performance of PLS/KPLS regression, with different purposes and schemes than those presented in this article. In the literature review, it was found that the most recurrent metaheuristic algorithms in PLS/KPLS regression problems are GA [10], [13], [14], Algorithm 1 KPLS Algorithm by Rosipal and Trejo [4] 1. Randomly initialize u (any column of Y) Repeat steps 2 -5 until convergence 6. Deflate K,Y matrices: K←−(I-tt T )K(I-tt T ); Y ←− (I-tt T Y) where I is an n-dimensional identity matrix 7. Repeat steps 2 -7 until all the h-score vectors of T and U are found.
PSO [15], [16], and less frequently FFA [17]. This article is organized as follows. Section 2 presents the KPLS regression method, its cross-validation coefficient Q 2 cum and describes the four nature-inspired metaheuristic algorithms to be implemented: GA, PSO, FFA and GWO. Section 3 presents the methodology. Sections 4 and 5 present the results and conclusions, respectively.

A. KERNEL PARTIAL LEAST SQUARES REGRESSION
Denote by an (n × M ) matrix whose i-th row is the vector φ(x i ) in a M -dimensional feature space F, which can be very large [18]. The kernel function k(x i , x j ) = φ(x i ) T φ(x j ) calculates the inner product in the feature space F. We can see that K = T represents the (n × n) kernel matrix of the cross dot products between all mapped input data points {φ(x i )} n (i=1) , i.e., the element of the i-th row and j-th column of K is k ij = k(x i , x j ) [19]. In this article, the Gaussian kernel function k( KPLS regression is essentially a technique based on the decomposition of this kernel matrix K into singular values [20]. Rosipal and Trejo [4] proposed the KPLS algorithm as shown in Algorithm 1. After all the h components are extracted, we can write the matrix of the regression coefficients B in the following form B = T U(T T KU) (−1) T T Y, where the (n × h) matrices T and U represent the input and output scores, respectively.
To make predictions on the training data, we can writê are the testing and training points, respectively. Before applying KPLS, mean centering of the data should be carried out in the feature space [3].
KPLS regression predictive ability can be estimated for each included h-component by means of the cross-validation where PRESS h denotes the predictive residual sum of squares including the component h and RSS h−1 the residual sum of squares with the previous components [21], [22]. The overall predictive performance of KPLS regression can be assessed by the cumulative cross-validation coefficient , which takes values between 0 and 1, and the higher the Q 2 cum value is, the better the predictive performance [23].
Cross-validation is probably one of the oldest resampling techniques [24]. The procedure divides the dataset into blocks of size equal to k and then use k -1 blocks to fit the model and validate it in the remaining block. This is done for all possible combinations of k -1 of the k blocks. The k-blocks are usually called folds in the cross-validation literature [25]. In this article, the cross-validation procedure was applied to randomly split the dataset into k = 10 segments of equal size to obtain the coefficients Q 2 and Q 2 cum [26]. We can list two main problems in KPLS: the selection of the kernel function and its parameters and the selection of the number of components. However, kernel function selection is still an open problem in KPLS [27]. This article proposes a metaheuristic tuning procedure to simultaneously estimate the kernel function parameter and the number of components, where the objective function is the cumulative cross-validation coefficient Q 2 cum (h, θ) for the case of KPLS regression is a function of both parameters. The optimization task can be written as follows.
(h, θ) = arg max The domain of the number of components is the set of positive integers smaller than the number of columns of X and the domain of the kernel function parameter is a predefined subset S of real numbers.
The main differences with respect to [7] are both the optimization task formulation and the cross-validation approach based on grid-search that it proposes.This article does not include a comparison of the accuracy of the estimates and the computational costs associated with the two proposals due to differences in the databases evaluated and the software packages used.
An additional feature of this proposal is that it designs an algorithm already embedded in a general iterative optimizer, e.g., a nature-inspired metaheuristic or derivative-free algorithms. The proposed method is universal for any kernel function and easy to implement in any computer. Algorithm 2 describes the steps of the method to be used with nature-inspired metaheuristic algorithms.
To perform the statistical analysis of the estimates, several runs of this procedure are performed for each optimizer, and the convergence of the estimates with the gradual increase in the optimizer iterations is also evaluated.

B. NATURE-INSPIRED METAHEURISTICS ALGORITHMS
Metaheuristics, in their original definition, are solution methods that orchestrate an interaction between local improvement procedures and higher-level strategies to create VOLUME 8, 2020

Algorithm 2 KPLS Parameter Selection
Given X n×N the input matrix, Y n×L the output matrix, a kernel function k ij and its kernel parameter θ ∈ S, and the number of components h ∈ Z + . 1. The optimizer randomly generates an initial population p of θ (individuals). 2. From X n×N the input matrix, p kernel matrices K n×n are generated, one matrix for each θ.
3. KPLS regression (Algorithm 1) between K n×n and Y n×L is executed with h = N components. 4. Q 2 cum is calculated by 10-fold cross-validation for each KPLS. 5. Identify θ associated with the maximum Q 2 cum and extract h for which this value is reached. 6. g optimizer iterations are performed. The fitness function of the optimizer is max Q 2 cum . 7. The optimizer determines the best values of θ and h and the optimal Q 2 cum .
a process capable of escaping from local optima and performing a robust search of a solution space [28]. Metaheuristic algorithms are appropriate for solving problems with complex formulation because they do not use any specific information of the problem in the exploration of the space of feasible solutions [29]. An important set of nature-inspired metaheuristic algorithms is that they mimic the behavioral characteristics of biological agents [30]. The most popular are evolutionary algorithms and swarm intelligence algorithms, which have demonstrated their potential for solving important engineering decision-making problems [31]. The basic formulations of four nature-inspired metaheuristic algorithms: GA, PSO, FFA, and GWO [32]- [34], are described in this section. The derivative-free optimization algorithms used in this work are also briefly presented [11].

C. GENETIC ALGORITHMS
GA is a population-based metaheuristic algorithm widely used in search and optimization problems [35], [36]. GA works on a population of individuals (chromosomes). Each individual represents a potential solution to the problem to be solved [37]. In the algorithm, an individual (or chromosome) is represented by a binary bit string, an initial population of individuals is created randomly, the objective function for each individual is evaluated, and based on these values, the individuals of the next iteration are produced by selection, crossing and mutation operations. The basic steps of the algorithm can be consulted in [14].
For the implementation of a GA, it is necessary to previously define its parameters such as crossing probability p c , mutation probability p m [38], population size p and number of generations g [32]. The completion criterion can be determined by the number of generations (iterations) g or the convergence of results [39]. In Mello-Román et al. [12], the optimization of the KPLS with GA was deepened, and preliminary tests were carried out to determine the p and g values.

D. PARTICLE SWARM OPTIMIZATION
PSO is a population-based metaheuristic optimization technique inspired by the social behavior of bird flocks or the movement of fish shoals [40]. Each particle in the swarm is a potential solution to the optimization problem, which moves through the search space and updates its velocity and position with a learning mechanism based on its personal best experience and the population best experience [15]. In each iteration, every particle moves in a direction that is based on the previous best individual position pbest and the best overall position of the swarm gbest. For the dimension d of a particle i, the velocity and position can be updated according to the following equations: v where t is the iteration, v id is the velocity, x id is the particle position, ω is an inertia factor, c 1 and c 2 are learning factors, and r 1 and r 2 are random numbers between 0 and 1. More details about the steps followed by PSO can be found in [16].

E. FIREFLY ALGORITHM
FFA is a swarm intelligence technique inspired by imitating the light emissions of fireflies to attract each other [41].
To formulate the algorithm, the following conditions are established: a firefly can be attracted by any other firefly, the attraction is proportional to brightness for any pair of fireflies, the less bright firefly will move towards the brighter firefly, as the fireflies move away the perception of the light decreases, and the intensity of the light emitted by the firefly is determined by the value of the function to be optimized [42]. Each firefly (candidate solution) flashes its lights with some brightness (associated with the objective function) attracting other fireflies within its neighborhood. This attractiveness depends on the distance between the two fireflies and is determined by β(r) = β 0 e (−r 2 ij ) , where r or r ij is the (Euclidean) distance between the i-th and j-th of two fireflies, β 0 is the initial attractiveness at r = 0 and γ is a fixed light absorption coefficient that controls the decrease in the light intensity [30]. The movement of the i-th firefly is attracted to another more attractive firefly j and is given by the following equation [42]: where x i is the particular location of the i-th firefly, α is the randomization parameter, and i is a vector of random numbers drawn from a Gaussian or uniform distribution. The steps of the FFA algorithm can be found in [43].

F. GREY WOLF ALGORITHM
GWO is a recent metaheuristic optimization technique that simulates the hierarchical mechanism and predatory behavior of the gray wolf pack, where under the leadership of the main gray wolf, wolves capture prey through a series of processes: surrounding, hunting and attacking [44]. GWO is an optimization algorithm with a strong global search capability; however, studies continue to be conducted to verify its performance in different types of problems and suggest improvements [45], [46]. According to Mirjalili et al. [44], the process can be described as follows: first, the gray wolves track the prey, chase it and approach it; if the prey runs, then they chase it, surround it and harass it until it stops moving. Finally, the attack begins. Details of the steps followed by the GWO algorithm and its parameters can be found in Gao and Chao et al. [45].

G. DERIVATIVE-FREE OPTIMIZATION ALGORITHMS
The proposal of this work is to optimize the KPLS regression by means of nature-inspired metaheuristic algorithms. However, we intend to check the performance of some deterministic algorithms on the same optimization problem, both to validate the results and to have a reference measure of the efficiency of the set of nature-inspired metaheuristic algorithms evaluated. According to Sun et al. [47], when it is necessary to determine the optimum of a real-value function defined in n-dimensional space, but the derivatives are not available because they are computationally expensive or there is no explicit expression of the partial derivatives of the function, it is possible to resort to optimization methods called derivative-free or direct search methods. These methods use only function values and apply when no computer code can be produced for the derivative of the function. In general, they are robust methods, even for nonconvex or discontinuous functions [48].
Two popular derivative-free optimization algorithms are NM and HJ [11]. The NM algorithm is a direct search method for the unrestricted optimization of multidimensional functions, and it attempts to minimize a nonlinear scalar function of n variables by using only values of the function [49]. The HJ method is also a derivative-free optimization method that searches for directions of descent by exploratory movements through coordinate directions by performing two types of searches: an exploratory search and a pattern search, which are repeated until convergence [50].

III. METHODOLOGY
Taking as reference the good practices proposed by Osaba et al. [29] for the implementation and comparison of metaheuristics, in this work, we present for each algorithm the number of runs performed, the mean and standard deviation of both the objective function Q 2 cum and the KPLS parameters θ and h [7], the convergence behavior, the run times, and an appropriate statistical analysis with the results obtained.
The parameter estimation algorithm for the KPLS regression was programmed in the R programming language. This required the installation of the following packages: plsdepot by Sanchez [51], kernlab by Karatzoglou et al. [52], metaheuristicOpt by Riza and Nugroho [53] and dfoptim [54]. The Gaussian kernel function was selected for this work [7], [26].
Preliminary tests were performed to select the hyperparameters of the nature-inspired metaheuristic algorithms [32], [33]. Several sets of hyperparameters were evaluated while keeping both the population size and number of iterations constant. Considering the overall goal of this work, the sets of hyperparameters for which the algorithms showed higher mean values of Q 2 cum and with lower dispersion were selected [55].
The iterations of the metaheuristic algorithms must be stopped according to some criterion, which can be the convergence of the algorithm or a preset number of iterations [39]. In this article, we evaluated the convergence of estimates for a set of iterations g = {10, 30, 50, 100, 150} keeping the population size constant at the value p = 40 [53].
The parameter estimation procedure for KPLS regression is presented in Algorithm 2. Thirty runs [43] of the algorithm were performed, extracting the maximum value of Q 2 cum and its associated values of both the number of components h and the parameter of the Gaussian kernel function σ for each run. Furthermore, the running time was calculated in minutes.
For the different runs, the maximum Q 2 cum reached, the Gaussian kernel function parameter value σ , the number of components h and the running time were obtained. The number of iterations of each metaheuristic algorithm was gradually increased in g = {10, 30, 50, 100, 150}, and the mean and standard deviation of these estimates were extracted [29]. To validate the results and have a benchmark of the efficiency of the GA, PSO, FFA and GWO algorithms in the optimization task, the performance of the HJ and NM algorithms was also evaluated. The NM and HJ algorithms establish a set of random starting points, taking into account the fact that these procedures are usually stuck in local optima [56]. The parameters of these algorithms were established with reference to recent research [57], [58].
Finally, a statistical analysis was carried out with the results obtained to test the hypothesis that there are no statistically significant differences between the estimates of the different metaheuristic algorithms [29]. One-way ANOVA was performed taking as a null hypothesis that the means of the estimates of the objective function Q 2 cum and the σ and h parameters were the same for the different algorithms evaluated. The analysis was complemented with variance homogeneity tests and multiple post hoc comparisons to identify the best-performing algorithms [67]. The Thamane T2 test was used for post hoc comparisons. This test is indicated when variances and/or sample sizes are unequal [68]. Additionally, the Kolmogorov-Smirnov normality test was applied to contrast that the distribution of σ estimates is normally distributed.

A. DATASET
For the implementation of the Algorithm 2, a dataset reported by Tsanas et al. [60] in the Machine Learning Repository at the University of Carolina at Irvine [61] was used. The dataset VOLUME 8, 2020 consists of biomedical voice measurements from 42 people with early Parkinson's disease recruited for remote monitoring of symptom progression. The predictor variables are subject number, subject age, subject gender, time interval from initial recruitment date, and 16 biomedical voice measurements. The data are used in predictive tasks for scores on the Unified Parkinson's Disease Rating Scale (UPDRS), motor and total scores [60], [62]. This scale is used to track Parkinson's disease [63].
Several research studies have used different techniques to attempt to predict the severity of Parkinson's disease symptoms from speech cues [64]. In particular, on the same dataset, feature selection methods [60], least squares regression techniques, nonparametric classification and regression trees were applied. In [63], artificial neural networks were tested. In both works, [63] and [64], despite the good results, the high correlation between some measures of dysphonia was highlighted, and it was suggested to further explore nonlinear methods.
Additionally, we used another dataset [65] hosted in the same repository [61] to implement Algorithm 2. It contains data on the academic performance of three hundred ninety-five (395) students of two Portuguese secondary schools in mathematics and Portuguese language, measuring forty-six (46) variables: grades of three periods, G1, G2 and G3, demographic, social and other variables related to the school, all compiled through reports and school questionnaires. More details about the variables can be found in [61]. This data file has been used in the past [65] for classification and ordinal regression purposes. More specifically, univariate ordinal regression was performed taking G3 as the response variable, comparing the results when G2 and G1 were also included as predictor variables and when they were excluded. It has been noted that feature selection methods should also be explored since only a small portion of the input variables considered appear to be relevant.

IV. RESULTS
Preliminary tests were first carried out to select the hyperparameters of the nature-inspired metaheuristic algorithms for the two datasets used [60], [65]. The results are presented in Table 1.
Below are the results of thirty runs of the parameter estimation algorithm in KPLS regression proposed in Algorithm 2 for the metaheuristic algorithms GA, FFA, PSO and GWO and the derivative-free algorithms NM and HJ. To generate the kernel matrix K, the Gaussian kernel function was used with the parameter σ > 0.
The population size of each metaheuristic algorithm was set at p = 40, and the iterations were gradually increased by g = {10, 30, 50, 100, 150}. The derivative-free NM and HJ algorithms took several random starting points σ 0 > 0 in given ranges for both datasets. The convergence criterion was the value 1.e-06 [54]. Figure 1 presents the behavior of the estimates of Q 2 cum for both datasets. The metaheuristic algorithms estimated higher  average values of Q 2 cum and with a lower dispersion than the free derivative algorithms. Estimates of Q 2 cum were very similar for all four metaheuristic algorithms, with minimal  variations observed with the increasing numbers of iterations. Despite the above differences, the estimates of the deterministic algorithms do not differ considerably from the values obtained by the metaheuristic algorithms. Details can be seen in Tables 2 and 3. A statistical analysis of Q 2 cum estimates was carried out [29]. First, one-way ANOVA was executed taking as a null hypothesis that the means of Q 2 cum were equal for the different algorithms. The analysis was complemented by a variance homogeneity test and multiple post hoc comparisons to identify the best-performing algorithms [67]. The results presented in Table 4 indicate that there is statistical evidence to reject the null hypothesis that the average estimates of Q 2 cum are the same for the different tested algorithms for both datasets. Similarly, the result given by Levene's statistic indicates the nonhomogeneity of variances.
Estimates of the parameter of the Gaussian kernel function σ were on average similar for the algorithms evaluated in  both datasets. Details can be seen in Tables 2 and 3. NM and HJ showed mean values of σ close to those obtained by the metaheuristic algorithms, but with a much larger standard deviation. Table 4 shows the results of the ANOVA carried out to contrast the null hypothesis of equality of means of σ obtained by the different algorithms implemented. The ANOVA test indicates that there is statistical evidence to reject the equal-means hypothesis at a significance level of α = 0.05. Likewise, Levene's statistic indicates the nonhomogeneity of variances. For post hoc comparisons, the T2 Thamane test was used to indicate when variances and/or sample sizes are unequal [68]. The results for the voice biomedical dataset indicate a significant difference between the Q 2 cum estimates of the HJ algorithm and the other algorithms. In the academic performance dataset, multiple comparisons found no statistical evidence to state that Q 2 cum estimates were different for the algorithms evaluated. In this dataset, pairwise comparisons were not significant, while the overall effect was weakly significant [69]. This last claim could be justified by both the weak overall effect and the conservative characteristics of the T2 Thamane test. The results are presented in Table 5.
The number of h components estimated was equal to 6 for all algorithms evaluated in the biomedical voice measurement dataset. For the academic performance dataset, the estimates were variable; however, the most frequent estimate for all the algorithms was h = 7. The results are presented in Table 6.
A result that is favorable to the estimation in the interval of the σ parameter is the approximately normal distribution of the estimates for certain algorithms. Taking h = 6 for the  biomedical voice measurement dataset and h = 7 for the academic performance dataset, the Kolmogorov -Smirnov normality test was applied. The null hypothesis of the contrast was that the distribution of sigma estimates is normally distributed. At a significance level α = 0.05, the hypothesis for the algorithms for PSO and GWO in both datasets was not rejected. Details are shown in Table 7. The distributions of the distribution of σ parameter estimates by the nature-inspired metaheuristics algorithm are presented in Figure 2 for both datasets.
According to [8], in a complex optimization problem where there is no theory to find or verify the optimal, if several nature-inspired metaheuristic algorithms with tuned hyperparameters converge to a solution, this solution is most likely the optimal solution.

A. CONVERGENCE ANALYSIS
The convergence of estimates of Q 2 cum , σ and h was evaluated with respect to the variation in the number of iterations of each nature-inspired metaheuristic algorithm. Tables 2 and 3 do not show an overall effect in the increase in the iterations in the means and standard deviations obtained for Q 2 cum , σ and h. A statistical test was carried out taking as a null hypothesis the equality of means of Q 2 cum , σ and h for the different levels of iterations. Levene's statistic was also calculated to verify the homogeneity of variances. Table 8 presents the results.
There is no statistical evidence of the effect of increasing iterations in nature-inspired metaheuristic algorithms on estimates of Q 2 cum , σ and h in the two datasets evaluated at an α significance level = 0.05. However, for the academic performance dataset and at a significance level of α = 0.10, the null hypothesis of equality of means and variances in σ estimates with the increase in the number of iterations is rejected.
A relevant point in the comparison of the optimization algorithms is the computational cost associated with each algorithm. Tables 2 and 3 show a direct relationship between the number of iterations and the running time in minutes for the four metaheuristic algorithms. GA and PSO have a higher associated computational cost in both datasets and for any number of iterations. However, the reduced running time of the NM and HJ algorithms is noteworthy.
For the evaluated datasets and this specific optimization task, the results are inconclusive regarding the contribution to the accuracy of the estimates made by the increased iterations and running time of the nature-inspired  metaheuristic algorithms. Figure 3 shows the estimates of Q 2 cum in both datasets with respect to the running time of the nature-inspired metaheuristic algorithms. No direct relationship between the two variables is observed in either cases.
These results are consistent with those of previous publications [8], which suggest that deterministic algorithms are more computationally efficient than nature-inspired metaheuristic algorithms and refer to the slow convergence of metaheuristic algorithms.
Due to differences in objectives and methodologies, it was not possible to compare the results obtained with the biomedical voice measurement dataset with other studies conducted on the same dataset [60], [63]. The results obtained in this article regarding the academic performance dataset first verified the estimates obtained in [12], which exclusively used GA as the optimizer to improve the predictive capacity of the KPLS regression. However, the results in terms of convergence of estimates with the increasing iterations have not been repeated for the voice biomedical measurement dataset or for the other nature-inspired metaheuristic algorithms.

V. CONCLUSION
This article proposes a methodology to simultaneously estimate both the kernel function parameter θ and the number of components h, which maximize the predictive capacity in the KPLS regression. An optimization problem is posed that takes as an objective function the cumulative cross-validation coefficient Q 2 cum . To solve the problem, an algorithm is designed that embeds nature-inspired metaheuristic algorithms: GA, PSO, FFA and GWO. To validate the results and have a reference measure of the efficiency of the set of algorithms, the performance of derivative-free optimization algorithms NM and HJ on the indicated problem is also evaluated.
The results indicate that the nature-inspired metaheuristic algorithms allow the estimation of approximate optimal values of the KPLS regression parameters. In the two datasets, the mean estimates of Q 2 cum , θ and h made by the four metaheuristic algorithms were similar with low dispersion. The derivative-free algorithms obtained a similar performance in the mean of the estimates, validating the results obtained, but the dispersion in the estimates was generally higher. Under the conditions established in this work, it is not possible to draw a conclusion about the convergence of the estimates with the increase in the iterations in the metaheuristic algorithms. Additionally, the derivative-free algorithms have shown a much lower computational time than the metaheuristic algorithms.
The conclusions reached are limited to the optimization problem posed and the datasets used. Future studies can explore other approaches to optimize the predictive ability of the KPLS regression, reduce the computational cost and dispersion in the estimates, include more kernel functions in the analysis, and test with more datasets. ADOLFO HERNÁNDEZ received the Ph.D. degree in mathematical sciences (statistics) from the Carlos III University of Madrid. He is currently an Associate Professor with the Complutense University of Madrid, where he is also the Head of the Research Group "Statistical Methods and Big Data Applied to the Economy, Tourism, and other Social Sciences." He was a Former Lecturer and a Senior Lecturer with the Carlos III University of Madrid, the Autonoma University of Madrid, the University of Salamanca, and the University of Exeter, U.K.