Robust Multi-Linear Fuzzy SVR Designed With the Aid of Fuzzy C-Means Clustering Based on Insensitive Data Information

Multiple SVR based on ensemble learning could be enhanced from the viewpoint of the performance, but the performance of modeling closely depends on the initial condition of the partitioning method and they are easily affected by noise and outliers. In this study, a multi-linear fuzzy support vector regression (MFSVR) robust to noise is proposed with the aid of the composite kernel function and <inline-formula> <tex-math notation="LaTeX">$\varepsilon $ </tex-math></inline-formula>-fuzzy c-means (FCM) clustering based on insensitive data information. Here insensitive data information stands for the interval data information of “<inline-formula> <tex-math notation="LaTeX">$\varepsilon $ </tex-math></inline-formula>” which stands for insensitive loss parameter used in the <inline-formula> <tex-math notation="LaTeX">$\varepsilon $ </tex-math></inline-formula>- insensitive loss function. The objective of this study is to reduce the effect of noise and to alleviate the overfitting problem through the synergistic effect of the following methods: First, <inline-formula> <tex-math notation="LaTeX">$\varepsilon $ </tex-math></inline-formula>-FCM clustering based on insensitive data information is used for considering more impact on decision boundary and reducing the effect of noise. Second, the composite kernel based on multiple linear kernel expression is proposed for implementing multi-linear decision boundary to alleviate overfitting problem. In more detail, each training data point is assigned with corresponding membership degrees in the <inline-formula> <tex-math notation="LaTeX">$\varepsilon $ </tex-math></inline-formula>-FCM clustering. Some data which are potentially to be noise or outlier are assigned with lower membership degrees and given small contribution (compensation) considered in composite kernel function. Then, the composite kernel function for multiple local SVRs is constructed according to the distribution characteristics of <inline-formula> <tex-math notation="LaTeX">$\varepsilon $ </tex-math></inline-formula>-FCM clustering. The proposed MFSVR is tested with both synthetic and UCI data sets in order to verify the effectiveness as well as efficient performance improvement. Experimental results demonstrate that the proposed method shows the better performance when compared to other some methods studied so far.


I. INTRODUCTION
During the past two decades, an identification of nonlinear models (NM) has received much attention. One can refer here to neural networks (NN), radial Basis Function Networks (RBFNs), Wavelet Networks (WNs), Neural Fuzzy Networks (NFNs) [1]- [4]. Support Vector Regression (SVR) has attracted interest due to its high generalization ability and The associate editor coordinating the review of this manuscript and approving it for publication was Yizhang Jiang . robustness encountered in various applications. It is based on Vapnik's ε-insensitive loss function and structural risk minimization [5], which was subsequently proposed as the support vector machine (SVM) implementation for regression and function approximation [6]. SVR has a variety of attractive linear recognition characteristics, thus has been studied and applied successfully to identification problems. Among these characteristics, an important common point is the use of kernel techniques, which perform nonlinear mapping to a high-dimensional feature space implicitly by replacing the VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ inner product with a positive definite kernel function. It is well known that the performance of SVM depends on the choice of kernel function. In the SVM literature, there exist linear kernel, sigmoid kernel, radial basis function (RBF) kernel function. On the other hand, composite kernel functions produce more flexible regression mapping by combining different types of kernel functions. Many real-world systems exhibit complicated nonlinear features that are difficult to identify through suited linear features. The reason is that there is no exact multicollinearity amongst input variables. In these tasks, nonlinear kernel SVR such as RBF and sigmoid are also confronted with the potential over-fitting issue, especially for those linearly inseparable datasets in the high dimensional feature space. As a result, this would lead to severe over-fitting in the conventional nonlinear SVR. To address this problem, ''divide-and-conquer'' algorithm is an important strategy and widely used as an available tool for solving conceptually difficult problems [7]. A ''divide-and-conquer'' algorithm works by breaking a difficult problem into two or more sub-problems. The solutions to the sub-problems are then combined to give a solution to the original problem [8]. In [9], SVR with quasi-linear kernel is proposed for nonlinear system identification, the quasi-linear can consider nonlinear system as multi-linear model.
There is also a problem with the ''divide-and-conquer'' strategy. Each subset is not capable of capturing the distribution information of the input data. If the number of subsets is too large, it is more likely to encounter an over-fitting phenomenon.
When the multiple SVR model are applied, the original training data are partitioned by some unsupervised learning approaches, such as hard clustering method (HCM). One shortcoming of these methods for multiple SVR is that regression results are easily affected by noise, because the multiple SVR has a small number of data located in the training subspaces [24]. Also, the partitions closely depend on the initial conditions, the performance of regression results depends on whether the partition can capture the linear characteristic of the training dataset. Due to this kind of partition, multiple SVR for local subsets could be easily imbalanced Vapnik-Chervonenkis (VC) theory is introduced based on a limited number of data. Structural risk minimization (SRM) is the most important research area in the VC theory. The SRM principle provides a tradeoff between the quality of an approximation and the complexity of the approximating function [26]. The way to control the approximation and complexity is to adjust the insensitivity parameter ε present in the loss function. The decision boundary based on ε-insensitive margin is considered as the regression model. For the regression problem, more compensation on regression model has benefit for alleviating the overfitting problem on complexity regression task. Here, a FCM clustering based on ε-insensitive data information (ε-FCM) is proposed for providing more compensation on the interval data in the ε-insensitive margin.
In this study, a multiple fuzzy SVR (MFSVR) based on ε-FCM and the composite kernel function is proposed for alleviating the overfitting problem. The proposed MFSVR approximates a nonlinear separating boundary by a number of multi-local linear boundaries with interpolation, in which the local linear property of the nonlinear boundary is used to construct a composite kernel. In the composite kernel function, the membership degrees obtained by ε-FCM are considered to provide the regression model with more compensation. The contribution of irrelevant data which are far away from regression model will be limited. Then, the information of local subsets is merged for estimating a composite kernel, and then the composite kernel is utilized to realize the multi-linear fuzzy SVR model and implement the SRM in the same way as a single standard SVR.
The key issue and the contribution of the proposed MFSVR can be briefly outlined as follows: (a) Fuzzy subsets obtained by ε-FCM based on ε-insensitive margin are used for limiting the effect of irrelevant data. When the training dataset is divided into several subsets, the SVR decision boundary is more susceptible to uncorrelated data with less training data. ε-FCM alleviates the effect of uncorrelated data through assigning more compensation (through the corresponding membership degrees) on decision boundary (regression model). In the ε-FCM, data that belong to the ε-insensitive margin are assigned with a high membership degree for fitting the regression model more accurately.
(b) The proposed MFSVR based the composite kernel function is implemented as multi-linear system for fitting the complicated system more properly. The purpose of the composite kernel function is to alleviate the overfitting problem with multiple local linear decision boundaries. The composite kernel considers the nonlinear decision boundary as a combination of several multiple local linear decision boundaries. The local linear decision boundary obtained by local clusters can be adjusted by the number of clusters for alleviating the overfitting problem happened in the small clusters. The proposed MFSVR can alleviate over-fitting problem based on this composite kernel function.
(c) In order to carry out a fair and comprehensive comparison, statistical analysis is completed in experimental analysis.
The experiments include a comparison of the proposed model with a multi linear SVR, and a comparison of the proposed model with other general models. In order to clearly display the results of the comparison, Bonferroni-Dunn test is used to compare models run on the same datasets and to evaluate if the measured average ranks are significantly different from the mean rank.
In the sequel, the main contributions of our work can be summarized in a concise way as follows: First, The local linear decision boundary obtained by local clusters can be adjusted by the fuzzy clusters for alleviating the overfitting problem. Second, ε-FCM can effectively decrease the effect of uncorrelated data by assigning more compensation (through the corresponding membership degrees) on decision boundary (regression model). Third, the composite kernel based on multiple linear kernel expression is used through multi-linear decision boundary for better performance improvement.
The paper is organized as follows. Section 2 provides a survey of SVR models. Section 3 presents the ε-FCM method. Section 4 gives the formulation of MFSVR while section 5 outlines the procedure of MFSVR. Section 6 shows the experimental results and analysis. Finally, conclusions are covered in section 7.

II. SURVEY OF SVR
Regression is an intrinsic relationship to find a bunch of data. [21]. The traditional regression method only considers that the prediction is correct if the regression f (x) is completely equal to y, and the loss is calculated. The support vector regression (SVR) considers that the deviation of f (x) and y is not too large. It is considered that the prediction is correct and there is no need to calculate the loss. Specifically, we set a threshold α andcalculate the loss of the data point of As shown in Fig.1, support vector regression means that as long as the value inside the dotted line can be considered as the prediction is correct. It is only necessary to calculate the loss of the value outside the dotted line. The SVR is formalized as Here, τ is a regularization constant, and ∈ is an ∈-insensitive loss function.
where Table 1 commonly used kernel functions are RBF, linear, and polynomial. As shown in Fig.2, it is found that the model has the best fitting effect when using the RBF kernel function. However, the complexity of the SVR of the RFB kernel function is relatively high and it could easily lead to overfitting [20]. As a consequence, a multi-local linear SVR is proposed to solve the over-fitting problem of SVR with RBF kernel function.

III. ε-FCM BASED ON INSENSITIVE MARGIN
the nonlinear SVR model is too complex implying overfitting, and the linear SVR model is too simple to accurate enough. A ''divide-and-conquer'' algorithm works by breaking a difficult problem into two or more sub-problems. The solutions to the sub-problems are then combined to produce an overall a solution to the original problem.
As shown in Fig.3, the proposed solution can be described through the following steps: Step1) Divide the training dataset into several fuzzy spaces (subsets) is completed by using ε-insensitive loss parameter based Fuzzy C-Means (ε-FCM) clustering Step2) Each subset is estimated by a nonlinear mapping, and complete regression each subset by using local linear SVR.
Step3) The composite kernel function with fuzzy inference is introduced to implement to multiple local linear model by incorporating distribution information combined from each subset.
This section illustrates the ε-FCM method. This algorithm alleviates the overfitting problem by assigned more compensation on decision boundary. As the definition of decision boundary (regression model) is realized based on training label, ε-FCM is considered as a type of supervised clustering algorithm. The motivation and procedure of ε-FCM is described as follows.
The proposed ε-FCM is a data pre-processing algorithm based on existing SVR, which limit the effect of data points away from the decision boundary [23]. When using the existing FCM, the cluster center of the clusters (subsets) are easily influenced by noisy data.
As shown in Fig.4 the cluster center is affected by the outliers and noise data (green circles), which leads to the unsatisfactory effect of the clustering algorithm. To alleviate overfitting problem occurring in Fig.4. ε-FCM is proposed. The procedure of the proposed ε-FCM is described as Algorithm I. Fig.5 shows detailed computational steps of ε-FCM.

IV. ARCHITECTURE OF PROPOSED SVR
Multiple SVR models can obtain preferred results rather than conventional SVR based on the ''divide-and-conquer'' strategy. However, the complexity of the model increases as the model operates the SVR algorithm multiple times [22]. In this section, a composite kernel function is formulated to alleviate overfitting problem, and refine it in a single SVR fashion for reducing model complexity [25]. According the membership degree u is (10), new composite kernel function is constructed.
In [16], Pedrycz used the FCM clustering to find antecedent variable membership functions and then identified a relational fuzzy model. In the ε-FCM, each cluster corresponds to a fuzzy IF-THEN rule in the Takagi-Sugeno-Kang form [17], [18] is the vector of consequent parameters of the i-th rule; w i0 denotes a bias of the i-th model. The antecedent fuzzy set of the i-th rule, A i has a membership degree u i (x) ∈ [0, 1]. In [19], the FCM related fuzzy model is described as a weighted averaging aggregation of individual rules: where u i (x) is the normalized firing strength of the i-th rule for input then the overall output of the FCM related model can be written as y = uw g .
Then, we give the formulation of the composite kernel function with respect to u. Algorithm 1 ε-FCM Clustering Based on Insensitive Margin Input: All training data set Output: membership degrees of input data 1. Using a single SVR of RBF kernel function for all training data set X and calculate regression function based on training data set Select all x k to from X 6.
Using FCM iteration to calculate the cluster 7. else 8.
Compute membership degrees for all training input variables 9. end 10. end Consider a single-input-single-output (SISO) nonlinear time-invariant system whose input-output dynamics is described as where u(t) ∈ R, y(t) ∈ R, e(t) ∈ R are the system input, the system output and a stochastic noise of zero-mean at time t, n u and n y are unknown maximum delays of the input and output, respectively. ϕ(t) ∈ R n is the regression vector composed of delayed input-output data. n is the number of input variables, which equals to the sum of n u and n y . The t can be replaced by u, (13) can be rewritten as As an example, the model is generated by performing Taylor expansion of the unknown nonlinear function g(ϕ(u)) around the region ϕ(u) = 0. Since g(·) is assumed to be differentiable, the derivative g (i) (0)(i = 1, 2, ...) exists. Then ignoring g(0) for simplicity and then a regression form of the system described in (14) is given with an multiple structure Here, the coefficients a i,t = a i (ϕ(u)) and b i,t = b i (ϕ(u)) are nonlinear functions of ϕ(u), thus can be represented by nonlinear nonparametric models (NNM): where p j is a vector of parameters of the j-th basis function of the NNM, such as the center (µ) and the width (σ ) parameters in RBFN. M denotes the number of basis functions contained in the NNM, and the parameters p j and M are parameters thereby. j = [w 1j , . . . , w nj ] T is a connection matrix between the input variables and the associated basis VOLUME 8, 2020 functions. According to (12) and (16), a compact representation of the model is given as in which the NNM N j (u) is represented as of ϕ T (u)u j for the simplicity, and it is defined u 0 = 1. By introducing (u) and (17) can be rewritten as follows. where Therefore, the nonlinear system identification is reduced to a linear regression problem with respect to (u), and are called parameters The parameter vectors are estimated using the SVR approach. We Introduce the structural risk minimization (SRM) as with the constraint condition where N is the number of samples,ξ * u , ξ u are slack variables. The parameter C controls the tradeoff between the complexity of regression model and the amount up to which errors are tolerated.
The solution can be obtained by finding a saddle point of the associated Lagrange function.
Then the saddle point could be acquired by minimization of L with respect to , ξ * t and ξ t .
Thus, one can convert the primal problem (21) into an equivalent dual problem as subject to: (23) to do this, α t and α * t are reserved only in (23), which can be obtained and substituted into (18) for parameter vector .
With the multiplier α t and α * t obtained, the model can be identified as multi SVR with the membership degrees u.
Each training subset is represented by R j (u) corresponding to an RBF, which can be expressed as The j-th RBF function R j (u) is constructed by specifying the center of v j with width of σ 2 j , which are set as the center and radius of the j-th cluster, and λ is a scaling constant.
In this way, the nonlinear input data become linear in a high-dimensional space formed in terms of the composite kernel function (26). In other words, the composite kernel function is learnt to reproduce an appropriate decision boundary according to data distribution.
The multi-RBF kernel is named after two folds. Firstly, it is derived from the quasi-ARX modeling method. Secondly, the nonlinearity of the kernel can fill the gap between linear and nonlinear kernel functions by adjusting the value of C. The proposed MFSVR as shown in Fig. 6.

V. SOME COMMON MISTAKES DESIGN PROCEDURE OF THE MFSVR
Overall, the proposed MFSVR design framework includes the following steps; refer to Figure 7. Step

(Construct Training Data and Testing Data, and Set the Parameters for MFSVR Structure and Learn the Model):
The original dataset is divided into two parts: training data, and testing data. Training data is used to design MFSVR,  and testing data is used to verify the quality of the constructed MFSVR. The parameters of the MFSVR structure include:fuzzification coefficients, number of clusters, insensitive loss parameter ε, and penalty coefficient C.
Step 2: Carry out ε-FCM for input space.
Substep 2-1: Obtain data points in the ε insensitive margin. Select the data points for ε insensitive margin (|y − f (x)| ≤ ε) by implementing the SVR model with RBF kernel function.
Substep 2-2: Calculate and update the cluster centers from insensitive area data points (v ip ) Carry out FCM to estimating cluster centers v ip for insensitive area data points Substep 2-3: Calculate the membership degrees of all input variables (u is ). (10) is used to compute membership degrees of all input variables.
Step 3 (Construct the Composite Kernel and Carry Out MFSVR for Modeling): Function K (u, u ) is constructed according to (26). Then, the proposed MFSVR (24) model is designed with the aid of the composite kernel function.
Step 4 (Compute Performance Index): Root mean square error (RMSE) as a performance index on training and testing datasets is computed by designed MFSVR via steps 2∼3. Five operations were performed on the above process using five-fold cross-validation, and the average of performance indexes and its standard deviation was obtained as the output of the design model.

VI. EXPERIMENTAL STUDIES
In this section, the conventional SVR with three types of kernel functions (linear, polynomial, RBF), proposed MFSVR based on FCM and proposed MFSVR based on ε-FCM are used for comparing the performance. Here ε-FCM clustering means FCM clustering being carried out based on insensitivity margin of parameter ''ε''.
The most important difference between the three conventional SVR and the proposed MFSVR is the kernel function. The proposed MFSVR performs fuzzy partitioning on nonlinear regression task by ε-FCM, and locally fits each partition with several linear kernel functions, and then integrates these partitions through the composite kernel function to alleviate overfitting problem.
We use both synthetic and machine learning datasets to compare the performance of the existing model and the proposed model. Consequently, the experiment is tested for the noise and outliers robustness of the proposed model based on 5dB, 10dB, and 15dB noise rate and 10%, 20%, and 30% outlier rate (5dB noise rate means 95% of data impacted additively with white noise). The way to add noise is designed based on the wgn (White gaussian noise) function built in matlab toolbox, do not say about matlab but rather say how this function works and the way to add outliers is change their output value into mean value randomly [19].
All datasets are randomly divided into 80% training dataset and 20% testing dataset and all used 5-fold cross-validation. Table 2 summarizes the values of the parameters of proposed MFSVR.

A. TYPES OF GRAPHICS TWO-DIMENSIONAL SYNTHETIC DATASET
The MFSVR is applied to modeling the well-known Matlab logo. This logo is a display of one of the eigenmodes for an ''L'' shaped membrane as shown Fig. 8. We considered 500 randomly generated input-output data pairs. From solution set of Fig. 8 (a), 500 input-output patterns are arbitrarily selected and the distribution of the selected patterns is visualized in Fig. 8 (b). The 500 pairs of input-output patterns are used for designing model and comparing performance of each model. Table 3 shows the performance of the proposed MFSVR with different number of clusters when using existed FCM and ε-FCM. The performance of the proposed two models is very similar.  Fig.9 shows the membership functions and local input spaces of the two proposed models with 5 clusters. Fig. 9 (a) description of partition function of each cluster and local input spaces formed by existed FCM. Fig. 9 (b) description of partition function of each cluster and local input spaces formed by ε-FCM. It can be seen that with the general FCM, the data is divided into 5 clusters evenly (Fig. 9 a), while with ε-FCM, all five cluster centers are distributed in the ε insensitive margin (Fig. 9 b). Table 4 lists the proposed model and the existing SVR model for comparison. The approximation and generalization abilities of the proposed model are largely improved in comparison with the abilities of the existed SVR. The experiment of the proposed model is implemented on computer with the Intel Core i5-4690k 3.50 GHz CPU. It can be seen that the computing time of the proposed model is significantly shorter than that of the existing SVR model. The computing time of proposed MFSVR with ε-FCM is the same as the time of MFSVR with existing FCM, the difference of 0.003 seconds happens because of the difference of the number of clusters. Table 5 shows the performance index of the proposed MFSVR with added different levels of white noise when   using existed FCM and ε-FCM. After adding white noise to this simple synthetic dataset, the performance of MFSVR with ε-FCM is better than that of general FCM. Table 6 lists the results obtained after adding white noise the proposed model and the existing SVR model for comparison. The approximation and generalization abilities of the VOLUME 8, 2020 MFSVR with ε-FCM is largely improved in comparison with the abilities of the existing SVR after adding the same white noise. Table 7 lists the performance of proposed model and the existing SVR model with different levels of outlier for comparison.  Fig. 10 shows the data has outlier points in the insensitive data points selected by using ε-FCM. (a) is 500 data points after adding 10% outliers including 450 original data ('•') and 50 outliers(' * '). Fig.10(b) was shown insensitive data points selected by using ε-FCM. The blue period ('•') are 140 insensitive data points selected from the 500 data centers with outliers added in Figure 10 (a). It can be seen that by using ε-FCM to select 140 insensitive data points, there are only three outliers, and the other 47 outliers have not been filtered out.

B. ABALONE DATASET
We consider abalone datasets (http://archive.ics.uci.edu/ml/ datasets/Abalone). Predicting the age of abalone from physical measurements. The age of abalone is determined by cutting the shell through the cone, staining it, and counting the number of rings through a microscope. This dataset includes 4177 input-output pairs. There are 7 input variables, and the output is the age of abalone. Table 8 shows the performance of the proposed MFSVR with different number of clusters when using existed FCM and ε-FCM. The performance of the proposed two models is similar on abalone dataset.  Table 9 lists the proposed models and the existing SVR model for comparison. The approximation and generalization abilities of the proposed two models are largely improved in comparison with the abilities of the existed SVR.   using existed FCM and ε-FCM. After adding white noise, the performance of MFSVR with ε-FCM is better than that of general FCM. Table 11 lists after adding white noise the proposed model and the existing SVR model for comparison. The approximation and generalization abilities of the MFSVR with ε-FCM is largely improved in comparison with the abilities of the existed SVR after adding white noise.

C. AIR QUALITY DATASET
The dataset contains 9358 instances of hourly averaged responses from an array of 5 metal oxide chemical sensors embedded in an Air Quality Chemical Multisensor Device. Data were recorded from March 2004 to February 2005 (one year) representing the longest freely available recordings of on-field deployed air quality chemical sensor devices responses. Ground Truth hourly averaged concentrations for CO, Non Metanic Hydrocarbons, Benzene, Total Nitrogen Oxides (Nox) and Nitrogen Dioxide (NO2) and was provided by a co-located reference certified analyzer. Table 13 shows the performance of the proposed MFSVR with different number of clusters when using existed FCM and ε-FCM. The performance of MFSVR using exciting FCM is better than that of using ε-FCM. Because air quality dataset is time series data, and the amount of data is relatively large, exciting FCM using all the data to calculate the clustering centers can obtain more comprehensive data information than the ε-FCM only using insensitive data to computing center. While the performance of MFSVR by using exciting FCM is better.    Table 15 shows the performance index of the proposed MFSVR with added different levels of white noise when using existed FCM and ε-FCM. After adding white noise to air quality dataset, the performance of the proposed two models is similar.  Table 16 and 17 shows the performance index on testing dataset including Gaussian white noise and outliers. The boldface entries stand for the better performance than existing SVR. Consequently, the performance of the proposed MFSVR with ε-FCM outperforms the proposed MFSVR with FCM, and such a tendency is much clearer as increasing the intensity of noises and outliers.

D. OTHER DATASET
To evaluate the performance of the proposed models and the effect of the composite kernel function, different algorithms on 14 well-known benchmark datasets are compared. These   To further analyze whether the proposed model is statistically significantly better than the other comparative models, we use the Bonferroni-Dunn test in Table 20, which fit for situations where all models are only compared to the control model and not between them. If the  corresponding average rank differs by at least the critical difference (CD), the performance of any two models is significantly different. At p = 0.10 (significance level), the CD value is 1.70. Table 20 covers the difference of average rank between the five comparative models (weka) and the two proposed models, as well as the comparison results with CD. Since the difference between the average rank of all comparative model and the proposed MFSVR with ε-FCM is greater than CD ( 4.11-1.18=2.93>1.70), we can conclude that the prediction accuracy of MFSVR is statistically superior to the five comparative models. Furthermore, the proposed MFSVR with ε-FCM is slightly better than MFSVR with existing FCM. Table 19 provides comparison on the proposed models and existing SVR of weka on 14 datasets. Obviously, compared with the three kinds of SVR models, the proposed models have a great improvement in prediction accuracy. 9 in 14 datasets the MFSVR based on ε-FCM obtained the best performance.
According to Table 21 and 22, the experiments are carried out under noise and with the existence of outliers. Consequently, the performance of the proposed MFSVR with ε-FCM outperforms the MFSVR with FCM, and such a tendency gets much clearer as the increase of the noise and VOLUME 8, 2020 outliers. It means that the effect of ε-FCM increases with the increase of the noise and outliers.

VII. CONCLUSION
In this study, a multiple fuzzy support vector regression (MFSVR) with the aid of composite kernel function and ε-FCM is introduced. First, ε-FCM is used to partition the training dataset into several subsets as preprocessing of proposed modeling. Second, the composite kernel based on multiple linear kernel expression is considered to avoid overfitting problem. In more detail, each training data point is assigned with corresponding membership degrees used in preprocessing. Some data which are far away from the decision boundary are assigned with lower membership degrees and given lower contribution to composite kernel function. Then, the composite kernel function for multiple local SVRs is constructed according to the distribution structure of training data.
The performance of the proposed two models is very similar on original data. The approximation and generalization abilities of the MFSVR with ε-FCM is largely improved in comparison with the abilities of the MFSVR with FCM. Consequently, the performance of the proposed MFSVR with ε-FCM outperforms the MFSVR with existing FCM when the noise and outliers was improved increasing. It can be drawn to a conclusion that the effect of ε-FCM gets more increasing according to the increase of the noise and outliers. In future works, type-2 fuzzy sets could be investigated and their impact on the robustness of the model could be investigated.