Optimized ANFIS Model Using Hybrid Metaheuristic Algorithms for Parkinson’s Disease Prediction in IoT Environment

Throughout recent years, the progress of telemonitoring and telediagnostics devices for evaluating and tracking Parkinson’s (PD) disease has become increasingly important. The early detection of PD increases the consistency of the treatment of patients and ultimately allows it possible to achieve a rapid diagnostic decision from an experienced clinician. In this paper, a proposed fog-based ANFIS+PSOGWO model provided for Parkinson’s disease prediction. The proposed model exploits the advantages of the grey wolf optimization (GWO) and the particle swarm optimization (PSO) for adjusting the adaptive neuro-fuzzy inference system (ANFIS) parameters with the use of chaotic tent map for the initialization. The fog processing utilized for gathering and analyzing the data at the edge of the gateways and notifying the local community instantly. Compared to other optimization methods, many evaluation metrics used like the root mean square error (RMSE), the mean square error (MSE), the standard deviation (SD), and the accuracy and five standard datasets from repository of UCI machine learning that demonstrated the superiority of the model proposed against the grey wolf optimization (GWO), the particle swarm optimization (PSO), the differential evolution (DE), the genetic algorithm (GA), the ant colony optimization (ACO), and the standard ANFIS model. Moreover, the proposed ANFIS+PSOGWO applied for Parkinson’s disease prediction and achieved an accuracy of 87.5%. The proposed ANFIS+PSOGWO compared in producing positive outcomes better than PSO, GWO, GA, ACO, DE, and some recent literature for Parkinson’s disease prediction. The proposed model produced accuracy for the Parkinson’s disease prediction has outperformed its closest competitors in all algorithms by 7.3%.


I. INTRODUCTION
The Internet of Things, or IoT, is a world that is full of sensors and actuators, robots, and computers that are connected and able to communicate with larger networks. IoT can transform the routine, regular devices into intelligent devices. These digitally connected devices in the medical and healthcare sector are receiving a lot of traction [1]. Integrating IoT with medical applications has increased the efficiency of remote health monitoring systems for the elderly or chronically affected patients in need of long-term personal service [2].
A vast amount of data continually is provided by the medical sensors or wearable devices in the IoT health systems. The data generated by IoT sensors are high-speed. Consequently, The associate editor coordinating the review of this manuscript and approving it for publication was Huaqing Li .
there are also very high amounts of data generated by the IoT-based health monitoring system. This massive amount of IoT generated data can be analyzed using the techniques of machine learning. These techniques represent predictively, detecting anomalies, and classification data. Classification is the process of making a decision on a disease or of identifying that a particular disease belongs to which class [3].
Concerning IoT-cloud-based technologies, these systems encountered several problems such as latency constraints, data processing costs, and localization awareness [4]. CISCO established an essential data processing method named fog computing [5]. Using the fog, we can overcome the fundamental barriers described above for IoT-Cloud systems, by processing data on the edge of the network and obtaining immediate feedback from the local community [6]. Excellent and successful cooperation between fog computing and IoT-enabled technology can support various advantages such as improved service quality (QoS) in terms of data traffic reduction, low response time, scalability, location awareness, more exceptional user experience, and fewer bandwidth requirements [7].
Parkinson Diseases (PD) is the second most prevalent neurodegenerative disorder following Alzheimer's disease, affecting about 3 percent of the over 65-year-old population [8]. The disease is a crippling neurological disorder; its symptoms -tremors, stiffness, difficulty in walking, yet deteriorate over time [9]. Early detection of PD improves the quality of the patient's health and ultimately makes it easier for a knowledgeable practitioner to obtain quick diagnostic judgements.
Strong and accurate clinical informatics systems are required to identify the PD patients to provide the patient with early diagnosis, and timely treatment to improve the development of this disease. These systems are already striving to reduce the clinical workload [10]. The mechanisms for determining PD depend on determining the nature of symptoms using various device techniques. Among the most common symptoms is the vocal problem, since, in the early stages of the disease, most patients experience vocal abnormalities. Consequently, vocal-related health systems have a dominant position in recent PD detection research [11]. The results observed in the PD telehealth analysis indicate that feature extraction and learning algorithms have a significant impact on the consistency and efficiency of the proposed program [12], [13].
One of the promising techniques that are playing a significant role in prediction is machine learning (ML). ML, especially when combined with data mining techniques, is concerned with the development of algorithms which can learn the pattern from known data to form the model, then apply the model to an unknown data to predict the result [14]. Therefore, Machine learning and deep learning techniques have been widely applied for the prediction of Parkinson's disease to improve its predicting performance as in [15]- [18]. These studies utilized different techniques like multi-layer perceptron (MLP), support vector machine (SVM), artificial neural networks (ANN), and random forest (RF) to handle the Parkinson's disease prediction and diagnosis.
Although these individual non-linear ML techniques perform better than classic models, such models suffer from some overfitting and parameter optimization problems. Therefore, hybrid models have been introduced to increase predictive accuracy and overcome the weaknesses of the solo models [19]- [21].
Adaptive neuro-fuzzy inference system (ANFIS) is a soft computation approach that includes the powers of fuzzy inference mechanisms as well as of artificial neural networks (ANN). ANFIS is driven by strong generalization capacity with a quick and accurate learning process [22], [23]. We agreed to resolve the problem of Parkinson's disease prediction using ANFIS accordingly. However, the training of ANFIS parameters is a critical problem in terms of real-world implementation. The main concerns of researchers in designing the ANFIS model is to update its parameters so that improved precision achieved efficiently. Several methods have developed for the training of these parameters. These methods generally classified as deterministic and probabilistic.
Deterministic techniques, including gradient descent (GD) and least square estimator (LES), are slow and will not converge in some cases. In contrast, metaheuristic algorithms are population-based with the ability of global search. Each individual in the population expresses a potential solution. Moreover, the standard ANFIS training approaches use the gradient descent (GD) technique, so there are many local optimums since the chain rule used generates the gradient calculation at each step.
For solving these issues, specific optimization algorithms [24] have been developed, including differential evolution (DE), particle swarm optimization (PSO), genetic algorithm (GA), and grey wolf optimization (GWO) [25]- [27]. In this paper, we propose a model for ANFIS's parameters optimization using a hybrid of the grey wolf optimization (GWO) and particle swarm optimization (PSO). The proposed optimization model takes advantage of the exploration capabilities of GWO, together with the exploitation capabilities of PSO.
The contributions of this study include three parts. Firstly, utilizing fog computing for reducing the latency that required in many real-time applications, especially in healthcare systems. The fog in the proposed framework provides many benefits. The fog takes the prominent role for feature extraction from IoT sensors and provides the principle functions needed as data pre-processing. Moreover, the fog takes the advantages to provide an expert system based on the retrieved model from the cloud. Once a new model is available in the cloud, its backup sent to fog. Accordingly, the fog saved up to date model for providing users with advice or alert on abnormal situations.
Secondly, a proposed ANFIS+PSOGWO model presented. This model takes the advantages of both the GWO algorithm together with the PSO idea for adjusting the parameters of the model of the adaptive neuro-fuzzy inference system (ANFIS). The PSOGWO algorithm used for the ANFIS's training procedure as a technique for parameters adaptation. The adaptive parameters are located at the fuzzification layer (premise parameters) and defuzzification layer (consequent parameters). The PSOGWO start with initial solution generated using chaotic tent map rather than random initialization. The PSO boost the proposed model with its abilities in exploitation, together with the exploration capabilities of GWO. To evaluate the efficiency of the proposed PSOGWO model for ANFIS parameters optimization, five datasets used with different evaluation criteria like the root mean square error (RMSE), the mean square error (MSE), the accuracy, and the standard deviation (SD). The proposed model (ANFIS+PSOGWO) achieved better results that outperformed all other compared methods like VOLUME 8, 2020 the differential evolution (DE), the grey wolf optimization (GWO), the genetic algorithm (GA), the particle swarm optimization (PSO), and the ant colony optimization (ACO) algorithm.
Finally, the proposed model utilized for Parkinson's disease prediction. The proposed model, based on the results, achieved the least error rate in training and testing the ANFIS model compared to other optimization methods. The main contributions of this paper summarized in the following points: • A framework for Parkinson's disease monitoring and prediction.
• An improved model for ANFIS's parameters optimization using a hybrid of PSO and GWO (PSOGWO).
• Utilizing the chaotic tent map for the initial population in PSOGWO.
• Comparing the proposed ANFIS+PSOGWO with some recent related work for Parkinson's disease prediction.
The remainder of this paper proceeds to organize as follows: Section 2 summarizes the research carried out concerning fog-cloud infrastructures, the ANFIS model and the diseases of the Parkinson region. Section 3 outlines basic concepts for the generic grey wolf optimizer (GWO) algorithm, a standard PSO and the ANFIS adaptation system for the neuro-fuzzy optimization of a particular swarm. Besides, Section 4 illustrates in details proposed algorithm integration and structure. The analyses and the corresponding results presented in Section 5. Finally, in Section 6, other proposals and some work for the future were proposed.

II. LITERATURE REVIEW A. ANFIS OPTIMIZATION
ANFIS provides all the benefits of neural networks, as well as fuzzy systems. Training of ANFIS parameters is, however, one of the main problems when applied to real-world applications. Many previous studies involved strategies addressing the issue of ANFIS training based on different algorithms like genetic algorithm (GA), particle swarm optimization (PSO), and grey wolf optimization (GWO).
Based on the PSO, Lin et al. [28] proposed an approach for training ANFIS parameters. The system focused on utilizing quantum behaving particle swarm optimization (QPSO) for preparing an ANFIS's parameter settings. For the definition of the consequent parameters, the least square estimate (LSE) applied while the premise parameters modified by the QPSO algorithm. Hasanipanah et al. [29] proposed a modern rock fragmentation forecasting technique utilizing the ANFIS learning structure in conjunction with PSO as the parameters optimization method. Their model has proved its efficiency compared with support vector machines (SVM), ANFIS and non-linear multiple regression (MR) models.
Aghelpour et al. [30] developed an efficient adaptive neuro-fuzzy inference system (ANFIS) model combined with bio-inspired optimization algorithms for agricultural drought monitoring using the least variables in the rain gage-only sites. They used ANFIS-PSO (ANFIS merged with Particle Swarm Optimization), ANFIS-GA (ANFIS combined genetic algorithms), and ANFIS-ACO. ACO and GA algorithms provide the highest performance for optimizing ANFIS.
On the other hand, many studies were focused on explaining the effect of the genetic algorithm for ANFIS parameters adaptation. Ghose et al. [31] developed and used the non-linear multiple regression (NLMR) and the adaptive neuro-fuzzy inference system (ANFIS) models for forecasting rainfall from precipitation on river catchments. Both models have utilized as learning models to predict the output. Then, the genetic algorithm (GA) is connected with the NLMR learning model to achieve the hydrological parameter condition under which the runoff is maximal. The genetic algorithm (GA) used to get the optimal control factor value that maximizes the objective function. Sarkheyli et al. [32] established a new modified genetic algorithm (MGA) utilizing a different population form to refine the simulation parameters for ANFIS fuzzy rules and membership functions. In the tunneling process, Elbaz et al. [33] established a valuable multi-target optimizing model for the prediction of earth pressure balance (EPB) shield performance. The genetic algorithm (GA) implemented in this model with ANFIS process. GA improves the precision of ANFIS by multi-objective fitness function by adapting the corresponding parameters. Datasets preprocessed before modelling and essential operational parameters defined via the study of principal components analysis. Moayedi et al. [25] presented a model with two-parameter optimizing algorithms, particle swarm optimization (PSO) and genetic algorithm (GA), which are primarily for calculation of the friction strength ratio (α) in driven shafted sections.
GWO has proved its efficiency because of its discovery capabilities. So, many previous studies have utilized GWO for adapting ANFIS parameters. In Dehghani et al. [34], they developed a model for predicting and modelling the short-term to long-term influential flow rate. ANFIS and GWO are merged to predict the rapid, short and long driven flow rate. All of ANFIS's parameters are optimized and adjusted by GWO. In [35], they developed a model that consists of the grey wolf optimization (GWO) algorithm with the adaptive neuro-fuzzy inference system (ANFIS). The model achieved better performance compared to the neural network, support vector machine (SVR) and solitary ANFIS models. Golafshani et al. [36] proposed a framework for the compressive strength prediction of cost, energy, and timesaving. They used the GWO algorithm for adjusting the initial weights and parameters of the artificial neural network and the adaptive neuro-fuzzy inference system techniques.
Moreover, many other algorithms were used for training the ANFIS model. Bui et al. [37] proposed a new metaheuristic method, in particular, an algorithm for whaleoptimization (WOA), which adopted 28 days for the evaluation of concrete compressive strength (CSC). The WOA, coupled with a neural network (NN), is used to optimize its computing parameters. The strategies of optimization of ant-colony (ACO) and optimization of dragonfly algorithm (DA) often considered benchmarking. Penghui et al. [38] developed a model for predicting soil temperature (ST). The model consists of a neuro-fuzzy hybridized adaptive deduction method with mutation salp swarm algorithm (SSA) and an optimization algorithm of the Grasshopper (ANFIS-mSG).
ANFIS model proved to be an outstanding statistical tool for different computer applications. Based on the literature, because of its ability to compensate for data uncertainty, the ANFIS model is a robust and intelligent simulating model. ANFIS challenge is, however, concerned with the question of parameters optimization. Although traditional optimization algorithms of ANFIS produce better estimation outcomes than mathematical and computational approaches, they are just searching for a local optimum solution.

B. IoT BIG DATA ANALYTICS INFRASTRUCURES
Big data has different sources and strategies for storage and analysis. In many decision-making and forecasting areas, for example, business analytics, transport, web advertising, recommendations, healthcare, clinicians, the identification of fraud and tourism commercialization, big data played a significant role. Sood et al. [39] suggested a DENF-based cyber-physical system (CPS) supported by dew-cloud. This method detects dengue fever (DENF) and tracks the impact on essential body organs of DENF infection. This method uses the linear discriminant analysis (LDA)+ANFIS for dew space classification to detect DENF and quickly track the likelihood of coronary heart disease (CHD) of users affected by the DENF. Vidhya and Shanmugalakshmi [40] examined multiple diseases using a modified adaptive neuro-fuzzy inference system (M-ANFIS). Data integration and features extraction are used for data preprocessing. Their model combines the closed frequent itemset (CFI), M-ANFIS, and k-medoid clustering.
Manocha et al. [41] suggested a paradigm for measuring patient physiological and environmental factors to forecast general anxiety disorder (GAD)-induced wellbeing adversity. They used a hybrid structure which consisted of a cloud storage layer with fog. At the fog layer, weighted-naive Bayes (W-NB) classifier is used for classifying the collected data to forecast abnormal health incidents. The suggested two-phase decision-making approach helps to improve the delivery of the medical resources needed by assessing the risk scale. They also employed the adaptive neuro-fuzzy system-genetic algorithm for cloud calculation of health vulnerability indices. Khanna and Sachdeva [42] introduced a framework for forecasting and predicting flood. The structure utilized fog computing, mobile edge computing, and cloud computing, together with a sensing network based on IOS.
They used the adaptive neuro-fuzzy inference system for flood prediction.

C. PARKINSON'S DISEASE (PD)
The recent study of Parkinson's disease mechanisms has generated significant insight and, at the same time, challenges traditional conceptual frameworks. Tsanas et al. [43], explained how precise speech signal processing algorithms could be used to discriminate healthy controls against PD subjects. Through the feature selection process, they used various folds of data using minimum redundancy maximum relevance (MRMR), local learning-based feature selection(LLBFS), RELIEF, and least absolute shrinkage and selection operator (LASSO). Reduced data fed to support vector machines and random forests classifiers. The results showed the superiority of support vector machines with RELIEF.
Parisi et al. [44], designed a program for rapid PD diagnosis using a modern hybrid artificial intelligence classification based on lagrangian support vector machine (LSVM) and multi-layer perceptron (MLP). The research details were provided by the University of California-Irvine (UCI) machine learning database, which had sixty-eight assessment instances and clinical abstract. Sakar et al. [45] used a function extraction technique, the tunable Q-factor wavelet transform (TQWT) to PD patient voicing signals. The feature subsets are given for different classifiers, and they used the ensemble learning. Gunduz [46] offered a couple of architectures focused on convolutional neural networks provided the range of vocal (speech) features in Parkinson's disease (PD).

A. ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM (ANFIS)
ANFIS is a practical AI method built by the Jang [47] that imitates human thinking to solve inaccurate problems. ANFIS is a simple data-learning technique that uses fuzzy logic to convert inputs of highly interconnected neural network processing elements and information connections into the desired output. Since ANFIS incorporates both the ANN and the fuzzy inference method, it can manage non-linear and complex problems within a unique structure. ANFIS is a common functionally efficient approximator in which the information between the problem's input and output variables interpreted as a set of rules in the form if-then. ANFIS usually includes five layers: fuzzification, product, normalization, defuzzification, and summation.
ANFIS's network architecture consists of two node types: fixed and adaptable. Such layers are composed of multiple nodes identified by the node function. The efficiency of the network depends primarily on the adaptable parameters in the nodes. The guidelines for network learning change specific parameters settings for minimizing the error between actual and the desired output. The ANFIS architecture seen in Fig. 1 with two inputs and one output [48]. To explain the structure of ANFIS, two of the rules based on the fuzzy inference method Takagi-Sugeno form considered: A 1 , A 2 , B 1 , and B 2 are fuzzy sets for x and y inputs. p 1 , p 2 , q 1 , q 2 , r 1 and r 2 are the parameters of the defuzzification layer. The ANFIS model output shall be f. In IF part, parameters are referred to as precedent or premise parameters; in Then part, parameters are known as consequential parameters. Layer 1 (the premise) and Layer 4 (the consequent part) nodes are adaptable while layer 2 (the product) and layer 3 (the normalization) nodes are fixed. As in Fig. 1, the ANFIS model in five layers consisting of two inputs and one output described in the following steps [36], [49]- [51]: Layer 1: the first layer named the fuzzification layer, any node i is an adaptive membership function of this layer. Typically, any parameterized functions may be a membership function, i.e., for a linguistic label or a fuzzy set; generalized Bell, trapezoidal, Form of the triangle, or Gaussian. E.g., the Gaussian membership function described as follows by a couple of parameters (c, σ ): while Gaussian membership parameters regulated by center c and width σ , these parameters often pointed to as premise or antecedent parameters. y (1) i is the output of this layer. Layer 2: is the product layer. Each node in this layer reacts to a single sugeno style fuzzy rule. Nodes in this layer collect inputs from the respective fuzzification neurons and determine the firing strength of the rule they represent. As a consequence, the output of neuron obtained in layer three as, where x (2) ji is the layer input from layer 1(j) to layer 2(i) and the output is y (2) i for any neuron i in the product layer.
Layer 3: is the normalization or standardization layer. Nodes in this layer accept feedback from all the product layer neurons and measures a defined rule's weighted firing power. In this layer, the result of neuron i is thus defined as, ji represent the input received and generated according to neuron j from the product layer to neuron i in the normalization layer. y (3) i denote the output of layer 3. Layer 4: defined as the layer of defuzzification. The defuzzification nodes are considered modifiable or adaptable nodes. The neuron of the defuzzification layer measures the corresponding weighted, measured value of specific rule as: is the layer 4 input while the output is y (4) i . k i0 ,k i1 , k i2 are the consequent parameters of rule i.
Layer 5: the output of this layer is the overall output for the model, which aggregate all previous layer outputs.
The ANFIS learning process consists of updating its modified parameters by using a two-pass learning algorithm, forward-and backwards-pass. Using a hybrid gradient descent (GD) and least square (LSE) estimator, ANFIS is training in its parameters for minimizing errors between actual and desired output, as shown in table 1. In forward pass of the learning algorithm, node outputs go forward from layer 1 until Layer 4, and the consequent parameters are determined by the least squares. In backward pass, the error signals propagate backward from output layer until input layer, and the premise parameters are updated by GD algorithm. At this point, neural network learns and train to determine parameter values that can sufficiently fit the training data.

B. PARTICLE SWARM OPTIMIZATION (PSO)
Kennedy and Eberhart first implemented the PSO algorithm and is still commonly used because of their common and effective approach to non-linear optimization [52], [53]. Rather than using evolutionary operators to control the individuals, as in other evolutionary computational algorithms, each individual in PSO flies in the search space with a velocity that is modified dynamically depending on the best position of the particle (pbest) and the companion's best position (gbest). In PSO, each particle has a vector of position and velocity. The ith particle can be represent as a vector of d dimensions like xi = (xi1, xi2, xi3, . . . . . . ., xid). Modifying particle's location and speed changed by the following equations: where the values obtained from the range [0,1] for rand 1 and rand 2 as random values, t indicates the current iteration and i denote the particle number. The related constants c 1 and c 2 are components of awareness and social features. The inertial weight of w named for the previous velocity of particles. The pseudocode represented in algorithm 1.
Output: ← Optimal solution (g best ) Fitness evaluation of the particle swarm 4.
Find p best (the best solution of each particle) 6.
Find g best (the best for all particles) 7.
End for 11.
Update W by some corresponding strategy 12.

C. GREY WOLF OPTIMIZATION (GWO)
GWO is listed as top predators and is at the height of the food chain. Grey wolves often like to live in a pack. The average size of the group is 5 to 12. Of particular significance is their possessing a very rigid system of social superiority, as seen in Fig. 2 [54]. Gray wolf pack is made up of four levels named alpha (α), beta (β), delta (δ), and omega (ω). The alphas lead the pack and responsible for taking the decisions (e.g. hunting, sleeping, which wake-up time). The betas are the subordinate wolves that support the alpha in making decisions or other actions. The omega is the scapegoat's role. All other dominant wolves must always submit to omega wolves.
The deltas must submit alpha and beta, but it is the omega's dominant. Based on the design of the hunting strategy to capture their prey, grey wolves regarded among the most sophisticated optimization algorithms. It is because everyone in squad remains in a rigorously structured crowd. So, the GWO algorithm has recently developed to imitate grey wolves' behavior, which was inspired by the innate skill of grey wolves, by the battle, scanning or circle their communications.
The first, second and third suggested alternatives in the GWO method include alpha (α), beta (β) and delta (δ). Omega was an excellent choice for the applicants. The wolves mathematically surround their prey during the hunting process as follows: where − → X (t) and − → X P represents the positions of the gray wolf and the Beast, respectively at iteration (t).
− → A and − → C are coefficient vectors and described by the following equations: where − −− → rand1 and − −− → rand2 are random vectors found in the set [0,1]. − → a is reduced linearly by the following equation from 2 to 0: where T_total define the total number of iterations. The location of the grey wolf (X, Y) may be changed based on the location of the prey (X * , Y * ). The position of the grey wolf can be modified with the best solution by adjusting both − → A and − → C vectors. Fig. 3 displays a 2D place map on the grey wolf's next potential place.
All of the alpha (α), beta (β) and delta (β) are responsible for the potential positioning of the prey to simulate the actions of hunting. The remaining wolves adjust their conditions according to the possible three solutions: alpha, beta and delta ( − → X 1 , − → X 2 , and − → X 3 ) as in Fig. 4. The position updating process mathematically represented in the following equations:

− →
where, When targeting the interaction between its current position and the interaction location, wolves change their place such that |A| < 1. The wolves of Alpha (α), Beta (β), and Delta (δ) go after the beast.
− → A 1 which randomly diverts the wolves from the victim by values above -1 or below 1. All stages defined in algorithm 2 [55].

IV. THE PROPOSED METHODOLOGY A. THE GENERAL ARCHITECURE OF IoT-BASED PD PATIENT PREDICTION
The general infrastructure for Parkinson's disease processing and prediction presented in Fig. 5. The framework consists of
Modify all the positions of wolves by (15) 9.
End for 10.
End for 11.
Evaluate the value of fitness function and then modify α, β, δ 12.
End 16. End three layers, namely the IoT layer, the fog layer, and the third layer is the cloud layer.
1) The IoT layer: the data and services transmission scenario starts from the first layer where data collection and sensors setup. The IoT layer is responsible for collecting data, environmental and behavioral data by integrating multiple wearable smart sensors and generating real-time data analysis. 2) The Fog layer: the fog layer responsible for storing data that they arrive at fog when there is no cloud link. It also contains a local personal classified model which predicts the patient's health status locally in the event of internet disruption. So, the fog provides an expert system for real-time user query and provides alerts for emergency conditions. The fog layer can also make simple pre-processing tasks. The data acquisition involves the sampling of sensors data that measure physical conditions in the real world and transform samples into digital numerical values, which a computer can manipulate. The data preprocessing involves several tasks such as data transformation, data reduction, data integration, and data cleaning [56]. Selecting the suitable preprocessing task depends on the nature of the collected data and the analysis process.
3) The Cloud layer: the cloud layer has many responsibilities because of its capabilities for data storage and processing. Since the less capacity for fog storage, the cloud layer executes data storage and management functions  for long-term processing. In the cloud, the ANFIS model training and optimization executed for Parkinson's disease analysis and prediction.
As mention above, the fog provides an expert system, according to the proposed architecture, the user of the system may be the patient, caregiver, doctor, family member, or emergence system for calling an ambulance or immediate intervention. In the following subsections (section b and section c), the suggested particle swarm optimization with grey wolf optimization (PSOGWO) algorithm for ANFIS model optimization will be addressed.

B. THE PSOGWO ALGORITHM
Grey wolf optimization algorithms provide more outstanding convergence properties when practicing several regular test functions. In many practical problems, GWO is simple to operate and implement and work with few parameters. In the GWO algorithm, the position updating process takes into account only the first, second, and third solution neglecting the best position of every wolf through its search. Therefore, the concept of the PSO algorithm utilized to promote the cycle of position updating. The current location of the particle is updated in the PSO algorithm according to the ideal location information for a particular particle, and the best position data in the group. So, the hybrid combines the exploitation potential of the particle swarm optimization (PSO) ability with that of the discovery capacity of the grey wolf optimizer (GWO) [57]. We have previously applied a hybrid model that merges between the GWO and the PSO algorithms for optimal feature selection [58].
The step of position updating modified to take into concern four solutions is the three optimal solutions according to GWO and the best-experienced solution for each particle according to PSO, as shown in algorithm 3. And hence the updated position improvement equation will be as follows: X i (t + 1) = c 1 r 1 (w 1 X 1 (t) + w 2 X 2 (t) + w 3 X 3 (t)) +c 2 r 2 (X ibest − X i (t)) (22) where r 1 and r 2 inside the interval [0, 1] and set randomly, c 1 , c 2 are for social and cognitive learning factors. X 1 , X 2 , and X 3 are the optimal three solutions determined through equations 16,17, and 18. The set X ibest gives the best position for each grey wolf from the search beginning. The set w 1 , w 2 , and w 3 are the inertia weight coefficients and estimated as follows: Rather than equation (14), the non-linear control variables were used that worked better than linear optimization of the technique, and presented as continues to follow: where a initial and a final are both the initial and final value of a within the interval [2:0]. The actual iteration is t and the total number of iterations is T max .

C. ANFIS ADAPTATION USING THE PSOGWO ALGORITHM
In this paper, PSOGWO used to adapt both the ANFIS model's consequent and antecedent (premise) parameters. The ANFIS training algorithms are the classical hybrid optimization algorithm (GDLSE) which is a combination of two algorithms, namely least square estimator (LSE) and gradient descent (GD). In this conventional hybrid method, LSE supported for modifying the parameters of the then-part in the forward transfer. While, in the backward transfer, GD utilized to change membership settings as a means of backpropagation, as seen in  Sticking to the local minima is a significant criticism of GD, which can be avoided by metaheuristic algorithms like the proposed PSOGWO algorithm. Since PSOGWO is computationally less costly and incorporates discovery and exploitation capacities, it implemented the function of upgrading ANFIS parameters more flexible and quicker than gradient-based approaches.
Calculate X ibest // the best-experienced solution for each grey wolf 10.
End for 12.
End for 13.
End 15. end In PSOGWO, the location of each particle reflects a full collection of parameters for the ANFIS system. These locations include two settings for parameters. Firstly, the parameters of a membership function. The second part of the solution reserved for the coefficients of the sequential portion of the fuzzy rule, as seen in Fig. 6. Due to the computing effort that is necessary for the process of adaptation, the total number of ANFIS modifiable parameters is an essential factor in ANFIS network formation. The membership types should, therefore, be carefully selected. Gaussian member type function, which only takes two parameters, namely width and center, is better than other types of member functions. The proposed overall PSOGWO cycle with ANFIS model defined in Fig. 7. Data split at the model start into training and testing. Each of which, training and testing, data selected randomly.  There are numerous irregular strategies for initialization, including those concentrated on distributed samples (DS), chaotic maps and some more. Chaotic number arrays have been used in lately many applications instead of random sequences of numbers. The essential characteristics of chaotic motion are randomness, regularity, and ergodicity, that play an important role in solving function optimization problem. These characteristics helped the algorithms not to fall into the local optima problem in order to preserve the appropriate diversity of the population and enhance global search capacity. There are different forms of chaotic maps, such as logistic maps and tent map. However, the search properties of various chaotic mappings are different [57].
Though the logistic map is used in most of the literature, nevertheless, it can lead to the inhomogeneous distribution of values due to its higher value rate that range in [0, 0.1] and [0.9, 1]. The tent map is one of the simplest chaos functions that have been studied recently [59]. Experiments in [60] reveal that the tent map outperform much better than logistic mapping in traversal homogeneity. It is able to improve the speed of the algorithm by generating a more consistent initial value between [0, 1]. The convergence property of the proposed algorithm is based on the random number sequence used to run the algorithm using different parameters [61]. Chaotic tent map utilized in the proposed model initial population that showed its efficiency and effectiveness rather than the traditional methods. The tent map is mathematically modelled as follow: when µ is estimated to 1+R and R randomly generated in the range [0,1]. Any x 0 point of the interval is produced with x m in [0,1] as described above after the procedure has been updated.
The fitness identified as a mean squared error (MSE) from the target amount to the real performance, may have been represented as: where MSE denote mean square error, y t m is the target output, y A is the average output, and m represent the size of the dataset.
Step by step discussion for the proposed PSOGWO+ANFIS introduced in algorithm 4.

V. RESULTS AND DISCUSSION
Two experiments have been carried out in this analysis to check the ANFIS-PSOGWO efficiency. The first experiment used to evaluate the efficiency and effectiveness of the proposed approach in achieving a minimal error utilizing five datasets that obtained from the archive of UCI (University of California, Irvine) [62]. This experiment's results reported in Section b. The second experiment used to predict the disease of Parkinson and described in section c.

A. PERFORMANCE EVALUATION OF THE MODELS
Several measurements utilized to determine the efficiency of the suggested ANFIS-PSOGWO approach and to check the efficiency of the performance of the solutions, which described as follows [63]: 1. Mean square error (MSE): where n indicate data points, yi is the observed values whileȳ i is the predicted values. 2. Root mean square error (RMSE): For j = 1: d 4.
Produce a disorderly series of the tent according (27) 5.
End for 6.
End for 7.
While t < T max 8.
Update position according to (22)

Accuracy
where TP, FP, FN, and TN calculated using table 3.

B. ANFIS WITH PSOGWO EVALUATION 1) EXPERIMENTAL ENVIRONMENT AND TOOLS
Five datasets are included in this section to determine the efficiency of the proposed system (ANFIS-PSOGWO), which obtained from the UCI repository [62]. The overall properties of these datasets are given in table 4. In this experiment, the ANFIS-PSOGWO compared to different metaheuristic algorithms like differential evolution algorithm (DE) [64], ant colony optimization algorithm (ACO), grey wolf optimization algorithm (GWO), genetic algorithm (GA), particle swarm optimization algorithm (PSO), and standard ANFIS structure.
Dataset split into 70% training and the remainder for testing in all experiments. Table 5 shows the parameters of all algorithms. All studies were conducted on the Windows 10 Pro 64-bit OS; Intel(R) 16 GB RAM Core(TM) i7-8550U CPU@ 1.80GHZ 1.99 GHz CPU. MATLAB(2018a) is used for all implementation.
The parameters of all optimization algorithms are: population size (n) = 25, maximum iteration = 100, lower bound = −10, upper bound = 10. Such parameters used to hold the algorithms fairly comparable. Parameters are assigned since they have produced some great outcomes in previous research [38], [63].

2) EXPERIMENT OUTPUT
In this experiment, four metrics were used for evaluating the proposed ANFIS+PSOGWO model in optimizing the ANFIS parameters namely the root mean square error (RMSE), the accuracy, the standard deviation (SD), and the mean square error (MSE). The tests introduced 15 times, and all measurements are reported on average in Tables 6:9 and Figure 8:11, respectively. We can infer from these tables and figures that ANFIS-PSOGWO has outperformed the other algorithms in all cases of 5 datasets. In breast cancer dataset, the proposed model achieved better results in the training stage, but ANFIS+ACO outperformed the proposed model in evaluation. According to the accuracy that showed in table 9 and figure 11, we note that the proposed ANFIS+PSOGWO outperformed all algorithms in all utilized datasets. This experiment proves the success of our suggestion and can be relied upon to improve the process of ANFIS's parameters optimization.

C. PARKINSON'S DISEASE CLASSIFICATION
Parkinson's disease (PD) is a recurrent motor and non-motor neurodegenerative disease [65]. There is increasing interest to find reliable and effective telemedicine systems for PD diagnosis and monitoring. These systems can extend the lifetime of PD patients with the aid of surgical or pharmacological treatments. Moreover, decreasing the amount of uncomfortable physical trips to hospitals for surgical tests and minimizing the workload of clinicians [10]. Telemedicine PD programs utilize many instruments and methods to assess the symptoms. Around 90% of PD patients in early disease phases impaired by vocal difficulties, one of the most significant signs.
In this section, the proposed model for Parkinson's disease prediction is introduced. Referencing to     The sequence of the model showed in Fig. 12. One of the main functions of the fog layer is data preprocessing. The fog preserves a temporary storage for generated data. Moreover, fog saves a backup for the classification model that developed in the cloud in case of internet connection failure. The classification model saved in fog utilized as an expert for patients and can give alert in case of abnormal cases. The cloud used for training the proposed model and long term data storage.
The proposed Parkinson's disease prediction consists of two main processes, namely data preprocessing and data classification.  1. Data Preprocessing: data preprocessing is an essential stage for classification models design. Data preprocessing has many forms according to the given problem. Data normalization and feature selection  are used in this model for preparing and producing high-quality data and hence high-quality prediction results. The min-max normalization [66] method used to produce a weighted features bPoundaries. Moreover, the correlation feature selection (CFS) [67] used to pick the most optimal features for the diagnosis of   The correlation feature selection (CFS) is developed using weka software tool. CFS reduced the Parkinson's disease dataset cardinality from 754 to 119 features. These features are used for training and testing the proposed ANFIS+PSOGWO model. The same environment setting was used as in the first experiment for Parkinson's disease dataset. Table 11 shows all output results of the proposed Parkinson's disease classification model. Moreover, the proposed ANFIS+PSOGWO compared to ANFIS+PSO, ANFIS+GWO, ANFIS+ACO, ANFIS+DE, ANFIS+GA, and standard ANFIS models. As shown in Table 12 and   The proposed model is compared with some of the recent studies such as Sakar et al. [45] and Gunduz [46] studies. These experiments use the same model training methods (i.e. the same sample, training protocol and evaluation measurements) as our analysis and therefore give us the chance to equate our results with the suggested experiments explicitly. While the results of Sakar et al. [45] were 86% and Gunduz [46] 86.9%, our proposed model outperformed these studies and resulted in the accuracy of 87.5%. Table 13 and Fig. 14 show a comparison between the proposed model and the related work.

D. RESULT DISCUSSION
There are two experiments for evaluating the proposed ANFIS+PSOGWO model. The goal of the first experiment is to make evaluation for the efficiency of the proposed model. Five datasets and four evaluation metrics utilized in the first experiment. The proposed model gave good results in all datasets used. In comparison to the standard ANFIS model, the proposed model achieved the lowest error rate in training and testing stages together with the highest accuracy. Moreover, different optimization algorithms were evaluated like the genetic algorithm (GA), the particle swarm optimization (PSO), the grey wolf optimization (GWO), the ant colony optimization (ACO) algorithm, and the differential evolution (DE) algorithm. The results reveals that the proposed model outperformed all of them in terms of MSE, accuracy, SD, and RMSE. All tests in the first experiment involved the entire features subsets in the original dataset to explain the effect of PSOGWO in optimizing the parameters of ANFIS model.
In the second experiment, the proposed model employed for Parkinson's disease classification. In this experiment, the CFS used to select the most informative features that help for the diagnosis of Parkinson's disease. The CFS choose 119 features from 754 features. The proposed ANFIS+PSOGWO model applied on the reduced feature subset. The results of the ANFIS+PSOGWO model outperformed all compared model. According to the accuracy, the proposed model produced 10+ more than the standard ANFIS model while 7.4+ more than the next in performance. In order to avoid any confusion between the data in the first and second experience, the datasets in the first experiment were used because there is literature research that have suggested approximately + 90 accurate schemes for distinguishing balanced subjects on such limited datasets from PD patients. In such data sets, the number of subjects was often relatively limited, and the accuracy of complicated models achieved on these small data sets cannot found in another dataset with a greater number of subjects.

VI. CONCLUSION AND FUTURE WORK
This paper introduces a novel model for extracting clinically useful information for Parkinson's disease (PD) assessment and utilizing learning algorithms to achieve reliable decision support systems. In this study, a proposed ANFIS+PSOGWO model for PD prediction using sets of vocal (speech) features in fog-cloud has introduced. The fog infrastructure provides real-time data processing and analysis and overcoming the cloud limitations. The PSOGWO algorithm merges the exploration and exploitation capabilities of grey wolf optimization (GWO) and particle swarm optimization (PSO), respectively. A chaotic tent map for initialization prevents the code from getting locked into a local optima issue. The proposed model was evaluated through different dataset with different metrics to approve its efficiency. Through the outcomes, the suggested model produces good results compared to other algorithms-the proposed model success in predicting the Parkinson's disease with accuracy 87.5. The proposed model will, in future, be extended to a broader range of data sets for other machine learning algorithms that can train fog learning.
Hybridization can also be implemented with contemporary metaheuristic algorithms like an optimization algorithm for dragons, algorithms for whale optimization and a chimp optimization algorithm.
IBRAHIM M. EL-HASNONY received the B.S. and master's degrees in information systems from the Faculty of Computers and Information Sciences, Mansoura University, Egypt. His research interests include cloud computing, big data, data analysis, smart city, the Internet of Things, neural networks, artificial intelligence, web service composition, and evolutionary algorithms. SHERIF I. BARAKAT is currently a Professor with the Information Systems Department, Faculty of Computers and Information, Mansoura University. His research interests include computer architecture, computer networks, data mining, information theory, discrete mathematics, and numerical analysis operation research. He has published more than 60 papers in international and local conferences, journals, and proceedings in different fields of information technology.