Data-Driven AI-Based Parameters Tuning Using Grid Partition Algorithm for Predicting Climatic Effect on Epidemic Diseases

Adaptive Neuro-fuzzy Inference System (ANFIS) remains one of the promising AI techniques to handle data over-fitting and as well, improves generalization. Presently, many ANFIS optimization techniques have been synergized and found effective at some points through trial and error procedures. In this work, we tune ANFIS using Grid partition algorithm to handle unseen data effectively with fast convergence. This model is initialized using a careful selection of effective parameters that discriminate climate conditions; minimum temperature, maximum temperature, average temperature, wind speed and relative humidity. These parameters are used as inputs for ANFIS, whereas confirmed cases of COVID-19 is chosen as dependent values for two consecutive months and first ten days of December for new COVID-19 confirmed cases according to the Department of disease control (DDC) Thailand. The proposed ANFIS model provides outstanding achievement to predict confirmed cases of COVID-19 with $R^{2}$ of 0.99. Furthermore, data set trend analysis is done to compare fluctuations of daily climatic parameters, to satisfy our proposition, and illustrates the serious effect of these parameters on COVID-19 epidemic virus spread.


I. INTRODUCTION
The World Health Organization (WHO) reported over one million six hundred thousand confirmed mortality rate, and still counting more than seventy million active confirmed morbidity cases from two hundred and twenty two countries and territories respectively [1]. The deaths are due to the outbreak of the novel coronavirus (COVID-19 virus) disease, which was declared as global pandemic on 12th of The associate editor coordinating the review of this manuscript and approving it for publication was Haris Pervaiz .
March 2020 [2]. The first case was reported by the health authorities in Wuhan of Hubei Provincial district of China on 29th December 2019. On 23rd January 2020 the virus forced the Chinese officials to impose a total lockdown in the main city of Wuhan before extending the order to neighboring cities. Stay at home order, and travel restrictionsn were imposed inside and outside china [3], [4]. As the virus continues to spread across the globe, Europe became an epicenter of the virus as at 23rd march 2020 later USA and India, despite measures taken by the authorities that includes travel ban, social distancing and stay at home order [3]. The emergence of the disease outbreak causes global economic hardships, and major gatherings such as South East Asia Olympics were cancelled. As reported by Tedros and Antonio that the virus causes human depression, anxiety, social disorders, and deliberate migrations to many [1]. The pandemic disrupt government sustainable human development projects and as well have future catastrophic negative impacts. Health authorities and government officials believed that reliable estimation of COVID-19 epidemic facilitates a best way of actualizing preventive measures. These proactive measures are believe to reduce the mortality rate and to stop the spread of the virus. Authorities have taken measures of exit strategies to slacken lockdown restrictions [5]. Reliable estimation could help the authorities to lower the enforced lockdown, as well as to avoid the risks of COVID-19 second wave, therefore research on reliable predictions to the epidemic still remains open. Several researches have been conducted to assess the epidemic risk of COVID-19 using basic reproduction number [6]- [12], the reckoned results are good chances to stop the spread of the epidemic. However, environmental conditions have serious impacts on the epidemic spread [13] and makes the virus more effective at low climatic conditions. Works in [13], [14] indicates that respiratory viruses spreads faster during winter (thrive during cold and dry air) period, but very inactive at temperatures above 30 degree Celsius (30 0 C). This findings is contrary to our case studies, as the virus can survive at even more than 30 degree Celsius. In Altamimi et al. confirmed that MERS-CoV coronavirus get worst around April and August, due to warm temperature, low wind speed, low relative humidity, and high ultraviolet index [15].
Recently, artificial intelligence (AI) techniques were extremely successful in jointly learning time-series features and predicting real-life scenarios [16]. Artificial intelligence algorithms suffer numerical instability, and complex boundary conditions that had existed in ill-defined mappings. Instability happens in AI during computation and is tightly coupled with insensitivity to meet exact predictions. Emergence of Adaptive neuro fuzzy inference system (ANFIS) tries to overcome these hindrances using trial and error learning. ANFIS is a machine learning approach, with hybridization of Artificial Neural Network (ANN) and fuzzy [17] logic networks to establish Adaptive neuro fuzzy inference System model, to make up the shortfalls of ANN and fuzzy inference methods. One benefit of ANFIS architecture is reliable representation of sophisticated non-linear relationships among unions of real life cases [16], [18], [19], and conquers disadvantages of classical approaches. One major challenge of using ANFIS is the difficulty in parameter estimation, which lead to noise at predicted output. Nature-inspired optimizers [20]- [22] have been widely applied to ANFIS parameter estimation and have demonstrated significant success in diverse real-life applications such as epidemiological disease predictions, for example see [23]. Performance of ANFIS can be enhanced using nature-inspired optimization approaches [24]- [26] to predicts confirmed cases of COVID-19 using time-series data; however, those methods basically focused on forecasting epidemic virus spread using daily number of cases, where each time of the day is correlated with the number of observed cases on that particular day, though Al-qaness et al. [25] trained ANFIS using air quality index time series data to predict PM 2.5 and relates it's effects to COVID-19 restriction order. It is desirable to train ANFIS for predicting impacts of climate condition on respiratory viruses such as COVID- 19. For examples, Pirouz et al. utilized artificial intelligence and multi-linear regression methods to demonstrate feasibility to classify confirmed cases of COVID-19 with respect to climate conditions for sustainable development [27], [28]. The obtained results show that there exists correlations between fluctuations of environmental conditions such as wind, humidity, average temperature and the confirmed cases of COVID-19. Also, their results demonstrated that around April and August, the virus spreads rapidly in warm temperature, low wind speed, low relative humidity, and high ultraviolet index. But this approach cannot be employed on another town or satellites as each environment exhibits different weather conditions (that is, specific model that fit data at one region could not be suitable to some other regions).
Henceforth, conventional methods [29], [30] suffers handcrafted features which might have affect the accuracy of the tested sample. Motivated by these relevant literature, in this paper we propose to extend application of Adaptive Neuro fuzzy inference system (ANFIS) as an artificial intelligence (AI) model to automatically predict impacts of climatic conditions on confirmed cases of COVID-19 virus in China and Thailand Cities, and to construct a correlation between first phase of the virus spread and spread of virus on new phase (second phase) according to climate conditions. Moreover, statistical analysis is employed to map correlation between demographic information and COVID-19 cases, as demographic data (population density, age, elevation, gender ratio and sex) may likely to increase cluster of cases.
During winter period in Thailand, people like to travel and spend much time outdoors than indoors, this creates avenue to human-to-human contact and possible exposure, where infected persons can transmit disease droplets during cough, sneezing or if in contact with someone, as a result may increase the number of COVID-19 cases, however, it is clear that, number of confirmed cases of COVID-19 increases not only due to human exposure but according to climate conditions and season. characterization between wind speed, relative humidity, temperature and COVID-19 may allow to understand the virus index and its possible behavior. However COVID-19 virus is not self-sustaining, it may possibly cluster in places with high population density, which happens to be around winter period. Therefore, having a robust model to predict the number of confirmed cases of COVID-19 at climate conditions is useful.
The proposed algorithms [27], [28] have learning capability not like multivariate statistical models, and demonstrates promising results, however accuracy of these methods need to be improved. Moreover, serious bias on the effect of climate conditions on COVID-19 virus need to be extended to different climate regions, we establish our propositions that those conclusions might not be suitable to tropical savanna climate, tropical monsoon climate, and tropical rain forest climate regions in Thailand, due to different countries exhibit different climate conditions at the study period. In Thailand till date, there are not many studies about the climatic effect of survival and spread of COVID-19 virus. Therefore, our paper aims to contribute knowledge to research domain of COVID-19 pandemic, and predict relationship between tropical climate conditions and epidemic spread of coronavirus in different Thailand's climate zones. This paper have chosen five climate conditions: (a) temperature (minimum, maximum, and average temperature), many existing studies reported that temperature have significant effect on spread and survival of COVID-19 virus [31]. (b) COVID-19 virus remain active in air for many hours, it is a clear indication that the virus spread has serious impact on wind speed [32]. In addition, wind speed is responsible to spreads pollutant in air, as well stand as key parameter in measuring of air pollutants such as biological contaminants [32] e.g. epidemic virus, therefore to establish this relationship we have chosen wind speed parameter, (c) humidity shift can seriously impact the spread and survival of COVID-19 virus [33]. However, these climate parameters cannot be directly ignored while looking for a way to control epidemic virus. Readers are referred to [29]- [31], [34] for details why temperature, humidity and wind speed are selected as our input variables. The major contributions of this paper as are follows: (a) To examine the number of confirmed cases of COVID-19 data and establish the nature of relationship between increment/decrements of number of COVID-19 cases at climate conditions, and scan virus spread pattern according to clustering differences. (b) To predict the climate effect on epidemic diseases such as COVID-19 virus. (c) To propose an AI model with good parameter estimation algorithm capable to overcome the generalization and over-fitting problems besides conventional ANFIS volatility.

II. MATERIALS AND METHODS
In this paper, we analyzed the correlations among the variations of confirmed cases of COVID-19 with respect to five critical environmental factors, namely, minimum temperature, maximum temperature, average temperature, population density of province, relative humidity and wind speed. These five factors are fed to the ANFIS inputs, whereas the number of confirmed cases of COVID-19 is used as ANFIS output for two consecutive months in Bangkok, Thailand and first 10 days of December, as well as 30 days in Wuhan, Hubei province China. Furthermore, this paper will track variations of confirmed cases of COVID-19 from four provinces of Thailand with large number of cases according to variations of parametric climate conditions. The following are the four provinces with greater than or equals to 90.31 percent of the outbreak per capita cases; Phuket, Nonthaburi, Yala and Samut Prakan. In addition, we try to predict the present situations in Bangkok for the second phase of COVID-19 confirmed cases (New normal) of the last 10 days of December.

A. CONDITIONS OF ANALYSIS
(1) The following demographic and parametric climate conditions are considered for the analysis; population density of province, sex ratio, average age, elevation, maximum, minimum, and average temperature, relative humidity, and wind speed.
the number of confirmed cases of COVID-19 (N 19 ) as function of climatic parameters (maximum temperature (t max. ), minimum temperature (t min. ), average temperature (t avg. ), wind speed (S w ) and relative humidity (H r )).

B. RESEARCH ENVIRONMENT
We employed two research environments from the China Republic and the Kingdom of Thailand. In the China republic, 30 days data sets from January 28, 2020, to February 26, 2020, of Hubei province is adopted from [27]. In Thailand, complete two months (March 2020 and April 2020) and the first ten days of December 2020 data sets are collected from the number of confirmed cases of COVID-19 reported by the department of disease control (DDC) Thailand and the Thai meteorological department (TMD). The data set is comprised of a total of 101 days of data of confirmed cases. These data sets are collected during the winter period of the two countries. Example of complete data sets are demonstrated in Table 1 to validate that confirmed cases of COVID-19 is a function of environmental conditions. This analysis is conducted by briefly visiting the procedures in [27], [28].

C. ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM (ANFIS)
Our method adopted artificial intelligence (AI) approach, which is the hybridization of Artificial Neural Net-work (ANN) and fuzzy logic [17] networks as one Adaptive Neuro fuzzy inference System (ANFIS) model, to make up the shortfalls of the ANN and fuzzy inference methods. One benefit of this architecture is a reliable representation of sophisticated non-linear relationships among unions. However special interests in using ANFIS to predict complex relationships have risen due to its good representation, fast convergence and soft computing [35]- [38]. ANFIS can be described with the following equations: It is assumed that exists linear and nonlinear parameters, h 1 , h 2 , c 1 , c 2 , j 1 and j 2 , when Q of m, tunes in K 1 ; Q of n, tunes in L 1 and Q of m, tunes in K 2 , and Q of n, tunes in L 2 respectively. Where the system output is demonstrated by Z a,i , with a denotes the order of the layer, and i denotes the order of the node combination. The first and second fuzzy rules can be generated from the two inputs m and n data for simplicity as Intuitively, the five layers are initialized using carefully selected Climatic parameters according to the Pearson correlation coefficient, which is later modified according to the Grid partition algorithm and fed to the layers.
First Layer: The nodes are adaptive here, formulate as: With u i and k i as the premise parameters in Sigmoid membership function in (3), Second Layer: The nodes here are fixed. And each layer computes explosion power per each rule through multiplying the input signals and print out results, according to Third Layer: The nodes here are fixed. This layer computes the quotient of the i order of the explosion power rules from the previous layer, to the aggregate of all rules of explosion power. This is defined as: Fourth Layer: The nodes are adaptive here. This layer computes the consequent parameters per each node as the multiplication of the normalized explosion power y i and polynomial of order one. The output of the third layer is multiplied with fuzzy rules of Sugeno function defined by: Fifth Layer: This is a single layer represented by . The layer handles all input signals from the previous layers. The fifth layer function Z 5,i can be formulated as follows: However, it is expected to evaluate function in (7) to be estimated as f i , in place of observed function f i to predict a new output for any available inputs. But the major challenge, is how to make predicted data to be as close as observed data, thus issue of minimizing function comes in, which is defined as: Therefore, our ANFIS formulation is updated according to Gaussian membership function with a sigmoid activation function. Membership function is defined as a designed curve to explains mapping relationship on membership values, strictly bounded within 0 and 1. MFs shapes play a crucial role due to their influence on FIS, and computational efficiency. Furthermore, data set input vectors are fuzzified by adopting a Gaussian membership function with sigmoid activation, and account for the membership degree per input, as a result, increases input vector dimensionality. These hybrid MFs enables a smoother transition between members and non-members when compared to triangular and trapezoidal MFs. Moreover, this MF has few parameters as compared to bell MF, thus the model becomes flexible. There are available MF in fuzzy logic formed from the following basic functions: (i) Gaussian distribution function (ii) Piecewise linear functions (iii) Triangular MF: simple, yet threatened non-smoothness and overlap. (iv) Sigmoid activation curve However, Gaussian bell, triangles and trapezoids curves performed nearly best to the approximated, but it takes negative values in its lesser lobes, and these values are viewed as low degrees of membership set [39]. Triangles and Gaussian bell MFs do not always handle the vast function space of IF-THEN fuzzy sets. Therefore they are not the best choices for nonlinear system IF-THEN approximations. To achieve accurate optimization capability of GP, an alternative Gaussian MF with Sigmoid activation function is our motivation, which is formulated in (9)-(10), and we extend it application to the ANFIS learning approach.
For a given symmetric logistic set function [39] centered at k c with thickness b c > 0, sigmoid function can be reformulated as follows: Note, u c (m) and λ c (m) defines joint set and multi-valued set respectively. The factor 2 gives max m ∈ Ru c (m) = 1 [40]. Furthermore, ANFIS lacks good generalization capability and faster learning speed. Addressing these issues involves increasing input data set, input-output membership functions, increasing membership functions type, and increasing the number of rules. Tuning these parameters yields a complex fuzzy inference system (FIS), however, initializing such a system is troublesome. To design this FIS, a data-driven technique is good choice to learn rules and tune FIS parameters [41]. However, during network training, optimization algorithm generates candidate Sugeno FIS parameter sets. Sugeno FIS normally employs a sum aggregation method. It is updated with each parameter set and then evaluated using a network input training data set. However, this technique suffers from a data overfitting problem, but tuned FIS from learnt data gives outstanding output for calibrated samples, whereas woefully fails during the verification/testing phase. Therefore, early stopping of tuned FIS according to objective verification with an unseen sample could address this issue. To do that, we need the best way to handle overfitting and avoid complex FIS tuning design. For this goal, we tune FIS using Sugeno due to it's demonstrated a few numbers of output membership function parameters and faster defuzzification. However, FIS that accommodate large dataset generally has fast convergence with Sugeno FIS than Mamdani FIS [41]. Few numbers of membership functions and rules, decrease number of tuning parameters and yields fast tuning. From previous experience, too many rules lead to overfitting [42]. Therefore, we propose to adopt a Grid partitioning algorithm, which uses three membership functions per each input data point and yields 125 fuzzy rules for learning. We proposed  ANFIS learning capability according to formulations in (9)- (10). FIS partitions of input space characterize prior fuzzy model, and partitions of FIS defines several fuzzy membership rules, illustrated in Figure 3. We obtained partition by injecting hybrid learning procedures. This proposed model is validated using five input variables (maximum temperature, minimum temperature, average temperature, wind speed and relative humidity) and single linear output (number of confirmed COVID-19 cases). The basic idea of grid partition is details in section E.

D. DATA PREPROCESSING
As explained earlier we briefly adopt the procedures in [27], in which we statistically used Pearson's correlation coefficient that establishes an independent relationship among the chosen data set variables. Before training, the data had undergone a correlation test to add more weight to its reliability, this is shown using Pearson's correlation coefficient in Table 3-4. For more details of Pearson's testing see [43]. Pearson's correlation is defined by: where ϕ is the given two independent parameters'(v,w), COV is the covariance between the two independent variables, with their standard deviations as σ v and σ w respectively. The results in Tables 3 and 4, shows that five input dataset parameters are independent upon each other, except on two instances when independent correlation among t avg. and t max. is statistically insignificant. However, these parameters can still be adopted as independent variables. the chosen climatic factors are well selected for building the ANFIS model to demonstrate the influence of Grid-tuning parameters to increase AI model exactness and forecast ill-defined and nonlinear mappings.

III. ANFIS MODELING
The construction of ANFIS architecture is detailed in equations (2-7). Best model selection is carried out by experts using a trial and error approach, but in this approach, we tried to select the best model by tuning FIS membership function parameters with Gaussian-shaped membership function according to hybrid learning algorithms initialized from the available clustering algorithms. Clustering is one of the best methods to get insights embedded in data groupings. Therefore, ANFIS clustering is considered as an alternative to the group number of confirmed cases of COVID-19 patterns into many clusters to ease scalability and learn relationship from the number of confirmed cases of COVID-19. Optimization of ANFIS parameters is done precisely using the following five clustering algorithms: Fuzzy clustering is an unsupervised approach to cluster each data point that dominates multidimensional space into X n different clusters [44]. The basic aim of this idea is to search for a cluster such that similar patterns that exist within different clusters would be minimized [45]. ANFIS parameters can be optimized as in accordance by minimizing the objective function of the FCM as follows: where m ck represents membership degree of x c in kth cluster, can be updated as follows: where h k represents cluster centers, can be computed as follows: (14) where N o , X n , f , and x c represents data size, the number of clusters, fuzzy partition matrix exponent, and cth data size, respectively. FCM is among the promising clustering approach as each data point is associated with a minimum of two clusters. However, FCM has an uncontrolled algorithm and minimizes the initial output function before computing new fuzzy clusters [16]. Also, this method is threatened by accommodating a large number of parameters. Accordingly, the best model combination is obtained with 5 inputs, 10 Gaussian membership functions per data point, where each data point accommodated 2 parameters, thus yielded 100 premise parameters. Whereas linear equations accommodated 5 parameters with 10 rules, which yields 60 consequent parameters in total.

B. SUBTRACTIVE CLUSTERING (SC)
In subtractive clustering, each data patterns is considered a capable cluster centre. The basic idea is to compute density index ρ c agreeing to pattern m c with positive constant b n , thus ANFIS-SC formulation can be defined as follows [46].
SC algorithm chooses top density index to be the first cluster center, which can be written as: The obtained cluster estimates are used to generate iterative optimization-based clustering and ANFIS modelling parameters [46] cluster estimates obtained can be used to initialize iterative optimization-based clustering methods and model identification methods like ANFIS. In this study, 15 cluster centers were determined for the given 101 data values m 1 , m 2 , . . . m No . Each data value serves as a member of the cluster centre. The number of fuzzy rule set would be equal to the number of cluster centres, each representing features of the cluster. The procedures in (16) are repeated till an adequate number of cluster centres converge.
SC clustering approach cannot handle small data points effectively, but it is highly re-presentable when high dimension problems for a moderate number of data points are considered. Accordingly, the best model combination is obtained with 5 inputs, 14 Gaussian membership functions per data point, where each data point accommodated 2 parameters, thus yielded 140 premise parameters. Whereas linear equations accommodated 5 parameters with 14 rules, which yields 14 consequent parameters in total. We achieved the best model with a cluster radius of 0.3, against the maximum iteration of 500.

C. GENETIC ALGORITHM (GA)
GA exhibits natural selection. In ANFIS parameters estimation, it is useful to model ANFIS in binary coded form, with concatenated parameters. This makes it promising in optimizing part of antecedents MFs in fuzzy rules, as well as optimizing part of linear coefficients of consequent rules. The function fitness of each binary digits string is computed as follows: where η and σ represent fitness and objective function and can be obtained from Eq. (8). Minimization of the objective function is possible through maximizing fitness. GA make this happen by random generation of an initial population of binary strings, where the candidate solution takes a portion of fuzzy rules within the partition. Accordingly using standard genetic operations of Roulette wheel selection and parameters in Table 5, slowly optimize binary string populations, and optimize linear equations of output layers of Gaussian membership function rules per chromosome of associated fuzzy partitioning of premise layers. Therefore parameters of ANFIS initialized from COVID-19 data sets are optimally obtained according to the Genetic algorithm.

D. PARTICLE SWAMP OPTIMIZATION (PSO) ALGORITHM
In particle swamp optimization formulation each particle is regarded as potential solution to M-dimensional problem space. PSO is adopted to optimize the parameters of the ANFIS model. The ANFIS-PSO consisted of five input variables and one output, with three fuzzy rules, a total of nine weighting coefficients were generated per each parameter.

E. GRID PARTITION ALGORITHM (GP)
In ANFIS parameter estimation, GP is one type of FIS tuning algorithm with notable competitive results. Despite its major challenge of an exponential increase in fuzzy rules, when the number of input data sets increases. However, the algorithm performs wonderfully good for small input data sizes. Besides better performance to small data size, another big advantage of GP is that it is flexible, due to the high number of rules and parameters [46]. Besides, GP handles internal parameters of ANFIS very effective and improves the prediction performance better than ANFIS-FCM, ANFIS-SC, ANFIS-GA and ANFIS-PSO algorithms, in modelling climate variability effects on epidemic diseases. Moreover, its application in the potential impact of climate on the COVID-19 epidemic virus is limited. GP algorithm partition the data space into subspaces, according to axis paralleled partition for a pre-defined size and type of membership functions per each dimension as we briefly adopt procedures in [46], [47]. Figure 4 below, explain the partitioned subspace of function in equation (1), where this function is numerically integrated according to the trapezoidal rule. We can simply denote equation (8) as trapezoidal area A t having thickness b at space interval. The next regime is to partition function into N equal points. However, this can be better explained if we use the grid point of N+1. Then b and A t are obtained as 1 N and equation (9), respectively. Since the data is known, procedures can use equation (10).
In Grid partition procedures for mapping input and output spaces, here we adopted Sugeno-FIS with linear equation parameter (first-order Sugeno-FIS). Sugeno-FIS + grid partition improves intelligent learning according to prior knowledge. The number of partitions is independent of iterations. The proposed model is Sugeno-FIS + adaptive grid partition, with five layers, which are initialized with the weights of maximum temperature, minimum temperature, average temperature, wind speed and relative humidity as the inputs m 1 . . . m 5 , whereas a number of confirmed cases of COVID-19 is used as network output Z.
The cluster centres from procedures (8)-(11) are used to activate first-order Sugeno fuzzy (Sugeno-FIS), using randomly selected data sets of three months: March, April and December. This proposed FIS is initialized with three months' datasets partitioned into 80% (80 samples) and 20% (21 samples) for both calibration and verification phase, respectively. The number of FIS rule set would be equal to the number of cluster centres, each denoting the characteristic of the cluster. FIS is optimized using a hybrid approach with equation (11), where back-propagation accounts for input membership functions, and iterative approach accounts for output membership functions.
Therefore to avoid complex geometries, equation (10) is solved using procedures in [48], their techniques adopted the following iterative search using gradient descent: This completes the grid partition procedures for the known data-driven approach. It is assumed that equation (11) partition the data into number of problem sizes according to the generated FIS rules. The optimum solution without serious overfitting converges at the following tolerance: To better explore this data space and to evaluate optimal ANFIS parameters, this paper develops metrics from equations (13)(14)(15)(16) to demonstrate the applicability of our algorithm.  The number of fuzzy rules, the initial training and testing step value, maximum epochs, and goal errors of the parameters, increase and decrease training rate after epoch 6 are shown in Table 5. Different values of R 2 for the model's fitness are achieved according to these values, and the condition in (11).

G. EVALUATION METRICS
The following metrics terms are used to quantify performance of proposed algorithms. The variables O c , Of c , N o ,, σ 2 r , p and Oavg c denotes measured data, forcasted data, data size, standard deviation residuals, total parameters, and average measured data, respectively.

5) AKAIKE's INFORMATION CRITERION (AICc)
Small sample-size corrected Akaike's Information Criterion index measures model's quality in terms of structural flexibility and level of deviation from the mean value, determined during the evaluation of the structure, where the model is verified according to unseen observations [18], [49], [50]. According to this index, the best model must have the lowest AICc. AICc is computed as

6) NASH-SUTCLIFFE MODEL EFFICIENCY INDEX (η N−S )
To quantify model fitness and level of deviation, the Nash-Sutcliffe model efficiency is defined as follows, with index value from −∞ to 1:

IV. RESULTS AND ANALYSIS
In this section, an adaptive neuro-fuzzy inference system is initialized using Sugeno-FIS with grid partition algorithms to evaluate the data set overfitting from conventional ANFIS and to demonstrate its applicability in predicting climatic impacts on epidemic diseases such as the COVID-19 virus. This model is developed using the Matlab R2020a platform and the climatic factors were carefully chosen using Pearson's correlation coefficient test, five independent input sets were chosen as the model inputs whereas the number of confirmed cases of COVID-19 from China and the Kingdom of Thailand are used independently as the model's output. Sugeno-FIS using the linear parameter of the Gaussian membership function demonstrates a good fit for our proposed algorithm. Our algorithm is numerically validated on the COVID-19 data set, which indicates that nonlinear data with five input variables VOLUME 9, 2021 or more, is stably predicted with Sugeno-FIS + Grid partition algorithm from a linear parameter of Gaussian membership functions. Gaussian membership function generalization with GP is highly dependent upon its learning parameters: C represents control parameters to determine the shape of the MF, and σ represents the standard deviation of differences. Our approach is divided into calibration and verification phases.

A. ANFIS CALIBRATION PHASE
Considering Figures 7a-7d demonstrates performance of the models during data set calibration, as the predicted values are nearly the same as the measured values. For the statistical significance of the results, equations (21)(22)(23)(24)(25)(26) are used to evaluate the generalization capability of our algorithm to known data, details in Table 6. The ANFIS-FCM algorithm demonstrates a good performance with the best average coefficient of determination of 99.7%, and achieved good AICc with low errors than the ANFIS-GA and ANFIS-PSO algorithms for the five inputs combinations. However, it outperformed ANFIS-SC  during BKK April data set calibration. The ANFIS-SC algorithm demonstrates a good performance with the best average coefficient of determination of 99.8%, and achieved good AICc with tolerable errors than the ANFIS-FCM, ANFIS-GA and ANFIS-PSO algorithms, except at BKK April data calibration where FCM shows superiority. However, it outperformed ANFIS-SC during BKK April data set calibration. ANFIS-PSO achieved an average coefficient of determination and good AICc values. However, ANFIS-GA achieved higher errors than all other evaluated methods, with a low coefficient of determination of 73% and achieved low AICc. Averaging the AICc and coefficient of determination accuracy metrics over the data set of considered cities, we note that ANFIS-GP achieved the best performances of 99.83%. The correlation of ANFIS-FCM, ANFIS-SC and ANFIS-GP algorithms are almost similar. However, the ANFIS-SC achieved the best AICc than the other algorithms. Moreover, quantitative and qualitative evaluations of the considered algorithms are presented in Table 6 and Figure 7. For brevity, we display in Figure 5 the performance of ANFIS-GP during data set calibration.

B. ANFIS VERIFICATION PHASE
Considering Table 7, provides quantitative evaluations of the proposed algorithms in predicting unseen data. The obtained values demonstrate good representations of observed data. In accordance, 20% (21 data points) of the data is randomly reserved for the verification. As indicated in Figure 21, where the algorithms during verification stably predicted the measured data. Therefore, it can be observed that the five algorithms have generalization capability to estimate ANFIS parameters, as this resulted in handles nonlinear and ill-mapping unions. Averaging the AICc and coefficient of determination accuracy metrics over the data set of considered cities, we note that, ANFIS-GP achieved best performances of R 2 = 99.7%. The correlation of ANFIS-FCM, ANFIS-SC and ANFIS-GP algorithms are almost similar. However, the ANFIS-SC achieved the best AICc accuracy than all other algorithms. However, ANFIS-GA achieved higher errors than all other evaluated methods, with a low coefficient of determination of 73% and achieved low AICc. For brevity, Figures 8 plots only the results of the best method. As quantitatively shows in Table 7, it is clear that the VOLUME 9, 2021 proposed ANFIS Sugeno-FIS + Grid partition algorithm has generalization capability to nonlinear and ill-mapping unions of COVID-19 data set according to climate variations, and achieved the best accuracy for five out of eight metrics over the four considered cities/locations, which is consistent with our targets.

C. PERFORMANCE COMPARISON OF DIFFERENT ANFIS OPTIMIZATION ALGORITHMS
In this section, we present experimental results of our proposed algorithm and give a comparison of the proposed model with other states of the art algorithms quantitatively and qualitatively. In Figures 8(a) Wuhan data set is validated according to five ANFIS parameters estimation algorithms, in Figure 8(b) Bangkok data set for the month of march is validated according to five states of the art algorithms, in Figure 8(c) Bangkok data set for April is validated according to five states of the art algorithms, and in Figure 8(d) Bangkok data set for the first ten days of December is validated according to five state of the art algorithms, better generalization results of GP can be observed. In Figures 8(a)-(d), data set generalization of GP is more representative to the measured data when compared to FCM, PSO, SC, and GA, with minimal overfitting, however, GP remain superior for the considered data set. Furthermore, SC and FCM are flexible during data set generalization to see section D for details.

D. SENSITIVITY ANALYSIS
Sensitivity analysis points out the effect of each combined input variables according to the model output variables. The reason is to obtain climate parameters capable of changing the model output variables (number of confirmed cases of , to an extent that spread of COVID-19 cluster cases changes from its actual cluster. Figure 20 illustrates that ANFIS with GP parameter estimation algorithm is demonstrated as a superior technique, thus it is chosen for the sensitivity analysis. For this, each input climate parameter was included once per each combination, and the variations in the number of confirmed cases were noted according to the ANFIS-GP algorithm model. The parameter inputs combinations of three, four and five were taken to notice correlational variability and its related influence on the number of confirmed cases as depicted in Figure 6. However, the resulting predicted confirmed cases of COVID-19 were compared to the reported cases from the DDC office. Figures 6 show the model combinations and variation effect of various input parameters of each input on the number of confirmed cases of COVID-19, concerning five monitoring climate factors. It is clear that the maximum temperature, minimum temperature and the wind speed are the most effective parameters, and relative humidity and average temperature are the least effective parameters in predicting the number of confirmed cases of the COVID-19 virus. In predicting the number of confirmed cases of COVID-19 according to five climate factors using the ANFIS-GP algorithm, the influence of all stated parameters in equation (1) is investigated. It is shown in Figure 7 that by removing the Relative humidity (H r ), and Wind speed (W s ) (model 1), the prediction accuracy decreased compared to model 15, so the AICc error is increased by (AICc = 1%, 38%, 39%). This shows that relative humidity and wind speed have a significant effect on the cluster of COVID-19 cases so that the spreading rate is increasing, as a result, the number of infected persons will be increased. In model 2, the absence of the average temperature and wind speed of the cities in consideration slightly reduced the accuracy of the model as  compared to model 1 but has serious deviations if compare to model 15. In model 3, average temperature, and relative humidity are removed which slightly improves model accuracy around 30% especially for the data sets around March and April. However, in Bangkok, the model accuracy decreased to over 30% due to relative humidity serious effect on the virus spread. In model 4, we considered maximum temperature, average temperature and relative humidity, this shows that in Wuhan city wind speed and relative humidity has the same influence on the virus spread, as lack of wind speed did not change the model behaviour. Although in the remaining cities during March, April and December absence of wind speed improves the model accuracy by about 1% and 30% respectively, this indicates that wind speed at this city around this period has an insignificant effect on the virus spread, therefore the number of cases may not rise. In models 6 to models 14 absence of one parameter may not significantly improve or decrease the accuracy of the models in Wuhan city, though the model suffers a partial complex structure. Furthermore, in Table 8, it can be seen that models 6-14 when verified using Bangkok data sets for March, April and December, absence of the minimum and average temperature increase models error for about 1−5%. This signifies that minimum temperature has a significant influence on the virus spread, as this result, even at about 22 − 29 • C, the COVID-19 virus can survive. In model 7, when the minimum temperature variable is introduced, increased the model accuracy to about 2%, as resulted, it is predicted the number of confirmed cases of COVID-19 in figures of 120 cases at 29 • C and 7 cases at 27 • C. The absence of Maximum temperature boosted the model's error for about 5% and absolute deviations by 5% due to its significant influence on the rising number of COVID-19 cases. It shows that the COVID-19 virus can survive at maximum temperatures of about 38 • C as the number of confirmed cases on those days increased to 188 cases from the preceding day of 89 at about 37 • C. In model 15, when all the five input parameters were included, model performance was qualitatively and quantitatively improved. These results are given in Tables 8 and 9. Furthermore, the five parameters' model combination is best as each parameter displayed influence in terms of accuracy, flexibility, and generalization, Figure 20 displays.

E. CORRELATIONS BETWEEN TRENDS OF CONFIRMED CASES OF COVID-19 WITH CLIMATIC FACTORS
In this section, we give the trend behaviour of the considered input parameters with respect to the output parameters. Figures 9-19 pictorially shows the actual relationship between the number of confirmed cases of COVID-19 and climatic factors (parameters) in other Thailand cities that are not inclusively considered in our analysis. We consider uniform climates between Bangkok, Samut Prakan and Nonthaburi, as the three cities lie in the same climates zone. This also happens due to the unavailability of weather information. Parameters fluctuations exist among the relationship with respect to output values, as can be viewed from the waveform spikes of the data trends. A negative relationship exists due to strict measures enforced by the two countries to avoid human morbidity, however rising cold climates remain a vector to VOLUME 9, 2021 spread the virus at most of the localities, as it positively raises the number of confirmed cases.

V. DISCUSSION
In this section, ablation studies were conducted to justify our estimation algorithm selection, hybrid optimization method, data sets selection and tunable parameter options in the grid partitioning algorithm. For each input parameter combination, fifteen different model sets were established. Different input parameters were verified for each ANFIS-GP algorithm and optimized parameters with good MAD, MSE, RMSE, MAE, R 2 , AICc, NSE, and SI were demonstrated. Tables 8 and 9 presents the ablation study performance, it is observed that M15 indicates model 15 with (349, 3156), (0, 188), (13,120), and (10,21) as the optimal parameters for control parameter to determine the shape of the MFs (c σ ) and standard deviation of differences, respectively. In 15 models sets, we evaluated five different input combinations (maximum temperature, minimum temperature, average temperature, relative humidity, and wind speed) as shown by Figure 6 for the four respective locations. The verification results of the best ANFIS-GP models are demonstrated for the Model 15 combination. However, remaining ANFIS-GP models one to fourteen give different predictions for different data sets combinations. According to the average performance of the models, models comprised of input combinations from either one of the temperatures, and the input combinations 4-5 perform better than the other models. The input combinations of model 1 seem to be slightly worse than the input combinations of models 2-15. It is clear that the ANFIS-GP models give the worst results for model 1, and 2 data set combinations. The reason behind this may be the fact that in model 1 wind speed and humidity were absence, wind speed is responsible for the spread of pollutants and contaminants in the air such as the COVID-19 virus. COVID-19 virus remains active in the air for many hours, it is a clear indication that the virus spread has a serious impact on wind conditions. Also, humidity variation affects the survival of epidemic viruses and some models with average performances were due to the absence of using two temperature parameters combination as indicated in Table 9. The best ANFIS-GP model was obtained for the M15 with five input combinations. Tables 6 and 7 present calibration and verification results of the data set using four different states of the art methods for ANFIS parameter estimation. We set different parameter values as indicated in Table 5.
To investigate the benefit of using clustering algorithms to estimate ANFIS parameters, we performed calibration and verification with multiple input combinations of Subtractive clustering and Fuzzy C-means. We also observed the benefits of tuning ANFIS using data set clustering as it achieved good performance compared to conventional ANFIS. It is clear that FCM minimizes output function before calculating new fuzzy clusters and a minimum of two clusters serve each single data point, this enhances prediction accuracy. However, it comprises a large number of parameters that burden the network training. Likewise in SC, each data point serves as a member of the cluster centre, which make it possible to establish an equal number of fuzzy rules, thus improve accurate predictions as indicated in Tables 6 and 7. However, this method suffers a gambling problem while selecting suitable values for the radius to estimate clusters' number. Furthermore, ANFIS parameters were also estimated using nature-inspired based optimization algorithms. Here we tuned the parameters using GA, and PSO and observed that adopting PSO with a large number of swarming particles improve the model accuracy thereby immune to local minimum problems. GA search space of potential solutions to trap the best to fit the problem. Both of these two approaches are computationally costly. The performance of the mentioned algorithms is demonstrated in Tables 6-7, and Figure 7. Figure 7 displays qualitative prediction results for the best different input combinations of ANFIS-parameter estimations. We note that using GP partition algorithm exponentially increases fuzzy rules, combining axis parallel partition with gradient descent hybrid optimization algorithm successfully handles ANFIS internal parameters well. A large number of generated rules and parameters up to 125 improved the prediction accuracy for our small data set, see Table 5 for details. Our proposed algorithm outperforms four states of the art methods. ANFIS-GP models generally perform better than the ANFIS-FCM, ANFIS-SC, ANFIS-PSO and ANFIS-GA models in predicting the number of confirmed cases of COVID-19 according to climate factors. Figure 8 shows the plots of evaluation results of the best GP, FCM, SC, GA and PSO models for the two countries. According to the average performance of the models in Table 6-7, the ANFIS-GP models have similar accuracy with SC and FCM for some input combinations.
Comparison of Tables 6-7 clearly shows that ANFIS-GP, ANFIS-SC and ANFIS-FCM models are successful in Thailand Cities than in the Wuhan Cities.
The error plot shows that the calibration error settles at about the 51st epoch point. However, the plot in Figure 20 shows that the smallest value of the verification data error occurs at the 6th epoch. After this point, it increases slightly even as ANFIS continues to minimize the error against the calibration data to the 24th epoch, where it drops to the 27th epoch and continues to move smoothly until it reaches the 66th epoch and rises rapidly to the maximum error of 2763 error value, then drops to 68th epoch to 300th epoch. Therefore settled at specified error tolerance, the plot also indicates the ability of the model to generalize the test data.
ANFIS model remains a good predicting tool for nonlinear and complex data, however, optimization techniques to tune FIS parameters make it a robust tool in predicting complex relationship among epidemic diseases according to climatic factors. It is demonstrated that the algorithm needs  to be tuned according to the environment or parameters of deployment. This is in conformance with our proposition and suggestion in [13]. The result shows acceptance of the generalization capability of grid partition in FIS procedures. Low wind speed, humidity and temperature increase the number of cases at most of the cities, but only low turn out is perceived in some instances and we believe it happens due to lockdown restrictions. Our findings are in conformance with the previous works. However, we believe this method is faster and accurate since no handcraft features were used, and the accuracy of our evaluations is better than the previous works. Existing work [27], achieved an overall classification accuracy of 95.7% and 80% for the trained and tested data, respectively, in contrast, our proposed method achieved an overall accuracy of 99% and 96% for calibration and verification, respectively.

VI. CONCLUSION
The number of cases of COVID-19 asymptomatic and symptomatic patient is on the rise due to the environmental climate's effects, as result authorities are struggling to find suitable ways to handle their situations since climatic factors are a natural phenomenon very difficult to handle by a human. AI techniques are robust to predict the climatic effect on epidemic diseases. Our obtained results show that relative humidity, average wind speed and average daily temperature affect the number of confirmed cases of COVID-19. Low relative humidity and wind speed around study areas negatively impacted the epidemic spread while at some timestamp shows a positive relationship, and the high value of average daily temperature negatively impacted the confirmed cases. These findings demonstrated that relationships between climate conditions and epidemic diseases are catastrophically biased (normalcy bias).
This model can serve as proactive measures for authorities to know when too tight or slack down restriction policies. Future research will incorporate the authorities' policies and look for more data to design a real-time AI model for Thailand.
SUNUSI BALA ABDULLAHI (Member, IEEE) received the B.Sc. and M.Sc. degrees in electronics from Bayero University at Kano, Nigeria. He is currently pursuing the Ph.D. degree in electrical and computer engineering with the King Mongkut's University of Technology Thonburi, Bangkok, Thailand. His research interests include computer vision, artificial intelligence, natural language processing, optimization, data analysis, mobile communications, and signal processing.
KANIKAR MUANGCHOO received the bachelor's and master's degrees in mathematics education from Naresuan University (NU), Phitsanulok, Thailand, and the Ph.D. degree in applied mathematics from the King Mongkut's University of Technology Thonburi (KMUTT), Thailand. She is currently an Assistant Professor with the Department of Mathematics and Statistics, Faculty of Science and Technology, Rajamangala University of Technology Phra Nakhon (RMUTP), Thailand. Her research interests include fixed point algorithm, convex analysis, and implementable optimization algorithms.
AUWAL BALA ABUBAKAR received the master's degree in mathematics and the Ph.D. degree in applied mathematics from the King Mongkut's University of Technology Thonburi, Thailand, in 2015. He is currently a Lecturer II with the Department of Mathematical Sciences, Faculty of Physical Sciences, Bayero University at Kano, Nigeria. He is the author of more than 30 research articles. His main research interest includes methods for solving nonlinear monotone equations with application in signal recovery.
ABDULKARIM HASSAN IBRAHIM was born in Sokoto, Nigeria. He received the B.Sc. degree in mathematics from Usmanu Danfodiyo University Sokoto, Nigeria, and the Ph.D. degree in applied mathematics from the King Mongkut's University of Technology Thonburi, Bangkok, Thailand. He has authored and coauthored several research articles indexed in either Scopus or web of science. His current research interests include numerical optimization and image processing. In August 2018, he was awarded the Petchra Pra Jom Klao Scholarship to study the Ph.D. degree.
KAZEEM OLALEKAN AREMU received the B.Sc. degree in mathematics from Usmanu Danfodiyo University Sokoto, Nigeria, the M.Sc. degree from the University of Ibadan, and the Ph.D. degree from the University of KwaZulu-Natal, South Africa. He is currently affiliated with Usmanu Danfodiyo University Sokoto and Sefako Makgatho Health Sciences University, Pretoria, South Africa. His research interests include nonlinear analysis, optimization theory, and graph theory. His current research interests include the applications of optimization theory in machine and deep learning.