Intelligent Machine Learning With Evolutionary Algorithm Based Short Term Load Forecasting in Power Systems

Electricity demand forecasting remains a challenging issue for power system scheduling at varying stages of energy sectors. Short Term load forecasting (STLF) plays a vital part in regulated power systems and electricity markets, which is commonly employed to predict the outcomes power failures. This paper presents an intelligent machine learning with evolutionary algorithm based STLF model, called (IMLEA-STLF) for power systems which involves different stages of operations such as data decomposition, data preprocessing, feature selection, prediction, and parameter tuning. Wavelet transform (WT) is used for the decomposition of the time series and Oppositional Artificial Fish Swarm Optimization algorithm (OAFSA) based feature selection technique to elect an optimal set of features. In order to improvise the convergence rate of AFSA, oppositional based learning (OBL) concept is integrated into it. Then, the water wave optimization (WWO) with Elman neural networks (ENN) model is employed for the predictive process. Finally, inverse WT is applied and obtained the hourly load forecasting data. To validate the effective predictive outcome of the IMLEA-STLF model, an extensive set of simulations take place on benchmark dataset. The resultant values ensured the promising results of the IMLEA-STLF model over the other compared methods.


I. INTRODUCTION
Electric power infrastructures are the major support for each nation and is an essential feature which straightaway influences the economic status of the nation. The classical electric power grids are not growing tremendously with respect to reliability and controllability [1]. The present century is shifting towards the smart grid power systems which combine advanced sensing, security, data transmission, and control technologies, that makes the grid highly effective and The associate editor coordinating the review of this manuscript and approving it for publication was Ruisheng Diao . reliable [2]. For satisfying the increasing demand profiles and minimal power loss in the power systems, electric load prediction becomes essential for utility and power system workers. Several operational choices like power plant economic dispatches, developing power network and security network are mainly based on load prediction. Electric load forecasting majorly comprises 4 kinds namely very short term, short term, medium term, and long term. Short-term load forecasting (STLF) is commonly employed to predict load from hours to weeks. Recent advancements are employed to monitor the demand response profiles and combination of production sources for power systems. Conventionally, engineering VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ approaches are applied for forecasting the forthcoming demand in a manual way using tables and charts. They majorly considered the weather and calendar impacts [3]. Presently, the development of statistical tools, artificial intelligence (AI), machine learning (ML), and evolutionary algorithms (EA) have resulted in the design of accurate and efficient STLF models [4], [5]. These technologies utilize intelligent and adaptive components which necessitate recent techniques for precise generation and demand prediction in an optimal way. STLF is a major problem for the proper functioning and dispatch of power system to eliminate the severe consequence of power failure. It is required for the commercial functioning of the power system and the foundation of dispatching and creating startup shutdown plan that acts as a major part in the automated controlling of the power system [6]. Precise STLF allows the user to select a proper energy utilization policy and decreases maximum amount of electricity expenses. It reduces the production cost and enhances the economic benefits with an intention of energy saving and emission minimization. Since the power systems become more complex and the degree of electricity marketization is additionally improved, the way of rapidly and precisely predicting the short-term load becomes a hot research topic in the domain of energy load forecasting [7].
Several STLF approaches have been existed in the literature [8]. Earlier techniques include time series model, Box Jenkins model, exponential smoothening, state-space model, Kalman filtering, and regression. Besides AI based models like pattern recognition, expert systems, fuzzy expert systems, fuzzy time-series, neural network (NN), and fuzzy NN are developed for STLF. In [9], it has been discussed that the predictive models have enumerated an evolution, that is affected by the rising complexity of the factors. Therefore, the ever-increasing significance and complexity of STLF (particularly in electricity market) require the design of accurate STLF models. This paper designs an intelligent machine learning with evolutionary algorithm based STLF model, called (IMLEA-STLF) for power systems. The presented IMLEA-STLF model primarily involves Wavelet transform (WT) for decomposing the time series into components. In addition, the IMLEA-STLF model utilizes oppositional artificial fish swarm optimization algorithm (OAFSA) based feature selection technique. The OAFSA technique is derived in such a way that the convergence rate of the classical AFSA can be increased by oppositional based learning (OBL) concept. Moreover, the water wave optimization (WWO) with Elman neural networks (ENN) model is employed for the predictive process and the utilization of WWO algorithm helps to significantly increase the predictive outcomes. Lastly, inverse WT is applied to obtain the hourly load forecasting data [34]. For examining the improved predictive results of the IMLEA-STLF technique, a comprehensive simulation analysis is performed on a benchmark dataset. The key contribution of the study is listed below.
• An intelligent hybrid STLF model consisting in WT feature selection, ENN, and parameter optimization is presented. To the best of our knowledge, the IMLEA-STLF model has been never existed in the literature.
• A novel OAFSA based feature selection technique is introduced by incorporating the concepts of OBL and AFSA. The input to the ENN model is generally taking place using a discretionary manner. But the OAFSA technique considered the correlation and linear independencies to select the input features.
• The parameter optimization of the ENN model employing WWO algorithm by cross-validation helps to boost the predictive outcome of the IMLEA-STLF model for unseen data. The rest of the sections in the study is planned as given here. Section 2 reviews the recent state of art STLF models. Section 3 presents the IMLEA-STLF model and section 4 explains the numerical outcomes. At last, section 5 highlights the key findings and possible future extensions.

II. PRIOR WORKS ON STLF MODELS
This section reviews the recent state of art in developing power systems. El-Hendawi and Wang [10] proposed a full wavelet neural network approach for STLF that incorporates full wavelet packet transform and NNs. Then, the decomposed features are given to the trained NN, and the output of NN is created as the predicted load. The presented method is employed for STLF in the electric market of Ontario, Canada. Yin et al. [11] proposed a deep forest regression (DFR) model for STLF in power system. This DFR model consists of 2 processes namely cascade forest and multi grained scanning. They are efficiently trained by 2 complete random forests (RF) with default arrangement. Later, the DFR is employed for the STLF of power system. The forecasting efficiency of DFR model is related to various intelligent techniques and traditional regression methods with previous data of 7, 21, and 40 days. The designed model is composed of 2 models and provides average level of precision in the power system prediction process.
Tayab et al. [12] designed a hybridization of STLF model, integrating the stationary wavelet packet transform and Harris hawk optimization (HHO) algorithm with feed-forward neural networks (FFNN). The HHO algorithm is employed for FFNN as an alternate training method to optimize the bias and weight of the neurons. The presented method is employed for predicting load demand in Queensland. In Niu and Dai [13], a new STLF model which utilizes a de-noising technique for integrating Grey Relational Analysis (GRA) and Empirical Mode Decomposition (EMD) was proposed. This proposed model processed the actual load sequence to forecast the processing subsequence using Modified Particle Swarm Optimization (PSO) and least-squares Support Vector Machine (LSSVM). Later, the ultimate predictive outcomes are attained after recreating the forecast sequence. The proposed model performs the prediction using empirical mode decomposition process which is much complex.
In Tian and Hao [14], a new non-linear integrated forecasting method comprised of 3 models (pre-processing, evaluation, and forecasting modules) is established for STLF. In contrast with simple data preprocessing of recent researches, the enhanced data preprocessing model depends upon longitudinal data selection that is effectively made in this scheme. In addition, it develops the efficiency of data preprocessing and later improves the ultimate prediction efficiency. Moreover, the altered SVM is enhanced for integrating different forecasters and attain ultimate predictions. Raza et al. [15] emerged a new STLF method depending upon feed forward artificial neural networks (ANNs), to forecast hour based load demand for many years. Here, a global best PSO (GPSO) method is employed as a novel training method to boost the ANN prediction performance. The major setback of the PSO algorithms is it can fall back to local optimum in high dimensional space at a lower convergence rate during the process of multiple iterations.
Liang et al. [16] proposed a hybrid method that integrates general regression neural network (GRNN), minimal redundancy maximal relevance (mRMR), and empirical mode decomposition (EMD) with fruit fly optimization algorithm (FOA) called EMD-mRMR-FOA-GRNN. Initially, a new load sequence is disintegrated to a certain intrinsic mode function (IMF) and remains with distinct frequencies to weaken the volatility of sequence affected by the complex features. Later, the mRMR is utilized for attaining optimum feature set by the relation analyses among each IMF and the feature includes temperature, day types, meteorology condition, etc. At last, FOA is used for optimizing the smooth factors in GRNN. The Fruit fly optimization algorithm experiences a major drawback of providing poor solution while solving the complex objective functions and non linear optimization functions.
For simplifying the data processing method to assist the real-world application and STLF, [17] utilizes previous load data as features and considers the time sequence features of load data concurrently. The multi-temporal spatial scale technique is employed for processing load by decreasing the noise error and improving the time sequence features. Later, a new STLF method called multitemporal spatial scaling based temporal convolutional networks is employed to achieve forecasting load functions. The presented method could learn the non-linear feature and time sequences features of load data concurrently. In Zainab et al. [18], several smart meter energy datasets are examined for performing STLF. It uses multi-processing for enhancing the entire runtime of the forecasting modules by presenting a concurrent task to every available processor. It establishes the efficiency of presented technique by selecting machine learning (ML) methods, scalability, and runtime. Munkhammar et al. [19] exploit the Markov chain mixture (MCM) technique for STLF of residential electricity consumption. This method is utilized for forecasting further step half hour resolution suburban electricity utilization information from Australia. The outcomes are related to Persistence Ensemble (PeEn) and Quantile Regression (QR) as an innovative and simple standard module. Massaoudi et al. [20] proposed an efficient calculating architecture for STLF. The presented method handles with random variation of the load demand by utilizing stacked generalization technique. The proposed Machine Learning based STLF model of forecasting the power estimation solves the complex optimization function and provides optimal solution for the input objective functions.

III. THE PROPOSED STLF MODEL
The proposed STFL model involves a set of different processes, as illustrated in Fig. 1. Primarily, the load time series data is decomposed using WT and in parallel, every sub-series is forecasted using a WWO-ENN model. A major problem that exists in the design of STLF model [30], [36] is the appropriate choice of input parameters. In case of STLF, a collection of input parameters hold various intervals of the load of auto regression part, and exogenous parameters like weather related variables (e.g., humidity, rainfall, temperature, wind speed, and so on), time indicator (e.g., hourly and daily indicator), cost data in electricity market and specific expert details like load patterns. The presented model using OAFSA based feature selection technique identifies the optimal candidates to prevent the model from over fitting issues. The Cross-Validation (CV) technique is applied where the training process is handled by the validation errors rather than the training error. Followed by, the WWO optimized ENN model is applied for prediction process. Lastly, inverse WT is applied to obtain the hourly load forecasting [35] data. The detailed processes involved in every component are given in the subsequent sections. The major advantages of the proposed IMLEA-STLF are its high level of precision in estimating the energy requirements for the upcoming weeks which is made possible by incorporation of Machine Learning algorithm.

A. WT BASED DECOMPOSITION
Electric load series comprise numerous non stationary features like trend, modifications in levels and scope, and seasonality, etc. They are the essential and crucial parts of the load signal which is needed to be considered while dealt with non-stationary. The multi-resolution examination by WT separates the load series to 1-low frequency and a few high frequency sub-series in the wavelet series. They generally exhibit improved behavior compared to the actual load series, and thus, they can be forecasted precisely [21]. Fig. 2 illustrates the process of WT decomposition. The WT is majorly separated into a pair of classes namely continuous wavelet transform (CWT) and discrete wavelet transforms (DWT). The CWT W (a,b) of any signal f (x) based on wavelet (x) can be represented as where scaling variable a manages the distribution of the wavelets and translation factor b computes the intermediate location. (x) is otherwise known as mother wavelet. The CWT is represented as ''A'' components while the DWT is represented as ''D'' components in the Figure 2.
As CWT is accomplished through incessantly scaling and transforming the mother wavelets, considerable repetitive detail is produced. So, the mother wavelet undergoes scaling and translation by the use of a particular scale and position generally depending upon the power of 2. It is effective as CWT and is called DWT, as mentioned below.
where T denotes the signal length. Here, a fast DWT model using the filters is employed. The wavelet transformation decomposition process possesses an advantage of offering instantaneous localization of samples in frequency and time domain. The WT decomposition process enables rapid decomposition process leading to faster computation.

B. DATA PREPROCESSING
During data preprocessing, the input data is provided directly to the data cleaning process and the missing values are occupied with the average values of the earlier electricity data. At the end of pre-processing, the quality of the data is raised to a specific extent.

C. FEATURE SELECTION USING OAFSA TECHNIQUE
Next to data preprocessing, the feature selection process takes place using OAFSA technique to increase the predictive performance. The OAFSA technique is primarily derived from the concepts of OBL and AFSA.AFSA is an EA based optimization technique that is simulated as the swarming behavior of fishes like preying, swarming, and succeeding with the local search of fish individuals to obtain global optima [22]. It is a stochastic and parallel searching technique. Because of the peculiar features of AFSA, it can be employed to resolve feature selection problem. Consider a swarm of fishes comprising n particles that moves in a D-dimension searching area. The AFS can be denoted by The AF denotes a set of features like flexibility, fault tolerance, and indifferent to the initial value are represented by a binary vector: where X is the present state of AF, D indicates the feature count with the bit values of 0 and 1 representing unselected and selected features correspondingly. Consider Y as the food concentration as the objective function value and the visual scope of AF can be indicated by visual distance. The behaviors involved in the AFSA are discussed below. Following Behavior. If the AF existing state is X i , it judges the food concentration of every neighborhood partner. Next, it determines the state X j in the existing neighborhood, that includes maximum food concentration Y j . Assume n f as neighboring fishes in the present area and n signify the total AF count. When Y i < Y j and n f n < δ, it is denoted that the state X j has additional food and is un-crowded, it moves a step in the direction of the state X j . Else, it carries out the swarming behavior.
Swarming Behavior. The AF existing state is X i , it assembled into group at the time of moving. Consider X c as the intermediate place in the visual scope. When Y i < Y c and Preying Behavior. If the AF existing state is X i , it desires to elect a state Y j arbitrarily in the visual scope. When Y i < Y j , it goes forward a step. Else, it arbitrarily chooses a state X j over its visual distance, and decides whether the forward criteria get fulfilled. If the AF elects to move frontward a step, the mutation operator of genetic algorithm (GA) is employed. The mutation of the position is employed for the creation of the trials. When the AF goes forward the step in the state X i to the state X j , then dissimilar bit count n b is determined. When n b > S m , then S m = 3, else, S m = n b . Arbitrarily create a digit n r denoting the mutation count, where n r lies between 1 and S m . Here, few indices of the mutation places are chosen and subsequently, the bits of the chosen places are modified from [0-1] and vice versa. Random Behavior. When other fish performances are unexecuted, the AF accomplishes the arbitrary performance. It is connected to an arbitrary motion for an improved place. It is equivalent to the preying behavior; however, the mutation place can hold any place of the state X i . Fig. 3 demonstrates the flowchart of AFSA. In AFSA, the OBL concept is integrated to enhance the quality of the initiated population solution by the diversification of the solutions.
The OBL mechanism operates by searching both directions in the searching area. They encompass original solution and opposite solution. In the end, the OBL model considers the fittest solution from the available solutions. Opposite number: x can be represented by a real number in the range x ∈ [lb, ub]. The opposite number of x can be symbolized asx and to compute the value using Eq. (4): The above equation can be generalized to employ in a searching area with multiple dimensions [23].
For generalization, the location of the searching agents and the corresponding opposite points can be defined as follows: The values of every element inx can be computed by Eq. (7): Here, the FF is f (.). So, when the fitness value f (x) of the opposite solution is superior to f (x) of the actual solution x, next x =x; else x = x. The processes involved in the OAFSA technique are summarized as follows.

D. LOAD PREDICTION USING WWO-ENN MODEL
At the load prediction stage, the ENN model optimized by WWO algorithm is employed to predict the load in power systems. The ENN is a dynamic recurrent network. Compared to the classical models, the ENN model includes a specific layer called context layer, that allows the network to hold a capability of learning time varying pattern. Therefore, it is highly appropriate for discrete time series problems. Without the inclusion of the context layer, the ENN looks similar to the classical multilayer network. The context layer is commonly generated from the outcomes of the hidden layer [24]. Next, the outcome of the context layer is given as input to the hidden layer altogether with the subsequent collection of the external input layer data. The details of the earlier time are saved and reclaimed by this characteristic. It has a n-dimension external input layer, the external input vector is defined by x 1 (t) = x 1,1 (t) ,x 1,2 (t) , . . . ,x 1,n (t) T , where t denotes tth input series. For simplicity, the outcome of the final layer comprises of n neurons, and the outcome vector of this layer can be defined by y (t) = [y 1 (t) , y 2 (t) , . . . ,y n (t)] T . The neurons that exist among the hidden and context layers correspond to 1-by-1. Therefore, the neuron count in the context layer is m, that is equivalent to the neuron count in the hidden layer. The input of hidden layer from the context layer is denoted by − 1) , . . . , c m (t − 1)] T . The entire input vector of the network is represented by where k = m + n. The matrices between the 3 layers are defined by W hi (t), W hc (t) and W oh (t) correspondingly. It is important to recognize the matrix size. By analyzing the dimensions of all layers, W hi (t) ∈ R m×n , W hc (t) ∈ R m×m and W oh (t) ∈ R n×m is attained.
Here, y (t) is the real outcome of this network and d (t) is the anticipated output vector. When the activation function is selected as the sigmoid, y (t) is determined as follows: The input of the hidden layer includes two portions namely external and context inputs; so, W h (t) = W hi (t) W hc (t) ∈ R m×k . Using the entire input vector x(t) and the sigmoid activation function, the outcome of the hidden layer is defined by The intention of the network is the minimization of the error as given below.
To reduce (t), every weight matrix can be updated using Eqs. (14)- (17): here, µ is the learning rate, and (17) For determining the learning rate of the ENN model, the WWO algorithm is applied and thereby the predictive results are further improved.
In WWO method is stimulated from the concept of shallow WW [25]. With no loss of generalization, maximization problem F and objection function are f , the practical issue F is related to the shallow WW module. If the population initialization takes place, for every wave, height of the wave h is fixed to a constant h max and wavelength λ is commonly fixed to 0.5. The fitness value of every WW is inversely proportional to the vertical distance of seabed; here it could distinguish that from seabed closer to the WW fitness value is greater, the h is larger, and the wavelength is lesser. In the procedure of optimization decision making, the refraction, propagation, and breaking process of WW take place.
During propagation stage, each WW should be circulated after every round. It is considered that the actual WW represents x, x denotes novel wave generated by the propagation operator, the dimensions of maximum value function F is D, the propagation process is moved, and every dimensional of actual WW x is provided as follows where d ∈ D, rand (−1, 1) is utilized for controlling the propagation stage, and L (d) indicates length of dth dimension of the searching area. When the length of L (d) is lengthier compared to the length of dth dimension of the searching area, then a novel location would be arbitrarily changed as where lb (d) and ub (d) indicates minimum and maximum bounds of d th dimensions of the searching area and rand () represents arbitrary amount with the extent of zero and one . Afterward propagating, they calculate fitness of x ; if f x > f (x), x rather than x in population, simultaneously the wave height of x is changed to h max ; or else, x continued, and to implement energy dissipation of wave in the procedure of propagation, it decreases the height by 1. WWO utilizes the method where the wavelength of every wave is upgraded afterwards every generation as given by: where α denotes the control variable called wavelength reduction coefficient, f max and f min represents the higher and lower fitness value amongst the present population, correspondingly, and ε indicates a smaller positive constant for avoiding division by 0. At the breaking process, the energy of WWs is continually increased, the crest increasingly becomes steep, and the wave breakdowns to a sequence of private waves while crest velocity exceeding the wave celerity. Afterward propagating, WWO executes breaking on the wave x that is a novel optimum solution x * , utilized for improving population diversity. Fig. 4 showcases the flowchart of the WWO technique. The complete operation is given as follows. Initially, they choose arbitrarily k dimension (Win which k denotes arbitrary amount among one and a predetermined amount k max ) and execute processes on every chosen dimension of actual wave x to create every dimension of solitary wave x is given by: where N (0, 1) represents Gaussian arbitrary number. The refraction process executes on a wave their height reduces to 0 and avoids searching stagnation that follows the phenomenon in which wave ray isn't perpendicular to the isobaths. By refraction, in this manner the arbitrary amount center halfway among the actual locations and x * to estimate every dimension of novel wave x , is given by: Following refraction, the wave height of x is change to h max ; simultaneously its wavelength is upgraded by: The propagation operator creates maximum fitness wave to make use of smaller region and the lower fitness wave exploit larger region, the breaking operator improves the local searching between the significant optimum waves, and refraction process assists in avoiding searching stagnation and therefore decreases the early convergence. The proposed WWO integrated with ENN assists in achieving the accurate prediction of power requirement and the upcoming experimental validation process proves that the prediction and the actual data are close to each other with a minimum standard deviation. The high level of accuracy can be achieved by the Elman Neural Networks which analyze the input waves which was arbitrarily chosen by the Water Wave Optimization (WWO) algorithm.

IV. EXPERIMENTAL VALIDATION A. IMPLEMENTATION DATA
This section validates the performance analysis of the IMLEA-STLF method against two benchmark datasets [26], [27]. The first UK Smart Meter dataset holds several features such as household_id, plan used (static/dynamic), date, time, meter reading, and acorn group. The second benchmark dataset comprises hourly load and temperature data from a North American electricity utilization for a duration of January 1, 1988, to October 12, 1992. In order to ensure the better efficacy of the IMLEA-STLF method, a sequence of simulations was performed, and a brief comparative results analysis take place.

B. RESULTS ANALYSIS
An investigation of the feature selection results by the OAFSA with other techniques such as cuckoo search algorithm (CSA) [31], social spider optimization (SSO) [32], and whale optimization algorithm (WOA) [33] take place in Table 1. From the table, it is evident that the WOA model has showcased worse feature selection outcomes by attaining the best cost of 0.0398 and 0.0354 on the applied datasets 1 and 2 respectively. Next to that, the SSO algorithm has depicted slightly increased results over the WOA by offering the best cost of 0.0371 and 0.0296 on the applied datasets 1 and 2 respectively. In line with, the CSA model has portrayed a reasonable best cost of 0.0113 and 0.0261 respectively. However, the OAFSA technique has demonstrated superior results with the best cost of 0.0062 and 0.0165 on the applied datasets 1 and 2 respectively. Table 2 and Fig. 5 examine the MAPE analysis of the proposed IMLEA-STLF model with the existing technique on applied the dataset-1 [28,29]. It is apparent that the ARIMA VOLUME 9, 2021 From the table 2, the KNN is the K-Nearest Neighbor (KNN) algorithm, which works on the principle of identifying solutions based on the suggestions from the neighbor power systems. The RPART is defined as the Recursive Partitioning which employs the classification of trees to determine the electrical features from the datasets. The Random Forest (RF) model is Machine Learning based algorithm yields better results without any necessity of hyper parameters and involves simpler process. The NNET is the Neural Networks based energy forecasting algorithm which prepares the processed data to perform data classification and evaluates the accuracy of the forecasting model. Finally the Support Vector Regression (SVR) algorithm which relies on Machine Learning technology, which can be employed for forecasting and determining the solutions for the objective functions. The aforementioned algorithms were considered as the benchmarking algorithm and the performance of the proposed model is compared with these benchmarking algorithms to prove its superiority. The predictive results obtained by the presented IMLEA-STLF model with other methods on the Winter and Summer Month data are provided in Table 3 and Fig. 6. The obtained results portrayed that the ISO algorithm has exhibited least performance over the other methods with the MAPE of 2.86% and 3.55% on Winter and Summer months data respectively. Besides, the single SVM model has portrayed slightly enhanced outcome with the MAPE of 2.38% and 3.03% on Winter and Summer months data respectively.   Table 3 and Fig. 6. The obtained results portrayed that the ISO algorithm has exhibited least performance over the other methods with the MAPE of 2.86% and 3.55% on Winter and Summer months data respectively. Besides, the single SVM model has portrayed slightly enhanced outcome with the MAPE of 2.38% and 3.03% on Winter and Summer months data respectively.  A detailed MAPE analysis of the IMLEA-STLF model on the load forecasting of December and July period with Neural Networks for Electricity Prices Forecasting (NN-EPF) and Wavelet Transform and Neuro-Evolutionary Algorithm (WT-NEA) is given in Table 4 and Fig. 7   An average MAPE analysis of the IMLEA-STLF model with other existing methods on last two years' data is carried out in Table 5 and Fig. 8 After examining the obtained tables and figures, it is evident that the IMLEA-STLF model can be employed as an appropriate tool to predict the load in power systems. The enhanced predictive performance of the IMLEA-STLF model is obtained due to the inclusion of OAFSA based feature selection, OBL based population initialization, and WWO based learning rate scheduling of ENN model.

V. CONCLUSION
This paper has developed a novel IMLEA-STLF model to forecast the load for power systems. The presented IMLEA-STLF model involves different stages of operations such as WT based data decomposition, data preprocessing, OAFSA based feature selection, ENN based prediction, and WWO based parameter tuning. The OAFSA technique selects the optimal features from the candidate set of input parameters comprising distinct intervals of the load as the autoregression part, and exogenous parameters. Besides, the utilization of WWO algorithm in the design of ENN model helps to significantly increase the predictive outcomes. For assessing the improved predictive results of the IMLEA-STLF model, a comprehensive simulation analysis is performed on a benchmark dataset. The resultant values ensured the promising results of the IMLEA-STLF model over the other compared methods. The proposed IMLEA-STLF model predicts the requirement of the power system for the upcoming days and weeks precisely than the existing methodologies. The high precision results are achieved by performing the multi level decomposition method which is absent in existing methodologies. Therefore, the IMLEA-STLF model can be employed as an appropriate tool for load forecasting in power systems. In future, the predictive outcomes of the IMLEA-STLF model could be increased by deep learning and hyper parameter optimization techniques.
ZAINAL SALAM (Senior Member, IEEE) received the B.Sc. degree in electronics engineering from California State University, Chico, CA, USA, in 1985, the M.E.E. degree in electrical engineering from Universiti Teknologi Malaysia (UTM), Kuala Lumpur, in 1989, and the Ph.D. degree in power electronics from the University of Birmingham, U.K., in 1997, respectively. He is currently the Professor in power electronics and renewable energy with the Centre of Electrical Energy Systems, School of Electrical Engineering, Universiti Teknologi Malaysia, Johor Bahru, Malaysia. He is also involved in over 30 projects and consulting work on power converters, solar energy, and integration of power electronics in renewable energy systems. He has authored or coauthored over 250 articles in various technical journals and conference proceedings. He also represented the country as an Expert for the International Energy Agency (IEA) PV Power Systems Task 13 Working Group, which focus on the reliability and performance of PV power system. He was the Vice-Chair of the IEEE Power Electronics Society, the IEEE Industrial Electronics Society, and the Industry Application Joint Chapter, and the Malaysia Section, from 2011 to 2013. From 2011 to 2013, he served as an Editor of IEEE TRANSACTIONS ON SUSTAINABLE ENERGY.
MD. PAUZI BIN ABDULLAH (Senior Member, IEEE) received the B.Eng. degree in electrical and electronic engineering from Universiti Tenaga Nasional (UNITEN), Malaysia, in 2002, and the M.Sc. degree in electrical power engineering and the Ph.D. degree from the University of Strathclyde, Glasgow, U.K., in 2003 and 2008, respectively. He is currently an Associate Professor with the School of Electrical Engineering, Faculty of Engineering, Universiti Teknologi Malaysia (UTM). He is also the Director with the Centre of Electrical Energy Systems (CEES), Institute of Future Energy (IFE), UTM. His research interests include power systems analysis, systems security, deregulated electricity market, and demand-side management.