Adaptive Protection Scheme for FREEDM Microgrid based on Convolutional Neural Network and Gorilla Troops Optimization Technique

Microgrids (MGs) suffer from unpredictable faults on the feeders for various random reasons. These faults could obstruct the stability of the MG operation and damage the components. Moreover, many uncertainties elements affect the MG’s response to faults, such as faults’ types and locations and resistances, MG operation modes, DG penetration levels, load variations, and system topologies. Therefore, fault detection, classification, and location are vital for the MGs as they provide rapid restoration and protect the components. This paper proposes an adaptive protection (AP) scheme for the future renewable electric energy delivery and management (FREEDM) system. The proposed scheme is based on the convolution neural network (CNN), in which the measured current and voltage at buses are processed in multidimensional arrays for the images’ identification and classification. The gorilla troops optimization (GTO) technique has been used to improve the CNN by acquiring the optimal architecture and hyperparameters of the proposed CNN. The proposed protection scheme can detect the system fault, classify the fault type, and determine the fault location using three proposed CNN-GTO protection scheme models. A communication channel has been performed to transfer the data, information, and tripping signals between the different devices in the FREEDM system. The proposed method is tested using a hypothetical FREEDM microgrid system under different fault conditions. The results show that the proposed CNN-GTO models can detect, classify, and location feeder faults in the FREEDM system with high accuracy. A comparison with the existing schemes such as Support vector machine, Fuzzy logic, conventional CNN, and wavelet-based CNN is performed. The optimized CNN-based GTO models can achieve an overall accuracy for fault detection, classification, and location of 99.37, 99, and 98.2%, respectively.


I. INTRODUCTION
Nowadays, information and communication technology development enables the system to form smart grids. The main smart grid characteristics are self-healing, resiliency towards physical and cyberattacks, customer accessibility, optimal utilization of the system appliances, and dependability and security of the distributed power. However, the bidirectional power and information transferring capability make the smart grid's control, operation, and protection is a great challenge [1]. The microgrid (MG) is an active system that consists of renewable energy resources (RERs), battery energy storage devices (BESDs), and load demand at the distribution system voltage. The integration of the RERs in the distribution system improves the system performance, supports the primary generation, avoids power disruptions, and enables fast system recovery. Also, the reactive and active power injection to the utility grid through the RERs can enhance the grid resiliency, minimize the losses, improve the power quality by reducing the voltage sag, and maximize the system reliability and security by an environmentally friendly source. Besides, the penetration level of the RERs, type, and location may affect the microgrid protection by bidirectional fault current [2].
The future renewable electric energy delivery and management (FREEDM) system is one of the essential models in the smart distribution network. It contains RERs, solid-state transformers (SSTs), and loads, where the energy management is done locally and individually. This can be accomplished by monitoring all the system devices and RERs in the local area. The control signals are then delivered to these devices and RERs. In the FREEDM system, a communication channel is performed for the data and information transferring among the connected RERs and loads and the control center [3]. There are many advantages of the FREEDM system, including controlling the power flow, reducing the size and weight compared with the traditional transformers, and the plug-and-play capability for the RERs and loads. Moreover, the integration of the RERs in the FREEDM system makes the system protection and control is a great challenge [4].
Various protection methods can be applied in the FREEDM system. These methods include current-based, voltage-based, impedance-based, traveling wave, timefrequency transform, harmonic content, and adaptive protection methods. The adaptive protection for the electrical grids can be classified into conventional methods and artificial intelligence-based methods. The conventional methods include adaptive overcurrent, adaptive directional overcurrent, adaptive differential, adaptive protection based on symmetrical components, adaptive centralized protection, and decentralized adaptive scheme. According to the different system operating conditions, most of these methods are based on the online setting adaption. The artificial intelligence-based methods include metaheuristics optimization-based methods, fuzzy logic-based methods, multi-agents system-based methods, and artificial neural networks-based methods [5][6][7].
In literature, the optimization of the overcurrent relays coordination in adaptive protection (AP) system was introduced based on the linear programming (LP) method and practical swarm optimization (PSO) [8]. The method was tested on the real-time digital simulator (RTDS) and evaluated for different distributed generations (DGs) penetrations and locations. The ant colony optimization (ACO) was used to carry out an AP scheme for obtaining the overcurrent relays (OCRs) coordination [9]. The AP schemebased ACO was evaluated by comparing its performance with the GA, and the ACO had better performance in selectivity, sensitivity, and operating time. The differential Evaluation (DE) optimization method was used to obtain optimal adaptive relays coordination for network topologies [10]. It adjusted the coordination settings of the OCRs and zone-2 of distance relays based on the change in the network topology. Furthermore, the PSO was used in the AP scheme by [11] to react to the network topological changes. A modified version of PSO had been advised to solve the OCRs coordination problem for modern distribution systems. Moreover, the DE method was used in [12] to mitigate the effects of DGs on the coordination of the directional OCRs using the AP scheme. It had a better performance than other optimization algorithms like GA, harmonic search (HS), and PSO in OCRs for solving the coordination problem. An AP scheme was presented to address all the protection challenges for distribution systems due to connected DGs such as centralized control, multifunction protection, and optimal protection settings [13]. It was based on finding the optimal setting of the relays and then performing an online self-adjustment of their settings by using two solvers: Baron and Ipopt. The simulation indicated that the method had an efficient performance. An AP system for MG using digital directional OCR and phasor measurement units (PMUs) was presented in [14]. The current values were measured by PMUs and acquired to determine the changes in DGs penetrations, system topology, and MG operation mode (grid-connected/isolated). Thus, according to estimated changes, the coordination settings of the OCRs were updated. The method was applied on seven buses MG, and the Ipopt solver solved the coordination problem.
The AP schemes were also presented using fuzzy systems to optimize and update the OCR settings based on grid parameters, pre-fault power flow, and circuit breakers status [15]. The grid was simulated with the ETAP program, and different DGs capacities and locations were studied. However, only one type of fault was studied; three-phase faults. A hybrid method of an adaptive fuzzy inference and GA was introduced to obtain the current and time settings of the OCRs [16]. The method was tested on a modified IEEE-14 bus with DGs at various locations. A decentralized AP scheme based on fuzzy logic to determine the optimal parameters of the OCRs while considering the DGs was presented in [17]. In addition, the AP scheme used the magnitude of voltages to decrease the time between the primary and backup OCRs. The method was implemented using ATP-EMTP software and tested with DGs on the IEEE-13 bus system.
The multi-agent system was used to implement the AP system for the MGs by the Java agent development framework [18]. To adapt the protection system for the changes in the MG, the relays updated their characteristics in offline mode, then detected the faults in online mode. The PSCAD software simulated the faults and tested the multiagent system. Ref. [19] identified various agents; measurement, breaker, relay, protection, and optimal coordination agents for AP of the distribution system connected to the DGs. The ETAP software simulated the distribution system with faults at different DG statuses. Ref. [20] introduced an adaptive protection scheme based on the multi-agent system and considered different operation modes of the distribution grid integrated with DG. These operation conditions were the various DGs status and topology changes. The adaptive protection-based multiagent system of the MG integrated with PV systems was studied in [21]. The authors identified agents for circuit breakers, relays, point of common coupling, and loads.
The artificial neural network (ANN) was used for protecting the distribution systems. An AP scheme-based radial base function neural network (RBFNN) was developed [22]. The method was used to find the faulty line and locate the fault from the source bus. Then the backtracking algorithm was applied to coordinate the main and backup relays for isolating the faulty line. Four faults were implemented in the distribution network connected to DGs to evaluate the AP method. The results were compared with other ANNs presented in [23,24] and illustrated that it had better performance. An AP system for the distribution system connected to the DGs and fault current limiter was explored by [25]. The overcurrent relays settings in this system were modified using the decision tree and neural network topology-adjusted method. A hybrid model of ANN and the support vector machine (ANN-SVM) was used to implement an AP scheme for the MGs [26]. The feedforward neural network was firstly used to detect the fault and identify the faulty lines. Then the SVM was used to estimate the fault location in the line. Based on this identification, the protective settings could be automatically re-adjusted in the ANN-SVM model to ensure the relay operation reliability.
The convolutional neural network (CNN) is considered a type of deep learning neural network based on the living system's visual construction. It is used in images identifications and classifications that process the data in multidimensional arrays. The main advantages of CNN are that it reduces the number of trained connections and hyperparameters compared to traditional neural networks [27]. The CNNs were used for many classification problems in power systems [28][29][30][31][32][33][34]. A predictive control systembased single dimension CNN was presented to improve the performance of the grid-connected wind farms [31]. It added a static var compensations control block to the convolution control systems. The CNN was applied to predict the power system instability mode and assess transient stability [32,34]. The inputs of the CNN are the phasor voltage measured by the phasor measurement units, while the state of the system (oscillatory, stable, and aperiodic unstable) is the output.
CNN also attracted the attention of power system protection researchers. Fault detection and classification method in lines using a sparse convolution autoencoder were presented in [28]. The convolution sparse autoencoder method used the current and voltage waveforms to learn the features. Ref. [29] developed a method to detect and classify the faults in lines based on self-attention CNN and the wavelet transform (WT). The discrete WT improved noise immunity performance by denoising the faulty current and voltage signals. A protection scheme based on the CNN was presented to differentiate between the PV inverters faults and the distribution line's faults [30]. Moreover, the protection scheme could identify the faulty section. Ref. [33] introduced a protection scheme for MG by identifying the faults' phase and location. The protection scheme was based on the CNN and discrete WT. The discrete WT processed the measured branch currents and then fed them to the CNN.
Recently, evolutionary optimization techniques were advised to obtain the optimal hyperparameters and construction of the CNNs [27,[35][36][37][38][39][40][41]. They are favorable because of their ability to detect global solutions to complex problems. The PSO method was employed to obtain the optimal values of the hyperparameters of the CNN [35,36]. A multi-level PSO technique was involved to simultaneously find the optimal hyperparameters and architecture of the CNN [37]. Two optimization levels were involved; the first level was used to obtain the optimal CNN architecture by determining the optimal number of layers for convolution, pooling, and fully connecting layers. The second level determined the CNN configuration hyperparameters. Furthermore, other optimization techniques were applied to obtain the optimal value of some hyperparameters of the CNNs, such as GA [38,39], harmony search algorithm (HS) [40], microcanonical optimization algorithm (MOA) [41], and fuzzy gravitational search algorithm (FGSA) [27]. Most of the applied evolutionary optimization techniques purposed many restrictions on the architectures of the CNN and/or parameters such as filter size, pooling operation, and activation function. Although these restrictions can lower the computational complexity, they reduce the performance. So, the strategy to optimize the parameters is yet to be developed.
An AP scheme is suggested for the FREEDM microgrid in the present research. The suggested AP scheme relies on the improved CNN with the gorilla troops optimization (GTO). The proposed method can detect, classify, and locate the faults in lines in the MG using the CNN with multi convolutional and pooling layers. The GTO is proposed to obtain an optimal architecture and hyperparameters of the proposed CNN. The hyperparameters to be optimized include the parameters in the three layers of the proposed CNN; convolution, pooling, and fullyconnected layers. The proposed AP scheme consists of three models; CNN-GTO-I for the fault identification and detection in the lines, CNN-GTO-II for the fault type classification, and CNN-GTO-III for the fault localization. A communication channel has been performed to transfer the data, information, and tripping signals between the different devices in the FREEDM system. The proposed AP scheme has been verified and tested using the MATLAB/Simulink environment using a hypothetical FREEDM system. Different types of faults, such as three-phase fault, line-toground fault, line-to-line fault, and double line to ground fault, are applied to prove the effectiveness of the proposed protection scheme. The suggested CNN is evaluated against the variation of the MG parameters such as operation modes, DG penetration levels, load variations, and MG topologies as a method for judging its performance. For comparison, a comparison with the existing schemes is performed.
The main contributions of the present study are as follows, -Propose a new rugged technique for MG protection based on the CNN combined with GTO Algorithm. -Propose various CNNs with multi convolutional and pooling layers to detect, classify and locate the faults. -Design a GTO algorithm to obtain the optimal architecture and hyperparameters of the proposed CNNs. -Study the performance of the suggested CNNs when the designed GTO optimizes its architecture, compare the accuracy and examine the evolution effectiveness of GTO. -Evaluate the performance of the proposed GTO-CNN against variation in different MG parameters (fault types, The following paragraphs are organized as follows; Section 2 introduces the proposed FREEDM system architecture, while section 3 describes the system dataset. The CNN method is represented in section 4; however, the GTO is illustrated in section 5. The hyperparameter optimization of the proposed CNN by GTO and the proposed adaptive protection scheme for the FREEDM system are performed in sections 6 and 7. Section 8 represents the system results and discussion. Finally, section 9 presents the conclusions of the paper.

II. Proposed FREEDM System Architecture
The structure of the proposed system is shown in Fig.  1. The system relies on FREEDM, which is considered the distribution systems' future technology. The proposed system comprises three connected microgrid sources and two loads with four buses. The microgrid sources are based on the RESs such as photovoltaic (PV), wind energy (WE), and BESDs. These sources are connected to the SST, consisting of three stages; AC/DC, DC/DC, and DC/AC, with two DC links called MVDC and LVDC. The SST can provide the control system with a high degree of freedom and improve the modulation and hardware design. Each transmission line has two fault isolating devices (FID) and a fast solid-state circuit breaker. The main part of the FREEDM system is the distributed grid intelligence (DGI), that can be represented as the master control center. It is connected to each device in the FREEDM system, such as SSTs and FIDs. It is used in collecting all the data and information from all these devices and hence sending/receiving a control signal. The applied communication platform is based on the internet of things (IoT). The IoT platform comprises two layers; the physical layer representing the FREEDM system and the cyber layer that composes the data analysis, processing, and storage. The communication between the physical and cyber layers is based on Wi-Fi.
The system voltages and currents at all system busses are monitored using phasor measurement units (PMUs), and the data are sent to the cyber layer via the communication system. Then the data is processed, and in the presence of a system fault, a signal is sent to the FIDs connected to the faulty line to isolate it using the proposed AP scheme. The FREEDM system parameters are represented in Table I. As illustrated in Table I, the voltage and current sampling time are 50 μs, and the FID's switching time is 0.1 ms.

III. Datasets Description
Electrical distribution networks suffer from unpredictable faults on feeders for various random reasons. These faults could obstruct the stability of the MG operation and damage the components. Many uncertainties elements affect the MG's response to faults, such as the types and locations of faults. So, it is essential to study and detect these uncertain elements before designing the protection system. In this paper, the uncertain data are represented as follows: -Different fault types; (single phase to ground, double phase to ground, phase to phase, and three-phase faults, so ten types of faults are recorded). -Different fault locations on one feeder; (from 0 to 100% of the feeder length with a step of 10%, so ten cases are recorded). -Different fault resistances: (from 0 to 21 Ω with a step of 3 Ω, so eight cases are recorded). -Microgrid operation modes; (on-grid and off-grid mode, performed by connecting/disconnecting the utility grid breaker, so two cases are recorded). -Different system topologies: (meshed and radial, performed by connecting/disconnecting the FIDs on feeder BB1-BB2, so two cases are recorded). -Different DGs penetrations; (change from 0 to 30% of the rated load with a variation of 5%, so six cases are recorded). -Load variation events: (change from 0 to 100% of the full load value with a step of 25%, so five cases are recorded).
The dataset of the above uncertainty parameters is simulated using the MATLAB/Simulink program. The corresponding fault voltages and currents at each busbar of the studied FREEDM system in Fig. 1 are simulated/measured and arranged separately in matrices. So, this dataset contains about 224288 patterns to be used in training and testing the proposed CNN for adaptive protection of the studied MG (FREEDM system). For each pattern, the short circuit or load flow is performed. The simulated buses' currents and voltages time series are initially in vector form. So, one cycle only of the stored data is split as: where and are the current and voltage at the i th sample, respectively. L is the total number of samples. The currents and voltages time series are converted to images to extract the required feature. The proposed AP scheme feds with the input encoded three-phase voltages and currents. The time series of voltages and currents are represented as the Gramian Angular Field (GAF), where the time series is converted from cartesian coordinate to polar coordinate. In the GAF matrix, each element is the cosine of the summation of angles. After converting the rescaled time series vectors of currents and voltages into a polar coordinate, the angular perspective can be exploited by taking into consideration the trigonometric difference/sum between each point to define the temporal correlation over various periods and determine the Gramian Difference Angular Field (GADF) and Gramian Summation Angular Field (GASF) [42]. For example, GASF and GADF representations of current and voltage waveform measures at BB2 (of Fig.1) in the case of single-phase a to ground faults at 10 % and 70 % of the line are presented in Fig. 2. The figure shows that the currents and voltages images are changed according to the faults' conditions. The dataset is divided into three parts to train, validate, and test the proposed CNN. The training dataset is introduced to the training algorithm to determine the optimal architecture and hyperparameters of the CNN using the characteristics of each classification. Then the validation dataset is used simultaneously as training to evaluate the training quality with a new dataset. In this stage, the accuracy of the CNN is improved by adjusting its parameters. Finally, the proposed CNN simulates the test dataset to measure its generalization ability. In this paper, the dataset is randomly divided into; the training stage of 60% of the dataset, 20% for validation, and 20% for testing.

IV. Convolution Neural Network (CNN)
The CNNs have significant advantages in processing a large amount of data with low computational cost. Thus, they are widely applied in solving various classification problems [43][44][45][46]. In this study, CNN has been established to build an effective and reliable protection system for the FREEDM system. As previously mentioned, the protection problem of the FREEDM systems has many operating conditions and fault scenarios that could be happened.

A. Overview of CNN architecture
One of the essential advantages of CNN is that it does not require a large number of parameters compared to other traditional neural networks. Thus, this reduces the computation complexity and required memory and improves the performance. The CNN comprises three types of layers: Convolution layers, Max-pooling layers, and fullyconnected layers [54]. An example of CNN architecture is illustrated in Fig. 3. Moreover, the following sections explore in detail the function and the Description of each layer used in designing the proposed CNN in this paper.

(i) Input Layer:
In this layer, the input data, such as the raw time series (images) of voltages and currents at each bus, is received and stored, then utilized by the input layer function. The input images size can be determined by the raw input data [47].

(ii) Convolutional layers
2D-filters (convolution kernel) are used in the convolution layers to carry out the image sampling; then, the images are converted to new images with two-dimensional arrays. The selected number of convolutional kernels filters is dependent on the neurons in the same input array region. The filters' sizes are defined according to the entire array's window sliding (width and height) [48]. The number of neurons in the future map and the convolutional layer should be the same. The layers' numbers can be determined according to the proposed network design. The operation of the convolution layer is mathematically represented by; where i and j are the mask line and column, respectively, and y and x are the column and row of the characteristic's matrix. p and k are the column and row of the filter size, respectively.
represents the characteristics matrix, and represents the mask.

(iii) Non-Linearity Layer
After the convolution layer, a rectified linear unit (ReLU) layer function is used to apply a thresholding operation for the received inputs, as illustrated in (4). The ReLU function is commonly used with CNN because this network trains faster [48].

(iv) Max-pooling layer
The maximum pooling layers are used to scale down the data to decrease dimensions and remove the redundant information to avoid overfitting and improve robustness [54]. It is implemented after the operation of the ReLU and convolution. So, it helps in increasing the filters' number in convolutional layers without complicating the computations. It can be represented by [49]; where, and are the (i, j) elements of ℎ output and input of feature maps, respectively. q and h are the pooling window width and length, respectively.

(v) Fully-connected layer
Fully-connected layers are used after the convolutions to perform pattern recognition. In this layer, the neurons are connected individually to the prior layer's neurons. It integrates all the earned features by the previous layers. So, it is very successful in classifying large patterns. The number of output neurons equals the classes' number for a classification problem.

B. CNN hyperparameters to be optimized
CNN is a powerful tool in classification problems; however, it has many hyperparameters that need to be optimized to be classified to network configurations and learning hyperparameters. For the CNN optimization, the three main layers (convolution, pooling, and fully connected) have hyperparameters that help the network to obtain a better recognition percentage. Choosing the CNN hyperparameters leads to enhancing their performances. Firstly, the convolutional operation has six main hyperparameters to be optimized; the number of convolution layers, the number of feature maps, the number of convolution filters, filters size, and the stride size padding of the convolutional layer. The number of convolution filters is used to obtain the characteristics map; moreover, the size of the filter controls the data extraction to build the characteristics map.
In the pooling layer, the hyperparameters such as the number of pooling layers, size of stride, filter size, and the method of pooling are selected to reduce the feature maps. The stride hyperparameter is common to the pooling and convolution layer; it is used to obtain the number of steps taken by the filter as it moves through the input. The pooling method produces the average or maximum value in the corresponding field of the filter is produced for the pooling method. Also, filters in the pooling layer do not have weights to learn.
The last hyperparameters associated with the fully connected layers are the number of layers and neurons. It is worth noting that the selected number of convolution layers is always greater than or equal to the number of fullyconnected and pooling layers.  This paper uses the GTO algorithm to optimize 15 hyperparameters of the proposed CNNs. The common hyper-parameters for the three layers, such as dropout rate, learning rate, and weight initializer, are listed in separate raw in Table II. The learning rate parameter controls the amount the model changes in response to the error value each time the model weights are updated. Selecting the learning rate is a challenge, whereas selecting a large value led to a set of suboptimal layer weights learned too quickly. The process of training becomes unstable while selecting a small value, lead to a long training process and can be crashed. In this study, the selected learning rate is between 0.001 and 0.05 to ensure a balance in the learning process. Furthermore, the maximum value of the dropout rate is chosen to be 0.5, whereas it leads to maximum regularization [41].

V. Gorilla Troops Optimization Algorithm (GTO)
Most of the engineering optimization problems were recently solved by the natural-based metaheuristics methods. It enjoyed numerous advantages such as the simplicity and ease of layouts and implementations, the wide scope of usage in engineering applications, performing better than local search algorithms, and not needing information for derivation functions [50,51]. These algorithms mimic the natural physical or biological phenomena such as the behavior of humans, animals, swarms, and plants. In this research, a novel natural inspired metaheuristics algorithm that simulates the group behavior of the gorilla troops is called "Gorilla Troops Optimizer (GTO)" [51].
GTO is based on five operators to perform the exploitation and exploration operations. Three operators are simulated for the exploration phase: migration to an unexplored site, moving to other gorillas, and migration to a known site. These three operators improve the searchability of the GTO for several optimization areas and achieve more balance between the exploitation and exploration phases. The other two operators used for the exploitations are competition for adult females and following the silverback.

A. Exploration stage
According to the lifestyle of the gorillas' group, the three mechanisms used for the exploration phase: migration to an unexplored site, moving to other gorillas, and migration to a known site. Thus, all gorillas are considered candidate solutions, and at each optimization step, the silverback is the best solution. The selected mechanism for the migration to an unexplored site is represented by parameter. So, three conditions are checked to implement one of the three mechanisms depending on the variable's value, rand. The first mechanism, "migration to an unexplored site" is chosen when the rand is smaller than P. Whereas, if the rand is greater than or equal to 0.5, the second mechanism, "moving to other gorillas," is chosen. If rand is smaller than 0.5, the third mechanism, "migration to a known site," is chosen. The three mechanisms can be modeled as follow;  (6) where ( ) and ( + 1) are the candidate vector solutions for the gorilla positions in iteration and + 1, respectively. is one gorilla selected randomly from the group. 1 , 2 , 3 , and rand are random values between 0 and 1 varied each iteration. LB and UB are the minima and maximum limits of the variables. , , and are parameters that can be determined by using the following equations, = (cos(2 × 4 ) + 1) × (1 − ) = × where and are the current and maximum iteration numbers, respectively. 4 , l and Z are random values in range [0, 1], [-1, 1] and [-C, C] respectively. The simulation of the silverback leadership is represented by (9). The fitness value for all solutions is evaluated at the exploration stage end and the ( ) is replaced with ( ) if the fitness value of ( ) is smaller than ( ). So, the best-selected solution in this stage can be denoted as a silverback.

B. Exploitation stage
In the exploitation stage, two mechanisms are applied for optimization: following the silverback and rivalry for adult females. The parameter can be used in the exploitation stage to select the mechanism. Following the silverback mechanism is chosen in case of the value of the parameter is greater than or equal to a preset parameter W. However, if is smaller than the parameter W, rivalry for adult females is selected. In groups created recently, the silverback is young and strong, and all gorillas in the group follow their orders. This mechanism is chosen in the case of ≥ and can be represented mathematically by; ( where is the best position of the silverback, is the number of gorillas.
When young gorillas reach adulthood, they compete with other male gorillas to expand their range in selecting adult females. In this case, < and this mechanism can be modeled by; where 5 is random values in the range [0, 1], and is a value that represents the impact of violence on the solutions' dimensions. The fitness value for al solutions is evaluated at the exploitation stage end, and the X(t) is replaced with ( ) if the fitness value of ( ) is smaller than ( ). So, the best-selected solution in this stage can be denoted as a silverback.

VI. Optimization of the Proposed CNN by GTO
The GTO is applied to optimize the CNN hyperparameters. In the present section, the details of the proposed GTO algorithm for CNN are provided; algorithm, detailed architecture of the proposed GTO-CNN, and flow diagram. The encoding strategy is described as that involves initialization, evaluation of the fitness function, and position updating mechanism [37]. As illustrated in Fig. 4, GTO develops the CNN architecture and its hyperparameters. Each gorilla acts as a possible configuration of CNN with its hyperparameters. In the proposed CNN, the last layer is a classification layer (SoftMax) to predict the class of each sample data. The delivered accuracy acts as the fitness value of each gorilla.
In the designed architecture, the initial learning rate of the training process is set to be 0.001. The simulation has been carried out using MATLAB R2018, a laptop computer running Windows 10, 64-bit with an Intel Core i7 -4510U 2.6 GHz processor and 16.00 GB RAM, and all tests accomplished to check the performance of the GTO were carried out using 30 populations in a maximum of 200 iterations. All the results are stored based on the average of 25 independent run results. Then they are compared using the obtained results. Moreover, the dataset consisted of 224288 data sets taken from simulating the test system as previously described.
The flow of the proposed GTO algorithm is described in Algorithm 1 in the appendix. Moreover, the flowchart diagram of GTO-CNN is explained in Fig. 5. The population Xi is randomly initialized and generated in the first stage to calculate the optimal set of hyperparameters of a CNN. The constraints of the search space are limited to the available resources, as listed in Table II. Also, the proposed GTO parameters are set as = 0.8, = 0.03, and = 3 to control the Gorilla Troops movements in the pre-described search space, as identified in Table II. This search space can be extended for exploring the deeper construction of the CNN based on the source and time of computation. The proposed GTO has 15 gorillas, and each gorilla is a vector of size 5. Thus, the dimension of the Gorilla Troops is 15 × 5. The GTO iterates to calculate the optimum configuration of the proposed CNN and their parameters in the search space, as listed in Table II. In the proposed hybrid CNN-GTO, the set of hyperparameters that ensures better accuracy than another is the best solution (gorilla) compared to the other lower accuracy solutions (gorilla). The fitness value is calculated for the set of parameters (gorilla) by using (15).
The positions of the gorilla are updated by (6), (10), and (13) based on the values of , parameters.

VII. CNN-GTO based Adaptive Protection Scheme for FREEDM System
As previously mentioned, this paper aims to develop the CNN-GTO-based AP scheme to protect the proposed FREEDM system. The proposed CNN-GTO is designed and implemented to detect the fault and then classify and estimate the fault location in lines for enhancing the FREEDM system resiliency during the various operating conditions. The proposed CNN-GTO model structure is illustrated in Fig. 6. First, the three-phase currents and voltages signals are generated from simulating the FREEDM system (under various normal operating and fault conditions). These signals are encoded into the GAF image as described in section 3. After that, the images are resized as 96×96 pixels. The resized images of the three voltages and currents are fed to three proposed CNN-GTO models, namely, CNN-GTO-I to detect the fault in the lines, CNN-GTO-II to classify the fault type, and CNN-GTO-III to locate the fault point. One cycle of the three-phase currents and voltages are feeding using one sample moving window at any arbitrary time. If the CNN-GTO-I detects a fault event, the CNN-GTO-II and CNN-GTO-III are triggered to classify the fault type and locate it. The proposed protection method can be implemented for practical verification using Raspberry Pi, oscilloscopes, voltage and current sensors, loads, photovoltaic simulator, wind simulator, battery energy storage, AC-DC converter, DC-DC converter, DC-AC inverter, and fault isolator devices (FIDs). The voltage and current sensors are used for data collection at all system buses and then send these data to the DGI that can be implemented using Raspberry Pi. The proposed protection scheme can be performed in the DGI, and hence, at the fault occurrence, the DGI sends a trip signal to the FIDs of the faulty line to isolate it. The communication between the voltage and current sensors and the DGI and between the DGI and line's FIDs is implemented using the IoT platform with Wi-Fi protocol.

VIII. Simulation Results and Discussions
This section aims to evaluate the performance of the proposed CNN-GTO-based fault detection, classification, and location for FREEDM systems under a wide variety of operating conditions and fault parameters. It reports results' overall accuracy, dependability, and security analysis to better evaluate the CNN-GTO models' performance.

A. Fault detection model performance
The optimum architecture of the proposed CNN-GTO-I generated by the proposed GTO algorithm, trained for 40 iterations (epochs) using the fault detection dataset, consists of three convolutional layers, three Maxpool layers, and three fully-connected layers. The first convolutional layer contains 110 filters of 64x64 size. While the second and third convolutional layers contain 98 filters of size 32x32, and 56 filters of size 32x32, respectively. The first pooling layer has a size of 32×32, stride [2 2], and padding [0 0 0 0]. The other two pooling layers' sizes are 8×8 and 16×16. The first fully-connected layer consists of 956 neurons with a ReLU activation function. Moreover, the number of neurons in the second fully connected layer is 732 neurons with the TReLU activation. Furthermore, the last fully connected layer has one neuron for detecting the fault condition with the SoftMax activation function. Table III illustrates in detail the optimum structure of the proposed CNN-GTO-I. All weights and bases of the layers are recorded and saved. The plot of average classification accuracy achieved in each iteration is illustrated in Fig. 7. The average detection accuracy is determined from the datasets of faults for 25 independent runs. It shows that the performance of the best CNN-GTO-I architecture generated by the proposed GTO algorithm has become stable. As depicted in Fig. 7, the CNN-GTO-I performance is enhanced by increasing the iterations by the GTO algorithm. Moreover, the GTO algorithm has converged during the prescribed maximum number of iterations. In contrast, the accuracy of the CNN-GTO-I has converged at about iteration 30. The test dataset of the fault detection is classified using the trained CNN-GTO-I with an accuracy of 99.369%. Figure 8 illustrates the confusion matrix for the results of the fault detection dataset by the CNN-GTO-I obtained from simulating the trained and tested data. The yaxis of the confusion matrix represents the estimated state, while the x-axis is the true state. The values of the main diagonal indicate the correctly classified states, and the offdiagonal values refer to the miss-classified states. The green highlighted boxes refer to correctly classified states, while the red-colored boxes refer to the miss-classified states.

FIGURE 7. Average CNN-GTO-I detection accuracy for each iteration
As the trained and tested datasets simulated the proposed CNN-GTO-I model, the confusion matrix indicates the model's results for different fault datasets. In this case, 224288 patterns of normal operation and faulty voltages and currents are used to test the CNN-GTO-I. As illustrated in Fig. 8, most fault and normal operation datasets are correctly detected. The miss-detected states are observed for faults that occurred at locations near the ends of the line, whereas the heavily loaded conditions conflicted with these fault states.  Fig. 6. In this case, 22400 patterns of each fault type (3 phase voltages and currents images) are used to train and test the CNN-GTO-II. As in the CNN-GTO-I, the GTO algorithm is applied to obtain the best architecture of the proposed CNN-GTO-II. It trained for 40 epochs using the fault classification dataset, whereas the required iterations number for convergence is dependent on the datasets. The optimum fit of CNN-GTO-II parameters is illustrated in Table II. It consists of four convolutional layers, four Max-pool layers, and three fully-connected layers. The convolutional layers contain 465 filters of size 64x64 in the first layer, 183 filters of size 64×64 in the second layer, 93 filters of size 32×32 in the third layer, and finally, 87 filters of size 32x32 in the fourth layer. The first pooling layer has a size of 28×28, stride [2 2], and padding [0 0 0 0]. The sizes of the other three pooling layers are 8×8, 28×28, and 16 × 16. The first fully connected layer consists of 826 neurons with a ReLU activation function. Moreover, the number of neurons in the second fully connected layer is 493 neurons with the TReLU activation. Finally, the last fully connected layer has ten neurons for classifying the fault type with the SoftMax activation function. Furthermore, the weights and biases of the CNN-GTO-II are updated using the GTO algorithm and then saved.
The average classification accuracy achieved for the four-fault classes (line to ground (LG), line to line (LL), double line to ground (DLG), and three lines (3L)) in each iteration is depicted in Fig. 9. The average classification accuracy is determined from the faults' datasets for 25 independent runs. As illustrated in Fig. 9, the solution is converged after some iterations to the fitting solution. It shows that the performance of the best-fitted CNN-GTO-II architecture generated by the proposed GTO algorithm has become stable at the 31st iteration. The average accuracy of the proposed CNN-GTO-II fault classification result is shown in Table IV. It can be observed that the accuracy of the CNN-GTO-II is varied between 100% and 98.62% for LL and DLG faults, respectively; nevertheless, the average accuracy of the CNN-GTO-II model is high for the overall performance. The scheme achieved an overall 99.14% prediction accuracy for unbalanced faults in the FREEDM system, and only 2303 out of 22400 test cases had incorrect phase identification. The confusion matrix delivers important information on the classification performance results for each fault class, as illustrated in Fig. 10. The overall classification accuracy of the network is then calculated through the confusion chart output to visualize the percentage of the accuracy of the testing data predictions.  model detects a fault state in the lines of the FREEDM system. The GTO algorithm is applied to train the proposed CNN-GTO-III model for 40 iterations to obtain the best architecture and hyperparameters. The mean square error function between predicted and actual locations is used to obtain the optimum results. The best fit CNN-GTO-III model parameters are listed in Table III. This CNN uses 22400 patterns of each fault type to train and test the CNN-GTO-III architecture. The weights and biases of the CNN-GTO-III structure are also updated using the GTO algorithm. It consists of four convolutional layers, four Maxpool layers, and three fully-connected layers. The last fully connected layer has one neuron for estimating the fault location in line with the SoftMax activation function.
The error of the proposed CNN-GTO-III model during the training and testing process is illustrated in Fig.  11. The average fault location error is determined from the faults' datasets for 25 independent runs. It is shown from the figure that the training process is converged within 29 iterations. The average error of the fault location result of the proposed CNN-GTO-III for the different types of faults is shown in Table V. It can be observed that the percentage of mean square error of the CNN-GTO-III is varied between 1.2% and 3.5% for LL and DLG faults, respectively, which can deliver a completely accurate prediction for locating all types of faults. The maximum error is less than or equal to 3.5% of the line, and it is acceptable in short lines with lengths smaller than 20 km as in the FREEDM systems. Moreover, the computational time for the proposed CNN-GTO-III model is recorded in Table V. In conclusion, the scheme and understudy can be executed in real-time. This refers to the short time (0.35 ms) during which the scheme. Even in the sequential calculations scenario, which is considered the worst case, the fault detection time is around 1.3 ms (fault cases).

D. Applying the proposed protection scheme to the FREEDM system under different fault conditions
The proposed AP scheme has been applied and tested considering different fault conditions in the FREEDM system. The used FREEDM architecture is shown in Fig. 1, and the system parameters are represented in Table I

Scenario#1 Three-phase fault
In this scenario, a three-phase fault has been applied in line 24 between bus BB2 and bus BB4 at 1.00 sec in the FREEDM system shown in Fig. 1. After the fault occurrence, the system voltages and currents are violated from their nominal values. By monitoring all the system voltage and currents, the DGI can identify the faulty line and the type of fault. Hence, the adaptive protection scheme sends a trip signal to open the two FIDs that are equipped 0.001008 sec with the faulty line to isolate it from the system and return the steady-state operation of the FREEDM system. The tripping signal is sent from the DGI to the FIDs through communication media via Wi-Fi, and then the data and information are saved in the memory of the cyber layer of the IoT platform. Fig. 12 demonstrates the RMS current and voltage at all FREEDM system busbars. As shown in Fig. 12, the currents and voltages at all Busbars are violated and affected at the instant of the fault occurrence. After the fault is cleared by isolating the faulty line using the two FIDs, the system is returned to its steady-state operation. The current and voltage at the system lines are shown in Fig. 13. From Fig. 13(a), the current in line 24 has become zero after the line is isolated using the two FIDs at the line terminals, and all system lines are in service. Fig. 14 shows the tripping signal that has been sent to the FIDs. After the fault occurrence in 1.00 sec, the tripping signal is sent to the FIDs of the faulty line after 0.001008 sec with, considering the sampling time, switching time, and the communication channel delay time.

Scenario#2 Single line to ground fault
In this scenario, a single line to a ground fault has been applied in the middle of line 24 at 1.00 sec. Fig.15 shows the current and voltage profile at all system buses. After the fault occurrence, the current and voltage at each bus are affected until the tripping signal is sent to the FIDs of the faulty line to isolate it and hence, retain the steadystate operation. The current and voltage at all system lines are shown in Fig. 16. It can be seen that, with a single line to ground fault at line 24, the current is increased, and the voltage is decreased till isolating the faulty line after sending the tripping signal to the FIDs connected at its terminals. The relay tripping signal is illustrated in Fig. 17, where it can be sent to the FIDs of the faulty line after 0.00102 sec to clear the fault.

Scenario#3 Line to Line Fault
A line-to-line fault at the middle of line 24 is applied at 1.00 sec for the FREEDM system shown in Fig. 1. The current and voltage at the four system buses are demonstrated in Fig. 18, which shows the deviation of the current and voltage signals from their nominal values after the fault occurrence. Hence, the steady-state operation is retained by applying the proposed adaptive protection method. The lines' current and voltage are investigated in Fig. 19. It can be noted that the faulty line is isolated, and the fault is cleared by applying the proposed adaptive protection method. The fault is cleared after 0.001015 sec, as shown from the relay tripping signals in Fig. 20.

Scenario#4 Double Line to Ground Fault
In this scenario, a double line to a ground fault has occurred in the middle of the line connected between bus BB2 and bus BB4. Fig. 21 illustrates the currents and voltages waveforms at all system busses. The currents and voltages waveforms at all the FREEDM system lines have been reported in Fig. 22. Also, the relay tripping signals are represented in Fig. 23. These results show that the proposed adaptive protection scheme can clear the system fault by sending a tripping signal to its FIDs connected with its terminals. Hence, after the fault occurs at 1.00 sec, the currents and voltages at all system busses are influenced till the fault is cleared after 0.00103 sec, as shown in Fig. 23.  In general, the proposed adaptive protection scheme can clear the system fault and retain the steady-state operation of the system after the fault occurrence. The proposed AP scheme based on the CNN-GTO can isolate the faulty line by tripping the FIDs on its terminals. The DGI can monitor the system voltage and current data, process it, indicate the faulty line, and clear the fault. A Wi-Fi communication channel performs the sending/receiving signals between the DGI and the FIDs. The IoT platform has been applied to enhance the proposed adaptive protection method. The proposed method can clear the fault after 0.001 sec from its occurrence. Hence, the proposed method is more sensitive and reliable in protecting the FREEDM system in the presence of different fault conditions.

Effect of noises on proposed protection system
In this section, the effect of noise on the proposed detection, classification, and location of faults modules is investigated. The noise in the voltages and currents waveforms puts forward higher requirements for the proposed CNN-GTO protection schemes. So, the proposed protection scheme needs to have robustness and anti-noise interference ability. To test the robustness of the proposed CNN-GTO protection scheme, the voltage and current and voltage signals are distorted with white Gaussian noise. The test signals have various values of the signal-to-noise ratios (SNRs) from 20dB to 45dB, similar to values used in protection research [33] while the training data is un distorted as in previous cases. The performances of the proposed CNN-GTO protection schemes are summarized in Table VI. As illustrated in Table VI, the noise of voltages and currents signals do not affect the performance of the proposed protection schemes even with low SNR. In the worst case at 20dB, the accuracy of the detection model (CNN-GTO-I) is decreased by 0.787%, the classification model (CNN-GTO-II) is decreased by 1.09%, and the percentage error for the fault location model (CNN-GTO-III) is increased by 0.91%.

performance of the Proposed protection system in the presence of simultaneous multi-faults
The proposed protection method has been verified and tested in the presence of multi-faults. Fig. 24 shows the results obtained for the line-to-ground fault in line 24 between bus BB2 and bus BB4 and line 13 between bus BB1 and bus BB3 at 1.00 sec. The currents and voltages at all system buses and lines and the relay signals are shown in Fig.  24. The relay tripping signals are sent to the FIDs of the faulty line 13 and line 24 after 0.00115 sec and 0.00155 sec, respectively, to clear the faults, as shown in Fig. 24(e).

Performance of the Proposed protection system in the line N-1 contingencies
The proposed protection method has been tested under the presence of the N-1 contingency. Fig. 25 shows the obtained results of the busbars and lines currents and voltages and the relay tripping signal considering the outage of line 13 between bus BB1 and bus BB. A fault line-toground occurred in line 24 between bus BB2 and bus BB4 at 1.00 sec. The proposed protection scheme clears the system fault, and the relay tripping signal is sent to the FIDs of the faults line after 0.001128 sec.

Performance of the Proposed protection system in different DGs penetration levels.
The proposed protection scheme is investigated by considering the change in the DGs penetration levels in the FREEDM system. The uncertainty of DERs is regarded by different DGs penetrations; (change from 0 to 30% of the rated load with a variation of 5%, so six cases are recorded). Table VII shows the performance of the proposed protection method with different DGs penetration levels. The accuracy of the fault detection and classification models is also reported, and the fault location model error is obtained. The proposed method can operate under the different DGs penetration levels.

F. Comparative Analysis
The proposed AP scheme-based CNN-GTO performance is compared with the existing schemes using the same simulation conditions for MGs' fault detection, classification, and location. Therefore, SVM [52], CNN [49], fuzzy logic [53], and Wavelet-based CNN [33] are selected for this purpose. Table VIII summarizes the performances and accuracy of these methods. It can be observed from the comparison that the proposed AP scheme-based CNN-GTO is surpassed the existing AP schemes for MGs protection.

IX. Conclusion
This paper introduced an AP scheme for the FREEDM system based on the improved CNN by GTO algorithm. The GTO is proposed to obtain an optimal hyperparameter of the convolution, pooling, fully connected layers of the proposed CNN. Three models of the AP are proposed; CNN-GTO-I for fault detection, CNN-GTO-II for fault classification, and CNN-GTO-III for fault localization. The proposed method has been applied and tested on the FREEDM microgrid system using a MATLAB/Simulink environment. The overall accuracy, dependability, and security analysis of results to better evaluate the performance of the CNN-GTO models is reported. Different types of faults, such as three-phase fault, line-to-ground fault, line-toline fault, and double line to ground fault, are applied to prove the effectiveness of the proposed protection scheme. The performance of the proposed CNN is evaluated against the variation of the MG parameters such as operation modes, DG penetration levels, load variations, and MG topologies. For the sake of comparison, a comparative analysis of the existing schemes is performed. The proposed CNN-GTObased AP scheme improves accuracy over the SVM, CNN, fuzzy logic, and WT-based CNN.