Integrated Multistage Self-healing in Smart Distribution Grids Using Decentralized Multiagent

Smart self-healing is perceived as a new alternative to ensure reliability and quality of power supply with the development of intelligent communication and control technology. On the basis of multistage characteristics of self-healing control, this article proposes an integrated multistage self-healing strategy for smart distribution systems using multiagent system (MAS), in which the complex self-healing problem is decomposed into phased sub-problems and is addressed by a unified control framework composed of different algorithms of stages. In the proposed control framework, decision-making agents vary with fault points and transition between self-healing stages making the technique fully decentralized. Stressing on the coordination between stage algorithms composed of communication self-adaption, fault tolerance, fault location and isolation, service restoration and state regression, the proposed strategy features well real-time control performance and relatively complete self-healing functions. Comprehensive simulation studies are carried out on the 84-bus and 22-bus distribution systems using MATLAB and JADE and the self-healing test platform respectively, and the test results have shown the effectiveness of the proposed strategy.

method [5], apparent impedance-based method [6], threephase circuit analysis-based method [7], travelling wavebased method [8] and artificial intelligence-based method [9], are the currently mainstreams and effective ones, especially for radial distribution systems. However, the volatility and uncertainty of DGs bring a new dilemma for these methods, and of these, artificial intelligence requires huge training data and retraining for every new configuration [1]. To overcome the above disadvantages, several automatic algorithms [10][11] based on multiagent system (MAS) have been implemented to localize and isolate a fault in distribution systems. Intelligent MASs have come to the light as a liable protection technology in distribution system, with large growth of information technology, intelligent electronic devices (IEDs) and smart metering [12].
Once a primary protection fails because of devices (such as communication, current transducer (CT) and CB) failure, the backup protection, as a fault tolerance mechanism, will be immediately implemented to locate and isolate the fault instead of the failed primary protection. Conventional backup protection approaches are mainly based on time-overcurrent and distance (or adaptive distance) measures. However, these conventional methods that are widely applied to radial systems, are prone to miscoordination in flexible distribution system with DGs [13]. Recently, the MAS-based backup protections gain much interest, including wide-area backup method based on centralized multiagent system (CMAS) [14], hierarchical multiagent system (HMAS) or decentralized multiagent system (DMAS) based backup methods [11,15]. In contrast to DMAS, the CMAS and HMAS methods are difficult to be combined into integrated self-healing framework [11].
After fault clearance, the out-of-service loads will be reenergized by automatic service restoration. As another key building block of the self-healing capacity, service restoration is defined as finding appropriate healthy paths or intentional islands powered by DGs to restore the maximum possible outof-service loads at the minimum number of switching operations within the shortest time [1]. Many methods have been presented to address service restoration problems in a centralized way, including mathematical optimization [16], multi-stage optimization [17][18][19], heuristics and artificial intelligence [20][21]. The centralized approaches can obtain a best solution. However, they depend solely on a control center, which in turn, would lead to a huge spends on communication and computation [22]. Recently, some research has been proposed MAS solution so as to facilitate the self-healing feature of a smart distribution system. The MAS can solve service restoration problems much faster with its extensibility, maintainability and concurrency [23]. An HMAS-based fuzzy clustering algorithm was presented in [9]. An HMAS with multi-level and multi-region architecture was proposed in [24]. A Q-learning algorithm based on HMAS was presented for restoration of distribution systems in [25]. Compared with CMAS, the hierarchical service restoration approaches have lower cost of computation and communication, which, however, will be deteriorated gradually with the increment of their structural levels [26].
To compensate the above mentioned shortcomings, DMAS or fully decentralized multiagent system (FDMAS) was proposed to implement service restoration tasks in recent year. In [27][28], Sharma, et al. proposed two DMAS-based approaches for service restoration respectively considering DG islanding and uncertain environment. In [29], Hafez, et al. presented a decentralized technique for active radial distribution networks, and in [30], a decentralized method is developed for distribution network restoration by convex optimization model. In [31][32][33], some distributed multiagentbased methods were proposed for self-healing of distribution networks. Ref. [34] contributed a solution for self-healing dilemmas between fault correction and fault tolerance. Moreover, a few FDMAS-based service restoration algorithms were recently developed in [35][36][37]. To the best of our knowledge, the DMAS or FDMAS-based service restoration algorithms in [27,29,36,37] are the currently most classic ones. In DMAS or FDMAS, the decision-making agents depend on the fault points, which makes the technique decentralized or fully decentralized [29]. Note that the fault point (FP) in this paper is where an electrical fault occurs unless otherwise specified.
A promising self-healing strategy should at least provide an integrated solution for fault location, isolation and service restoration (FLISR) of distribution system. However, the aforementioned literatures and most of the published work have considered only the two stages of self-healing independently (i.e., one stage for fault location and isolation, and the other stage for service restoration) [1,2]. With this in mind, a few researchers have recently proposed some approaches to address this problem. In [38][39][40], a few comprehensive resilience-oriented FLISR methods were presented for service restoration under extreme event. Since the resilience-based FLISR methods are based on centralized and mathematical optimization, they have inherent defects in single point failure and self-healing speed. Authors in [41] proposed a communication-assisted protection and FLISR control scheme for distribution systems based on IEC 61850. In [42], a self-healing framework was proposed in the basis of MAS considering the penetration of DGs. In [43], a distributed multiagent-based FLISR algorithm was presented for selfhealing grids. But the algorithm is not guaranteed to find an optimal solution [43]. Nevertheless, a complete automatic self-healing system should also furnish smart distribution with fault tolerance function in case of device failure (i.e., backup protection), emergency measures and regression mechanism of pre-fault configuration. Further, the complexity and multistage of MAS-based self-healing control bring new challenges for unified programming application of control system, especially for FDMAS.

C. CONTRIBUTIONS
In this paper, an integrated multistage FDMAS self-healing strategy is proposed for distribution network with DGs, which affords more complete self-healing capacity including selfperception, device fault tolerance, FLISR and regression of initial state and has excellent performance on real time of self-healing decisions and operations. The main contributions of this paper are summarized as follows: 1) This paper presents an integrated multistage strategy on the basis of multistage characteristics of self-healing control, which decomposes the complex self-healing problem into simpler phased sub-problems and is collaboratively addressed by different algorithms at different stages, different from previously heuristic-rule-based [27,36], expert-rule-based [29], graph-theory-based [37], multistate-based [17][18][19] FDMAS or centralized service restoration methods and centralized resilience-based [38][39][40], DMAS-based [41][42][43] FLISR approaches.
2) A unified FDMAS programming control framework is proposed to facilitate generalization and application of the proposed strategy. In contrast to other FDMAS method [27,29,36,37], in the presented framework, the decision-making agents and the tasks of other agents change dynamically with different fault points and self-healing stages, and all agents as independent actuators complete collaboratively general selfhealing objective, which makes the self-healing control fully decentralized.
3) A service restoration algorithm is presented for distribution network combining network reconfiguration with intentional islanding, unlike other FDMAS methods [27,29,36,37], especially where based on the inverted multi-tree breadth-first search (IMTBFS), the proposed network reconfiguration sub-algorithm can restore the maximum possible out-of-service loads at minimum switching times with an excellent performance on service restoration time.
4) Based on Kirchhoff's current differential principle and expert logical rules, an integrated primary and backup protection algorithm is developed in this paper, different from the binary logic rule-based method in [10], where the proposed backup sub-algorithm provides a fault tolerance measure for device failures.
The remainder of this paper is organized as follows. Section II presents the FDMAS control architecture. The methodology composed of integrated self-healing control framework, situation awareness, self-adaption of communication topology, fault location and isolation, service restoration and regression of initial state, is developed in Section III. Illustrative cases are provided in Section IV. Finally, Section V concludes this work briefly.

II. PROPOSED FDMAS CONTROL ARCHITECTURE
In this section, the definition of control agents is given in Subsection II. A, the self-healing stages are illustrated in Subsection II. B, and the communication and coordination among agents at different stages are described in Subsection II. C. The control strategy of the proposed architecture, including integrated overall control flow/framework and decision-making functions, is illustrated in the next section.

A. CONTROL AGENTS
In this paper, the control agents are composed of the following four types of agents: general bus agent (BA), Feeder bus agent (FBA), Tie-switch bus agent (TBA) and DG bus agent (DBA). The four types of BAs collect bus voltage, branch current and bus loads in real-time, and share them with others by requestresponse way periodically [37]. Besides, FBA is responsible for calculating and sharing capacity margins of its feeder with TBAs connected with the feeder, TBA shares information with the decision-maker of service restoration, and DBA makes decision on intentioned islanding.

B. SELF-HEALING STAGE DIVISIOON
Based on situation awareness information, the stages of selfhealing process, as shown in Fig. 1, are divided as follows: 1) Normal operation stage (NOS): There are no faults or abnormal states at NOS. Consequently, each defined agent, as decision-making agent, periodically repeats the routine work (self-perception and self-diagnosis) including collection and sharing of information and self-checking of communication (or agent), etc.
2) Abnormal communication stage (ACS): When a failure of communication (or agent) has been detected, the system will goes into ACS from NOS. At this stage, the neighbors of communication-failed agent, as decision-makers, become neighbors with each other by self-adaption algorithm of communication topology (described in Section III-C). After that, the system returns to NOS.
3) Electrical Fault stage (EFS): When an electrical fault occurs, the system falls into EFS. At EFS, first of all, the agents on faulty feeder become decision-makers to star fault location operation. After that, the agents locating the electrical fault at fault upstream and downstream by primary or backup location algorithm (described in Section III-D) are granted decision-making authorities to isolate the fault. 4) Service restoration stage (SRS): After the electrical fault is isolated, the system goes into SRS. At this stage, the fault isolation agent at fault downstream (which is called as FDFIA), or DBA with droop control (which is abbreviated as DCDBA) in intentional island, is granted a decision-making authority to implement network reconfiguration or intentional islanding task. 5) Regression stage (RES): If service restoration operations are completed and the faulty section is repaired by crews, the system falls into the RES. At this stage, FDFIA/DCDBA as decision-maker, performs regression operation to put the system into initial pre-fault configuration (i.e., NOS).

C. COMMUNICATION AND COORDINATION BETWEEN AGENTS
In the proposed control architecture, the communication links between agents and the information required by decision-

Eelectrical line
Message between agents

Repaired fault section
Opened switch Closed tie-switch (c) making are divergent at different self-healing stages. As depicted in Fig. 2(a), the agents at NOS, ACS and EFS communicate with their neighbor agents and their neighbors of neighbors. Even so, the contents of the delivery messages between agents are different in the three stages: At NOS, each agent sends/receives load information and communicationchecking frames to its neighbors and its neighbors of neighbors, and TBAs collect the capacity margins of feeders from the FBAs connected with them to prepare for possible service restoration at SRS; at ACS, the neighbors of communication-failed agent become new neighbors with each other by exchanging communication-confirming frames; while at EFS, each agent shares their current information with its neighbors and its neighbors of neighbors to detect and locate the fault, after that, the agents that have located the fault become decision-makers to isolate the fault.
At SRS ( Fig. 2(b)), FDFIA/DCDBA communicates with other agents in the out-of-service area to acquire their load/DGs information, and communicates with the TBAs connected with the out-of-service area to obtain capacity margin of the reconfigure lines. While at RES (Fig. 2(c)), FDFIA/DCDBA sends the reverse switching sequence instructions to the TBAs closed at SRS, DBA in the out-ofservice area, BAs isolating fault and BAs shedding loads to put the system into NOS. In this paper, IEC 61850 protocol and the generic object oriented substation events (GOOSE) messages are applied to the synchronous communication between agents.
From Fig. 2, one can see that in the proposed FDMAS control architecture, the communication links and messaging propagation between agents are divergent at different selfhealing stages since the decision-maker and their decisionmaking vary with different stages (Fig. 1). Naturally, the proposed architecture lays the foundation for the presented full decentralized self-healing control. The decision making functions of the agents related to the proposed architecture are described in detail within Table I of the next section.

III. PROPOSED INTEGRATED MULTISTAGE SELF-HEALING STRATEGY
An integrated multistage control framework is first proposed for self-healing distribution system in Subsection III. A, and detailed control algorithms and operations at different stages are given in remaining subsections respectively.

A. SELF-HEALING CONTROL FRAMEWORK
The proposed multistage self-healing framework is descripted in Fig. 3, which is mainly composed of situation awareness, self-adaption of communication, fault detection and isolation (i.e., integrated primary and backup protection), service restoration and regression of initial state. At NOS, each agent collects electrical information, shares load information with its neighbors and checks its communication channel. When communication of an agent fails, its neighbors will selfadaptively adjust their communication structure to become new neighbors with each other at ACS. Once a fault occurs, the agents on faulty feeder will start primary (or backup) protection to locate and isolate the fault at EFS. After that, FDFIA/DCDBA is granted a decision-making authority to implement tasks of service restoration at the SRS. After the faulty section is repaired by crews, the reverse sequence operation instructions are output by FDFIA/DCDBA so that the distribution system is returned to its normal pre-fault configuration. Table I gives the transition between stages, and stage transition conditions/flags, decision-makers and decisionmaking at every stage for the proposed self-healing control framework. The criteria of transition between stages and decision-making operations/algorithms at every stage will be explained in detail within remaining subsections.
From Table I, Fig. 2 and Fig. 3, one can see that in the proposed self-healing strategy, the decision-making agents and their decision-making operations vary dynamically with 5 Besides, in the self-healing process, every agent in the proposed framework are independent actuators, and the decision-making agents make decisions and other agents act as executors at a certain stage; all of them complete general self-healing objective collaboratively. Consequently, the proposed control framework can be programed unitedly for all of agents facilitating generalization and application of the proposed selfhealing strategy.

B. SITUATION AWARENESS
Situation awareness of the system is an important prerequisite of self-healing control decision throughout the distribution system, which can be dynamically perceived through information collected by agents. After agents are initialized, the self-healing system goes into NOS. In this stage, each agent reads and stores its community topology in adjacency list, divides primary and backup protection zones to provide preparation for protection action and collects all sorts of information instantly which includes: 1) voltage and current for electrical fault awareness; 2) communication frames for failure detection of communication links; 3) bus loads for communication self-adaption and service restoration.

1) Self-check mechanism of communication
To check the communication links between agents, each agent sends periodical detection frames of communication link to its neighbors and neighbors of neighbors at NOS. Set numbers of the detection frames and the received acknowledgement frames to be NTotal and NAck in each checking cycle respectively. Then failure of communication link between any two agents can be detected by where K is the trust threshold and is set to 0.4 in this paper. If all of communication links of an agent fail, the communication failure of the agent (or agent failure) can be determined by itself. Moreover, the communication failure of the agent can be also diagnosed by its neighbors if they detect that the failed agent cannot communicate with all of its neighbors [10].

2) Self-adaption of communication topology
Once a communication of an agent (or an agent) fails and the communication failure is checked by its neighbors (i.e., the condition of transition from NOS to ACS is met), the selfhealing system enters ACS. At this stage, neighbors of the communication-failed agent, as decision-makers, become new neighbors with each other by updating self-adaptively communication topology and report the communication failure to substation in real time. Accordingly, the relative protection zones (illustrated in next subsection) are also updated instantly.

1) Protection zone division
A primary protection zone (PPZ) is a minimal protection unit including minimal branch or bus protection zone, and a backup protection zone (BPZ) is defined as a minimal extension of PPZ [10]. It is noted that the protection zone is divided at NOS.

2) Fault detection and location
Once a fault occurs, the currents of branch lines or bus voltages (or zero sequence voltage) on the faulty feeder will exceed or be lower than their limits, where ̇ is the current flowing through branch bl to PPZ or BPZ, n is the number of the branches on the boundary of PPZ or BPZ, IThre is the current threshold for measurement errors, and F is a safety factor which is set to 1 for primary protection and is typically in range of 1.2-1.5 for backup protection [44]. As a consequence, the agents on the boundary of PPZ and BPZ covering the fault point can locate the fault by using (5). Note that the proposed primary and backup fault location algorithm is based on KCL current differential protection, which is different from the binary logic rule-based method in [10].

3) Fault isolation
After fault location of primary and backup protection, primary protection has priority over backup protection to isolate the fault to avoid expansion of fault range caused by the latter. Backup fault isolation operations are activated until the related primary protection fails due to the failed devices, which of rules are descripted as follows: Rule 1: If an agent locates a fault in its BPZ but the fault is not detected in its PPZs by itself and its neighbors due to CT failure, the agent can assert that a CT fails at the overlap between the PPZs within its BPZ, then switches off the CB on the boundary of the BPZ locating the fault.
Rule 1: If an agent receives a redundant back trip instruction from its neighbor (because of neighbor's CB rejection), the agent will output a backup trip signal to isolate the fault.
It is noted that the backup protection under communication failure is contained within the proposed primary protection after updating communication topology and protection zones.

E SERVICE RESTORATION
After fault isolation, the loads at the fault downstream are out of service, accordingly, the self-healing system enters SRS. The task at this stage is known as service restoration [1], which includes two modes, i.e., network reconfiguration mode and intentional islanding mode. In former, the out-of-service area can be restored through grid-connection via valid tie-switches;  while in latter, the out-of-service area is not networkreconfigurable but can be powered steadily by DGs. Note that the DGs in intentional island must have sufficient backup capacity and contain the v-f or droop control model [45]. Fig. 4 gives the processing flow of service restoration (i.e., a sub-flow of SRS in Fig. 3). As shown in Fig. 4, if the out of service area is reconfigurable, FDFIA becomes decisionmaker; if the out-of-service area is an intentional island, DCDBA is granted a decision-making authority; other agents without decision-making authority complement auxiliary operations for service restoration. The network reconfiguration and intentional islanding algorithms are respectively illustrated in next two subsections at length.

1) Network reconfiguration
Inverted multi-tree (IMT) model (Fig. 5): 1) Search and obtain the original out-of-service network (OSN) by breadthfirst search (BFS); 2) simplify the OSN by merging the branches without tie-switches into the branch cross nodes respectively; 3) construct an IMT model GSim(V, E) for OSN by taking the tie-switches as end nodes of branch and adding an artificial source node connected with all tie-switch nodes as source node, where node v ϵ V represents buses, tie-switches or artificial source node, edge e ϵ E denotes branch line, and the capacity margins of tie-switches represent that of their respective reconfigured feeders.
Network reconfiguration algorithm: An IMTBFS algorithm is presented for network reconfiguration, which includes the following steps: Step 1: Search GSim (V, E) from the artificial source node v s by BSF; Step 2: For each search path, when a new node n is added into the current path l sequentially, the capacity constraints of path l is calculated by where i is index for traversal nodes and, , is the load priority factor of node i in path l, , and , are active and reactive loads of node i, and are active and reactive capacity margins of tie-switch in path l respectively. If (6) is met for the new node and the new node is not a cross node between two paths, add the new node into the restored load set and  repeat Step 2; else if (6) is not met for the new node, shed off partial loads in the current searched path according to their priority until (6) is met, and repeat Step 2; else go to Step 3, Step 3: If two searched paths intersect at a new node, disconnect the two paths with each other for meeting radial network constraint, and stop the search of the path with smaller capacity margin, while the bigger goes to Step 2; Step 4: If all nodes are traversed, and if there exist any a pair of paths connected with each other originally and capacity margin of one path is greater than or equal to all of original loads of the other, restore the connection between the two paths and invalidate the tie-switch of the latter for minimizing the switch operations, else go to Step 5.
Step 5: For two paths connected originally with each other, if one paths has enough capacity margin (i.e., > 0) and the other has shed off loads in early search process, restore the connection between them and the former goes to Step 2 to take on the partial loads of the latter; after that, the latter restores all or partial loads according to their priority for maximum load restoration until (6) is not satisfied; Step 6: End.

2) Intentional islanding
If the OSN cannot be reconfigurable but there exist DGs with droop or v-f control, after fault isolation, the OSN will form a stable re-energized intentional island by the following steps where d is index of load, Dp and Ts are the sets of picked loads and the set of started DGs in OSN respectively. In the operation process of service restoration, if DGs have low voltage ride-through (LVRT) capability and the whole time of self-healing operations is less than 3s [45], hold the grid/OSN-connection of DGs; otherwise, disconnect DGs.

F REGRESSION OF INITIAL STATE
If all service restoration operations are implemented without anomaly of voltage or frequency in a certain time limit and the faulty section is completely repaired by crews, the distribution system automatically enters into RES. At this stage, FDFIA/DCDBA, as decision-maker, outputs reverse sequence switching instructions to help the distribution system return to its pre-fault configuration: 1) switch off the closed tie-switches for network reconfiguration mode (or DGs for intentional islanding mode) at RES; 2) close the switches which have switched off for isolating fault upstream and downstream at EFS, 3) switch on the branch switches that have been opened to divide OSN into sub-OSN for radial configuration at RES; 4) sequentially start DGs according their capacity; 5) restore the shed loads according to their priority. After regression operations are completed, the system goes back to NOS.
Since the abnormal data from communication or agents may bring about a conflict-resolution or even incorrect decision-making in self-healing process, it is essential to provide the related conflict negotiation or fault tolerance for self-healing. Besides the above device fault tolerance for failed primary protection, a general treatment for conflictresolution is provided as follows: if a conflict-resolution leads to an inappropriate or even wrong decision-making operations, a backtracked operation will be implemented to make the distribution system return to previous status; further the abnormity will be reported to the substation in real time. For instance, if the voltage or frequency in the reenergized zone is out of range within a certain time limit after reconfiguration, the just switched-on tie-switches will be open again for electricity safety of healthy areas.

IV. CASE STUDIES
The 84-bus and 22-bus practical distribution systems, which have been described in [37,46] respectively, are applied to validate the proposed strategy. The former is simulated by MATLAB and JADE [47], while the latter is used to test integration of the proposed strategy through our developed FDMAS.

A 84-BUS SYSTEM
As shown in Fig. 6, the 84-bus system is composed of 11 feeders and 94 branches, which of rating voltage is 11.4kV and maximum transmission capacity of feeders is 5,000 kVA. The three-phase loads and installation buses of DGs with associated capacity, as shown in [37,46], are assumed as balancing and constant.     Primary protection (Scenario 1): When a three-phase short circuit fault (TPSCF) occurs at FP1, the protective starting conditions of F5 (i.e., (2) and (3)) are met, then BA30 and BA31 locate the fault through the proposed protection method. Subsequently, they trip their CBs at branch line 30-31 to isolate the fault without backup misoperation, Fig. 7 and Table  II gives the results. Remarkably, the three methods (i.e. in [10,42,48] and this paper) can locate and isolate the fault by primary protections very well. Compared with other methods in [10,42,48], the proposed FDMAS primary protection achieves same or better performance on fault clearance time.

1) Case 1: Fault location and isolation
Backup protection: Furthermore, the backup protections in case of communication, CT and CB failures are investigated in Scenarios 2, 3, 4 respectively, where in Scenario 3, a TPSCF occurs at FP3, and the three-phase grounding faults (TPGFs) occur at FP2 and FP4 in Scenarios 2, 4 respectively. The test results of backup protection are depicted in Fig. 8 and Table II. From and Table II,  through the proposed method, while the backup fault clearance time in case of CB failure is close to 137ms (where the predefined delay is set as 110ms in this paper). Moreover, the fault clearance range of the proposed backup protection can be minimized due to minimal extension of BPZ. Here the methods in [42,48] can only provide a type of fault tolerance for communication and CT failure respectively. It is therefore evident in contrast to the methods in [42,48] that the proposed protection algorithm is effective and has advantages in completeness of backup protection function and fault clearance time and range, and has approximate or slightly better performance on backup fault clearance time than the method in [10].

2) Case 2: Network reconfiguration
This case is a continuation of Scenario 1 in Case 1. It is evident that the out-of-service area can be restored by network configuration after isolation of the fault line 30-31, because the reconfigurable feeders F4 and F6 connected with the out-ofservice area have ample capacity margins (2214kVA and 3145kVA) to carry other loads. Therefore, agent 31 (as a FDFIA) makes a decision on network reconfiguration as follows: 1) switch off the switch 32-33 to divide the out-ofservice area into two independent parts; 2) cut off loads at bus 32; 3) close the tie-switches 92 and 94 to take on the loads in the two divided out-of-service sub-areas respectively. The results are illustrated in Fig. 9 and Table III. Obviously, the three methods (i.e. the proposed, in [29] and in [42]) can restore out-of-service loads very well, where the method in [42] brings about the maximum lost loads and switching times due to its lack of optimization in these two aspects. In comparison of other two decentralized methods in [29,42], the proposed 9  method outperforms them in terms of maximizing load restoration, switching times, computation and restoration time.

3) Case 3: Intentional islanding
Case 3 is a continuation of Scenario 2 within Case 1. Owing to the fact that the out-of-service area cannot be reconfigured but the intentional islanding condition is met, the agent 24, as a CDBA, obtains the decision-making authority of intentional islanding instead of agent 21 (a FDFIA). Because the capacity (2236kVA) of DG at bus 24 is far greater than the out-of-service loads (693kVA), agent 21 makes a decision on intentional islanding: Black-start DG3 and pick up the out-of-service loads in sequence of load priority. The related results are depicted in Fig. 10 and Table III. From Fig.  10, one can see that both of the bus voltage and branch power flow in out-of-service area do not go beyond their limits. However, the intentional islanding mode is considered by other decentralized methods in [29,42].
From the above cases, it can be observed that the proposed strategy has several advantages as follows: 1) based on a unified programing framework, the proposed multistage selfhealing strategy can provide a more complete self-healing function including fault detection and isolation, fault-tolerance, service restoration while synthetically considering network reconfiguration and intentional islanding mode; 2) compared with other methods in [29,42,48], the proposed strategy can minimize fault clearance time, maximize loads restoration and shows better performance on switching times, calculation and restoration time.

B. 22-BUS SYSTEM
The complete function and integration of the proposed strategy are investigated in this section. The rating voltage and maximum transmission capacity of the test 22-bus system (Fig.  11) are 10kV and 2000kVA respectively and its three-phase loads and DGs capacity are illustrated in [37]. To be safe, the loads, DG capacity and maximum trans-mission capacity of the system are reduced to one-thousandth respectively. Further, the TPGF is simulated by increasing load current at fault point, where the overcurrent threshold in (2) is set as 6A. The test FDMAS platform developed by us is descripted in [37].
Case 4 (Dynamic model simulation): At NOS, each agent collects electrical information in real-time and communicates with its neighbors and neighbors of its neighbors periodically to detect communication failure, meanwhile a CT is disconnected from K35. When a fault (FP1) occurs at T1=0.089s, as shown in Fig. 11, the primary protections of BA14 and FBA4 fail to detect the fault due to the CT failure while their backup protection locates the fault promptly. As a result, K34 and K36 switch off to isolate the fault at T2=0.119s, the fault is finally cleaned at T3=0.169s. The overall fault clearance time is about 0.80s, and the correlative recorded waveforms are descripted in Fig. 12(a, b, c), where K34, K36, K29, TS2 and TS3 represent their related control relays respectively and the trip signals of K34, K36, TS2 and TS3 in Fig. 12(c) are shown in different scales for better experimental observation.
After the fault isolation, FDFIA (BA14) acquires a decision-making authority of network reconfiguration at SRS. Then BA14 communicates with agents in the out-of-service area and FBA2 and FBA5 to obtain out-of-service loads and capacity margins of the reconfigured feeders respectively. Based on the collected information, BA14 makes a decision on network reconfiguration and outputs reconfiguration operations sequentially as follows: 1) switch off K29 at T4=0.244s to divide the out-of-service area into two independent subparts; 2) after the tripping operation of K29 is completed successfully, close TS2 and TS3 to restore the divided out-of-service loads at T5=0.286s respectively. Finally, all out-of-service loads are restored at T6=0.324s, and the overall service restoration time is about 0.155s.  Return to pre-fault configuration gradually After the faulty section is completely repaired by crews (where the repair time of faulty section is about 113s), BA14, as a decision-maker at RES, implements the reverse sequence switching operation to return the system into normal pre-fault configuration, as follows: 1) open TS2 and TS3 at T7=113.549s, as a consequence, the out-of-service loads are interrupted again at T8=113.587s; 2) after K36, K34 and K29 are switched on sequentially, the loads in faulty section and out-of-service loads are gradually picked up at T9=113.587s, T10=113.622s and T11=113.663s respectively. Excluding the manual repair time of faulty section, the whole self-healing time in Case 4 is about 0.58ms. Note that the above tripping operation time may be a little longer in the presence of higher voltage due to the impact of high voltage arc. From Case 4, it can be observed that the proposed strategy can complete the whole self-healing operation task for the practical distribution systems with multistage and integration, and has excellent performance on real time of self-healing.
Owing to the phased decomposition of complex selfhealing problem and the decentralized multistage solving way, the proposed multistage FDMAS strategy can accelerate selfhealing process of distribution networks, which is demonstrated from the above cases. In contrast to the centralized or CMAS approaches that have inherent disadvantage on real time of decisions and operations [1,3,4,13,26], the proposed FDMAS strategy has intrinsic deficiency on best solution, such as without considering multi-level feeder reconfiguration. Yet it may be a more reasonable tradeoff between complexity of algorithms and real-time of selfhealing. Moreover, compared with other DMAS [28,[30][31][32][33][34][35] or FDMAS [27,29,36,37] methods, the proposed strategy provides more complete and integrated self-healing functions and has better performance on time of FLISR. Additionally, the fully decentralized architecture proposed in this paper facilitates an integrated programming control framework, which can be embedded seamlessly and conveniently into other systems to help promote utility of the proposed strategy. In this regard, maybe it is a trend that the oncoming selfhealing system of distribution system is a combination of a centralized system and many independent FDMASs with partial self-healing capabilities.

V. CONCLUSION
In this paper, an integrated multistage strategy is proposed based on FDMAS for self-healing distribution systems with DGs, which covers throughout the self-healing process (i.e., fault location and isolation, fault tolerance, service restoration and return of pre-fault configuration). Based on expert logical rules, the proposed integrated primary and backup protection algorithm is able to detect and clear fault quickly without misoperation, where the proposed backup sub-algorithm provides a fault tolerance measure for the failed primary protection caused by device failures. Moreover, the presented service restoration scheme considers network reconfiguration mode and intentional islanding mode, especially, in which, the network reconfiguration algorithm based on IMTBFS can maximize possible out-of-service loads with minimum switching times. Furthermore, a unified programming framework is presented for application of the proposed strategy. The efficacy of the proposed FDMAS self-healing strategy is validated through exhaustive case studies on the 84bus and 22-bus systems. The test results exhibit that the proposed strategy offers more complete self-healing functions and performs better than other methods with respect to fault clearance time and range and service restoration time.