Distributed architecture of power grid asset management and future research directions power system

With the continuous expansion of the power system scale, the requirements of the power system intelligence are getting higher and higher and the power grid fault diagnosis has been made to become the focus of the power system research. In this paper, the research field of power grid fault diagnosis is reviewed, and the data sources and characteristics and data preprocessing are analyzed. Moreover, the diagnosis mechanism, characteristics and deficiencies of various fault diagnosis methods, the improvement results and future research direction are reviewed in recent years, and the design of online diagnosis system architecture mode are established. A new type of power system fault monitoring and networking method is proposed which is more concise and reliable than the traditional scheme. The power system fault diagnosis strategy based on knowledge representation and the fault diagnosis model with distributed architecture are structured which showed the efficiency is effectively improved. Finally, the problems faced by the power system fault diagnosis under the background of regulatory big data are discussed, and the development direction of the power grid fault diagnosis under this new situation is deeply analyzed.


I. INTRODUCTION
The structure, operation mode and uncertain feature of power grid are becoming more and more complicated in recent years with the access of more electric vehicle charging station, air conditioning, large industrial electric power user, micro grid, user-side-energy-storage, interruptible and flexible load [1,2]. Meanwhile, the power demand is increasing year by year coupled with climate change caused by frequently extreme weather and power system is under great pressure which cause serious security and stability problems increasingly. Once the power system has a large disturbance or failure, it will affect the safe and stable operation of the power system if it is not handled in time, and have a large area of power failure which brings huge loss to the social economy. Therefore, it is necessary to strengthen the analysis and research of the fault diagnosis of the power system, so that the power system failure could be solved in the shortest time and minimize the loss. The safe and efficient operation of a large power grid shall to be guaranteed by the safety and stability analysis and control based on more flexible, reliable and efficient fault identification and defense control [3,4].
The analysis of fault diagnosis and protection action are mainly completed by operators of power grid according to the alarm information of dispatching operation while the power system breaks down [5][6][7]. After the occurrence of fault, the collection and monitoring system of the power grid could send it to the dispatching center and upload a large number of alarm information which describes the fault events from various aspects. At this time, it is necessary to protect the operators to accurately judge the fault position and fault type, evaluate the performance of the protection action and prepare the accident analysis report according to the various alarm information obtained by the control center. All the procedures are provided to handle system fault and power supply recovery. The massive amounts of information processing and rapid decisions require operators with higher ability which has exceeded the ability of manual processing in the case of complex failure and need to quickly give the accident analysis results. The comprehensive analysis of various alarm information to get the preliminary diagnosis results will greatly improve the scheduling decision ability and reduce manual workload. Therefore, it is of great significance to analyze of grid protection operation data.
Actually, the diagnosis method is to use the intelligent algorithm to obtain various fault information by backward reasoning and find the fault elements on the basis of their own information. The fault diagnosis of power system is mainly based on the fault symptom information which are the uploaded fault warning, switch quantity and electrical quantity at present. Moreover, the operator analyzes and judges the fault, and evaluates the action correctness of the protection and circuit breaker according to the fault symptom information. The methods to research the power system fault diagnosis mainly include expert system (ES) [8][9][10], artificial neural network (ANN) [4,11], cluster analysis (CA) [12], Bayesian network (BN) [13,14], Petri network (PN) [15][16][17], information fusion and so on.
In this paper, the research field of power grid fault diagnosis is summarized firstly including the data sources, data preprocessing, diagnostic algorithm, and diagnostic system architecture and so on. The data sources and data characteristics, the way of data extraction, the fault diagnosis methods, and the design of online diagnosis system mode are deeply analyzed. Secondly, the diagnostic mechanism is briefly introduced and the improvement are reviewed in recent years. Moreover, the future research directions are elaborated. Finally, the important problems of power grid fault diagnosis under the background of regulatory big data are discussed, and the future development direction is deeply analyzed.

A. Common faults in power system
The power system is the collection of electric generator set, power transformer, transmission, distribution grid and load terminal which involved in the production, transmission and utilization of electric energy. It also includes the relay protection system, the auxiliary decision system, integrated automation system, distribution network automation system and communication system in order to ensure the safety and stable operation of the power system. In addition, there are generator set excitation device, governor and other devices. There are several common faults in power system.

1)
The fault of the transmission lines in the power system. There are many aging transmission lines damaged by the sun and the wind in practice. Some transmission lines encounter in strong wind and cause short circuit of wires.
The faults of these transmission lines could be eliminated by wire separation. 2) The fault of transformer in the power equipment. Transformer plays an important role in the whole transmission line. Once the transformer has a problem, it could cause serious damage to the entire power system. The main cause of the transformer fault is the high electric field strength usually. 3) The bus fault of power system. The bus fault mainly includes the short circuit of the bus and the protection error action existing in the bus. When the core substation occurs the bus fault, it will bring serious losses to people. The voltage and current of fault elements in power system are sudden change while there are failures. The relay protection device related to the fault element analyzes and judges the electrical amount of the detected fault signs to make the protection action secondly. At last, the protection system sends a trip signal to the corresponding circuit breaker to isolate the fault element from the power grid, and the fault element is eliminated. The relevant information of fault process is shown in followed Figure   As showed in above Figure, the power system faulty could be divided into three stages: electrical quantity change, protection device action and circuit breaker trip, which contain a large number of data and information reflecting the fault of the power system. The information needs to be collected, stored and accessed by different monitoring systems due to the different characteristics of these data information. The power system fault diagnosis is conducted using the data information of these monitoring systems, which can provide comprehensive data services. The monitoring systems could gather almost all the required data information which includes grid steady-state, transient, dynamic data and power grid history and real-time section data, as well as grid graphics, topology information.

B. Power system fault diagnosis data
Supervisory control and data acquisition (SCADA) system was applied to power system in 1970s which provided a data basis for the implementation of power system fault diagnosis. The expert systems began to be applied in power system fault diagnosis based on the data and information provided by the SCADA system with the development of computer technology in the 1980s. The SCADA system provided realtime data which was the main data source of early power system fault diagnosis, and the relevant data was analyzed and judged to obtain diagnostic results. The wide application of SCADA system had promoted the rapid development of power system fault diagnosis technology. However, it could not collect transient data and only analyze the data such as protection and circuit breaker of the event, the credibility of diagnostic results depends greatly on the accuracy and completeness of collected data.
The wide area measurement system (WAMS) utilizes the phasor measurement unit (PMU) to collect and transmit power grid section data which can accurately record dynamic data in case of power system faulty and realize the monitoring of the dynamic process of the power system. The data collected by WAMS comes with a precise time scale and better data transmission of real-time performance, which could provide more accurate real-time section data for power system fault diagnosis. Therefore, the data collected by WAMS is also gradually applied to the power system fault diagnosis field Since the 1990s. However, the WAMS are applied in the system of 220 kV or above on account of expensive price and some immature key technologies. With the gradual popularization of PMU, the practical research on power system fault diagnosis based on WAMS data has broad prospects in the future.
The block diagram of the central virtual database system is shown in Fig. 2. The main service process (DataCentreServer) could provide application function ordering and query service by accessing real-time data and historical data. Moreover, there is a local data buffer space to avoid repeated acquisition of data. The data center will package the relevant communication interface to provide the relevant interface function to the application and realize the data call of the application. The data information applied in power system fault diagnosis mainly comes from SCADA system and WAMS at present time. A large number of automation devices and systems have been applied in power systems with the comprehensive advancement of smart grid construction in recently years. Therefore, the sources of fault diagnosis data are more diversified. The construction of smart grid provides a solid data foundation for the fault diagnosis. If the above data and information sources can be reasonably and fully used for accurate and rapid fault diagnosis, it will create favorable conditions for the safe and stable operation of the power grid.

III. Power system fault diagnosis methods
The technical research of power grid fault diagnosis has achieved facing the system level in the 1970s. With the rapid development of communication technology, the power data obtained in the power grid operation is more and more rich. For the collected power big data, the use of intelligent algorithm is a research hotspot in the field of power grid fault diagnosis. The fault diagnosis method is to use the intelligent algorithm to obtain various fault information, and to find the fault elements on the basis of their own information.

A. Expert system (ES)
Expert system (ES) utilizes a computer model to simulate the reasoning experience of human experts to explain things and draw the conclusions as experts. The method is to associate the action logic of the protection and circuit breaker with the protected element, express the diagnostic experience of the expert with rules, and form the knowledge base of the expert system mainly. When the power grid fails, the input alarm information and action information logically match the fault element and explain the results [10].
The expert system fault diagnosis model is shown in Fig. 3 which consist of five parts. The inference engine utilizes the health information from the knowledge base to interact with dynamic database. The explanation system makes response and explains the reason of diagnosis decision. Moreover, the user interface gives function of data transmission, result acquisition, consultation and so on.

FIGURE 3. Framework of expert system-based fault diagnosis model
After the continuous development and improvement of the expert system, many scholars constantly optimize and improve the problems existing in the system by introducing intelligent technology algorithms, and have achieved fruitful results. However, there are still some inevitable challenges in the practical application of power grid fault diagnosis.
1) The analysis of specific scenarios and specific device analysis is limited using Expert systems, and it is difficult to give accurate diagnostic results for faults outside the diagnostic rule.
2) It needs to be logically matched according to the input information against the knowledge base in the diagnosis, which need large amount of intermediate reasoning and huge amount of calculation. 3) The scale of the power grid continues to expand with the power construction, the power grid structure is also becoming increasingly complex, and the construction of the rule knowledge base of the expert system will be more complex too. When the power grid architecture changes, it is difficult to update and maintain the system knowledge base.

4)
The accurate diagnosis is more difficult, when the information is incomplete or stacked, The existence of these problems affects the mature application of the expert system in practice, so it is necessary to introduce intelligent technology algorithm to solve the above problems. Moreover, it is also need to deeply research the fault tolerance, maintainability and scalability of the system.

B. Petri network (PN)
In 1962, the Carl Adam Petri first proposed the use of mesh mechanism graphics to represent the communication system in Federal Germany, and this system model was later called the Petri network. Petri network is widely used in dynamic system modeling of discrete event, and it have visual graphics language and the processing of discrete information, which is a very effective modeling tool.
The power grid bears the heavy responsibility of electric power transmission, and any link failure will cause the electric power transmission to be blocked. According to each link and the important components in the grid, the faults in the grid are briefly classified according to the component faults, which are showed in Table 1. The power system fault diagnosis process requires information from the SCADA system. When the information reaches the control center, the operator analyzes the data and diagnoses the fault. The accuracy and speed of the diagnostic process depends entirely on the operator's experience. However, as the complexity of the power system increases, especially in the case of multiple failures or incorrect operations of the protection device, the amount of information that the operator needs to process may be so large that the power system cannot be fault diagnosed correctly and timely. Therefore, domestic and foreign researchers combine computer and power system fault diagnosis, and put forward the power system fault diagnosis model based on big data and Petri network theory for the fault diagnosis of complex power system. The existing research direction of power system fault diagnosis based on Petri network can be integrated into the three aspects shown in Fig 4 which are algorithm optimization, model structure improvement and timely sequence information processing. VOLUME XX, 2017 1

C. Artificial neural network (ANN)
Artificial neural network is an integrated information processing system that simulates the parallel processing of information by a biological neural system. It takes alarm information as input and diagnostic results as output [26,27]. The correct diagnostic results can also be obtained when the information is uncertain. Neural network can solve the problem of knowledge representation with the advantages of fast reasoning speed, strong fault tolerance, good robustness and strong learning ability. However, the diagnostic results of neural network lack the interpretation ability in practical application, the neural network is difficult to meet the requirements of scheduling operation. The large power grid often contains hundreds of nodes, the training sample contain thousands of circuit breakers, and it is very difficult to obtain a complete sample set to build a diagnostic model. The rapid training could not be guaranteed, which seriously affects the user experience.
To overcome the limitations of the above methods, various applications utilized ANN complied with others methods. Guo et al [28] presented the neural networks to detect fault and location of distribution power system. Han et al [4] proposed the improved convolution neural networks to avoid frequent adjustment work of fault-diagnosis models when power system topology changes, and the verification of the results on IEEE 24-bus power systems show that the method is robust to noise with high generalizability.

D. Bayesian network
The Bayesian network is a decision analysis tools which appeared with the development of the influence diagram. It provides the knowledge representation, reasoning and learning method in uncertainty environments and can accomplish the tasks such as decision-making, diagnosis, forecasting, classification and so on. Its advantages are gradually shown with the application of Bayesian network in power system fault diagnosis.
Bayesian network is a graphical model which is comprised of nodes and directed arcs [29]. The node from which the arc is originated is called the parent node, the node to which the arc is directed is called the child node, and the node which does not have any parent node is called a root node [13,29]. Bayesian network needs to obtain effective prior probability in order to ensure the accurate diagnosis results. However, the effective prior probability is hard to obtained according to different faults. The development direction is to make full use of timing information in Bayesian network in recent years. Building Bayesian model with the timing information, and improving the credibility of fault diagnosis will promote the Bayesian process in fault diagnosis. Moreover, the Bayesian model for online analysis technology needs to be improved.

E. Fuzzy set theory
There are two types of fuzzy set theory in power system fault diagnosis. The first one is to set the credibility of alarm information is not 1, action protection and circuit breaker status could be given according to the alarm information, and then the diagnosis results could be given by expert system or random set. After obtaining the fault information, the information is preprocessed with the fuzzy method, then uses the expert system and artificial neural network modeling, and finally output the diagnostic results in reference [8].
The second one is to assume that the uploaded alarm information is completely correct, using fuzzy membership to describe the possibility of association between circuit breaker trip, failure, and protection action. The literature [30] uses this type of fuzzy set theory. It needs all the fault point alarm information and screening in order to measure the total ambiguity of fault diagnosis location for suspicious fault elements [30]. However, the basis of fuzzy set theory still has a variety of problems, especially in the selection of fuzzy membership function and large-scale power grid fuzzy design.

IV. Distributed fault diagnostic structure of power system
As the power system becomes more and more complex, the requirements for fault information monitoring are higher and higher, and the amount of information to be monitored during the failure of the power system is greatly increased compared with the past power system [31,32]. The safe operation of modern large power grid is also more and more dependent on the effective analysis and processing of various information. This section puts forward a new power system fault monitoring and networking method, which recording wave files is uploaded by existing non-real-time channel of scheduling data network, and no longer by protection information slave station. This method reduces the operating load, and the way of recording independent network analysis effectively improves the accuracy of fault analysis at the same time [31,33,34].
In addition, the knowledge representation method is introduced to transform the fault information and the power

A. Power system fault monitoring scheme
The traditional power system fault monitoring and networking scheme is shown in Fig. 5, which illustrates that all the recording files generated in the substation operation are uniformly sent to the main station through the protection information slave station. However, this way of implementation has some disadvantages. All the recording files that need to be uploaded through the sub-station need to be stored and forwarded on the sub-station, which leads to a very large operating load of the sub-station. At the same time, it still needs to transfer the recording files, which increases the stability risk of the system.

FIGURE 5. Traditional power system fault monitoring and networking scheme
According to the current situation of protecting system and recording remote transmission system, the system structure of protecting system and recording remote transmission is reorganized based on the principle of simplicity and reliability. We have established the technical principles of the protecting and recording remote transmission system of 220kV voltage which are discussed as following.
1. Reducing the operating loads of protection information slave station The recording files of the recorder are uploaded through the existing non-real-time channel of the dispatching data network, and are no longer stored and forwarded by the guarantee sub-station to reduce the operation load of the substation, reduce the transmission link of the recording files, and improve the stability of the system. The new recording wave main station will be built in the control center, considering the current status of the hardware and software configuration of the recording wave main station and the subsequent application requirements. The recording station of the control center is independent of the protection station, and the protection station can access the recording data of the recording station through the interface.
2. Simplifying the structure of the credit guarantee system The protection information slave station connects the lower (inside the station) through network access protection device, and the protection information slave station transmits information through the protection information real-time channel. The wave recording file can be transmitted independently, which eliminates the non-real-time channel communication of the protection information slave station. Therefore, the communication connection mode and secondary security strategy of the substation also need to be changed.

B. The distributed structure of power system fault diagnosis
The intelligent monitoring system will upload the collected the system fault information to the dispatching center, and the dispatcher diagnose the fault by summarizing and analyzing all kinds of information and handles the accident [35][36]. However, the traditional transmission mechanism consumes more time and human resources, and has been difficult to meet the needs of smart grid development. The working way of polling processing and the large amount of transmission data of the lower server are easy to lead to the accumulation of communication data, resulting in communication congestion and even information loss. In view of the above problems, we adopts the distributed information collection system structure which the upper application and information acquisition are independent and using the data grid analyzes fault information independently. Therefore, system network blockage problem is solved and the diagnostic efficiency is also improved.
The distributed fault diagnostic structure of power system could be divided into three layers which are the substation layer, network layer and dispatch layer. The substation layer provides the basic data for the fault diagnosis, and the client software can directly read the protection and switch status. It needs to occupy a large storage space for the historical curve, which needs to be archived in the server after uploading and the comprehensive data server can perform the upload of other data information. The network layer located in the middle lies in the arrangement and distribution of fault data. The communication of the network layer is often carried out through the special network of the power system in order to ensure the reliability and security of the system.
The upper dispatch layer mainly includes the fault diagnosis and analysis model.
The fault diagnosis server collects and analyzes the required operation status and protection information of the required equipment, and conducts the fault analysis according to the corresponding diagnosis model to derive the fault equipment and generate the fault analysis report. The hierarchical structure of the diagnostic system is shown in Figure 6. The layered power system fault diagnosis structure could perfectly express the connection between components using a variety of knowledge to describe the fault information. The main system protection performance parameters get better inheritance and it is conducive to the unified modeling of the system. Moreover, the algorithm and diagnosis method are becoming more scientific and perfect.

C. The overall framework of safety assessment and fault diagnosis
The concept of power system big data came into being under the introduction of new energy and smart grid construction environment. The effective analysis of electric power big data can not only optimize the production, consumption and distribution of electric energy, and ensure its economy, but also provide effective decision support for power grid planning and transformation. There are many data sources in the power grid, including user, power enterprise and external data. Those three types of data interact with each other to form the distribution network database together, which also brings problems about the accuracy and value of data. The increasingly large data collection makes it difficult for the traditional data processing technologies to process the data efficiently. Therefore, it is necessary to establish a dedicated data processing system for the power grid.
Data mining generally refers to the process of mining various hidden information from a large number of data through algorithms, which is an important technology for information acquisition. The big data analysis and processing generally adopts the combination of outer structure duallayer analysis architecture. The outer layer mainly for dynamic data, uncertainty data fusion, and the core part of the inner layer is data mining, which through the corresponding algorithm for data calculation and dig out the valuable data information. Mining massive information sources and growing exponentially, the traditional centralized serial data mining methods cease to be an appropriate way to obtain information.
With the increasing complexity of the power grid and the access of the scenery sy stem, the failure rate of the power system increases significantly. Although the traditional SCADA/EMS-based power system data collection and monitoring control system can collect data in real time and monitor the system online, how to quickly and effectively distinguish data, screen data, integrate data and establish data information database is the biggest test of the system in front of the complex big data. Therefore, the big data mining platform is configured to effectively process the data and show the valuable data information in front of the controller on the basis of the traditional SCADA/EMS system, The overall framework of safety assessment and fault diagnosis is illustrated in Fig. 7 which consists of five parts. Layer 1 (data source layer) establishes the source database by collecting history and real-time data from inside (client, enterprise) and outside. Layer 2 (algorithm model layer) builds up different algorithm engines as needed, In the parallel calculation, the data is processed by calling the corresponding algorithm function. Layer 3 is mining data based on the parallel and distributed data mining toolkit platform (PDMiner). Layer 4 is building the criterion library based on the results of data mining, including fault criterion, cause criterion, risk area criterion, operation evaluation criterion, optimization scheme criterion, and provide the decision-making basis for the business layer. Finally, according to the indicators, parameters and displayed data of the business layer for fault diagnosis, fault cause analysis, risk area division and identification, system operation status evaluation, system optimization scheme formulation, It provides an important reference for the transformation and planning of the power grid.

V. Future research directions
With the advancement of a new generation of intelligent power system construction, China's power system has become the world's largest throughout the production and operation of professional internet of things. Moreover, the China's largest "cloud computing" platform has been built laying a foundation for the deployment of energy resources from multiple dimensions including space and time. Power grid fault diagnosis is based on the dispatching automation platform. It has ushered in new opportunities and challenges under the background of power big data as one of the basic topics to be solved in the intelligent power grid dispatching decision.
The integration of power grid regulation makes a large number of equipment directly access to the monitoring system, thus all kinds of data and information collected by the dispatching technical support system can be obtained at the dispatching end. However, the real-time and non-realtime collection of data constitute a reliable source of power big data. Big data mining and knowledge acquisition suitable for power grid fault diagnosis, and the construction of multidata source aggregation methods and diagnosis model are still in its infancy. The followings should focus in-depth research in the future. 1.
We shall carry out research on feature extraction and selection technologies in the environment of big data, and extracts the required features (knowledge) from massive data, such as fractal theory, deep learning, etc. 2.
We shall carry out research on data-driven analysis methods in the environment of regulated big data to solve problems such as data diversity and heterogeneity, such as random matrix theory, etc.
3. We shall carry out the research of parallel algorithms in the environment of regulated big data, and parallelizes the traditional data mining algorithm to improve the scalability and computing efficiency of the algorithm.

VI. Conclusions
Power grid fault diagnosis plays an important role in the rapid analysis, exhaustion and rapid power supply recovery after the accident and this Research has also yielded fruitful results, especially in the environment of smart grid and big data. The advanced application of network self-healing function plays a more and more important role. The main conclusions and results of this paper are summarized as follows: (1) This paper reviews the whole level of power grid fault diagnosis, and tries to find out the research priorities and difficulties in this field from the analysis and combing of many research results on power grid fault diagnosis.
(2) A new fault monitoring and networking method of power system is proposed which are more concise and reliable than traditional schemes.
(3) According to the problems existing in the traditional fault diagnosis system structure, a distributed fault diagnosis architecture model is introduced to stratify the diagnosis system and reduce the time consumption of power grid fault diagnosis.

VII. Conflicts of Interest statement
This study received funding from Technology project of State Grid Zhejiang Electric Power Co., LTD (5211WZ2000WY). The funder was not involved in the study design, collection, analysis, interpretation of data, the writing of this article or the decision to submit it for publication. All authors declare no other competing interests.