Research on Fault Detection for ZPW-2000A Jointless Track Circuit Based on Deep Belief Network Optimized by Improved Particle Swarm Optimization Algorithm

With the rapid development of railway traffic, traffic safety has become a focus. The ZPW-2000A jointless track circuit is an important part of train control systems. Currently, the fault detection of the ZPW-2000A jointless track circuit still relies on the experience of maintenance personnel, which can introduce several problems, such as a low fault detection efficiency and large amounts of required labor. Although some artificial intelligence fault detection algorithms for the ZPW-2000A track circuit have been developed, their detection accuracy is not high enough to meet the needs of large-scale applications, and due to security requirements, the actual ZPW-2000A track circuit fault data cannot be directly obtained in large quantities. To solve these problems, an equivalent theoretical model of the Chinese ZPW-2000A jointless track circuit is proposed by using four-terminal network theory. Through this equivalent theoretical model, the original fault data were collected. Considering that the relationship between fault data and fault types of the ZPW-2000A jointless track circuit is not obvious, a deep belief network was designed to detect the fault modes of the ZPW-2000A jointless track circuit. In order to optimize the deep belief network performance, the particle swarm optimization algorithm optimized by the genetic algorithm (GAPSO) was selected to optimize the deep belief network. The simulation experiments indicated that the optimized deep belief network could achieve a 98.5% fault detection accuracy and a 98.6% F1 Score rate, which showed that the deep belief network optimization by the particle swarm optimization algorithm which was optimized by the genetic algorithm (GAPSO-DBN) model proposed in this paper, had high accuracy and robustness. The results show that it had higher accuracy and robustness than other fault detection methods, and it can greatly improve the level of ZPW-2000A track circuit fault detection in the future.


I. INTRODUCTION
In recent years, high-speed railways have developed rapidly and become a popular means of transportation for travel. They satisfy high safety and operating efficiency requirements. The ZPW-2000A jointless track circuit safety is becoming more and more important. In this paper, the ZPW-2000A jointless track circuit is studied. The track circuit system is the weak link in the railway signal system because of its complex structure and poor working The associate editor coordinating the review of this manuscript and approving it for publication was Christian Pilato. environment. The probability of failure is high, and the failure phenomena are diverse. At present, on-site maintenance is mainly used to examine the hidden dangers of track circuits, and fault detection is carried out based on the experience of staff, which is a time-consuming and low-efficiency process. In addition the labor intensity of maintenance personnel increases, which can lead to diagnostic errors, affecting the operation intervals of the trains [1]. Therefore, it is very important to introduce an intelligent detection algorithm to assist field personnel in fault detection.
In terms of track circuits, Japan and France carried out the early research, and great achievements were made [2]. VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ France completed the UM71 track circuit based on years of research and practical testing [3]. The UM71 track circuit in France is an jointless track circuit with low frequency modulation [4], which exhibits good anti-jamming performances [5]. China's ZPW-2000A jointless track circuit is a technical transformation and localization product developed based on the UM71 jointless track circuit in France and the actual needs of Chinese railways. It fully retains the technical advantages of the UM71. Compared with the UM71, the ZPW-2000A jointless track circuit has significantly improved transmission security, transmission length, system reliability, and maintainability as well as reduced project costs. Japanese scholars have studied the problem of the poor distribution of track circuits for nearly ten years and have conducted experiments [6]. The causes of shunt resistance under different environmental conditions have been summarized [7], and practical solutions have been given [8]. In 2004, Debiolles et al. proposed a method using the least squares algorithm and a neural network to diagnose compensation capacitor faults. This was applied to French track circuits and vehicle equipment [9]. Two years later, to avoid the least squares approach, Debiolles et al. proposed a fault detection method using a compensation capacitor based on a segmented selection output coding strategy, taking the segmented short-circuit current as the characteristic parameter [10]. In 2008, Chen et al. presented laboratory research results of track circuit fault monitoring and detection methods [11]. His team combined a fuzzy system and a neural network to form a fault detection system, which was carried out using the audio jointless track circuit test bench. Experiments showed that the fuzzy neural network could effectively identify all kinds of common failure modes of the track circuit. Based on a previous report [9], Oukhellou et al. divided the global detection problem into several local pattern recognition problems and proposed a fault detection algorithm based on dempster-shafer (D-S) evidence theory and a neural network with compensation capacitance. This method could achieve good fault recognition and location rates [12]. In recent years, with the rise of machine learning and intelligent algorithms, experts and scholars have conducted considerable research on the Chinese ZPW-2000A jointless track circuit, and many research results have been achieved [1]. De Bruin et al. simulated the track voltage signal by building a mathematical model of the track circuit in 2017, and they determined the correlation between time and space in the fault data using a long short-term memory (LSTM) recurrent neural network to realize the fault identification of a track circuit [13]. Zhu et al. used rough set theory and a combinatorial decision tree to identify faults layer by layer for the common fault modes of jointless track circuits [14]. Huang et al. accurately identified track circuit faults by adding fuzzy theory to a traditional neural network [15]. Dong et al. added attribute reduction and an adaptive genetic algorithm to the fuzzy cognitive map to preprocess the fault data, which could effectively diagnose the faults of a ZPW-2000A jointless track circuit [16]. Xu et al. used the amplitude trend of the received signal of the track circuit reader (TCR) as the feature, and they used a method of combining segmented calculation to find a specific value and first-order derivative and second-order discrimination to realize the fast detection of compensation capacitor faults [17]. Mi et al. proposed a method using fuzzy fault detection, a genetic algorithm (GA), and grey system theory to detection the fault of a 25-Hz phase sensitive track circuit [18].
At present, there are some related intelligent fault detection algorithms for the ZPW-2000A jointless track circuit, such as back-propagation (BP) neural networks, support vector machine (SVM) classifiers, radial basis function (RBF) neural networks, BP networks optimized by genetic algorithm (GA-BP) [19], SVM classifiers optimized by particle swarm optimization (PSO-SVM), fuzzy cognitive maps (FCM), and extreme learning machine (ELM), but the results are only passable. Their fault detection accuracies are basically between 80% and 95%, so they are not widely used [20]. Although a deep belief network (DBN) was mentioned in a previous publication [1], the structure of the DBN was not adjusted accurately, so the accuracy of fault detection was still limited. The relationship between the collected circuit parameters of the ZPW-2000A jointless track circuit and the fault modes of the ZPW-2000A jointless track circuit is not obvious, and the classification results are greatly affected by the structure of the DBN. Thus, the high-level distribution characteristics of the circuit parameters of the ZPW-2000A jointless track circuit must be extracted to improve the detection accuracy, and the structure of the DBN must be accurately adjusted to the optimal configuration. Therefore, after obtaining the circuit parameters of a jointless track circuit through an equivalent theoretical model, a track circuit fault mode detection model based on a DBN was established in this study. This model uses an unsupervised hierarchical learning method of a DBN to extract the high-level distribution characteristics [21] of the key monitoring quantities of the ZPW-2000A jointless track circuit. The parameters of the DBN were fine-tuned through supervised learning to determine the optimal selection of parameters. A BP neural network was used as the classifier to detect the track circuit fault modes, and the structure of the DBN was precisely debugged.
In this paper, the ZPW-2000A jointless track's simulation model is established for fault data acquisition. The reason for using simulation data is that it was very difficult to obtain the actual track circuit data, which is restricted by the relevant confidentiality regulations. Only a small amount of track circuit data could be obtained, but the amount of data was too small to complete the training of the neural network. Thus, simulation data were used to complete the relevant research of the track circuit fault detection algorithm.
For the DBN model design, the number of hidden layer nodes, the number of hidden layers, and the iteration number of the hidden layers were the three main parameters. A normal experimental plan is used to decide the number of hidden layers and the iteration number of the hidden layers. After these were preliminarily selected, determining the best number of hidden layer nodes for each hidden layer would be a difficult job for a normal experimental plan. To solve this problem, the particle swarm optimization algorithm optimized by the genetic algorithm is used to optimize the number of hidden layer nodes for each hidden layer. Consequently, good experimental results were obtained.
The main contributions of this study were as follows: 1. The high fit simulation circuit of the ZPW-2000A jointless track circuit was established, and a large quantity of accurate ZPW-2000A track circuit simulation data was obtained (in China, such data is generally classified as confidential, the full ZPW-2000A track circuit fault data could not be massively obtained from China Railway Administration).
2. Fine tuning of the deep belief network (DBN) structure by the particle swarm optimization algorithm optimized by the genetic algorithm (GAPSO) effectively solved the problem of the number of nodes in each hidden layer in the deep belief network (DBN) being difficult to select.
3. The deep belief network optimization by the particle swarm optimization algorithm which was optimized by the genetic algorithm (GAPSO-DBN) was applied to the fault detection of the ZPW-2000A jointless track circuit, and highly accurate and robust fault detection was achieved.
4. This paper provides an alternative and efficient scheme for other problems in which the relationship between the original data and the classification results is not obvious and the classification accuracy is sensitive to the classification network structure.
The remainder of this paper is organized as follows. An equivalent theoretical model of the ZPW-2000A jointless track circuit using four-terminal network theory is explained in Section II. Section III provides the methodology, including a detailed explanation of the DBN model and particle swarm optimization (PSO) algorithm optimized by the GA. The simulation process for DBN design is presented and compared with other networks in Section IV. Experimental results are presented in Section V, and the conclusions are presented in Section VI.

II. ANALYSIS OF ZPW-2000A JOINTLESS TRACK CIRCUIT
To explain the ZPW-2000A jointless track circuit in more detail, the structure, equipment function, working principle, simulation model, and simulation data are introduced in this paragraph. According to the ''red light band fault'' of the track circuit, 15 fault modes are summarized. A four-terminal network is used to establish the transmission matrix for each piece of equipment of the jointless track circuit, and the whole four-terminal network model is formed by a modular cascade. The validity of the model is verified by field data, and the format of the original data and its preliminary processing are discussed.

A. ZPW-2000A JOINTLESS TRACK CIRCUIT COMPOSITION
The ZPW-2000A jointless track circuit is used near the track and is connected to the track. The main pieces of equipment are small, but a set of equipment covers a long track section. It can realize the communication from rail to train, which is a low-frequency communication signal. The main function of the ZPW-2000A jointless track circuit is to automatically and continuously detect whether the line is occupied by the rolling stock and also to control the signal or switch device to ensure the running safety of the equipment. In the current practical application, the main method is to check the hidden dangers of the track circuit by regular maintenance. Intelligent methods have not been introduced. A single piece of equipment may fail, but its failure characteristics may not be obvious. Thus, comprehensive analysis is needed to detect the failure. This is usually done by experienced personnel. Thus, it is difficult for a single sensor alarm to play a decisive role. To allow artificial intelligence to replace humans for the fault detection of track circuits, considerable amounts of research have focused on this subject. The main working states of the ZPW-2000A jointless track circuit are the adjustment state, shunt state, and rail breaking state. In the adjustment state, the track circuit is not occupied by the rolling stock. The relay at the receiving end is in an excited state and sends out a message that the track circuit section is idle, regardless of the unfavorable power supply or weather conditions. Since the adjustment state is the general case of the track circuit, the subsequent analysis is based on this state.
The ZPW-2000A jointless track circuit is mainly composed of a transmitter, transmission cable equipment (including matching transformer, service parallel thermoplastic (SPT) transmission cable, and cable analog network), a 29-m tuning area, a rail, and compensation capacitance and receiving equipment. The functions of each type of equipment are shown in Table 1.
The structure of the ZPW-2000A jointless track circuit is shown in Fig. 1. The ZPW-2000A jointless track circuit is mainly composed of a main track section and a small track section, of which the fixed length of the small track section was 29 m. Generally, parking is forbidden in the small track section, because it is the specific location at which the electrical insulation mechanism occurs. There are eight kinds of carrier frequency information in the system. The uplink carrier frequency was 2000-1Hz, 2000-2Hz, 2600-1Hz, 2600-2Hz, the downlink carrier frequency was 1700-1Hz, 1700-2Hz, 2300-1Hz, 2300-2Hz, and the carrier frequencies of the two adjacent track sections are different. Electrical insulation was achieved using the series parallel resonance of the tuning area. In Fig. 1, 1G, 3G, and 5G indicate different specific areas of a track section with the ZPW-2000A track circuit installed. Assuming that the carrier frequency information of the 3G section in Fig. 1 was F1, and the 5G section of track in the adjacent section was F2, the 3G section of track showed a pole-impedance state to the carrier frequency F1, and a zero-impedance state to the carrier frequency F2. At this time, the circuit formed a short circuit, which isolated the cross-region transmission of the information in the 3G and 5G sections.   In addition to the eight kinds of carrier frequency information, the ZPW-2000A jointless track circuit also generated 18 kinds of low-frequency information. Distributed between 10.3 and 29 Hz, every 1.1 Hz represents a signal with a specific meaning. The system sent low-frequency signals with different meanings to the main and small track sections through the transmitter at the sending end. One of the low-frequency signals was directly sent to the receiver at the end of the area, the other was sent to the receiver at the adjacent track section in front of the operation through the tuning area, and the checked small track status conditions (small track (XG) and small track receiving end box (XGH)) were sent to the receiver at the end of the area. The summary results were output after two-way information was judged to be correct, and the idle/occupation status of the track section was determined by the track relay being pulled up and falling down.

B. ZPW-2000A JOINTLESS TRACK CIRCUIT FAULT MODES
The normal operation of the track circuit must meet two limit requirements. First, in the adjustment state, the transmission voltage is at the lowest value of the normal range and the track length is the limit length. At this time, when the rail impedance is the maximum and the ballast resistance is the minimum, the equipment at the receiving end operates normally, and the track relay is reliably drawn. Second, the transmission voltage has the highest value in the normal range. When the impedance of the rail is the minimum and the ballast resistance is the maximum, a standard shunt resistance of 0.06 or 0.1 is used anywhere on the rail to cause a short-circuit, causing the relay to fall down reliably.
In the railway field, there are two kinds of fault characterizations of track circuits [23]. When the track circuit is not occupied, some factors cause the relay to lose excitation. The console shows that this section is occupied. This kind of fault is called a ''red light band fault.'' For the other kind, a vehicle occupies the track section, the track relay cannot fall down reliably, and the console shows that there is no reliable occupation in this section, which is called a ''poor shunting fault.'' This studied was focused on the fault detection of the jointless track circuit under the ''red light band fault'' condition. By sorting the field materials (the identification of the actual field data fault mode is accomplished by workers measuring the voltage and current of each circuit using a multimeter or by measuring the voltage and current of each circuit using sensors, combined with the circuit schematic diagram and manual experience.), 15 common failure modes of the system in the case of a red light band failure are summarized in Table 2.
When failure of the ZPW-2000A jointless track circuit occurs, a sudden change or wave of the voltage and current will appear in the centralized signal monitoring. The monitoring quantities involved in the dynamic monitoring included the transmission voltage, transmission current, rail-in voltage, and rail-out voltage. The change of these monitoring quantities is an important basis for judging the fault mode and finding a fault area.

C. MODELING OF ZPW-2000A JOINTLESS TRACK BASED ON FOUR-TERMINAL NETWORK THEORY
A four-terminal network refers to a multi-terminal network with four terminals, which is a circuit connected with four The specific expression of the transfer matrix T x is as follows: where T 11 and T 21 are the open circuit parameters, and T 12 and T 22 are the short circuit parameters. The corresponding formula is as follows: Based on the four-terminal network relationship established for the input and output voltage and current, if any two are known, the third can be obtained, and the voltage/amperage characteristics between the input and output can be obtained. Supposing that the port is connected to an output impedance Z 0 and an input impedance Z 1 , the relationship between them is as follows: The structure of the ZPW-2000A jointless track circuit is divided into a receiver module, rail circuit module, and transmitter module [22]. The input and output voltages of each device are marked from the receiver, and the four-terminal network model is built based on the modeling principle. The modular modeling results were sorted, and the whole theoretical model was built based on the structure of the ZPW-2000A jointless track circuit. The transmission characteristic model in the adjusted state is shown in Fig. 3.
The modeling of the receiving end mainly involved an attenuator, receiving cable (cable simulation network and SPT cable), and matching transformer. The equation relating the input voltage and current of the attenuator is as follows: where L v is the inductance. The input voltage and current of the attenuator approximately equal the voltage and current at the end of the cable, so the expression of the voltage and current at the beginning of the cable at the receiving end is expressed as follows: where N rspt represents the receiving cable transmission matrix, r d represents the propagation constant of the cable, l r represents the length of the cable, and Z cd represents the characteristic impedance of the cable. The matching transformer structure is shown in Fig. 3, which can be divided into three parts: the capacitances C t1 and C t2 , the transformer T t , and the inductance L tl .
The transfer matrix N rtad of the matching transformer can be expressed as follows: where n is the turn ratio of the transformer.
The transmission characteristic model of the receiving end tuning area is shown in Fig. 3. The impedance of the air core coil (SVA) is denoted as Z SVA , Z ca represents the connection impedance between the plug pin and the rail, the impedance of circuit load BA2 is denoted as Z BA2 , and the length of the half tuning region is l tx . The equivalent four-terminal network N rtx in the tuning region of the receiver and the corresponding voltage and current input and output relations can be expressed as follows: The transmission matrix of the main track section is obtained by cascading compensation units, as shown in Fig. 3. Based on the transmission matrix of the compensation capacitance, the four-terminal network of the whole rail can be cascaded as follows: where N lgm is the transmission matrix of the compensation unit.
where N gm is the transmission matrix cascading of multiple compensation units. The transmission matrix of the transmitting device is symmetrical to that of the receiving device. The corresponding relationship between the input voltage U 5 and current I 5 and the output voltage U 4 and current I 4 in the tuning area of the transmitter is as follows: The cable at the sending end is the same as the cable at the receiving end, which is also composed of a cable simulation network and an SPT cable. Therefore, the expression of the voltage and current at the beginning of the cable at the sending end is as follows: where l s represents the length of the cable.

D. ZPW-2000A JOINTLESS TRACK CIRCUIT FAULT DATA ACQUISITION
After the real ZPW-2000A jointless track circuit parameters are introduced, through the theoretical model of the jointless track circuit built using the four-terminal network, the voltage and current values at each node can be extracted. In this study, 12 main monitoring quantities were selected as characteristic parameters. These monitoring quantities are shown in Table 3. The corresponding simulation conditions were set based on the relevant parameters of the track circuit. The data recorded in the field test and the simulation results were compared to verify the correctness and effectiveness of the model. The comparison of the results is shown in Table 4. The actual measured value data of the ZPW-2000A jointless track circuit in Table 4 is from the actual measurement results of a certain section of the track circuit of the Beijing Shanghai line of China Railway, which was provided by the National Natural Science Foundation project undertaken by the laboratory. The data is confidential and cannot be disclosed in large quantities. The simulation data in Table 4 was extracted using the simulation model of the ZPW-2000A jointless track circuit established in this paper.
As shown in Table 4, the calculation results of the 12 monitoring parameters of the ZPW-2000A jointless track circuit model built using the four-terminal network were very close to the measurement results, with a maximum error of 9.5%, indicating that the equivalent model of the ZPW-2000A jointless track circuit established in this study was effective. Thus, the data collected by this model can be used as the input of the subsequent detection system.
In the simulation experiments, using the fault data of the theoretical model of the ZPW-2000A jointless track circuit and taking M1-M12 as the characteristic parameters of detection model, 16,500 fault samples were collected, with F1-F15 as the fault modes and 1100 samples for each fault type. The samples were divided by the strategy described below. For each fault mode, there were 900 samples in the training set, 100 samples in the validation set, and 100 samples in the test set. The training set was used to train the model, the verification set was used to verify the model of each training stage, and the test set was used to verify the final model.
The physical meaning and order of magnitude of each value in the data set were different. Thus, a unified normalization process was needed to achieve data standardization. In this study, the data was transformed using the following linear function: where x and y are the values before conversion and after normalization, respectively, and MaxValue and MinValue are the maximum and minimum values of the corresponding items in the data set, respectively. Using this formula, each data point could be mapped to the range of [0, 1]. Some fault samples after normalization are shown in Table 5.

III. ALGORITHM DISCUSSION
The DBN algorithm and GAPSO algorithm are used to solve the problem in this paper. The DBN algorithm was mainly used for fault detection of the ZPW-2000A jointless track circuit. The GAPSO algorithm was mainly used to adjust the VOLUME 8, 2020
structure of the DBN. Both algorithms are described in detail below.

A. DEEP BELIEF NETWORK
The DBN is composed of simple learning modules. In the following text, the basic composition unit of the DBN-the restricted Boltzmann machine (RBM)-is described, and then the structure of the DBN is analyzed in depth. Finally, the training method of the DBN is described in detail.

1) RESTRICTED BOLTZMANN MACHINE
The restricted Boltzmann machine (RBM) is a generative stochastic neural network [27] proposed by Smolensky et al.
The quantitative probability graph model is an energy model of unsupervised learning, which has the characteristics of a low network complexity and strong practicability. It is a double-layer structure composed of a visible layer and a hidden layer. The visible and hidden layers are connected symmetrically by weight. Each layer of nodes is independent. There are two states: active and inactive. The visible layer is mainly used to input data. Each RBM module has two layers of feature detection units. The hidden layer is mainly used to extract feature information to reflect the characteristics of the input data, and the value of hidden layer's node must be 0 or 1. The value of the visible layer's node can be a real number. The calculation of the RBM generally uses the contrastive divergence-1 (CD-1) algorithm, and its schematic diagram is shown in Fig. 4.

2) DEEP BELIEF NETWORK STRUCTURE
To perform a parameter search in the deep structure space, professor Geoffrey Hinton proposed the DBN method in 2006. The DBN is an unsupervised machine learning model [29] that can extract the high-level distribution characteristics of input data through autonomous learning, and it is suitable for the characteristic parameter data of the ZPW-2000A jointless track circuit. The DBN is composed of N RBMs, and it is trained layer by layer. The training process includes two parts: forward-stacked RBM learning and reverse optimized learning. In the first part, the multi-layer RBM extracts, abstracts, and retains the important feature information from the original feature input data layer by layer. In the last layer, the RBM inputs the extracted feature information to the supervised BP neural network. In the second part, error back propagation of labeled input data through the BP neural network is used to fine tune the whole DBN from top to bottom to obtain the DBN training model that achieves the desired accuracy. The structure model of the DBN is shown in Fig. 5.

3) DEEP BELIEF NETWORK MODEL TRAINING ALGORITHM
The pre-training process initializes the network parameters through unsupervised layer-by-layer training. The initialization parameters are mainly connection weights and offsets of each layer [30]. The training process of the DBN is described as follows. The RBM network parameters θ = a i , b j , w ij are initialized, and the maximum number of iterations for each layer of the RBM training is set to N . The calculations of the hidden layer and visible layer units use the following equation: where . . , h m are the hidden layer units, W is the connection weight, whose elements w ij connect visible layer node i to hidden layer node j, a is the offset vector of the visible layer, and b is the offset vector of the hidden layer.
Taking the preprocessed input vector x as the initial state of the visible layer unit v (0) , the hidden layer unit state h (0) is calculated as follows: The reconstruction state of the visible layer unit v (1) is calculated using h (0) and the following equation: In (16) and (17), a i is the offset of the i-th node of the visible layer, b j is the offset of the j-th node of the hidden layer, and σ (x) is the activation function of the neural network, which is generally the sigmoid function 1/(1 + e −x ). The activation function can effectively enhance the nonlinearity of the network. The reconstruction layer is used to calculate all the hidden layer units, and this is repeated until the maximum number of iterations N is reached. The weight matrix W, the visible layer offset vector a, and the hidden layer offset vector b are updated as follows: where λ represents the learning rate, which is generally between 0 and 1. After the update, the training is complete. After the training, the network parameters must be further adjusted. The tuning training process uses gradient descent to supervise the labeled data. In supervised tuning training, the forward propagation algorithm must be used to obtain a certain output value from the input, and then the back-propagation algorithm is used to update the network weights and deviations. The pre-trained parameters W , a, and b are used to determine the opening and closing of the corresponding hidden layer unit, and the excitation value of each hidden layer unit is calculated as follows: where l is the layer index of the neural network. Propagating layer by layer, the excitation value of each hidden layer unit is calculated and normalized with a sigmoid function: Next, the output of the output layer must be calculated: where f (·) is the activation function of the output layer, and Y is the output of the output layer. The back-propagation algorithm is used to update the weight W and the offset b, as follows: where α is the learning rate. The mean square error of the DBN learning E is calculated as follows: whereŶ i is the actual output, Y i is the theoretical output, and W l and b l represent the weight and offset parameters of the first l layers to be determined. The training process of the corresponding DBN model is shown in Fig. 6. The CD-1 algorithm is used to train the RBM layer by layer until the DBN training is complete. A BP neural network is used to fine tune the DBN. Finally, the connection weights of the DBN model are determined.
In addition, if the number of nodes in the input layer is N 3 , the number of nodes in the output layer is M 2 , and the iterative number of hidden layer is T 3 . The computational complexity for the DBN training is O(N 3 * M 2 * T 3 ), and the computational complexity for the DBN during operation is O (N 3  *  M 2 ).

B. PARTICLE SWARM OPTIMIZATION ALGORITHM OPTIMIZED BY GENETIC ALGORITHM
Particle swarm optimization is a heuristic global optimization algorithm [31], [32] that can be used to solve complex optimization problems. The improved particle swarm optimization algorithm based on the genetic algorithm introduces the selection, crossover, and mutation operators of the genetic algorithm. Compared to the ordinary particle swarm optimization method, the improved method makes full use of the excellent properties of particles, increases the convergence speed, improves the efficiency of the evolution and search accuracy, increases the diversity of particles through the genetic algorithm, and searches a wider range of solutions to jump out of local optima. Thus, this method performs better for parameter optimization. The fitness function for the GAPSO [33] was selected as the ZPW-2000A jointless track circuit fault detection accuracy based on the DBN model to achieve minimum error. For the GAPSO algorithm, the particle number is i, the dimension number is j, the total number of dimensions of each particle is D, the current iteration number is k, c 1 and c 2 are acceleration constants, r 1 and r 2 are random numbers (their values were set between 0 and 0.2), the position of the particle was . . , v i D , the position with the best fitness function for one particle was P i = p i 1 , p i 2 , . . . , p i D , and the position with the best fitness function for the entire population of the particles was P g = p g 1 , p g 2 , . . . , p g D . The following formulas are defined: The positions of the particles in this study should be integers. If the calculation results are not integers, they should be rounded. The value of the position should be greater than 0 for each dimension.
In each evolution of the particle swarm optimization optimized by the genetic algorithm, the first third of the particles with the best fitness function directly enter the next iteration using the selection operation. The middle third of the particles exchange part of the position and velocity dimension data with a crossover probability using two random pairs of crossover operations, after which they produce offspring, and the offspring enter the next iteration. The last third of the particles use a mutation operation to change some values in the position and velocity dimensions by random initialization, and the process enters the next iteration after mutation. The formulas for the cross operation are as follows: where X represents the position vector with dimension D, V is the velocity vector with dimension D, and A is a number with dimension D that represents the probability of crossover, which has the same values for each dimension, and its value range is [0, 1]. X 1 (k) and X 2 (k) are the positions of the two particles selected for hybridization, and V 1 (k) and V 2 (k) are the corresponding velocities of the two particles selected for hybridization, respectively. The flowchart of the improved particle swarm optimization algorithm is shown in Fig. 7. In addition, if the number of particles is N 1 and the number of iterations is T 1 , then the computational complexity for the PSO and GAPSO algorithms is O(D * N 1 * T 1 ). For the deep belief network optimization by the particle swarm optimization algorithm (PSO-DBN) and GAPSO-DBN methods, the computational complexity for training is O(D * N 1 * T 1 * N 3 * M 2 * T 3 ), and the computational complexity during operation is O(N 3 * M 2 ).

IV. EXPERIMENTAL SIMULATIONS
The DBN structure plays an important role in the process of fault detection of the ZPW-2000A jointless track circuit. To determine the network structure, the ''experience method'' and ''trial and error method'' were used for preliminary parameter setting in the training process. The best collocation of hidden layer nodes was obtained by the GAPSO algorithm. In addition, some other models were introduced to compare with the GAPSO-DBN model.

A. DEEP BELIEF NETWORK MODEL PRELIMINARY DESIGN
The number of model input layer nodes depends on the dimension of the input data, which was a 12-dimensional jointless track circuit monitoring parameter vector. Thus, the number of input layer nodes was 12. As there were 15 fault modes, the number of model output layer nodes was 15, and the BP neural network was used as the classifier. The initial value of the RBM learning rate was set to 0.1, and the initial momentum was set to 0.5. The preliminary number of hidden layer nodes, the number of hidden layers, and the iteration number of the hidden layers are determined using enumerating experiments.

1) EFFECT OF NUMBER OF HIDDEN LAYER NODES
To determine the number of hidden layer nodes in the DBN used for high-level feature extraction of the input data, the following empirical equation was obtained based on experience: where m is the number of nodes in the input layer, n is the number of nodes in the output layer, c is a positive integer between 1 and 10, S is the number of hidden layer nodes. The values of m = 12 and n = 15 were set in (32) and determined that the value range of the number of hidden layer nodes should be 5-15. With 3 hidden layers and 120 iterations, the DBN model with 5-15 nodes in each hidden layer was studied through simulations. The simulation results are shown in Fig. 8.
The simulation results showed that when the number of single hidden layer nodes was 12, the DBN model's detection accuracy was 90.3%, which was the highest achieved.

2) EFFECT OF HIDDEN LAYER NUMBER
With 12 hidden layer nodes and 120 iterations, the DBN model with 1-6 hidden layers was studied using simulations. When the number of hidden layers was n, the structure of the DBN was n+2 12 − 12 − . . . − 12 − 15. The simulation results are shown in Fig. 9.
The simulation results showed that when three hidden layers and five total layers were used, the DBN model's detection accuracy was 90.3%, which was the highest value achieved.

3) EFFECT OF ITERATION NUMBER OF HIDDEN LAYER
With 12 hidden layer nodes and 3 hidden layers, the effect of the iteration number of the hidden layers is shown in Fig. 10.
The simulation results showed that when the number of iterations was fewer than 110, the detection accuracy increased rapidly with the increase in the number of iterations. When the number of iterations was greater than 130, the diagnostic accuracy decreased slightly with the increase in the number of iterations, corresponding to an over-fit state. When the number of iterations was about 120, the DBN model's detection accuracy was 90.3%, which was the highest value achieved.

B. DEEP BELIEF NETWORK MODEL ACCURATE DESIGN
The number of nodes in the input and output layers of the DBN structure was determined by the number of key parameters and fault nodes of the ZPW-2000A track circuit, respectively. However, it is generally difficult to determine the number of nodes in each hidden layer. The process of adjusting the number of nodes in each hidden layer of a DBN is time consuming and requires manual work to input VOLUME 8, 2020 the parameters and determine the final parameters using the classification accuracy as the evaluation index. To determine the number of hidden layer nodes in the DBN more quickly and effectively, the GAPSO algorithm was used to find the best number of nodes in each hidden layer, output the best combination of hidden layer nodes, and meet the requirements of the model fault detection accuracy.
The appropriate selection of parameters of the GAPSO can make the fitness function converge quickly, reduce the number of iterations of the population, select a better number of hidden layer nodes in the DBN, and improve the accuracy of fault detection of the network model. Therefore, the parameters of GAPSO must be further determined with simulation experiments.
As the number of particles in the particle swarm is generally large, the search time is long. Thus, the number of particles should not be too large. To satisfy the requirements of the genetic algorithm, the number of particles in this study was selected as 30. Since there were three hidden layers, the number of dimensions per particle was set to three. The inertia weights ω start = 0.8 and ω end = 0.1, which were selected based on experience. The initial value of each dimension of the particle position was determined by the coarse-tuning structure of DBN, that is, the initial value of each dimension of particle position was 12. The value of each dimension of particle velocity was randomly initialized to a larger value in the optional range, the maximum number of iterations was taken as t max = 100, and the value of each dimension of parameter A was set to 0.5.
For the selection of key parameters c 1 and c 2 of the GAPSO algorithm, special parameter combinations were chosen based on experience (such as c 1 = 1, c 2 = 1, c 1 = 2, c 2 = 2). The empirical formula c 1 + c 2 ≈ 4 was also used. In the process of parameter optimization, the value of c 1 was changed from 1.0 to 3.0 with a step size of 0.05, and the value of c 2 was between 4-c 1 -0.2 and 4-c 1 + 0.2 with a step size of 0.05. The experimental results are shown in Fig. 11, For each value of c 1 in the Fig. 11, the corresponding fault detection accuracy was achieved by the optimal choice of c 2 in the corresponding value range (in this way, there is only one curve in the plot, which is clearer, as there are no overlapping curves). The corresponding parameter c 1 and the iterative number of GAPSO algorithm are shown in Fig. 12. Fig. 11 shows that there were multiple c 1 and c 2 parameter combinations to optimize the fault detection results (which was 98.6%). At this time, the combination of parameters c 1 and c 2 that made the GAPSO algorithm stable with the fewest number of iterations was selected as the final value, that is, the final selected parameter values were c 1 = 2.8 and c 2 = 1.3, and the corresponding number of iterations was 28.
To better show the results, four different combinations of learning factors were used that were typical and representative: c 1 = 1, c 2 = 1; c 1 = 2, c 2 = 2; c 1 = 2.8, c 2 = 1.3; and c 1 = 2.2, c 2 = 1.8. The fitness curve is shown in Fig. 13, and the corresponding number of iterations of the hidden layer of the DBN model is shown in Fig. 14.

C. DESIGN OTHER MODELS WITH COMPARATIVE VALUE
A BP neural network and deep belief network optimization using the particle swarm optimization algorithm (PSO-DBN) were used with the same input data for comparison. The comparison was based on whether the classifier result and labeled result were consistent. The topology of the BP neural network was as follows. The number of input layer nodes was 12, and the number of output layer nodes was 15. Using one hidden layer, the learning rate was set to 0.1, and the minimum training objective error was 0.001. The optimal number of hidden layer nodes was determined by enumeration. The fault detection accuracy with different numbers of hidden layer nodes is shown in Fig. 15.    15 shows that the optimal number of hidden layer nodes was 12, and the best detection accuracy of the BP neu-  layer is M 1 , and the total number of iterations is T 2 , then the computational complexity for training is O(N 2 * M 1 * T 2 ) and the computational complexity during operation is O(N 2 * M 1 ).
The basic parameters of the PSO algorithm were set with reference to the GAPSO algorithm (a slight difference from the analysis above is that the maximum number of iterations was taken as t max = 150), and the key parameters c 1 and c 2 of the PSO algorithm were set using a similar method to that of the GAPSO algorithm. The relationship between the corresponding c 1 and c 2 parameter combination and fault detection accuracy is shown in Fig. 16, and the corresponding parameter c 1 and the iterative number of PSO algorithm are shown in Fig. 17. Fig. 16 shows that the best detection accuracy of the PSO-DBN model was 95.1%, and the corresponding c 1 and c 2 parameter combinations had only three groups. The optimal parameter combination of the PSO algorithm was significantly less than that of GAPSO algorithm. Based on the results in Fig. 17, the parameters were selected from the three groups with the smallest iteration number to make the model stable, which were c 1 = 1.7 and c 2 = 2.3, and the iteration  curve is shown in Fig. 18. The iteration number required to make PSO algorithm stable was 86, the PSO-DBN model structure was determined to be 12 − 12 − 14 − 13 − 15.
Comparing Fig. 11, 16, 12, and Fig. 17, it was found that the iteration number required by the PSO algorithm was significantly larger than that of GAPSO algorithm, and the GAPSO algorithm exhibited faster convergence and stronger optimization abilities than the PSO algorithm for the ZPW-2000A jointless track circuit fault detection based on the DBN.

V. EXPERIMENTAL RESULTS
In this experiment, there were 1500 samples in the test set, and 100 samples were available for each fault mode.   mode F7 was only 85%, and that of the best fault mode F9 was only 97%, and for each fault type, some samples were identified as normal. The overall effect was relatively general. The results showed that the BP neural network model was significantly worse than the other two methods. Fig. 20 shows the classification confusion matrix of the PSO-DBN model. The detection accuracy of only one fault mode reached 100%. The detection accuracy of most fault modes was about 95%, and the detection accuracy of the worst fault modes F7 and F14 was 91%. Almost every fault type had some samples identified as normal. The overall effect was good, and the results show that the PSO-DBN model was quite good at identifying faults. Fig. 21 shows the classification confusion matrix of the GAPSO-DBN model. There were six fault modes whose detection accuracy reached 100%. The detection accuracy of most fault modes was about 98%, the detection accuracy of the worst fault mode F7 was 95%, and only six fault types had some samples that were identified as normal. The overall effect is excellent. The results show that the GAPSO-DBN model was obviously better than the other two methods.
To better measure the effect of the model, four parameters were selected: accuracy rate, recall rate, precision rate, and  F1 Score rate to evaluate the model. The accuracy rate refers to the ratio of the number of samples correctly classified by the model to the total samples of the test set, which is an evaluation of the classification accuracy of the model. The recall rate refers to the proportion of the target class samples identified by the model in the total target category, which measures the recall rate of the diagnostic model. The precision rate refers to the probability that all positive samples were actually positive. The F1 Score is a probability value obtained from the combination of the recall rate and precision rate, which reflects the network modulus. The larger the numerical value, the more stable the model is. F1 Score is calculated as follows: To further evaluate the GAPSO-DBN model proposed in this paper, Fig. 22, 23 and 24 compare the accuracy, precision, and F1 Score of different models for different ZPW-2000A jointless track circuit fault types (for the model in this paper, for single fault mode, the accuracy and recall were numerically equal, so it was not necessary to draw a graph for the recall). Fig. 22, 23, and 24 show that for all fault types, the classifier evaluation index of the GAPSO-DBN model was basically better than that of the other models (except for the accuracy index of fault type F13, the precision index of fault type F9, and the F1 Score index of fault type F9). This shows that the GAPSO-DBN model proposed in this paper  is accurately classified each fault sample. The model rarely made wrong classifications, and its performance was stable.
A multi-class classification problem is the focus of this study. Thus, the recall, precision, and F1 Score are all the macro-averaging values were calculated. The formulas for the macro-precision, macro-recall, and macro-F1 Score are as follows: F1 Score macro = 2 * Recall macro * Precision macro Recall macro + Precision macro .
The fault classification results of the BP neural network model, PSO-DBN model, and GAPSO-DBN model are compared in Table 6.
As shown in Table 6, the GAPSO-DBN model had the highest detection accuracy, macro-recall, macro-precision, and macro-F1 Score. Each performance indicator was excellent, and thus, the proposed method had high accuracy and robustness, and the PSO-DBN model and BP neural network model evaluation indices were slightly worse. Combined with the limited effect of fault detection proposed in the literature [1], [2], [19], [20], the GAPSO-DBN algorithm successfully overcame the bottleneck of ZPW-2000A jointless track circuit fault detection and achieved excellent results.
The reason that the GAPSO-DBN model is suitable for ZPW-2000A jointless track circuit fault detection is that the DBN is an excellent deep structure space optimization algorithm, which is more suitable for solving the problem of fault detection in which the relationship between original data and classification results is not obvious. After the original data was extracted by the DBN model, it showed better inter-class separation and intra-class aggregation and less overlap in the high-level distribution characteristics. At the same time, because the final classification accuracy is greatly affected by the DBN structure, using the GAPSO algorithm to optimize the DBN structure can greatly increase the final fault detection effect. Combining these two advantages, the GAPSO-DBN model exhibited a good detection effect, and the accuracy reached 98.5%.
In addition, the GAPSO-DBN model proposed in this paper can effectively search the parameter space of deep structures. It can be further used to solve the problem that there is a relationship between the original data and the classification results, but the relationship is not obvious, and it must extract the high-level distribution characteristics of the original data. The final result is greatly affected by the DBN structure. Of course, if the relationship between the original data and the classification result is obvious, then the GAPSO-DBN method proposed in this paper has no significant advantages.

VI. CONCLUSION AND FUTURE WORK
In this paper, an equivalent theoretical model of a ZPW-2000A jointless track circuit was used, which was established using four-terminal network theory to obtain the fault data of the ZPW-2000A jointless track circuit. The fault mode detection based on the DBN model was used, which was optimized by the GAPSO algorithm. The experiments showed that the established ZPW-2000A jointless track circuit fault detection method could achieve excellent results and had a high detection accuracy of 98.5%. At the same time, the method has strong robustness. The actual jointless track circuit environments are complicated, which also increases the difficulty in jointless track circuit fault detection. Improving the method's fault detection capabilities will be the subject of future research. The operating environment of this method will be installed on an acquisition computer in an actual railway field to carry out ZPW-2000A jointless track circuit fault detection. After the performance of the method is demonstrated, we will launch practical integrated products with corresponding equipment companies.