Fault Diagnosis Scheme Based on Microbial Fuel Cell Model

Around the world, fossil fuels are decreasing and pollution is increasing. As a new energy source, microbial fuel cells (MFCs) have been widely concerned. However, most of the previous researches focused on the material selection, configuration design and optimal control of MFCs, and few of them were able to systematically analyze the failures of MFCs. In order to ensure the reliable operation of MFCs, this paper systematically explores the MFC fault diagnosis process, including the acquisition of initial fault data, feature extraction and fault classification.Firstly, in order to acquire data quickly and effectively, the mathematical model is used to simulate the occurrence of faults, and four types of typical fault voltages are obtained. Then, wavelet analysis is used to extract the voltage characteristics of MFC faults, and the characteristics of each fault are explored in eight frequency bands. Finally, the recognition effects of various classifiers on fault features are compared. Through the analysis of the results, it is found that fault tree is the most suitable fault diagnosis method for MFCs. The fault data extraction method proposed in this paper and the classification effect of various classifiers finally obtained provide a reference for the further analysis of MFC faults.At the same time, the combination of wavelet analysis and fault tree diagnosis model proposed in this paper provides ideas for fault diagnosis in other fields.


I. INTRODUCTION
Due to the global reduction of traditional energy and the environmental damage caused by the use and development of fossil energy, renewable new energy has attracted worldwide attention. Microbial Fuel Cell (MFC) is a potential clean energy source [1]. On the one hand, its products will not cause environmental pollution. On the other hand, the raw materials it uses are domestic sewage or industrial waste The associate editor coordinating the review of this manuscript and approving it for publication was Yongquan Sun .
water, providing a continuous energy source for the recycling of water resources [2], [3].
MFC is a new way of generating electricity -using electrogenic bacteria to decompose organic matter to obtain electricity. Because electrogenic bacteria can reproduce themselves, catalysts for the oxidation of organic matter are endless [4]. Bacteria can react in a variety of temperature ranges according to their tolerance limits. Most electrogenic bacteria can survive at room temperature, while some thermophilic bacteria can tolerate even higher temperatures. The presence of these electrogenic bacteria enables MFC to adapt to a variety of environments. The advantages of MFC have led more and more people to enter the field of research. However, a great deal of research has focused on the material selection, structural design and control optimization of MFCs. Some researches on the fault diagnosis of MFCs are mostly only at the theoretical level, there is no systematic experimental demonstration. The stable operation of microbial fuel cells is still a problem to be solved.
In recent years, there have been numerous methods of fault diagnosis. Reference [5] successfully predicted the Remaining Usable Life (RUL) of a Multifunctional Spoiler System (MFS) using a new hybrid prediction method. Reference [6] optimized the Echo state network (ESN) structure by combining a competitive swarm optimizer with local search, A layer-wise optimization strategy is subsequently introduced for evolving deep ESNs. Reference [7] is presented a joint-loss convolutional neural network (JL-CNN) architecture is proposed, which can implement bearing fault recognition and RUL prediction in parallel by sharing the parameters and partial networks, meanwhile keeping the output layers of different tasks.Reference [8] and [9] respectively solve the problem of fault diagnosis of 3D printer and multi-joint industrial intelligent robot. In fact, most of the signals encountered in fault diagnosis are non-stationary signals, wavelet analysis can be used to solve this problem. In reference [10], wavelet analysis, Generalized Regression Neural Network Ensemble for Single Imputation (GRNN-ESI) and Principal Component Analysis (PCA) are effectively combined and applied to the fault diagnosis of wind turbine blades. In reference [11], Wavelet Transform (WT) Multi-Resolution Analysis (MRA) technology was integrated with Artificial Neural Network (ANN), focusing on the problem of fault detection and classification of Ship-borne Power System (SPS). In reference [12], wavelet analysis and back propagation neural network are combined to solve the open circuit fault of power components in power converter. Therefore, wavelet analysis has a good performance in fault diagnosis.
MFC is special because it uses microorganisms as catalyst, which makes it difficult to diagnose its faults.Deep learning is widely used in data classification and has a good effect. This method needs the support of a large amount of data to automatically learn features from big data. However, MFCs have complex structure and strong coupling. In addition, MFC driven by microorganisms is gradually manifested over time after most of the faults occur. This determines that it is difficult for MFC to conduct deep learning by collecting a large amount of fault information accurately. Yan et al. proposed an MFC fault feature recognition method based on wavelet analysis [13]. By using wavelet packet analysis to analyze the output voltage of the system, the signals of the high frequency part and the low frequency part were obtained, so as to distinguish different fault types. However, this method fails to realize on-line diagnosis of fault type. Fan et al. proposed an MFC fault diagnosis method based on fault tree [14]. The feasibility of applying fault tree method to MFC is discussed in detail. But there is no validated with experiment data.
Based on the above research, wavelet analysis can effectively describe local features in the time domain and frequency domain. Therefore, it has a good application in fault diagnosis, especially for nonlinear systems. The combination of wavelet analysis and machine learning can enhance the adaptive resolution and fault tolerance of the system, and effectively improve the classification accuracy. In this paper, the battery model is used to simulate the fault and obtain the original fault data. Then use wavelet packet analysis to extract the multi-frequency characteristic value of the MFC output voltage. Finally, Test the effects of various classifiers based on the obtained feature values. As stated above, the fault diagnosis of MFC has been systematically analyzed.
The contributions of this paper are as follows: This paper is organized as follows. Section II introduces the microbial fuel cell and analyzes its main faults. Section III illustrates the concrete algorithm of wavelet analysis. In Section IV, various classifiers are introduced. In Section V, conduct simulation experiments. The last section concludes.

II. MICROBIAL FUEL CELL
MFCs use domestic sewage and industrial wastewater as raw materials. Under the action of electrogenic bacteria, the acetate in the matrix is decomposed into hydrogen ions and electrons for oxidation reaction. Hydrogen ions pass through the proton exchange membrane to the cathode. Electrons are transferred to the surface of anode by means of nano wire and dielectric, then participate in the reduction reaction through an external wire. Figure 1 shows a typical two-compartment MFC.

A. THE STRUCTURE OF MICROBIAL FUEL CELL
Double chamber MFC is composed of several components, including a reaction device, data processing equipment, external circuit and various auxiliary devices [15]. On the left, the anode chamber is kept in an oxygen-free environment to ensure the activity of electricity-producing bacteria. At the same time, the mixing equipment at the bottom keeps the activated sludge in suspension state to improve the efficiency of electricity production. The aerating apparatus is used in the right cathode to increase the concentration of reactant oxygen. The anode electrons are collected through the external circuit to form a closed loop.  [16]. The substrate concentration change rate, the growth rate of electrogenic bacteria and the concentration change of other substances in the anode can be expressed aṡ The anode and cathode open circuit potentials are expressed as where, E a , E c , E lossa , E lossc , R, T , F, n 1 and n 2 are constants. Their specific meanings and values are shown in Table 1. In the model, x 1 , x 2 , x 3 and x 4 respectively represent substrate concentration, microbial growth rate, HCO − 3 and H + . u represent the dilution rate of the injected substrate as the control quantity. µ max represents the maximum growth rate of microorganism, and its actual growth rate is obtained by Monod equation. q max is the maximum substrate consumption rate. S 0 represents the concentration of the feed substrate. K s is half-saturation constant. b is the loss constant of microorganisms in each generation.

C. TYPICAL FAILURES OF THE SYSTEM
Due to the strong coupling of the MFC itself [17], MFC fault frequency is particularly high. Its faults can be divided into the following categories.

1) LOW SUBSTRATE UTILIZATION
Microorganisms in the anode chamber produce electrons by breaking down organic matter. Therefore, it is an effective way to ensure the full reaction of microorganisms and substrates to the greatest extent to achieve efficient electricity generation.

2) LOW MICROBIAL ACTIVITY
MFC are powered by anodic anaerobic electrogenic bacteria, which has more stringent requirements on the anodic environment. The temperature and PH of the anodic solution can produce microbial activity [18].

3) OXYGEN DEFICIENCY
Oxygen in the cathode will directly participate in the reduction reaction under the action of catalyst, its concentration will directly affect the cathode chemical reaction rate. The lack of oxygen can seriously reduce the efficiency of MFCs in generating electricity and sewage treatment [19].

4) THE SUBSTRATE CONCENTRATION DECREASES
A higher substrate concentration means that microorganisms are able to break down organic matter more fully, enhancing the oxidation reaction.The anode solution is constantly replenished from elsewhere, and this process has an important effect on substrate concentration [20].
In the MFC model, q max and µ max represent the substrate utilization and the maximum growth rate of microorganisms respectively. Lower oxygen concentration reduces the cathode reaction rate, which we believe will result in cathode voltage loss, so we're going to use E lossc for the change in oxygen concentration. In addition, S 0 in the model represents the feeding concentration. Finally, the data of several typical failures of MFCs are obtained by abnormal perturbations of these parameters.

III. WAVELET PACKET DECOMPOSITION AND FEATURE EXTRACTION
In 1992, Coifman, Meyer et al. proposed wavelet packet analysis. Compared with other feature extraction methods, this method can realize multilevel and wide-field signal decomposition [21].
Wavelet analysis separates signal characteristics into multiple frequency bands and ensures that they are independent of each other. This effectively improves the accuracy of source signal identification [22]. Compared with STFT(short-time Fourier transform), wavelet analysis can gradually refine the signal at multiple scales through telescopic and translational operations, and finally achieve time subdivision at high frequencies and frequency subdivision at low frequencies. It can automatically adapt to the requirements of time-frequency signal analysis, so as to focus on any details of the signal.

A. THE PROCESS OF FEATURE EXTRACTION
The wavelet packet is decomposed for the first time and the high frequency and low frequency parts are obtained. In the second decomposition, the two parts obtained by decomposition are decomposed in the same way as the first decomposition. Thus, a wavelet packet decomposition produces two sequences.And the sample characteristics carried in the two sequences are completely independent [23]. A typical wavelet packet feature extraction process is shown in Figure 2. where: H for low frequency, G for high frequency

B. PRINCIPLE ANALYSIS OF FREQUENCY DOUBLING WAVELET
The signal of wavelet decomposition can be expressed by scale function and wavelet function.
Defines a set of recursive function H n (t) as: where, g(k) = (−1) k f (1 − k), when n = 0, the form above can be written as: where H 0 (t) is the scale function, H 1 (t) is the wavelet function, f (k) is the coefficient of the high-pass filter, and g(k) is the coefficient of the low-pass filter. In each step of decomposition, the signal to be decomposed is decomposed into an approximate signal of low frequency and a detailed signal of high frequency. For A collected timing signal x(t) t∈ (1,N ), N is the number of sampling points of the timing signal, and its decomposition coefficient can be obtained by the following formula: where, S(k), (k ∈ N) is the discrete sequence, the wavelet packet passes through the j layer to decompose the feature into l frequency bands. When microbial fuel cell system malfunction, the fault information included in the whole spectrum. The signal is enhanced in some frequencies and weakened in others as compared to normal. Therefore, can be calculated by spectrum signal characteristic value to different fault diagnosis system.
Spectrum signal eigenvalue calculation formula is as follows: where, j is the number of layers in which the signal is decomposed, and N is the number of nodes in the last layer.

IV. CLASSIFIER
This paper discusses the classification effect of various classifiers based on wavelet analysis. This section briefly introduces various classifiers.

A. PROBABILISTIC NEURAL NETWORK
Probabilistic Neural Network (PNN) is developed from the previous Bayesian classification method. It is a type of artificial neural network with simple construction, fast training and wide application in the market. When solving classification problems in practice, PNN can transform nonlinear problems into linear problems for solving, while retaining the accuracy of nonlinear algorithm recognition.The specific situation inside the network is shown in Figure 3.
VOLUME 8, 2020 where, Q i precedes the schema layer as a weight vector. γ is a smoothing factor, which has great influence on classification decision. Layer 3 is the summation layer, is to belong to a certain probability cumulative.The summation layer element is connected to a pattern layer element that belongs only to its own class and is not connected to other elements in the pattern layer. Through the normalized processing output layer, you can get all kinds of probability estimation. When the final decision is made, the posterior probability of each fault is counted and the maximum value is output as the decision result. The structural units of the output layer correspond to the summation layer, and each unit represents a class [24]. The output layer computes the most likely type of output based on the summation layer.
Feature recognition based on PNN is a classification based on probability statistics [25]. Suppose there are two known categories of features, the fault feature samples to be determined are X = (x 1 , x 2 , . . . x n ): where,h A and h R is the prior probability of the category θ A and θ R (h A = N A /N , h R = N R /N ); N A and N R is the training sample number of the characteristics of the category θ A and θ R ; N is the total number of training samples; l A is the cost factor that wrongly divides the fault feature sample X belonging to θ A into mode θ R ; l R is the cost factor that wrongly divides the fault feature sample X belonging to θ R into the mode θ A ; f A and f R are the Probability Density Function (PDF) of the fault mode θ A and θ R . Generally, PDF cannot be accurately obtained,and its statistical value can only be obtained according to the existing characteristic sample.

2) ADVANTAGES OF PNN COMPARED WITH BP NETWORK
The PNN process is simple and converges quickly. It has certain rules for the construction of hidden layer and is constructed by the system itself after input samples. In addition, under the condition of sufficient samples, the optimal solution can be obtained according to Bayesian criterion, and it will not fall into local optimization like BP neural network. Moreover, the uncertainty of the internal structure of BP neural network makes its training time longer, and the recognition process cannot be as one-step as PNN.
It has a strong ability to append samples. If added new training samples in the process of fault diagnosis or need to delete some old training sample, PNN only needs to modify the number of cell model layers, and the increase of input layer of the model layer weights you just need to directly assign new samples. For BP network, the training sample needs to be trained again after modification. The need to reassign all connection weights of the network is equivalent to rebuilding the entire BP network, if the vector dimension of feature recognition changes, the previous training results will be overturned, and it needs to spend time to train and construct the internal network again.

B. OTHER TYPICAL CLASSIFIERS
K-NearestNeighbor (KNN) is classified by measuring the distance between different eigenvalues. The idea is if most of the K samples in the eigenspace that are most similar (that is, the closest ones in the eigenspace) belong to a certain category, the sample also belongs to this category, where K is usually an integer not greater than 20. In the KNN algorithm, the selected neighbors are all correctly classified objects. This method only determines the category of the sample to be divided according to the category of the nearest one or several samples.As shown in Figure 4, when K is 3, the target is recognized as a triangle; when K is 5, the target is recognized as a circle.The Euclid distance between two points in the bidimensional plane is expressed as it represents the Euclid distance from (x 1 , y 1 ) to (x 2 , y 2 ). Support Vector Machine (SVM) is a binary classification model. Its purpose is to find a hyperplane to segment samples. The principle of segmentation is to maximize the distance between the two types of samples closest to the plane on both sides of the plane so as to provide a good generalization force for classification problems.For example, in Figure 5, using A as the hyperplane is clearly the optimal choice. Fault tree is a logical causal diagram. The elements of composition are events and logic gates. Events are used to describe the failure states of systems, elements and components. Logical doors connect events and represent logical relationships between events. It is a top-down graphic deduction method, has a lot of flexibility. It can do not only qualitative analysis but also quantitative analysis of the system. Not only the system fault caused by a single component, but also the system fault caused by different modes of multiple components can be analyzed. Figure 6 shows a simple fault tree model.

V. EXPERIMENTAL PROCESS AND SIMULATION A. FAULT DIAGNOSIS PROCESS
In this paper, wavelet analysis and classifier are combined to identify and classify several battery faults. The MFC collects voltage data after running for a period of time, and then performs wavelet decomposition to obtain the characteristic value of each frequency band. Figure 7 shows the voltage output of a MFC under normal operation. The collected data is divided into training set and test set. Then construct two neural network models respectively. The BP neural network selects appropriate parameters and initializes the entire model. After the BP neural network is trained, enter the test set to view the classification results. PNN is different from BP neural network. Most of the structure of the model is constructed by system intelligence. Finally, the two network prediction results are compared and analyzed. The diagnosis process is shown in Figure 8.
After testing the classification effect of two basic neural networks, based on the results of wavelet analysis, the classification effects of several typical classifiers such as KNN, SVM and fault tree are explored.

B. WAVELET PACKET ANALYSIS
Considering the training speed and recognition accuracy, the signal is decomposed into 8 irrelevant characteristic signals. The frequency range of each characteristic signal is shown in Table 2. After classification, the mean values of fault features of different frequency bands are shown in Table 3.
As can be seen from Table 2, compared with the data of each node under normal conditions, Node 1 of the first-type fault is nearly half lower; the second type of fault is lower at node 0, node 1, and node 2; the third type of fault is significantly reduced at node 0 and node 3; the fourth type of fault node 1 is slightly lower.   Take the data during the voltage stabilization process. Figure 9−13 represents the wavelet reconstruction result under different fault conditions. Figure 9 shows the results of normal battery operation. Figure 10 shows the wavelet reconstruction results when oxygen is insufficient. Figure 11 shows the associated failures that lead to reduced substrate utilization. Figure 12 shows the results of decreased microbial activity. Figure 13 shows the result of insufficient substrate concentration.It is observed that some nodes are not exactly the same in different situations. Therefore, this point can be fully used for fault classification. It can also be seen that after the failure, due to the internal role of the fuel cell, the signal does not continue to fluctuate, but after a period of fluctuation, it reaches a new stability. External performance for the voltage and power varying degrees of reduction.    Use wavelet analysis to obtain 2500 sets of data for training, including four types of fault data and one type of normal working data. Choose 105 sets of data to test the classification VOLUME 8, 2020   effect. The final classification result of BP neural network test is shown in Figure 14. PNN classification result is shown in Figure 15.
As can be seen from the figure, the BP neural network did not make any mistakes in identifying the type 1, type 2 and  It is worth noting that BP neural network can not recognize the normal working state of the system at all, while PNN can completely recognize it.This is particularly important for fault diagnosis of MFC, because the system itself is not highly stable even in normal operation.

D. OTHER CLASSIFIER EFFECTLE
After emphatically exploring two simple classifiers, PNN and BP, we simply tested the classification effects of KNN, SVM and fault tree. Their confusion matrices are shown in Figure 16, Figure 17 and Figure 18, respectively. Table 4 summarizes the classification effects of all the classifiers discussed in this paper on MFC fault data. We can see that the other four classifiers do not perform well except for the fault tree. The fault tree can almost realize the correct classification of all faults. Our analysis believes that wavelet analysis amplifies the fault characteristics in multiple frequency bands, which makes the characteristics of different faults appear more obvious in a certain frequency band. The fault tree itself is a top-down gradual classification method, which can effectively use the large difference in the characteristics of a certain frequency band to distinguish faults, and further improve the  fault tolerance of the system. This provides ideas for future fault diagnosis in other areas.

VI. CONCLUSION
MFC system itself has strong coupling, so it is difficult to trace the fault cause, which seriously affects the efficiency of electricity generation. This paper is the first time to systematically explore the fault diagnosis of MFCs. Firstly, the fault where he is currently a Professor. His research interests include machine learning, image processing, and computer networks. He is also a member of the IEEE, ACM, and EAI. VOLUME 8, 2020