Review of Machine Learning Based Fault Detection for Centrifugal Pump Induction Motors

Centrifugal pumps are an integral part of many industrial processes and are used extensively in water supply, sewage, heating and cooling systems. While there are several review papers on machine learning-based fault diagnosis on induction motors, its application to centrifugal pumps has received relatively little attention. This work attempts to summarize and review recent research and development in machine learning-based pump condition monitoring and fault diagnosis. The paper starts with a brief explanation of pump operation including common pump faults and the main principles of the motor current signature analysis (MCSA) method. This is followed by a detailed explanation of various machine learning-based methods including the types of detected faults, experimental details and reported accuracies. The performances of different approaches are then presented systematically in a uniﬁed table. Finally, the authors discuss practical aspects and challenges related to data collection, storage and real-world implementation.


I. INTRODUCTION
Centrifugal pumps (CP) represent 70% of all kinds of pumps [1] and are ubiquitous in the industrial world [2].Although modern pumps can last for many years, their sudden failure can lead to undesirable disruptions, or even catastrophic failures, e.g. when it affects a water supply in hospital.This spurred the rapid development of intelligent condition monitoring techniques using signal processing and machine learning methods to detect, diagnose and predict faults by monitoring patterns in vibration, pressure or current signature sensors.
Motor current signature analysis (MCSA) is a widely used predictive maintenance method for fault detection and condition monitoring by analyzing the electric current of the stator in the induction motor [3], the key component of centrifugal The associate editor coordinating the review of this manuscript and approving it for publication was R. K. Saket .
pump.The method is based on the idea that various faults produce characteristically distinct patterns that can be detected through signal processing and statistical techniques [4].
Although acoustic emission, vibration, and pressure based systems also perform well at detecting failures along with extensive literature [5], MCSA is widely used for predictive maintenance, i.e. maintenance based only on need, and is more cost-effective than the other maintenance types.
An MCSA monitoring system can be deployed by attaching current clamps, used as transducers, to power supply wires without requiring direct physical access to the pump itself.The ease of deployment, non-invasive installation, and relatively low cost combined with high detection accuracy are the main advantages of MCSA.For some applications, e.g.monitoring submerged sewage pumps, MCSA is often the only feasible and practical method.
When applied to centrifugal pumps, MCSA can detect not only induction motor related faults, but also pump related faults such as cavitation, impeller fault, and other types of faults using appropriate machine learning methods.In other words, MCSA uses the motor as a transducer to monitor pump conditions [6].For example, [7] proposes fault detection in pumps using deep learning, [1] uses neural networks for multi-objective prediction in multi-stage pumps, and [8] investigates cavitation faults using machine learning.However, despite a large amount of work on MCSA-based methods for induction motors [9], a systematic review of MCSA-based techniques for fault detection in centrifugal pumps is not documented in the literature to the best of our knowledge.
In this paper, the authors present a systematic analysis of machine-learning based MCSA methods for centrifugal pump fault detection.Our aim is to present a comprehensive analysis of recent work in this area summarising technical challenges, and relevant machine learning methods and comparing the key results.The survey also discusses practical challenges related to sensing, data transmission and collection that are specific to MCSA systems.In this respect, when comparing prior work, where possible, the authors summarize information about sampling rate, data acquisition equipment, and other relevant to give further practical insights to readers.To the best of our knowledge, this work is the first systematic survey on machine-learning based methods for fault detection in centrifugal pumps.
The rest of the paper is structured as follows.Section II and III-A1 introduce the key aspects of centrifugal pump design including its components, and main fault types.Section III discusses the key signal features and signal processing techniques used to extract salient features of the fault signal.A comparison of MCSA with other fault detection alternatives like vibration signal analysis is also presented in Section III-A.As the induction motor is the main component of centrifugal pumps, the survey will also cover relevant MCSA-based methods for induction motors.Finally, the authors present machine learning-based and some non-machine learning-based solutions to help researchers compare their implementations.

II. CENTRIFUGAL PUMP OPERATION
As shown in Fig 1, a centrifugal pump consists of two main parts: the rotating part containing a shaft and an impeller, and the stationary part, which is composed of casing, casing box, bearing and an electrical motor, typically an induction motor [8].The fluid inside the pump flows axially from the eye of the casing, engaging with the impeller blades and rotating radially to get velocity and pressure to get out of the impeller into the casing's diffuser.
Table 1 provides a brief summary of 31 different pump failure types and their underlying problems [11].In this paper, the authors present the commonly investigated faults and compare their related papers' detection accuracy performances.In the following sections, the authors first introduce MCSA method, followed by a detailed description of each failure type and whether it is detectable by MCSA.

III. MOTOR CURRENT SIGNATURE ANALYSIS AND FAILURE TYPES
MCSA is based on the idea that certain electrical and mechanical faults introduce harmonics in electric current, which can be detected through a combination of signal processing and machine learning methods.The healthy motors work with a 50 Hz fundamental frequency (60 Hz in the US).However, during the machine's fault development, different harmonics other than 50 Hz start to appear [12].Pump load affects the fundamental frequency component.On the other hand, the load fluctuation causes the noises and harmonics [13].Therefore, the detection can be done by checking the lateral bands around the fault's fundamental frequency.
The current is measured by attaching current probes to power supply wires, which makes it relatively convenient and inexpensive to install and maintain.The range of faults detected by MCSA includes stator winding breakdown, broken rotor bar or electric bearing problems [15].The type of faults detectable by MCSA includes: • Bearing fault: Outer Race Fault [12], Inner Race Fault, ball defect [4], [16].
• Blockage [19]- [21].Fig. 2 illustrates the broken rotor bar fault detection using MCSA.It can be seen that the fault is affecting the amplitudes of attenuated fundamental components' lower frequencies.

A. COMPARISON WITH VIBRATION SIGNATURE ANALYSIS
Vibration signature analysis (VSA) analyses the signals from the vibration sensors attached directly to the pump [15].The analysis can be done by monitoring the signal's spectral content and can locate the part of the machine where a fault occurs [22], [23].The assumption is that the frequencies where the vibrations happen point to the part of the machine where the error occurs [22], [23].Vibration can be measured by accelerometers, which need to be attached close to the rig of the centrifugal pump [23], [24].
Although VSA has its advantages, MCSA's costeffectivity, ability to detect electrical faults [15] and the sensitivity compared to other techniques [6] give MCSA more usability.Regarding motor faults, the MCSA is used to detect both mechanical and electrical faults, whereas VSA needs the acceleration measurement for displacement to find the error.
In terms of fault-detection performance, Corne et al. [15] claim that MCSA cannot distinguish bearing at drive-end and non-drive end if they have the same dimension.However, they suggest that the magnitudes of the frequency components should be evaluated [15] and that the unstable current sample will spread the magnitudes of the components to the spectrum.Zhang et al. [25] also claim that MCSA is easy to implement and has economic savings.However, just like the previous paper's claim, the varying stator currents at bearing fault can harden the process of having a universal threshold for detection.

1) FAILURE TYPES
In this section, some of the most common motor and pump faults will be discussed in detail.As the induction motor is a key component of a centrifugal pump, the section first introduces the MCSA detectable induction motor faults.Section III-A1.b then presents centrifugal pump specific faults and their detectability using various methods.

a: INDUCTION MOTOR-BASED FAULTS i) BROKEN ROTOR BARS
Broken rotor bars (BRB) are mostly an induction motor mechanical fault that has several severities of fault conditions: partial-BRB, one or more BRBs.BRB starts with a partially cracked bar which affects the physical magnitudes and makes the prediction difficult [26].The fault can be physically simulated by drilling a hole in all of its depth [27].
BRBs can be detected with several techniques like MCSA and VSA, or even, temperature.The image of a broken rotor bar fault by [28] can be seen in Figure 3: Broken rotor bars faults cause speed oscillations in the rotor, leading to premature wear of bearings and other components [28].The condition can be detected using MCSA and can be artificially generated by drilling holes in rotor bars.

ii) BEARING
The bearing fault is another crucial mechanical fault in motors.[12] reports that they constitute 44% of induction motor failures.The fault is caused by the lack of lubrication, mechanical stresses on the bearing's balls, misalignment, corrosion, damaged inner/outer race and more [12], which cause load irregularities in the magnetic field, hence changing the mutual and self-inductance [29].The image of Bearing damage is presented in Fig 4. In the presence of a bearing fault, the rolling elements (balls) pass over the defect area periodically, producing impulses with a certain frequency, which can be detected by MCSA [29].The condition can be artificially generated by drilling holes of various diameter.
Bearing fault can be detected by VSA and MCSA [12] and has at least three different types: Outer-race fault (ORF) [12], inner-race fault and ball defect [4].Continuous wavelet transform/2D wavelet scalogram, along with relative wavelet energy, can also be used with MCSA to detect ORF in ball bearings [29].

iii) STATOR WINDING
Stator winding (SW) fault is an IM specific mechanical fault that holds 38% of IM failures [30].Stator faults' main reason for the failure is a degradation of insulation which is followed by inter-turn short circuits [31].More importantly, the developing fault can cause the motor's destruction if it is not fixed on time too.The effect of SW fault on MCSA signals is that asymmetric SW causes spatial harmonics that vary at a single frequency [30].This effect can be seen clearly in Figure 5.There are many ways to detect SW fault, such as using fuzzy logic with motor current signatures data to detect specific components [30].

FIGURE 5. Phase currents of a healthy motor (top).
A stator winding fault produces unbalance in the motor currents (bottom) [30].

b: CENTRIFUGAL PUMP-BASED FAULTS
Centrifugal pumps' faults are types of faults that can occur by themselves or depend on each other in their creation.They can be categorized into two types: Mechanical or fluid-flowinduced faults [32].

i) IMPELLER
Impeller fault is a centrifugal pump specific fault that occurs at the impeller blades.Tian et al. [17] show that defects on the impeller due to inventible cavitation and erosions cause changes in both static and dynamic torque, which can be sensed through the current.The fault can be artificially generated by removing a portion of the metal [18] from the impeller.Fig. 7 shows healthy and faulty impellers [17].
Furthermore, there are several sub-faults under the impeller fault's domain: Inlet tip, exit tip [17].These sub-faults can decrease in amplitude at the blade pass frequency in MCSA [33].Then impeller's imbalance can also cause mechanical faults or fluid flow hydraulic faults [18] on top of sub-faults.These faults can be detected by several popular methods like VSA and MCSA [17] and interesting methods like DQ patterns [19], and discriminant feature extraction [18].

ii) CLOGGED IMPELLER
Another type of fault that occurs at the impeller area, and is well detected by MCSA is a clogged impeller fault.It is caused by pump impellers filled with external matter such as polystyrene (see Figure 6), hence, decreasing flow rate.This type of fault is also mentioned in Table 1.Clogged impeller causes effective value reduction in motor current and reduced efficiency.When a clogged impeller fault occurs in the pump, its efficiency is reduced by 9 to 15% [34].Its effects on the frequency domain are the most prevalent when three and four of seven channels are clogged (in other words, half-sided clogging).The fault frequency's amplitude increases a lot at 5th harmonics (791.9Hz and 875.3 Hz) under the condition of 10 kHz sampling rate and 30 seconds duration.This phenomenon is more visible at higher speeds like 1800 rpm and 2500 rpm [34].
The equipments used by [34] includes pressure sensors (IFM PU5413), data acquisition devices (NI USB-6363) and more in each pipe to measure differential pressure and signal of the pressure sensor.Electronic pressure switch (WIKA PSD-30) and temperature switch (WIKA TSD-30) are used for life-accelerated tests.FIGURE 6. Clogged impeller reduces the pump efficiency.The condition can be artificially generated by using polystyrene to clog impellers [34].

iii) BLOCKAGE
Blockage fault means blocked pipe and is one of the main reasons for pump breakdown.The fault is created by the pump's closed or modulated valve hand valve [19], [20].
For hydraulic pumps, the blockage of the outlet causes the reduction of hydraulic load and pumping of less liquid and eventually the need for less current [21].If a pump is blocked and the motor stops, no current will be drawn.Hence, no MCSA data will be collected to work with.However, if the blockage is beyond the area of the pump, then MCSA signals can be collected and the fault can be shown.
Blockage fault can be detected in many ways rather than just simple MCSA on deep learning or VSA.Given that the blockage can apply pressure on the pump, pressure signal based deep learning techniques can fully detect the fault [20].Additionally, the fuzzy-logic based detection system or Park transformation based DQ pattern plotting with the help of MCSA can also be a method to detect this fault [19], [21].

iv) CAVITATION
Cavitation fault occurs when the pump's absolute static pressure falls below the saturated vapour pressure of the fluid, hence causing vaporization [2].Due to the pressure change, the blockage based fault detection papers overlap with this fault too [20].
The five major causes of cavitation fault are: failure of the pump housing, destruction of the impeller, excessive vibration, higher than necessary power consumption, and decreased flow and pressure.There are five types of cavitation: vaporization, turbulence, vane syndrome, internal recirculation, air aspiration cavitation [8].According to [35], if run for a long period of time, the cavitation also creates unsteady flow that causes following internal surfaces' failure such as volute, bearing, shaft, seal and etc.
As most centrifugal pumps have induction motors and reflect all dynamic information to stator current signal or transient power signal, MCSA can be used to detect this fault [2].Besides that Luo et al. [13] state that the stator current spectrum is the composition of fundamental frequency, harmonics and noise.Therefore, it can be inferred that these components can be captured during MCSA of centrifugal pump fault detection.
Finally, Table 2 summarises the distribution of various faults.It can be seen that bearing fault and stator winding faults are two crucial faults that occur frequently and affect pump operations.

IV. METHODS
In this section, the MCSA based literature works on fault detection will be presented.The authors will now briefly explain the selection criteria for papers and the justification for choosing IM papers.The main research criteria was based on priority keyword search, which was carried out using the keyword ''MCSA'' followed by ''fault detection'', ''ML'' and then ''CP''.Among the papers the authors searched on IEEE, ScienceDirect and more, the authors tried to select the TABLE 2. Distribution of induction motor faults [36].
most relevant and recent (e.g 2020, 2021 and 2022) papers.During our research the authors observed that the vibration, non-ML, IM based papers were vastly more than what the authors were looking for.In order to make the ML survey paper more exhaustive and increase the detectable fault count with MCSA and ML, the authors also included papers with IM faults as long as they satisfy the MCSA and ML mustinclude criteria.Given that IM is a component of CP, fault in one will eventually affect the other's performance on overall output.

A. FAULT DEVELOPMENT MODELS
Before starting discussions of any ML or non-ML based solutions to detect faults, the authors would like to provide Ofuchi et al. [37]'s notable mathematical modelling of centrifugal pump head degradation over time.The authors used electric submersible pumps (ESPs) to investigate their degradation when highly viscous flows were given.The authors hypothesize that ESPs will be degraded more given that their design is for water-based operations.The authors aim is to propose a model to estimate the head and flow rate degradation of a centrifugal pump operating at a broad range of Reynolds numbers.The reported working conditions of the pump are rotation speeds up to 3500 rpm and kinematic viscosity up to 822 • 10 −6 m 2 s .The data collected is from the two mixed flow type electric submersible pumps and one radial type pump.The author uses polynomial models to estimate the pumps' pump head degradation curve under viscous operations.In the end, they compare their method against well-known engineering standards like Hydraulic Institute.In the end they compare their method against well-known standards of engineering like Hydraulic Institute.Their results from three pumps are at different rotating speeds and fluid viscosities.Industrial standards like HI and KSB result in similar curves, but they are generally underestimating performance degradation whereas the authors' model is better at estimating head versus flow rate curves under moderate to high viscosity.

B. TRANSFORMATIONS
Transformations are crucial components in the pump/motor fault detections, as they can help extract the relevant features for the main technique.Examples like Fourier transform, wavelet transform, and Park transform can be given for feature extractions in MCSA.

1) FOURIER TRANSFORM
Fast Fourier transform (FFT) is a computational method that is used to compute discrete Fourier transform (DFT) of time series (e.g.signals).DFT is mainly used in digital spectral analysis, filter simulation, and more.Its efficiency comes from iteratively calculating the coefficients of DFT [38].In the signal processing domain, FT decomposes a signal into frequency components.The formula for Fourier Transform is where x(t) ∈ C, t is time and ζ is frequency [39].
There are several ways to observe the motor's fault with FFT: Short-time Fourier transform (FFT over time) and frequency over samples.They are ways to visualize the fault developing over time.As much as it is a great technique to see the frequency domain easily, in some cases, it suffers from spectral leakage due to MCSA's limitations (e.g. machine operating at low slip) [40].
Many kinds of research utilize FFT to detect faults in 60 Hz IMs such as [41].They use FFT as a spectral estimation method in their detections.This research acknowledges the spectral leakage problem and addresses it by using multi-rate fractional re-sampling sampling and the combination of interpolation and decimation to the stator current signal at 8000 and 8192 Hz [41].Romero-Troncoso [41] investigates BRB that can also have other mechanical failures such as unbalance and misalignment.The data acquisition devices are hall-effect sensor model L08P050D15 and 16-bit fourchannel serial-output sampling analogue-to-digital converter ADS8341 to investigate IM with BRB at different loads and PicoScope 4262 to investigate IM with low loads.Hence, using real IM motors to conduct the experiments.The benefit of the research for FFT is its usability in the analysis of power quality as well [41].
The other robust and simple method of FT that is used in induction motors is windowed FT.On top of that, using quadratic time-frequency analysis is also efficient form of method for windowed FT due to its independent window sizes and types [42].

2) WAVELET TRANSFORMS
The DWT has a quick filtering ability to extract the bands of interest [42] for quick implementation [28].The signal is decomposed to multi-resolution by filters of cut-off frequencies.The process involves the selection of the mother wavelet and the number of decomposition levels such as the Daubechies-40 [28].Furthermore, it is a time-frequency based transformation with variable window size tool in different frequencies [43].The DWT is defined as [43] 3) HILBERT TRANSFORMATION Hilbert transformation is a transformation that is used for signal's demodulation operations and used in [27].With Hilbert transformation's usage in fault signature, instantaneous frequency and amplitude can be extracted.Therefore, faulty component induced signal's modulation can be showed.The Hilbert transform is defined as: Essentially three phase motor signatures are provided to Hilbert transform and to obtain A a , A b , A c , φ a , φ b , φ c where A is amplitude modulation and φ is phase modulation respectively for phases a, b and c.

4) PARK TRANSFORMATION
The Park transform is the conversion of a three-phase system to a two-phase system to describe three-phase IM phenomena with Park's vector [27].
The space vector: where x d and x q are Park's vector components that are made of weighted three-phase components and subtractions from each other [27].
In some solutions, Hilbert and Park transformations are combined and compared against FT in order to detect broken rotor bar, unbalanced voltage, one air-gap eccentricity and the outer raceway ball bearing defect [27].Park transformation is generally used alone in most literary works to detect stator winding fault as well.After Park transformation, the healthy motor DQ pattern shows a circular shape.When a fault is introduced, the shape becomes elliptic [44].
With that said, the initial studies that are conducted on Park transformation was about its pattern recognition associated with the current's Park's Vector form [42].
Fortunately, now there are new solutions in Park transformation that also address several limitations (e.g load dependency and sensitivity to transient) of converter diagnosis in permanent magnet synchronous motor/generators.The method is applying Park transform on three phases then taking vector modulus.The normalization is done by the dividing phase current by Park's vector modulus [45].

C. NON-MACHINE LEARNING METHODS
There are several solutions that do not use ML at all, despite utilizing MCSA.One interesting solution, which uses a centrifugal pump with three-phase, 1.5 hp, 3450 rpm, 60 Hz specs, proposes an electric diagnostic technique for fault analysis without extra sensors [19].Irfan et al. [19] use motor line current and voltage to measure the three-phase line current, transform it into two-phase DQ patterns.1000 samples with 4000 Hz sampling rate were collected from three-phase stator current sensors using PXIe-1082 data acquisition module [19].The fault detection is done with pattern classification based on statistical indices after the DQ pattern plot is generated.The success comes from the shape of the figure.When the healthy pump has a hexagonal shape, the pumps with impeller fault or blockage have distorted (fan shape circular).Their past papers also include bearing, winding damage and eccentricity [19].
MCSA is used to detect faults such as impeller clogging in the signal domain.According to a paper that focuses on impeller clogging [34], Becker et al observed that four particular frequencies' amplitude increase.Their experimental setup benefits from MCSA to detect healthy motor levels and different clogging levels.The authors use current, voltage, flow (17.9m 3 /h) and head (6.6 m) with a high sampling rate (10 kHz, 30 seconds).The power analyzer is Yokogawa PX8000 for one phase input values.The authors' pump's fundamental frequency is 166.7 Hz [34].They find that the power consumption decreases, and the amplitude increases with increasing clogging and blockages, but the amplitude decreases with the increasing number of clogged channels.They observe that higher (e.g. 4) clogged channels better represent the characteristics.Besides, the higher speed levels also define the differences between minor faults and healthy faults (e.g.1800 rpm and higher speeds).Although the paper did not create an automatic system to detect faults, the focus on MCSA confirmed other papers' hypotheses and defined the characteristics of faults' effects on MCSA [34].The paper also discussed the limitations of the faults detection frequency that they call the blade pass frequency.The paper's authors hypothesize that the amplitude of blade pass frequency is affected by the faultiness of the clogged impeller.However, said frequency is not affected when the pump is used as a circulation pump.

D. MACHINE LEARNING METHODS 1) NON-DEEP LEARNING
In this section, classic machine learning techniques used in motor fault detection will be explained.

a: SUPPORT VECTOR MACHINES
Support vector machines (SVM) is a supervised learning model which is used in classification tasks.In literature, its domain is under curve square optimization problem and statistical learning theory [18], [46].SVM maps the training data's non-linear dimension to a higher dimension (aka.feature space) with a transformation [46].SVM's are used mainly in detection problems with both vibration and MCSA based data: Centrifugal Motor Faults Oriented Solutions: The authors of [32] combine MCSA and VSA in their investigation.They use line-current probes and accelerometers to collect time domain-based data and convert it to the power spectrum.They compare and choose the best suitable features: Mean, standard deviation and 1/ standard deviation [32].Then they train and test with a multi-support vector machine (MSVM).The authors use a 30 Hz centrifugal pump and use 33 faults.They find that every fault alters the flow patterns with a unique effect on signatures.Therefore, with the MSVM solution, they aim to classify isolated and/or combined faults (e.g.interdependence of mechanical and hydraulic CP faults), find faults with various severities (like suction and discharge blockages), compare high and low-frequency resolutions, and classify 33 critical centrifugal pump faults [32].They collect 2000 samples of both 20 kHz sampling rate and 5 kHz sampling rate, and they vary the proportion of test to train ratios.The authors also tried with different pump speed pairs (30)(31)(32)(33)(34)(35)(36)(37)(38)(39)(40)(40)(41)(42)(43)(44)(45)(46)(47)(48)(49)(50)etc) to test intermediate speeds as an alternative option in case no specific fault data are available.Finally, the final obtained test classification accuracy for the same speed training/testing is 83.2% which gets worse if based on a different speed instead or gets better if more resolution is used [32].
Induction-motor Faults Oriented Solutions: IM faults based paper [47] authors use multi-class SVMs to focus on the cage induction machines' rotor fault diagnosis with five conditions: Healthy, broken bar, broken end-ring, static eccentricity (EC), dynamic EC.The test rig created is 1.5 kW, 50 Hz, 220 V and one pair pole cage induction machine.The paper first utilizes FFT to derive stator current signal's frequency spectra and extract other features.The sampling frequency is 8192 Hz, and the tests are repeated 30 times with 100:50 training to test the dataset.The authors compared the majority voting method, binary tree decision method, neural network method and hybrid matrix method to decide the best way to combine the results of sub-classifiers.They found that (with wavelet transformed data) neural network synthesizing scheme has the best result (97.38%) but has worse performance with random hidden neurons [47].Therefore, the mixture matrix synthesizing scheme is favoured with 97.32% accuracy and a lower time cost.In the end, mixture matrix/SVM has significantly less training time despite its very small lower accuracy [47].
In another research, Toma and Kim [16] use MCSA IM data from a university dataset.Their dataset consists of two current signals with 180 degrees of phase difference and has 17 different combinations of metadata such as but not limited to: bearing, damage in the inner ring.The dataset was labeled with 3 main labels: Healthy bearing, inner ring or outer ring failure.To classify 10 features (e.g.mean, median, variance, skewness) a random forest, SVM, K-Nearest Neighbor (K-NN) algorithms were used.The data was partitioned to 70:30 and 80:20 which are named as training:testing respectively.The performance of random forest, RF and K-NN algorithms is compared using precision and recall metrics.The reported accuracy of SVM and KNN using GridSearch method was 99% and RF has 98% respectively.On top of that, the authors observe that SVM performs slightly better than KNN with a higher recall (99% against 98% respectively).
There are other types of transformations that can be used with MCSA data as well.With the implementation of Hilbert Park transforms, Hilbert modulus current space vector (HMCSV) and Hilbert phase current space vector (HPCSV), the faults like broken rotor bars, supply voltage asymmetry, air-gap eccentricity and outer raceway ball bearing can be detected with SVM [27].The experimental setup is a three-phase 50 Hz four-pole, 28 rotor bars, 1.1 kW induction machine and samples with 10 kHz.The data acquisition is done in MATLAB.The authors create a pipeline that consists of obtaining HPCSV spectral component, training and testing with Gaussian kernel SVM (for each fault in tree shape) [27].The ratio of 90 to 60 samples for training to test dataset was used to get 95% accuracy.The authors find that HPCSV is better in showing harmonics than HMCSV [27].

b: MULTI-LAYER PERCEPTRON
Multi-layer perceptron (MLP), a class of artificial neural networks, is modelled to work the same way a brain performs a task or a function.The MLP has an interconnection of simple computational cells named ''neurons''.MLP has three core elements: Input connections with ''weights'' and ''sum'' functions to gather results and an activation function [50].MLP/ANN uses feed-forward neural network architecture, and its neuron weights are updated (aka trained) by the backpropagation algorithm [11].The figure of MLP can be seen in Fig 8.

i) CENTRIFUGAL MOTOR FAULTS ORIENTED SOLUTIONS
In [11], the authors experiment MLP with PCA (developed in MATLAB) for fault diagnosis and achieve a 100% in detection with 170 epochs and 6.76 sec.However, the authors generate simulated 600 (1:4 ratio for non-faulty to faulty) data with centrifugal pump.The training to testing dataset ratio is 3:1.The MLP without PCA still has 99.3% accuracy with 81.81 sec that is still a very good result [11].They consider 20 faults from the centrifugal pump system, including but are not limited to shaft wear, wrong impeller, suction leak, selfbearing rotation and oil seal leakage.The paper uses PCA to preprocess the data to extract 11 relevant features that are voltage, current, speed and more [11].

ii) INDUCTION-MOTOR FAULTS ORIENTED SOLUTIONS
Unlike the previous MLP papers, the authors of [51] has a real test rig with three phases, 1 hp (0.75 kW) AC induction motor.The data is sampled with 100 Hz to capture 4000 samples.
The MLP model has one hidden layer with 46 neurons and three output parameters (healthy, inner race and outer race faults).The loss is calculated with MSE and the correlation factor [51].The paper identifies that the current spectrum increases as applied load increases, and RMS, kurtosis features provides a good indication about the bearing's state.If ORF is present, the amplitude changes higher.The test dataset has unseen data totalling up to 360 sets with 120 per condition [51].
c: RANDOM FOREST Random forest (RF) is an ensemble classifier that is made of a collection of a tree-structured classifiers.RF uses bootstrap sampling to select k samples from the training dataset, creates k decision tree models based on these samples and gets k classification results.After k classification results, the classifiers vote for the final decision [52].

i) INDUCTION-MOTOR FAULTS ORIENTED SOLUTIONS
The remaining papers which use RF's in fault detection utilize vibration data [52].
2) DEEP LEARNING a: CONVOLUTIONAL NEURAL NETWORK Convolutional neural networks (CNN) are another type of feed-forward neural network (like MLP) that can automatically extract features with convolutional methods [53].CNN is a deep learning method that uses signals or images as inputs.The whole network is built by several convolutional layers that compute the dot product between the input image and set of convolutional filters [26] like in Fig 9.
CNN is also used to detect broken rotor bars in induction motors.Valtierra-Rodriguez et al. [26] use CNN's image classification ability to detect the faults that appeared on the short-time Fourier transform-based time-frequency (STFT) plane, and also uses MCSA for current signals in the transient state.Four induction motor cases are used: half-broken rotor bar, one broken rotor bar, two broken rotor bars, and a healthy rotor.The paper achieves 100% accuracy in the detection of all classes.STFT is a graph plane that demonstrates the change in magnitude of each frequency with respect to time (s) [26].The test rig used has two poles, 28 bars, the nominal power of 1 hp, and is fed with 220 Vac at 60 Hz to obtain 100 current signals for each condition with 1500 Hz Sampling frequency [26].

b: RECURRENT NEURAL NETWORKS AND LONG-SHORT TERM MEMORY
Given that motor current signatures changes over time, the usage of recurrent neural networks (RNN) or long-short term memory (LSTM)s are not uncommon.RNNs/LSTMs are types of deep neural networks that can work with arbitrary lengths of data series [54].LSTMs are modified RNNs to prevent gradient vanishing or exploding in RNNs.LSTMs have memory cells that include ''forget-cell'' to handle long-term dependency problems along with hyper-parameters like the hidden states, time steps [54], as shown in Fig 10.Given that LSTMs and (1D -) CNNs detect the fault from different perspectives, Khan et al. [55] investigate their performance on fault detection separately.The paper uses healthy and ''inter-turn fault in the stator'' conditions for the dataset (3:1 ratio for generated training and testing dataset respectively).Healthy data is acquired under balanced and imbalanced voltages.As a result, (chosen optimal model) 4 layered LSTM got 83% accuracy which is lower than (chosen optimal model) seven-layered CNN that got 99% accuracy [55].

E. SUMMARY AND PERFORMANCE COMPARISON
In this section, research results will be compared for induction motors (IM) and CP.Although the authors conducted exhaustive research to find CP faults based machine learning papers that utilize MCSA data, the authors observed that there are more research papers for IM based faults.Besides, the lack of recent journal publications from the last 3 years (2020-2022) also made it hard to find MCSA based CP (or even IM) publications that use ML to detect faults.Moreover, another observation was that the VSA based research papers for CP were also abundant.Secondly, the authors saw that the most successful methods to classify CP/IM faults were CNN, RF and MLP with their over 95% success rates.Therefore, with features used appropriately, near-perfect accuracies can be obtained by implementing those researches.Our other observation was the faults that are investigated popularly in the research papers the authors read.These faults were BRB, IRF & ORF, unbalanced power/voltage sources, stator windings that most are IM faults.

V. CHALLENGES
As much as MCSA is an attractive method, just like VSA, it has its own challenges and shortcomings for successful implementation.This section contains a discussion and analysis of the major challenges that relate to practical design and development of machine learning algorithms for pump monitoring.

A. DATASETS
The research of machine learning algorithms for fault detection and condition monitoring invariably requires access to large amount of accurate current data.Virtually all surveyed research papers rely mostly on simulated data generated from an analytical model or collected from testbed in controlled conditions.
There is a lack of open datasets of pumps working in real conditions that could be used for design, development, and comparative analysis of various machine learning approaches.Despite the fact that Case Western Reserve University does have a dataset for bearing fault that is collected under 2 hp reliance electric motor in MATLAB, they unfortunately use vibration data instead of MCSA data [56].Given that the motors can have linear degradation in performance or sudden drops [57], it is crucial to have a reliable and correctly annotated dataset and a model.Collecting such dataset can be extremely challenging as any monitoring system has to be operating for a long enough time to detect its degradation over time and its eventual failure.Given the typical lifetime of centrifugal pumps of 8-15 years (and 15-20 under well maintenance), this would require data collection over many years or scaling the data collection to a large number of pumps.Any resulting dataset would be imbalanced and dominated by healthy signatures with a very small proportion of faulty data [25].Invariably, the datasets from such experiments could be limited, and not all faults could be captured and isolated (especially in real-time).

B. PREDICTIVE MONITORING
The estimated lifetime of the pump is uncertain due to many aspects and conditions.Many models assume a binary approach by detecting whether a motor is faulty or not, potentially recognising the type of the fault under controlled conditions, e.g. by drilling a hole in the bearing or introducing an artificial blockage.Detecting multiple faults, which may happen with real pumps is more complicated as different faults can have inter-dependence which can complicate an analysis [32].What is needed is an approach that allows to evaluate the overall health of the pump, detect early signs of faults and changes in the fault severity level to enable operators to make informed decisions on when to replace or service the pump.

C. ROBUST DATA COLLECTION
Even though MCSA is a non-invasive method, which is much easier to install and operate compared to vibration based or pressure-based methods.The installation usually involves putting current clamps around the cables with the data fed into a microcontroller or a single board computer.However, deploying and operating a robust data collection system can still be challenging for multiple reasons.The control panels, where MCSA hardware is attached can be in areas with no or poor wireless connectivity.Even if Internet access is available, missing packets or any other hardware problem can still happen to disrupt the system.On top of that, the captured data can be noisy and make it hard to extract the desired features.Processing this data locally requires either a relatively high-performance system or a reliable wireless connection.Besides, processing and monitoring the data in real-time require a system with a specific software installation which can further increase the cost.

D. PUMP SPECIFIC RESEARCH
Finally, despite that IMs are part of centrifugal pumps, their faults may be different, but they still affect the overall processes.Therefore, any solutions that plan to create predictive maintenance should consider thinking about both sides of the problem.Unfortunately, there are only little researches in the literature with a focus on centrifugal pump faults.Especially the ones that use deep learning on MCSA.This situation makes the literature too limited to find/implement relevant papers.Hence, finding only a handful of research papers to compare their results in the same fault categories.

VI. CONCLUSION AND FUTURE WORK
The presented survey attempts to provide a systematic analysis of machine learning-based fault detection for centrifugal pumps.The main objective was to explain the relevant approaches and critically compare the performance of various methods.In particular, the survey explains the benefits of MCSA and compares them with other alternatives like VSA.Having described the relevant machine learning methods, data acquisition techniques and metadata, the authors have realized that CNN and MLP based neural network solutions (when paired with a good training algorithm or transformation) perform better than SVM or other solutions (e.g.LSTM).Therefore, the authors believe that such ML developments in the future have a great potential in terms of both prediction accuracy and resource requirements.The survey also highlights some of the practical challenges related to fault detection in centrifugal pumps.This includes the lack of public annotated datasets, that could be used to develop and compare the performance of various diagnostic algorithms.The authors hope that this work will be useful to other researchers and engineers in developing non-invasive and low-cost predictive maintenance solutions for centrifugal pumps.

FIGURE 1 .
FIGURE 1. Diagram of centrifugal pump with its main parts labeled [10].

FIGURE 2 .
FIGURE 2. Diagram of simulated (healthy and faulty) MCSA line currents and their FT version [14].The broken bars affecting sidebands of winding harmonics is highlighted in logarithmic scale.

FIGURE 4 .
FIGURE 4.In the presence of a bearing fault, the rolling elements (balls) pass over the defect area periodically, producing impulses with a certain frequency, which can be detected by MCSA[29].The condition can be artificially generated by drilling holes of various diameter.

TABLE 3 .
A comparative summary of MCSA based approaches.