A Survey of Cyber-Physical Attacks and Detection Methods in Smart Water Distribution Systems

Modern technologies empower water distribution systems (WDS) for better services in the processes of water supply, storage, distribution, and recycling. They improve real-time monitoring, automating, and managing. However, the limitations of these technologies introduce cyber-physical attacks to the WDS. The main goals of cyber-physical attacks include disrupting normal operations and tampering the critical data, which have negative impacts on the WDS. Therefore, it is vital to develop and implement solutions to increase the security of the WDS by detecting and mitigating cyber-physical attacks. Since security for WDS is relatively new, there are no surveys on this topic despite its vital importance. Therefore, in this paper, we provide a comprehensive survey for the common cyber-physical attacks and common detection mechanisms for the WDS. We compare the attacks and detection methods with emphasis on ideas, methods, evaluation results, advantages, limitations, etc. We further provide a future research direction. We realize that there are still not many research attempts in this area and we hope that this work can trigger more research activities related to the WDS.


I. INTRODUCTION
Smart cities intend to improve the life quality of people and offer intelligent services by using smart devices. Smart cities do not have a universal definition and many scholars define them as digital or intelligent cities [1], [2]. An intelligent city collects digital data and communicates via different networks to make intelligent decisions [3]. There exist numerous components that constitute a smart or intelligent city. These components include technologies, citizens, buildings, energy systems, infrastructure, transportation systems, medical systems, government, educational systems, water systems, etc. [4], [5]. One of the most crucial components of a smart city is the water system. The sustenance of the city and the interoperability of the different components of the city is dependent on the smart water system. A smart water system involves the use of information and technology to enhance the traditional water system. A smart water system is motivated due to the rising water scarcity and The associate editor coordinating the review of this manuscript and approving it for publication was Yuan Tian . the expensive structure to make water potable and available to the rising population [6]. The advantages of a smart water system over a traditional water system include accurate measurement of water consumption, quality control of water, monitoring and prevention of flooding, prevention of water wastage, etc., [7]- [11].
There are many security challenges in smart cities partially due to Internet of Things (IoT) devices to prevent water theft, water wastage, and water poisoning [14]. Also, there is a need to develop a secure water distribution system that controls the quality of water and controls the main operation water system [12]. The modern water distribution system (WDS) depends on modern technologies to operate and monitor water systems. These technologies have improved the service quality and the reliability of the WDS. However, it is susceptible to cyber-physical attacks (CPA) making them less secure. Attacks on the WDS halt or affect normal operations of these systems or even compromise critical data [13].
Cyber-physical systems and internet of things devices are considering as main components to build water distribution systems (WDS). These devices include sensors, smart meters, VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ programmable logic controllers (PLCs), and supervisory control and data acquisition (SCADA) systems. A sensor is responsible to collect the data of smart water, whereas smart meters measure the consumption of water. PLCs are embedded devices connected to sensors and actuators to exchange the data and control the process. While SCADA is a centralized system to manage the operations, store and analyze data [13], [15], these components increase the efficiency of WDS, but they are prone to cyber-physical attacks. An adversary can attack WDS and modify data to damage the water system or disrupt the water services [13]. Cyber-physical attacks have increased in recent years. An example attack targeted the Maroochy Water Service Sewage in Australia in 2000 [16], [17]. The adversary hacked the sewage pumping station by monitoring the radio network and sending a false message to corrupt the SCADA system and to alter the data of the sewage pumping station [16], [17]. As a result, the attack disrupted the communication for a short period between the SCADA system and the sewage pumping station and disables the operation of the sewage pump. Moreover, it prevented the alarm message to send to the SCADA system and caused the spilling out of more than 800,000 litters of raw sewage to different public areas such as parks and hotels. The government spent days cleaning up and cost around one million dollars to upgrade the security of the system [16], [17].
Another recent attack targeted an Oldsmar Water Treatment System, Florida in 2021 [18] and the hacker tried to access the computer system remotely to containment the water treatment system by changing the normal level of sodium hydroxide. WDS can be attacked using different types of attacks such as passive attacks and/or active attacks. For passive attacks, an attacker either listens to communications among components or senses some system's states to acquire some knowledge. For example, an attacker may want to know about a given component used in the system, acquire some readings from the sensors, or listen to communications between the sensors and the actuators. Moreover, for active attacks, the attacker may make some interference to the system or components in the WDS, e.g., a) conducting Distributed Denial of Service (DDoS) attacks, b) disconnecting a component connected to the WDS, c) interfering with the communication process of the WDS, d) data breaching or e) disabling the equipment and disrupting services. Under a DDoS attack to the availability of the WDS or communications between two components in the system, the users may not access the services offered by the system [19]. Under a data breaching attack, an attacker can access data without any authority so a data breach exposes confidential and sensitive information (such as customer payment details). A data breach in the system can cause huge financial losses, reputation, and user loss of the organization [20]. Under attacks of disabling the equipment and disrupting services, an attacker can alter data pressure sensors which lead to break the pipes and loss water or change the reading data of a water-level sensor to shut down a water pump [13]. Some active attacks are complex and difficult to detect especially when the attacker has gained many insights in the system [17].
These attacks show the security flaws inside the critical infrastructure such as water systems and there is a need to reduce the security risks or threats in the WDS. It is essential to investigate the threats and weaknesses, design detection methods, and design suitable mitigation procedures to alleviate the attacks. In recent years, there are some detection approaches to detect the attacks and mitigate the impacts of the attacks in WDS such as statistical detection methods, artificial neural network methods, machine learning algorithms, etc. [21]. Even though the area of security of WDS is relatively new and there are not many papers in this area, it is a very important area, particularly as part of a smart city. Based on our knowledge, there is no literature survey for this topic. Therefore, this motivates us to provide a comprehensive survey based on the limited and available papers. In this paper, we survey papers in detail in terms of attacks and detection methods in the WDS. Our contributions are highlighted as follows.
• To the best of our knowledge, this is the first paper to provide a comprehensive survey on security in WDS. We hope that our paper can trigger more research activities in this area.
• We survey cyber-physical attacks and their impacts on WDS. We provide a new classification of cyber-physical attacks on WDS.
• We analyze the existing detection methods to reduce the negative impacts of cyber-physical attacks on WDS. The rest of the paper is organized as follows. Section II introduces the basic background of the WDS and components. We discuss different types of attacks for the WDS in Section III. In Section IV, we discuss existing attack detection methods for the WDS. In Section V, we present future research directions. Finally, we conclude the paper in Section VI.

II. WATER SYSTEM
A traditional water system uses fewer technologies and adopts physical devices (such as pipes, valves, and pumps) to deliver end-to-end water distribution services. Smart Water is defined as a water system enhanced with technologies such as sensing (via sensors and monitors), real-time communications (such as wireless networks, satellite communications, etc.), controls, and intelligence. A smart water distribution system is shown in Fig. 1. First, water is collected from a water source such as rivers or the sea, and then the water is transported to the water treatment plant for treating and purifying water to meet water quality standards. Furthermore, the water is stored in the water storage system and then is transported to the WDS through pipes (transmission lines), which are used to convey water from the WDS to the end-users (home, industry, etc.). Finally, after the water was used, the wastewater is transported to the water treatment  plant for recycling water [22], [23]. Furthermore, a smart water distribution system may include many smart devices which are physical devices with intelligent cyber features.
The major components or instruments making up a smart WDS include pipes, water tanks, smart water meters, flowmeters, smart pressure meters, energy consumption (pumping) meters, smart water treatment monitors, smart water purity sensors, physical security monitors, smart river height sensors, dam height sensors, levee movement sensors smart valves, smart pumps, smart contaminant sensors, smart flood sensors, etc. We explain some of them in detail as follows [7].
• Smart water meters: They are electrical instruments to measure the water consumption periodically. The water utility management collects the users' water consumption data from the smart water meters over communication networks. Then, the collected data are used for computing the cost of water consumption and managing the bills.
• Pressure meters: They compute the pressure of the water to sense leaking. If there is a pipe burst, flooding, or any form of irregularities, the system stops water transportation.
• Flood sensors: They sense the present and imminent security level for chronological and imminent flood disasters. They detect a pipe break and shut off the water supply to avoid losing water.
• Smart valves: They regulate or stop water streams depending on environmental situations. They allow bi-directional flows of water, disallow pressure reduction to reach the water meter, and decrease over-supply.
• Smart pumps: They are electronic devices to pump water through pipes and into (out from) tanks. They allow bi-directional flows of water, are used to reduce high energy usage, and can prevent a huge waste of water.
• Smart irrigation controllers: They contain thermostats for sprinkler systems to automatically irrigate water base on schedules or certain conditions. They can prevent loss of water and energy that may occur due to the abuse of water usage.
• Smart contaminant sensors: They compute biochemical status to ensure the water quality such as temperature, turbidity, oxidation-reduction potential, pH, and conductivity. They can prevent pipe deterioration, water aging, and contaminant intrusion.
• Pipe: Pipes are used to transferring water.
• Tanks: Tanks are used to store water. Security for the WDS becomes more important. All the above smart devices can potentially be attacked along the systems used to connect them. To ensure WDS security, it is important to design better methods to detect attacks and mitigate security risks.

III. CLASSIFICATION OF CYBER-PHYSICAL ATTACKS ON WATER DISTRIBUTION SYSTEMS
Water is one of the highest critical resources to survive on this life. Security is the crucial subject matter to the success of WDS. The modern WDS depends on modern technology such as IoT devices to perform the functions and manage the water system. Modern technology enhances the water service and increases the quality of WDS [24]. However, the limitations of IoT resources reduce the security of WDS and attract attackers to disrupt normal operations and tamper with the critical data of the water system. For instance, an adversary can attack physical components of the water supply to shut down the power of the water pump which leads to disrupt the service. Also, the attacker can alter the chemical data of treatment the water to make people sick or die [25].
Cyber-physical attacks can be classified into many types based on attacks on physical devices and attacks on the communication/networking/system components of WDS. In this paper, we classify attacks into the following types: • Sensor attacks: This kind of attack is related to attacking sensors such as altering a sensor reading for the critical data.   • Actuator attacks: this kind of attack is related to attacking actuators who make wrong actions such as making a pump more or less water to disrupt services.
• Control system attacks: This kind of attack is related to the control system, particularly the SCADA system which may damage whole the system.
• PLC arracks: this kind of attack is related to attacking PLCs, such as misleading system operations.
• Communication/Networking attacks: this kind of attack is related to attacking communication and networking systems so that the system can be disrupted or damaged. Examples of communications include communications among sensors, actuators, PLCs, SCADA, and other hardware and software components. We provide a comparison among papers that introduced multiple cyber-physical attacks on WDS as shown in Table 1.
Cyber-physical attacks can largely impact the WDS. However, there is a lack of existing analytical and computational tools that characterize the responses of WDS to different kinds of cyber-physical attacks and show the risks of assaults. The authors in [13] aim to build attack models that characterize different types of cyber-physical attacks to identify the components of the cyber-physical system of WDS that respond to the attacks as shown in Fig. 2 and summarized in Table 2. Moreover, the authors in [13] build the EpanetCPA toolbox enables the researchers to design the attacks and customize the attributes for every type of attack. Also, it can simulate the attack results of responding to WDS. The authors in [13] study the characteristics of attacks depending on duration time, e.g., the starting and ending time for a specific attack, the number of attacks that occurred, and the locations of the components that will be attacked. They perform attacks in different scenarios as follows: • In the first scenario, the attack aims to control pump PU1 and pump PU2 at the same time and forces both pumps to run 10 hours without any demand from Tank T1, shown in Fig. 3, known as the C-town network. As result, the tank overflows 35 hours because the attack disallows PU2 to stop when the level of water of T1 is higher than thresholds (4.5 m).
• In the second scenario, the attack targets to alter the water level reading that was collected by PLC3 in T4. Then, it sends the altered information of PLC3 that controls PU6 and PU7 to give a wrong decision to deactivate pumps 6 and 7. Therefore, the water level in T4 is decreased by 1.11 hours.
• In the third scenario, the attack alters the data on the T1 water level sent by PLC2 to PLC1 leading to the overflow in T1.
• In the fourth scenario, the attack aims to hack the communication link between PLC2 and SCADA system, and then to alter the data reading of T1 water level. As result, the SCADA system stores incorrect information.
• In the fifth scenario, the attack targets the communication link that connects SCADA and PLC5 and modifies the thresholds that activate/deactivate PU11. As result, PU11 is activated for 50 hours which increases the water level of T7 and makes a wrong decision for PLC9 to deactivate PU10.
• In the last scenario, 100 random attacks are used to attack PLC3, T1, T2, and T3.
As a result, the C-town water network shown in Fig. 3, is largely impacted during the periods of attacks [13].
Furthermore, the authors in [27] propose Man-inthe-Middle attacks to attack a water treatment testbed, called  SWaT, which contains physical devices such as sensors, actuators, and PLCs. When the sensors collect data and send the data over a Fieldbus communication network to a PLC, an attacker can spoof packets through the Fieldbus communication protocol and inject a false date to modify the sensor reading [27]. As a result, the attack successfully accesses the physical devices and controls the whole SWaT testbed [27].
The authors in [26] present attacks to a water distribution testbed (WADI) which uses to purify the water. WADI is prone to attacks because its components such as sensors, PLCs, and Remote Terminal Units (RTUs) are subject to attacks. WADI also is connected with SWaT to supply filtered water. They design two types of attacks and show that attacks could cascade to the SWaT. The first attack alters the sensor reading to reduce the level of water by changing the percentage of the level water from 75% to 10% to turn on a pump and this makes it overflow. The second attack alters the sensor reading of a tank to shut off the tank valve. The results show that the attacks successfully impact the WADI, cascade the failure to SWaT, and impact the operations. However, the authors in [26] only list two types of attacks and do not provide a detection mechanism to detect these attacks.
An American water utility company was attacked by hacking the passwords of the routers to access and monitor the pumping stations in 2016 [29]. As a result, the monthly bill price was increased from $300 to $15,000 and later the utility company upgraded the security to mitigate attacks [29].
The authors in [1] launched an availability attack and an integrity attack on a water system testbed developed by an EU project called ''FACIES'' which has a fault diagnosis module to detect anomalies. The results show that the security of the system is poor and unable to detect the launched attacks. Therefore, the fault diagnosis module needs to be enhanced with a robust detection method to increase the security of the water system testbed.

IV. DETECTION METHOD CLASSIFICATION OF WATER DISTRIBUTION SYSTEMS
The purpose of a WDS is to deliver better water service to consumers. However, WDS is prone to cyber-physical attacks which can disrupt normal operations and tamper with the critical data of the water system. The attacks expose the main components of the water system. For instance, the advertisers can modify the values of water levels in each tank or the suction pressure in each pump station to VOLUME 9, 2021 TABLE 2. Cyber-physical attacks in water distribution system [13].
turn ON/Off the pump water. Also, the adversary can alter flow sensors values in the valves or pump stations to causes physical damages. It is important to develop an anomaly detection algorithm to identify the anomalies before they harm the system. The development algorithm also should identify the global anomalies from multiple sensors in a multi-dimensional space. Note that in contrast to a global anomaly, a local anomaly indicates a case that from a single sensor. A detection algorithm that only recognizes local anomalies from each sensor separately might miss the attacks due to the potential high dimensionality of the sensor's data. Also, there is a probability occur a single attack that can affect multiple sensors at the same time and mislead the operations of the system [30]. We will survey many detection methods first and then compare them in the following subsections.

A. DETECTION METHODS
In this subsection, we survey most of the detection methods in the literature in detail as follows.
The major contribution of the authors in [13] is that they propose nine types of attacks to the components of the water distribution system such as sensors, PLC units, and SCADA system as shown in Fig. 2. The detection method is simple and trivial. For the limitations, the attacks model is applied on a simple WDS that has one pump, one tank, one valve, and a few actuators; experiments on a complex water distribution system network are needed as well as a well-designed detection method.
The authors in [30] develop a detection algorithm to identify local anomalies that affect each sensor individually and also identify the global anomalies that affect multiple sensors at the same time. The proposed algorithm includes three layers as shown in Fig. 4: 1) a simple statistical detection layer to determine outliers; 2) an Artificial Neural Network Model (ANN) layer to detect contextual anomalies based on data from one sensor, and 3) a Principle Component Analysis (PCA) layer to detect anomaly behaviors among multiple sensors. We explain the detailed methods, attacks, results, and limitations as follows.  . An anomaly detecting algorithm [30].
to the sum of the mean value and the product of the standard deviation and an upper multiplier in the data set, while the lower boundary is equal to the subtraction of the mean value from the product of the standard deviation and a lower multiplier in data set. Therefore, any value that is greater or less than the upper boundary or lower boundary, respectively, will be classified as outliers.
• Layer-2: For the second layer, a trained ANN model includes multiple layers of interconnected artificial neurons that perform nonlinear computations as shown in Fig. 5. It has an input layer, multiple hidden layers, and an output layer. The input layer consists of independent neurons and each neuron is multiplied by a connection weight to the hidden layers. Then, the combination of bias and all weighted inputs is fed to an activation function as shown in Fig. 6. If the result of the summation process is greater than a threshold of the activation function, the neuron output will be able to be fed to the next input layer. To adjust the relationship between the input layer and the output layer, every neuron has a specific weight which can be adjusted by applying a Backpropagation (BP) algorithm to update the weight. The BP algorithm is used to compute a gradient and to reduce errors in the predicted output values of the training data set [33].
• Layer-3: For the third layer, the PCA method is used to reduce the multi-dimensional sensor data and to project it to new axes called Principle Components (PCs). Each PC will classify the dataset as normal for the dataset whose maximum variance is large; otherwise, the PC will classify it as an anomaly for the lower variance dataset.
• Attacks: The authors also adopt the C-Town Distribution System as shown in Fig. 3. The authors adopt two datasets: 1) the sensor data without attacks over one year and 2) the dataset with labeled five kinds of attacks over 6 months. As shown in Fig. 3, the first attack targets to attack PLC5 that connects between PU10 and PU11 and the connection between PLC9 and PLC5. The second attack targets to attack water level data of tank1 which affects on control operation of PU2. The third attack targets to change the pressure value of the valve and pumps 4 and 5. The fourth attack intends to attack PU7 by attacking the water level data of Tank4. The fifth attack targets to attack PU6 and PU7.
• Results: The algorithm is tested on C-Town Distribution System as shown in Fig. 3. The proposed algorithm can identify all the labeled attacks with no delay. Also, it identifies three new attacks that have duration attacks less than the original minimum expected attacks duration in the given compromised set.
• Limitations: The limitations are summarized as follows. First, the third layer in the algorithm that adopts PCA could not recognize the exact component that was compromised during the detected attack 3 because the discovered anomalies are obtained from multiple components at the same time. Also, the authors do not provide a solution for this so that there is a probability to lose a huge amount of water in case that the physical attacks damage assets of the water system without identifying those compromised components. Second, the algorithm issues false alarms which put the system under attack status for a few hours even after there is no attack existed. Third, training the ANN with increasing the number of hidden layers is time-consuming.
The authors in [34] propose an algorithm to identify malicious attacks by checking the integrity of sensors' data and actuators' rules and identifying the anomalies in the data. Then, they adopt an optimization approach to extract low dimensionality of sensor data and separate them from VOLUME 9, 2021 all the SCADA data measurements. We explain the detailed methods, attacks, results, and limitations as follows.
• Part I : An attacker can attack the water level sensors to change the values. The first part checks the integrity of the data system and actuators' rules to protect the SCADA system from compromising by attackers. The first part includes two functions.
--The first function is called ''Actuators Rules Verification'' which is to check the integrity of the actuators' rules. For example, one of the basic rules of actuators is to activate or deactivate a pump based on the measurement of water level sensors. If the water level in a water tank is less than the minimum threshold, the pump should be activated. Otherwise, the pump should be deactivated. If this rule is not followed, there is an attack on the sensor data or actuators' rules. The algorithm also determines the time of the attack. --The second function is called ''Data Verification'' which is to compare the data of all variables with the range of the normal operation. First, the function determines the minimum and maximum water levels values and then checks if any level reading is less or greater than the bounds of the minimum and maximum water level values. Second, the function checks the pump statue with the pump water flow value. For instance, if the pump statue is turned ON, the pump water flow value must not equal to zero; otherwise, the pump water flow must equal to zero. Third, the function also checks if the valve statue works correctly with the flow valve. Finally, the function checks the level of pressure to see if it is below or above normal threshold values of pressure.
• Part II : The second part detects anomalies and isolates them from the actual SCADA system data set based on an optimization algorithm. The measurement values of the SCADA system during the period can be represented as a matrix. The matrix includes the true values of SCADA measurements without interrupting by noise or attacks (L), false values of attacks (A), and noise values (N). The total measurements data is donated by (Y) at a specific time as Y = L+A+N. The authors assume that L is a low-rank matrix and it is nonconvex so that it is difficult to minimize it computationally. To convert the matrix to a convex optimization problem and to extract the low dimensional data from SCADA, they run an optimization framework called CVX. CVX is software that uses Matlab language to model the convex optimization problem. It requires a training data set which is Y measurements of the SCADA system. Also, the software selects tuning parameters to minimize the percentage of errors in calculating the convex problem to extract the anomalies. Moreover, the authors assume that A is a sparse matrix to determine the attacks. If A = 0, there are no attacks on the measurements values (Y). The A matrix should contain zero values in all the rows and columns except one row that can have non-zero values. Then, they determine the threshold value of A and compare the non-zero values with the threshold. If the non-zero values are greater than the threshold, the attacks are presented.
• Attacks: The authors also adopt the C-Town Distribution System as shown in Fig. 3 under eight attacks. The first attack targets to attack the low-level water sensor in Tank5. The second attack targets to attack a high-level water sensor. The third attack targets to attack a water overflow sensor in Tank1. The fourth attack tends to attack PU10 and PU11. The fifth and sixth attacks target actuator rules of controlling PU2, respectively. The seventh attack targets to pressure value of PU7. The eight attack targets actuator rules of controlling PU6 and PU7.
• Results: The authors test the algorithm on the C-Town Distribution System as shown in Fig. 3 under eight attacks. The results show that the data verification algorithm can detect all types of attacks except the seventh and eighth types of attacks, while the actuator rule algorithm identifies all labeled attacks for compromised data set.
• Limitations: The limitations are summarizing as follows. First, the Data Verification algorithm could not identify all the eight labeled attacks. Second, the low dimensional data still appear on the SCADA measurements although the purpose of adopting the optimization algorithm is to extract low dimensionality sensors' data completely from SCADA measurements' data to detect attacks. In other words, the optimization algorithm is not very effective. The authors in [35] propose an intrusion detection method to detect Denial of Service attacks (DDoS) in smart cities. DDoS attacks can disrupt the service by flooding the system with large numbers of requests and make it inaccessible to the users [37]. For instance, DDoS can affect the performance of a smart water plant by sending large numbers of requests to a data controller as shown in Fig. 7. As a result, a data controller sends harmful control signals to the water pumps causing the change of the pumping speed at the pumps and the disruption of the services. A machine learning algorithm is applied to the smart water plant dataset to detect DDoS attacks that disrupt the services by changing the pump speeds. The algorithm clusters DDoS attacks to multiple attack types, build a Deep Restricted Boltzmann Machine model (RBM) and build a Feed-Forward Neural Network model (FFNN) to detect the attacks. We explain the detailed methods, attacks, results, and limitations as follows.   it calculates the distance between the elements and the centroid point and assigns the elements to their categories based on the smallest data centroid. • Step-1 The first step is to apply the RBM model that learns the high-level six features of smart water plant data from sensors and smart meters. These features include water level 1, water level 2, water flow 1, water flow 2, pump speed 1, and pump speed 2. The RBM model contains one input layer and multiple hidden layers and there is no output layer as shown in Fig. 8. Each input layer has many variables multiplied by a specific weight which are then connected to the hidden layers. The summation process of all products with a bias is fed to an activation function to produce the output that determines if the hidden state is activated or not. Then, the reconstruction is calculated in the same way, but in the opposite direction that starts from the hidden layers to the input layers.
• Step-2: The second step is to build FFNN model layers that consist of one input layer with multiple neurons, one or more hidden layers, and one output layer as shown Fig. 9. Each neuron in the input layer has a directed forward connection to the neurons in the hidden layers without any loops or cycles. The FFNN model is trained to classify smart water plant data as normal data or anomalies as DDoS attacks that change the speed of the water pump.  The authors in [38] propose an algorithm to identify anomalies including three modules: control rule and consistency module, pattern recognition module, hydraulic and system relationships module. We explain the detailed methods, attacks, results, and limitations as follows. VOLUME 9, 2021 • Control rule and consistency module: The control rule and consistency module check the consistency of data with specific control rules given in the data set of control rules. Based on the control rules, PLCs control the water system and give control commands based on the collected data from sensors and actuators. An adversary can tamper with the control rules and also modify the obtained data to cause some inconsistency in the WDS dataset.
• Pattern recognition module: The pattern recognition module contains distinct patterns for hydraulic parameters. There is a pattern for every hydraulic parameter and also there is a pattern for a combination of multiple hydraulics parameters. These patterns are developed based on datasets free from cyber-attacks. Then, the authors compare these developed patterns with the current WDS dataset to identify the anomalies. If the WDS dataset does not follow the developed patterns, anomalies occur.
• Hydraulic and system relationships module: The hydraulic and system relationships' modules are developed based on the relationships of WDS components.
For instance, the physical information of the water level in the tank and pump station flow can be calculated to derive the information of demand from another component such as mass balance. The calculated values of these components are compared with the collected data of WDS components to detect the attacks.
• Results: The authors test the algorithm using a collected dataset for 6 months. The algorithm shows the efficiency of the algorithm to identify the anomalies and detect the attacks. The total hours of operations are 4177 hours and the anomalies are observed in 666 hours. The algorithm detects 62 cyber-attacks and calculates the duration time to detect each attack. The longest period that the algorithm spends to detect the attack is 87 hours, while the shortest period to detect the attack is 2 hours.
• Limitations: The limitations are summarizing as follows. First, the algorithm detection time needs to be enhanced to minimize detection time. Second, comparing the water dataset with the given patterns of modules is not enough to detect some attacks. The authors in [44] propose a methodology to detect cyber-physical attacks on WDS by molding the component of WDS as a logic graph, e.g., a graph contains interconnection between sensors and actuators. Then, the algorithm in [44] computes a control function for each subgraph and compares the computed value of the control function with the new value of the control function. When an attacker tries to compromise the components, the new value of the function control graph will be different from the original value of the control function and the attack occurs. The algorithm also recovers the attacks by replacing the comprised value of water components with an estimated correct value to mitigate the impact of attacks. The method was tested on a simple WDS and needs to extend in the future for testing on a complex WDS.
The authors in [39] propose an ensemble methodology to detect cyber-physical attacks by creating four modules. The first module is responsible to check if the operations of pumps and valves follow the right control rule based on the observed water levels in each tank. For example, if the water level in a tank is less than the threshold, the pump should be turned on. The second module contains a statistical method to monitor the sensors and actuators based on the calculated upper and lower boundaries. If any value is above the upper boundary or below the lower boundary, it will be considered as outliers. The third module has an Artificial Neural Network Model (ANN) model to train the dataset and predicts the anomalies. The fourth module contains Principle Component Analysis (PCA) to classify data as normal or anomalies. The results show the ability of the method to detect all the labeled attacks and it identifies the locations of the compromised components for most of the labeled attacks. However, there is a need to develop the algorithm to increase the ability to recognize the compromised components when attacks occur from multiple components at the same time.
The authors in [45] propose a model-based fault detection methodology to detect cyberattacks on WDS. First, the authors create a dataset free from attacks by adopting the EPANET simulator to simulate the WDS. They define the normal errors which are expected in the WDS due to the differences between the EPANET and the actual WDS and then create another dataset with attacks to produce errors in the SCADA readings [45]. The proposed algorithm in [45] compares the normal errors and the errors produced in the presence of attacks for SCADA readings to discover potential cyberattacks. The results show the ability of the algorithm to detect attacks from SCADA readings, but the reliability of the algorithm is decreased since the sensor's data with noise affects the accuracy of the results.
The authors in [40] adopt Long Short Term Memory Recurrent Neural Network (LSTM-RNN) method to train the SWaT testbed dataset and predict the attacks. RNN is one type of deep neural network and it feeds the output layer as the input for the next layer. A propagation algorithm is used to update the weight for the gradient value to increase the ability of the algorithm to predict results. However, RNN has a vanishing gradient problem so that the algorithm cannot update the weight because the gradient value is very small and this leads to a poor result. The LSTM algorithm is used to solve the vanishing gradient problem in RNN and increases the ability of model learning. LSTM includes a memory block that consists of an input gate, a forget gate, and an output gate to increase the ability of the model to learn data and predict results. The results in [40] show that the proposed method detects the labeled attacks with high accuracy. However, the proposed method is trained with a small sensor dataset and it should be extended to train whole the sensor dataset in the SWaT testbed.
The authors in [41] adopts Autoencoder neural networks (AE) algorithm that includes an encoder and a decoder, where the encoder encodes and processes the input data to reduce dimensionality and the decoder reconstructs data from encoded data and obtains a small reconstruction error measured from all data. If the reconstruction error is larger than the small reconstruction error, an anomaly occurs. The proposed algorithm in [41] can detect all the labeled data, but the performance is decreased when all the SCADA system is compromised.
Most of the WDS adopt a single site Event Detection System (SSEDS) to detect contamination based on sensor readings. The authors in [46] develop a Multi-Site Event Detection System (MSEDS) method to detect contamination for all the datasets of WDS and not only for the sensor dataset. The proposed method simulates the quality of water-based on six characteristics: ''Chlorine, Electric Conductivity, pH, Temperature, Total Organic Carbon, and Turbidity''. Then, the method calculates the errors between sensor readings and the predicted values of the simulation model. Moreover, the method classifies these errors into normal or abnormal based on thresholds that determine the six water quality characteristics. The method focuses only on water contamination and improves the detection of contamination, but it does not study other attacks and detection.
The authors in [47] propose a smart fuzzing method to identify network attacks that spoof the sensor readings and command actuators over the network. They adopt two machine learning algorithms, LSTM and Support Vector Regression (SVR), to train data set and classify datasets based on thresholds of fitness functions. If the predicted value is greater than the threshold of a fitness function, an attack occurs. The results show that the proposed method can identify 27 attacks, but the performance is slow. Moreover, the method does not detect attacks other than network attacks.
The authors in [48] propose a method to detect DDoS attacks and test on a dataset of smart water distribution plants. The method adopts the RBM model to reduce the dimensionality of data and uses four different algorithms to classify data into normal and abnormal data. First, they determine 10 types of DDoS attacks based on a k-means clustering algorithm. Then, they apply the one-layer RBM model to four algorithms which are FFNN, FNNN automated, rain forest, and support vector machine to classify data. Then, they apply the two-layer RBM model to the four algorithms and repeat the same process until the RBM model reaching to 5 layers. The results show that the FNNN automated algorithm has the best accuracy to detect attacks. Furthermore, the performance is the best when the RBM model has one layer and with the increase of the number of layers in the RBM model, the complexity is increased, causing the decrease of the performance of detecting attacks.
The authors in [42] design some jamming attacks which block the communication channels to disallow the communications between PLCs and the physical process. The goal of the jamming attacks is to control or damage a SWaT testbed.
The results in [42] show that the SWaT testbed is responded to and had negative impacts on water overflow. Furthermore, the authors in [42] provide a simple detection method to detect attacks by comparing the measurement values with their properties. However, the detection method needs to be improved as future work to enhance the security of both the physical layer and the network layer.
A competition called BATADAL was organized in a conference in California in 2017 to compete for attack detection algorithms to detect cyber-physical attacks in the WDS [21]. The authors in [21] summarize various detection methods of seven team participants at BATADAL and evaluate them based on metrics such as detection time and the ability to identify the compromised components. The first team adopts a method that extracts features based on calculated mean and covariance and then adopts a rain forest algorithm to classify data into normal and abnormal data. The second team adopts district metered areas to reduce the dimensionality of data and then adopts recurrent neural networks to classify data and to predict the attacks. The third team proposes a method that first checks the integrity of rule operations, and then classifies data based on a deep neural algorithm called convolutional variational autoencoder. The fourth team proposes a method that first checks the integrity of SCADA data and actuators' rules, and then adopts an optimization algorithm to minimize the computation time. The fifth team proposes a model that has three layers to detect attacks, including a layer using an outlier detection method, a layer using an artificial neural network algorithm to classify data, and a layer to determine the anomalies based on principle component analysis. The six-team proposes a method with three modules to check control rules, integrity of data, and the relationships among components of WDS. The seventh team proposes a model based on EPENANT to simulate a WDS and compare the data of the actual water system with the simulation model to detect the attacks. The results show that all the teams can detect cyber-physical attacks, but the seventh team won the competition by achieving the best performance overall. However, all the proposed algorithms are trained based on the medium size of a WDS and the large size of WDS should be considered in the future.

B. COMPARISONS
We compare existing detection methods as shown in Table 3. We also provide a general comparison of detection methods based on different selection criteria as shown in Table 4. We compare some of them as follows.
These papers present in-depth researches on the impacts of cyber-physical attacks on water distribution systems (WDS). The objective of these studies is to detect cyber-physical attacks by identifying anomalous behaviors in the WDS. Table 1 shows the different types of cyber-physical attacks among several papers that disrupt the services and compromise the main components of WDS. The paper [13] proposes all the cyber-physical attacks while the papers [30], [34],   [35], and [38] only use some of the attacks. The paper [30] adopts only four types of attacks and the paper [34] adopts 6 types of attacks. Moreover, both the paper [35] and the paper [38] adopt the same types of attacks that aim to compromise the sensors and actuators. Also, both the paper [13] and the paper [34] adopt the same types of attacks that aim to compromise the sensors, actuators, PLCs, communication links between sensor and PLCs, communication links between actuators and PLCs, and Communication links of multiple PLCs. We observe from Table 1 that only the paper [13] defines SCADA attacks and attacks on communication links between PLCs, which can disable completely the WDS. Table 1 also shows that several papers adopt the same types of attacks that target compromise sensors, actuators, communication links between actuators and PLCs, and communication links of multiple PLCs. Table 3 compares several papers and their details. The paper [13] focuses on building an attack model by using a simple method without providing a detection solution. However, the attacks model can be used in the papers [30], [34], [35], and [38] that aim to develop anomaly detection algorithms to detect cyber-physical attacks. Specifically, The paper [13] creates nine classes and identifies the features for each attack including the action, the start time, and the end time. The paper [30] adopts 3 layers to detect attacks by using the statistical method, the ANN model, and the PCA method, sequentially. The paper [30] also shows that the statistical method and the ANN model are used to detect attacks from individual sensors while the PCA method detects attacks from multiple sensors at the same time in a short time. The paper [34] adopts Actuators Rules Verification algorithm, Data verification, and an optimization algorithm. The paper [34] also shows that the optimization algorithm VOLUME 9, 2021 minimizes the detection attack time. We observe from Table 3 that the paper [35] adopts the RBM model and the FFNN model to detect attacks. All of the RBM model, the FFNN model, and the ANN model are artificial neural networks. However, the main difference between the RBM model and the FFNN model is that the RBM model includes only one input layer and one or more hidden layers, while the FFNN model and ANN model include one input layer, one or more hidden layers, and an output layer. The FFNN model is one type of ANN model and both of them can use linear or nonlinear transformation, but the RBM model is always nonlinear transformation. The RBM model must convert real data to binary data which causes loss of information, while the FFNN model can receive binary or real data. The RBM model adopts to learn a large number of features and it has an energy function to reduce the energy. However, it cannot classify the data. The FFNN model can learn features and classify the data, but the accuracy is less than the RBM model. Table 3 also shows that both the PCA method and the RBM model are used to reduce the high dimensional data but the RBM model is better and faster than the PCA method in the dimensionality reduction. The paper [38] adopts a control rule and consistency module, a pattern recognition module, and a hydraulic and system relationships module that works together to identify all attacks. Table 4 shows a comparison among several papers based on different selection criteria. The authors in [13] propose nine types of attacks and show the importance of developing detection algorithms. As we explained earlier, only the paper [13] does not provide a solution to detect attacks, while others papers develop an algorithm to detect cyber-physical attacks. Table 4 also shows that the paper [30] first adopts a simple statistical method that can easily calculate the results. However, there is a probability to detect false outliers. Then, the paper [30] also adopts the ANN model that is appropriate to learn large numbers of features, but it can suffer from the overfitting problem that reduces the reliability of the model to detect attacks. The overfitting problem occurs when the model is trained with a large number of data and learns noise so that the accuracy of the predicted results reduces. To avoid the overfitting problem in The paper [30], the PCA method is adopted to reduce the number of features to the most important features. However, there is less risk of some information that affects the prediction results. The paper [30] has a false alarm for a long time. The paper [34] shows that all the actuators' rules and data verification are simple algorithms and cannot identifies the complex types of attacks. The actuator rules algorithm can identify all the eight label attacks, while the data verification algorithm identifies six label attacks. The paper [34] targets to extract low dimensional data sensors completely from SCADA and to minimize the computational time data by adopting an optimization algorithm. However, the optimization algorithm cannot extract all the low-dimensional data sensors. The paper [35] adopts the RBM model to increase the accuracy to detect attacks. However, the RBM model must convert real data sensors to binary data which causes losing information. Moreover, The paper [35] adopts the FFNN model that is suitable to learn a large number of features, but the overfitting problem still risky for the FFNN model. Also, the FFNN model is memoryless and forgets the learning features after many stages in the training data. The authors in [38] adopt three modules to detect attacks, but it has a large time overhead. The paper [34], the paper [35], and the paper [38] can identify all the compromised components, while the paper [30] identifies some of the compromised components.
In summary, integrating physical water infrastructure with cyber systems exposes the WDS to cyber-physical attacks. As we indicated above, there are some cyber-physical attacks and detection algorithms available. However, many detection algorithms are still not optimal for identifying all the attacks. Furthermore, there are inadequate automated monitoring and reporting that allow appropriate responses in a short time to detect or mitigate attacks.

V. FUTURE RESEARCH DIRECTIONS
In the above sections, we observe the following conclusions. WDS is a critical infrastructure and we cannot continue normal life without water. However, Cyber-physical attacks are big challenges and can cause severe impacts on the security level of WDS. We summarize many cyber-physical attacks and their impacts on the WDS and many detection methods. We observe that the number of attacks is still limited and we believe that in the future, more attacks will emerge. Furthermore, the existing detection methods are not optimal and most of them could not detect all the attacks. In the future, there are many research directions that people (including us) can work on them.
• First, designing accurate, better detection methods is important. Most of the current methods adopt machine learning or artificial neural network algorithms. However, high dimensional data and ensemble training data set have some drawbacks of these approaches. Moreover, ensemble training data set can affect the accuracy of the algorithms to identify the attacks especially in the detection methods that adopt machine learning or artificial neural networks. Enhancing the data training process by reducing the false-negative rate is still a big challenge. Also, hybrid methods using the advantages of different methods deserve better studies. Furthermore, since all the surveyed papers cannot detect all the attacks, designing a comprehensive detection method to detect all attacks is deserved more studies.
• There are many problems such as heterogeneous network authentication, access control, privacy protection, key management, information storage, and security devices that should be studied in the future. Furthermore, Modern WDS not only exchange data over networks but also store and process these data in central systems such as SCADA systems. We observe that many attacks target SCADA systems to control completely the water system [15]. Proper cryptography tools are needed to be deployed in proper locations in the WSD. Where and how to deploy these tools in the WDS need careful studies to prevent attacks.
• Third, Internet of Things (IoT) with other smart devices empower the WDS by allowing transparency to the processes in the water supply chain, ensuring real-time monitoring, and automatic processing. However, the WDS becomes more vulnerable to attackers due to the limitation of resources. Both security and privacy are still major problems especially in IoT devices due to their attributes such as the low battery, small memory, small capability, lack of regulations, etc. The privacy violation reduces the level of acceptance of the water system generally. The major aim of privacy violation is to expose sensitive information to unreliable and undependable parties after achieving the feat of getting access to the data by hacking it through the unguided portion of the IoT system. For example, user information on water consumption can be exposed when a smart meter reads the water consumption by the user several times. Also, the SCADA system can be kept in advanced technology like cloud and fog to process the consumption data and store that personal information, financial information, and produce the bill. However, security and privacy are also still challenging and there is the probability to expose and breach the dataset of water consumers. Furthermore, some of the smart water devices have their geographical location accessible. Thus, an attacker may effortlessly get unrestricted authorization to the device to steal the information. There isn't a universal security standard for IoT technologies. An appropriate level cryptographic algorithm should be designed for the WDS that can be suitable with the limitation of new smart technologies such as low power, small storage, small size, etc.
• Fourth, most of the attention has been paid to quantify the impact of attacks on the WDS. However, a little resilience study is achieved to build WDS with more resilience and recover faults and attacks. Faults can occur due to failure or faults of main components such as sensors, PLCs, SCADA systems, etc. The failure might trigger many issues such as pipe breaks, tank failure, pump outage, and valve locking. The current WDS deploys redundant software or hardware to recover faults, but this is very costly. WDS should be designed to be more resilient and fault-tolerant when faults and attacks happen. Furthermore, WDS is a real-time system and should be operating continuously even after a fault occurs. It is important to develop fault/attack tolerance techniques that can increase resilience, isolate the faulty/attacked components, and recovery from failures/attacks to minimize the bad effects. Deep learning algorithms can be adopted as fault-tolerant solutions and more studies should be conducted to increase the ability to determine exact locations of faults/attacks and to provide mitigation and recovery.
• Finally, current attacks on water systems are limited and are mostly similar among different papers. Designing novel attacks on water systems are also needed to test the security and robustness of the systems. Collaborated attacks, multiple-factor attacks, and anti-detection attacks can be designed on water systems as a future research direction. In summary, the WDS can be enhanced by prevention, detection, mitigation, and recovery mechanisms. Both secure software and hardware need to be considered. Cryptographic algorithms, security network/ communication protocols, cyber-physical system designs, device-level security (such as IoT security), etc., should be considered. There is still very limited research on security for WDS in the literature as indicated in this paper. There are a lot of research topics that can be carried out in the future.

VI. CONCLUSION
A Water Distribution System (WDS) is a critical infrastructure for providing high-quality water services by transferring clean water to consumers and recycling the dirty water back. It operates based on the main components, such as pipes, tanks, sensors, PLCs, SCADA systems, etc. However, WDS is prone to cyber-physical attacks that impact the basic operation and disrupt the service. Therefore, it is necessary to develop and implement solutions to increase the security of the WDS by preventing, detecting, mitigating, and recovering cyber-physical attacks. In this paper, we provide a comprehensive survey that shows the impacts of cyber-physical attacks on the WDS and we survey the common detection methods that mitigate the negative impacts of cyber-physical attacks. We analyze them with details, especially on methods, evaluation results, and limitations. We also compare them based on different evaluation criteria and provide future research directions. We realize that there is a shortage in the security research of WDS and we hope that our paper can trigger more research to increase the security of WDS against these attacks and threats.