A Review on Current Technologies and Future Direction of Water Leakage Detection in Water Distribution Network

Water leakage in the supply system is a silent problem that costs billions of dollars yearly. As these supply pipes are mostly underground, this leakage remains undetected for a long time. In 2019, Liemberger and Wyatt estimated an annual loss of thirty-nine billion dollars due to water leakage in the supply pipe. In this systematic review, we have analyzed forty-seven articles about water leakage detection and location research. The aim is to find the new technology, trends, and possible direction in this research field. We have derived four research questions. The first question was about how the research evolved over time, and we have observed that researchers focus more on experimental data collection, ML algorithms, and IoT technology. The second question was about the sensor the researcher was using. The most popular sensors researchers have used are: vibration, acoustic, and flow sensors, as they are cheap and easy to install. We can also see some novel applications of image and optical fiber sensors. The third question was about the trend in the algorithm. ML and threshold-based algorithms are dominating the field. The fourth question was about the communication technology trend, and WIFI, cellular IoT, and LoRa technology are leading the space capturing 80% of the research.


I. INTRODUCTION
Water leakage in the supply system is a silent problem that costs the world billions of dollars each year. A good portion of the water supply pipes are underground, so leakages remain unnoticed and undetected for a long time. Water is a precious natural resource, and according to Liemberger and Wyatt, the estimated cost of annual water loss is 126 billion cubic meters per year and is conservatively valued at thirty-nine billion dollars in 2019 [1]. Clean potable water is a fundamental human right, but even today, around half of the world's population, or 3.6 billion people, suffer from water scarcity at least once a month every year, and by the year 2050, this number is expected to be around 4.8 to 5.7 billion people [2].
In 2016, the World Energy Outlook estimated that 4% of global electricity consumption was by water in 2014. Of this consumption, 60% of it was for the extraction and distribution of water. If all the countries reduce water supply leakage to less than 6%, then the energy savings equates to 130TWh, the entire energy needs of Poland. The EU has an average of 24% water leakage, the USA has 12% water leakage, and Australia has 10% water leakage [3]. This water loss has a chain reaction effect on the economy and public health. Long-term water leakage can cause structural damage, cause sinkholes, and even cause disease-causing pathogens to contaminate the water supply system. Due to water loss, the revenue decreases, and eventually, the water cost goes up, and the service goes down. This leakage also causes the water quality to deteriorate.
The authorities regularly perform water audits, interventions, and performance evaluations to minimize water loss in the system [4]. Water consumption and use can be categorized into authorized consumption, real losses, apparent losses, and non-revenue water [4]. Authorities try to find the reason for ''real losses'' that are mainly leakage loss in the transmission line, service connection, and overflow from storage tanks. A water audit provides information about the portion of the system that is losing water. Further information is gathered after the water audit to detect and locate leakage. Then the leakage is repaired or the pipe is replaced. Finally, the network's performance is monitored to make sure the major loss points are fixed. The problem with this approach is that it is time-consuming, and labor intensive.
In [5], [6], and [7] researchers have summarized the fundamental way the WDS is set up, the cause of leakage, and the current technologies that are used to detect and locate the water leakage. Water leakage detection and location work can be divided into three parts. The first one is data capture; the second part is the data processing and finally alerting the authority if there is any issue detected. There are a range of sensors to capture acoustic, vibration, flow, pressure, and temperature data. There are lot of ways to use these sensors for leakage detection work. The simplified method is the listening stick. It probes the ground to detect the leakage sound and locate the leakage sound based on the ''loudness'' of the sound [8]. Usually, the detection work was entirely based on human experience and with the development and improvement of signal processing algorithms and cheap computers, now researches are developing automatic leakage detection techniques [9]. There are two major way the signals are being processed in this field. The first one is threshold based and the other one is machine learning based. Machine learning (ML) based techniques are getting traction because of their flexibility and accuracy. One of issues of ML based techniques is that, it requires a lot of computational resources to run ML models. In 2016, Obeid et al. reviewed water leakage technology based on micro-controller, digital signal processor, FPGA, and ASIC. The author suggested that a custom SoC (System on Chip) design is ideal for high-performance and low-power leakage detection nodes [10]. Previously, ''leakage detection survey'' was the only way authority can detect, locate and fix leakage but it is expensive and wasteful. Researchers have developed noise loggers with custom radio, hardware and protocols but with the advent of standardized IoT, it is becoming easy for the researches to integrate IoT systems into their research [11].
This research focuses on sensor-based leakage detection solutions that use IoT, AI, and ML. The remainder of the article is arranged in the following fashion. Section II provides a background and history of leakage detection methods, their merits and disadvantages. Section III describes articles selection process, their sources and data assembly process. In section IV, articles are organized and a systematic review performed to assess the article quality. Finally, the data is analyzed based on sensors, signal type, experiment type, experiment outcome and methodology. Section V addresses the research questions, and lastly, we have made a conclusion in section VI.

II. BACKGROUND
An underground leak can be discovered in various ways. Researchers have classified these techniques considering a wide range of criteria. Chan et al. have classified these systems as active or passive systems [7], Adedeji et al. [6] have classified them as internally and externally systems, and Ismail et al. have [12] classified them as their historical appearance. We have categorized these approaches as invasive and non-invasive in this research. The way the sensor is mounted to the pipe determines this categorization. If the sensors must be placed within the pipe, this is an invasive method. These are invasive procedures, such as using a hydrophone within a pipe to collect audio data or using a flow meter between pipes. Vibration sensors, on the other hand, can be mounted to a pipe, making it a non-invasive procedure. Techniques that use non-invasive methods are less expensive to implement. The purpose of providing background information on leakage detection methods is to provide the reader with the context in which this literature is written. A review of classical approaches will aid the reader in comprehending their algorithms, the challenges that researchers are encountering with each method, and the advantages and disadvantages of each technique.
The acoustic method is the most primitive method. Listening sticks, vibration sensors, and hydrophones belong to the acoustic method. Most of the early leakage detection methods are based on acoustic methods. Then there is the vibration sensor. When there is a leakage, generally, it produces noise, and vibration sensors can pick up the noise. Accelerometers, piezo transducers are examples of vibration sensors. The most modern non-intrusive technique is the ground-penetrating radar (GPR). GPR was developed mainly for understanding geologic materials [13]. With an experienced operator, it is possible to locate leakage using GPR. It is a very accurate leakage detection tool but is very slow. Recently as the thermal camera or infrared camera (IR) attachments for mobiles are getting cheaper, academics are trying to explore this field. Some research has been done to detect leakage using the thermal camera, but there is a lot of room for improvement. Flow sensors work based on the conservation of mass principal. The input volume of water and the used water have to be equal. Due to the loss in the system, the input and measured output are different. Pressure sensor sensors are generally installed on every water supply network to make sure the WDS (Water Distribution System) does not cross a certain pressure; otherwise, pipe will burst. These pressure sensor data are used in different models to predict leakage location. The optical method is the newest technology, but the optical fiber is very expensive. If optical fiber becomes cheaper, we might see water supply pipes embedded with optical fiber in the future. Finally, there is the noise logger. It can accept a variety of sensors. The noise logger listens to the acoustic or vibration noise. If the noise logger detects any anomaly, then it alerts the authority. To avoid the influence of environmental noise, these loggers generally operate at a predetermined time at night. The rest of the section explains all the established technologies in detail, as shown in Figure 1, to give the reader the context of the literature review.

A. LISTENING STICK
The oldest and the simplest way of detecting leakage is to use the listening stick. It is like a stethoscope but for the ground. First, the location of the underground pipe must be marked then the operator tries to get the loudest leakage sound by probing the surface. It is best suited for metallic pipes between 75mm to 250mm and with pressure above 15psi. One major benefit of this method is that irrespective of the pipe material or size, the user can pinpoint the leak from the surface [8]. The drawback of this method is that the accuracy of leakage detection depends on the expertise of the operator. In the case of weak sound, the operator may fail to detect a leak. Background noise can also interfere with leak noise, causing a false alarm. This method can be improved via signal processing, noise filtering, and using advanced computational algorithms [9].

B. VIBRATION SENSOR
When water flows down the pipe, the water molecules alongside with the flow and very few particles collide with the wall of the pipe causing minor vibration. When there is a leakage in the pipeline, the water pressure tends to equalize with the outer pressure. This creates a water jet and turbulent flow, causing perpendicular vibration to the flow. This vibration can be used to detect and locate leakage in water pipeline. Micro electro-mechanical systems or MEMS sensors are miniature 3D structures fabricated from silicon using deposition and etching techniques. The movement of the nanostructures enables them to detect movement. They are used as accelerometers, strain gauges, microphones, air mass flow sensors, pressure sensors, gyroscopes, yaw-rate sensors, compasses, hydrophones [14], [15], [16]. MEMS accelerometers are being extensively used in leakage detection. Ismail et al. have only focused on accelerometer sensor-based technologies, their costs, and their accuracy [12]. Ismail in his research has investigated the impact of considering vibration on all three axes of an accelerometer in water leak detection. A few widely available cheap accelerometer IC with several leak diameter has been compared. Considering the time-domain graph and Fast Fourier Transform (FFT), a relationship between leak size and z-axis vibration has been established [17]. According to the researcher, ADXL345 is suitable for several leak scenarios. Similarly, Yazdekhasti, Sepideh, et al. have used an accelerometer is also used to detect and locate leakage [18].

C. GROUND PENETRATING RADAR (GPR)
Ground Penetrating Radar (GPR) works by transmission and reflection of radio waves. These waves vary between 10 MHz to 2.6 GHz. GPR uses a directional antenna to introduce radio waves into the ground. When the radio wave passes through different material with different dielectric 1 properties, they generate different types of hyperbolae or arc-like reflections. It is one of the non-destructive techniques available for leakage detection work. GPR shows the amplitude change and frequency shift of the radio wave. With this information, an experienced operator can identify the shape and depth of a buried object. As the GPR passes through rock, boulder, buried pipe, electric cables or anything different than that of the surrounding material, it shows up the difference on the screen [19]. There are a few benefits to this method. First, it is a non-invasive method, so nothing needs to be attached or inserted into the pipe. There are a few types of devices, for instance, low-frequency devices have a higher depth penetration but have a low vertical resolution. Highfrequency devices have a low ground penetration range but have higher vertical resolution and accuracy [20]. One of the limitations of GPR is that the units are very expensive. It is slow in operation and the operator needs to know the pipe location otherwise it becomes difficult to cover a large area. The accuracy of leak detection depends on the operator's experience. The radio wave interacts with different materials differently. For example, the dielectric signature of dry soil 1 Dielectric constant is the ratio of electric field storage capacity of a material to that of a material to that of a free space.  is different from wet saturated soil. Similarly, the subsurface condition also impacts the reading of the device like the bare ground, pavement, or concrete slab all of them have quite different characteristics [20]. Lai, Chang, and Sham have conducted a blind test to test the effectiveness of a GPR in detecting a void of pavement or ground surface. The test result was not consistent as there were many false-positive cases. It may be because the effectiveness of GPR is heavily dependent on the operator [21]. Researchers are also developing different algorithms and mathematical models to identify the reasons for the false-positive results and eliminate this phenomenon. Demirci has used the near-field back-projection imaging algorithm and concluded that the homogeneity of the medium plays a vital role in the output. According to Demirci, the soil's void can sometimes also raise a false alarm [22]. The limitations of GPR are recognized by researchers, mainly the interpretation of the GPR images to locate leakage. Ocaña et al. have tried to identify and extract feature GPR images and produce a 3D model to visualize them quickly. They have used a variance filter to characterize the anomalies. In a controlled lab experiment, the researchers were able to identify the leak as well as the wet zone surrounding it [23].

D. INFRARED (IR) CAMERA
Infrared (IR) thermography is a promising field, but the application of IR thermography in water leakage detection has not gained high traction. The earliest use of the thermal camera in leak detection can be traced back to 1980. Eidenshink [24], Weil and Graf [25], and Weil [26], who have discussed leak detection trials in the 1980s. Their study could detect buried pipes, erosion, and void surrounding them. Alaa Al Hawari in his paper titled ''Non-Destructive Visual-Statistical Approach to Detect Leaks in Water Mains'', used IR images and statical analysis to identify leaks in the pipe on the ground surface. This method used the normal distribution curve to predict the leakage location [27]. For buried pipes, Peter and Jayantha conducted a study, where they wanted to understand the reliability of detecting leaks in buried pipes using infrared thermography [28]. They have focused on small diameter buried pipes (i.e., around 100 -400 mm) with depths ranging from 0.8 to 1.2 m. Carreño has tried to detect buried pipes using thermal images and data mining [29]. A simple experiment to showcase the effect of different resolution thermal cameras in leakage detection has been presented in [30]. Penteado in 2018 has experimented with the digital image processing method with thermal images. Particularly, the authors have used the q-sigmoid function to process the images. Their experiment was lab-based, and in their investigation, they could separate the leakage location from the other areas. The researchers have included both sandy and clay soil for the experiment, and at 10cm pipe depth, this result can be used for garden or backyard leak detection [31].
Fahmy and Moselhi [32] have conducted extensive research in Montreal (Canada) about the factors that affect the applicability and limitation of IR cameras in leak detection. The researchers have developed a model that was able to detect a leak in the fall and spring but failed in summer and winter. The model was not generalized. It was specifically designed for pavement. They have compared their findings with the acoustic method, and the minimum error was 1.01 m to a maximum of 2.30 m [32].
Atef et al. in their researcher have conducted more thorough research on leakage detection using IR images and ground-penetrating radar (GPR) [33]. Their research included simulated leaks as well as verifying actual leak data. They have verified their result with ground-penetrating radar (GPR). They have used a seeded region-growing algorithm on each IR image to differentiate the leakage and non-leakage area. The centroid of the leakage area is then calculated using Green's theorem to pinpoint the leakage. They have claimed their method can achieve accuracy up to 97% by verifying their claim with GPR reading.
From the research discussed above, it can be seen that the researchers have used the manual method to differentiate the leakage. Some researchers have used simple algorithms, but they are not fully automated. The number of scenarios the researchers have considered is also very limited. All the researchers have used a one-dimensional approach, which means using only IR images. Most of the researchers have taken a single shot of the location apart from Bach and Kodikara [28], they have taken multiple shots of the same area and considered the temperature gradient. There is a potential to use RGB images and AI to differentiate the scenario and conditions.

E. HYDROPHONE
Hydrophones are listening devices that are designed to be used underwater. These devices are placed into the water at convenient fitting such as a fire hydrant or another outlet along the pipeline. This device is particularly useful for the environment where there is high background noise. In the plastic pipe, the sound energy absorbs into the elastic pipe material, but the water-borne waves travel a large distance. In this scenario, the hydrophone is a good choice [8]. One of the benefits of using a hydrophone is that it can be connected to a fire hydrant. As there are fire hydrants at regular intervals in a city, it is easy to connect and disconnect a hydrophone for regular leak detection check-ups [34]. Khulief et al. have investigated the feasibility, potential, and limitation of in pipe hydrophone [35]. They have observed that the frequency band of the leak acoustic signature depends on the leak size. They have also observed that on the downstream side of the port, the acoustic energy of the leak signal drops to a lower value. They have also observed that the leak signal becomes noticeable for line pressure above one bar.
Sadeghioon et al. have explored the possibility of deploying smart wireless devices to detect leakage. As smart wireless devices are meant to be connected to the pipe for a long time, and water can corrode the hydrophone sensor. To counter this problem, they have used the relative pressure sensing method based on a force-sensitive resistor (FSR) [36]. One of the major hurdles of the sensor network is the cost of the sensor, particularly in this case, the cost of the hydrophone. Due to the emergence of microelectromechanical sensors (MEMS), researchers are investigating the possibility of using cheap sensors in the systems. Xu, J., et al. have investigated the potential for low-cost MEMEs hydrophones. They have used a custom fabricated Mo-AlN-Mo 2 piezoelectric stack. The results look promising on equivalent noise density vs frequency curve. The device has an acoustic sensitivity of −180 dB and a bandwidth of 10Hz to 8Khz [14].

F. NOISE LOGGER
Noise loggers or acoustic loggers are programable data loggers. Different types of sensors can be attached to it. Generally, they are deployed in large numbers at underground valves, generally 200 to 500 meters apart. To have the minimum background noise, they are programmed to operate between 2 a.m. to 4 a.m. Every night, they start at their specified time and listen to the sound. If there is any sound, it transmits an alert to a receiver [34]. Noise loggers can store data in the memory for further development of the algorithms. Noise loggers have their drawbacks too. Zahab et al. have pointed out that the city of Montreal was recording false alarms from the installed noise loggers [37]. To improve the detection accuracy, the researchers have incorporated Decision Tree, Naïve Bayes, Deep Learning, and genetic algorithm to develop an augmented approach. The researchers were able to reach a hundred percent accuracy in distinguishing leaked noise from non-leaked noise [37].

G. FLOW SENSOR
Flow sensor is the simplest way to detect leakage in a WDS. It is based on the conservation of mass theory. The summation of input volume of water will be equal to the output volume of water. Rahmat et al. has used the simplest algorithm to implement a leakage detection system based on flow sensor [38]. As there are measuring tolerances in a flow meter, implementing the system in real life is difficult. In [39], researchers have used city of Harare's water utility data to train a ANN model to predict water consumption. The deviation from predicted consumption and actual consumption have been used to predict leakage in a section. Similarly, in [40] researchers have used fizzy logic to detect leakage in the a WDS. Leakage location using flow sensor is a bit challenging and Narayanan et al. have used network structure and static properties to detect leakage in WSN [41].

H. PRESSURE SENSOR
Pressure sensors detect water pressure in a network and usually installed at an interval to monitor the optimum pressure of the network. If the pressure is high then there is a chance that pipe network could burst and low pressure could cause problem to the household users. Usually these are used injunction with the pump to maintain a steady flow to the end user. To detect leakage events in water distribution networks, [41] provides an optimization methodology based on a hybrid information-entropy approach (WDN). Optimization-based approaches are commonly used in the literature; nevertheless, they are limited by time-consuming processes. As a result, to reduce the computing cost, researchers exclude sections of the choice space. To explore the whole choice space, this paper represents an information theory-based strategy that uses Value of Information (VOI) and Trans information Entropy (TE) approaches in combination with an optimization model. In [42], researchers have used genetic algorithm and simulated data to develop a model NSGA-II that requires minimum number of sensor nodes and minimum time to detect a leakage. An issue with pressure sensor is that it cannot detect minor leakages as small leakage does not cause considerable pressure change in the system.

I. OPTICAL FIBER
A fiber-optic sensor can be used in several ways to detect a leak in a pipeline. When a leak occurs, the temperature, the pressure of the fluid changes, and the hoop strain of the pipe wall also change. Liang and his team have used distributed optical fiber technology to monitor pipeline corrosion and leak monitoring [43], [44]. In their research, they have used optical frequency domain reflectometry to measure hoop strain. When a leak occurs, the hoop strain slumps rapidly. Yang and his team have focused on multiple leak detection along the pipeline using optical fiber [45]. Similarly, when a leak occurs, there is a temperature change along the pipeline. By scanning the temperature profile of the entire length of the pipeline, the researchers pinpoint the anomaly location. With this technology, a 55km long pipeline can be monitored in under ten minutes [46]. These distributed temperature sensors are also used in dikes for leaks detection as well [47]. Benefits of using fiber-optic sensors include immune to electromagnetic interference. As these fibers do not conduct electricity, so they can be deployed in volatile gas pipelines. Can survive a harsh environment [48]. One major weakness of this technology is that currently, it cannot be used in branched networks, and the price of fiber optic is very high J. COMPARISON OF TRADITIONAL TECHNOLOGIES So far, we have talked about nine common technologies used the leakage detection. After considering all of the factors of the traditional methods, we can see that each method has pros and cons. We have made a comparative analysis of the described methods, and a relative comparison of leak detection methods is presented in Table 1.
In this section, we have discussed how researchers use sensors to detect leakage. We can observe that leakage detection research is evolving. IR, acoustic and vibration-based methods are getting more traction, and a move toward noninvasive-based techniques is observed. Initially, the leakage detection methods were completely based on experience. Then we have seen algorithm and threshold-based methods, and currently, a push toward ML-based algorithms can be observed. In ML-based techniques, researchers can use different kinds of water supply data and fuse them together to get higher accuracy leakage detection methods. In particular, researchers consider different feature extraction and diurnal patterns in the leakage detection algorithm. Researchers are exploring different communication technologies to make leakage respond quicker to minimize water loss. After discussing all the previous work, we believe there is still room for improvement. Further research is required to understand the research gaps and develop an efficient leakage detection method. In the next section, we have used a systematic review of the papers to explore leakage detection in detail.

III. RESEARCH METHODOLOGY
In this chapter, water leakage and leakage location-related studies have been identified. A systematic review was conducted to have a clear understanding of the research trend and interest in water leakage detection and where the academic community is moving. It also reveals the research opportunity in this sector. We have limited our systematic review to water leakage detection research and its technologies, and for this, we have developed four research questions. According to the research questions, we have searched the literature.

A. RESEARCH QUESTION
The purpose of this investigation is to provide answers to the following questions: ACM Two reviewers worked on the project, and a search of all databases yielded 3892 results. They returned many non-related results due to a lack of advanced search capabilities in academic repositories. As a result, just the first 200 most relevant results from these databases were included.

C. EARCH QUERIES
We have used the following strings to search the articles. All the repositories returned a lot of unrelated articles. We went through the title, abstract and conclusion before downloading the articles. When searching the academic databases, we have used some search quarries and the list of the queries are given below:

D. INCLUSION AND EXCLUSION CRITERIA
After the electronics search results, the carefully specified inclusion and exclusion criteria were used to refine the results. The list of inclusion and exclusion criteria are listed in Table 3. All the studies that meet the inclusion criteria and exclusion criteria were downloaded.
We have recently explored the online repositories. The ''leakage detection'' covers a board range of literature like, industrial leakage detection of fluids, data leakage detection in information technology, and even blood leakage detection in the human body. So, the query was intentionally made to get results related to water leakage technology, VOLUME 10, 2022  algorithms, and the use of IoT in water leakage detection. Table 2 summarizes the selection process of literature. A total of 9892 papers, including journal and conference papers, are selected. Due to the lack of an advance search option, many non-related articles also showed up. As a result, we only included the first 200 most relevant results from these datasets.
A total of 853 articles were included in the dataset. All the duplicated articles are removed from the list. Due to access issues, we cannot access 29 papers, and these items are removed. Then the titles and the abstracts are read, and the ones that are irrelevant to this research are removed. An emphasis was given to journal articles than conference articles. If the literature aligns with this research, it was added to the literature database; otherwise, the paper was discarded. This bought down the number of articles to 211. The articles with no new ideas, no implementations, or have very basic and generic information been removed. Finally, the full text was analyzed, and it reduced the articles to 47. The remaining articles were thoroughly studied and used to answer our research questions.

E. SELECTION OF DATA
The data were assembled to answer the research questions. The steps are shown below: Step 1: In the first step, the selected literature was classified according to its content. Selected papers are listed in Table 5. along with publication year, type of publication, Google Scholar citation number, source, if the research paper is based on simulated data or experimental data, and the journal name.
Step 2: In the second step, the research papers were summarized. The sensor it uses, the type of data it collects, the methodology it follows, the outcomes it produces, and the wireless technology it uses.
Step 3: Finally, the IoT components like the sensors and the communication technology and protocol are compared in detail.

F. QUALITY ASSESSMENT FOR SYSTEMATIC REVIEW
The quality of the papers is appraised using two ways to guarantee that they are trustworthy and appropriate for this systematic review. The first method is evaluation questions (QE). This questions are adopted from previous SLRs [49,50] and systematic reviews [51]. Each of the question is answered by no (0), partially (1) and yes (2). Each reviewer can provide a score of 0 to 2 to each of the five questions.
The papers are assessed with quality evaluation questions to find suitability for systematic review. Two reviewers scored the papers with five questions. Each question's score is a two-mark cumulative score, with a maximum score of four. Highest grade a publication can get is 20. Both reviewers agreed that a cumulative score of 0 to 10 would be regarded as a failure, requiring the article to be removed from consideration. A pass is a score of 11 to 20, indicating that the work is extremely suitable for inclusion in the review. Table 4 summarizes the quality rating of the publications included in the systematic review.
A total 47 papers have passed the threshold mark. Table 4 shows a snap shot of the papers that passes the quality assessment tests. From figure 3, we can see that there is almost an equal distribution of assessment score. Further details of the selected papers are discussed in the section IV.   Table 5 contains the basic information about the publications. This information includes reference ID, author's name, year of publication, type of article, number of citations, publisher name and journal name. All of the articles are published in peer reviewed journals or conference. Figure 4 shows the number of articles published per year. It is clear that the overall interest in the research community is increasing year over year. Research dipped in 2021, probably because of the worldwide pandemic. In this review, we have primarily tried only to include journal and peer-reviewed articles followed by conference publications. Figure 4 shows detail of the bibliographic overview. Out of 47 publications, 8 were conference preceding and the rest of them were journals. A more detail discussion of the selected articles is presented in the next section. Table 6 shows the information of the analyzed publications. This table contains the sensors used, type of data used, experiment type, if the experiment detects and locates a leak on not, the methodology used, and the IoT technology used.   The researchers have used a variety of sensors for capturing data. The most common ones are flow, pressure, acoustic, and accelerometer and comprise 60% of the research. Some researchers have used unconventional sensors like optical fiber, vision systems, TENG, and very few have used simulated sensor data. Out of 47 articles, 30 of them were only experimental, 7 of them were only simulated, 2 of them were historical data, and 8 of them were both simulated and verified by experiment. We can see a greater emphasis on experimental implementation in leakage detection tasks.

B. ANALYSIS OF GATHERED DATA
Based on the sensor attached to the pipe, leakage detection can be divided into two types, namely invasive and noninvasive. Examples of invasive processes are in line pressure and flow meter and hydrophone. The invasive sensor installation process is expensive, requires rework of the pipe network, and is costly. We can see from Table 6 that there is a shift toward non-invasive techniques, and 67% of the research focuses on non-invasive sensor data. Vibration-induced data like accelerometer data, contact audio, and image-based leakage detection techniques are getting more attention. The most common way of processing audio and vibration data is to transform it into Short-time Fourier transform (STFT) or decompose it to Intrinsic Mode Function (IMF) to get the feature space and then use thresholding or some form of ML classifier to classify leak and non-leak sound. Details of the methodology are discussed in section 5. A steady increase in leak location research along with leak detection can be observed. 90% of the research focuses on leak detection, and 55% of the total research study location. Compared to leak detection, leak location is difficult as it requires signal measurement and comparison in the microsecond range. This precision measurement was not possible on low-end hardwires. As the DSP-based MCU is getting cheaper, a surge in leakage location research is observed.
Internet of things or IoT is the umbrella term that focuses on data collection devices. Information is the key element in the computer and internet, and the focus of IoT is to automate that process [88]. Table 6 shows a broad range of IoT devices used in this field of research. IoT devices are designed to serve a broad market. Many low-power communication methods  have been suggested to fulfill IoT devices' requirements. Depending on the transmission and receiving range, there are several IoT devices in the market. For example, Bluetooth devices are used for short-range communication, Wi-Fi and Zigbee devices are for medium-range communication, and Nb-IoT, GSM, and LoRa devices are suitable for long-range communication. We can see that researchers are combining two different technologies in their research. They are using low-power sensor nodes with a central node that does all longdistance communication. Some articles use optical fiber for sensing and communication, but research in this area is very small. Figure 5 shows the types of sensors used and their frequency in the research. A total of 12 types of sensors are used in this research. Some researchers have used multiple sensors to increase reliability and accuracy [11], [61], [63], [85], [90], [95]. It should be noted that, in our study, we are focusing on leakage detection, location, related sensors, algorithms, and communication technologies. That's why we have excluded water quality-related sensors like turbidity and pH sensors, as they are irrelevant to our research [11], [63], [80], [90]. Five types of sensors dominate the research field: flow, pressure, accelerometer, acoustic, and camera sensor. Vibration sensors and their derivatives like acoustic and accelerometers account for 36% of the sensors in the research. The reason for the high intake of the vibration-based sensor in the research is that they are noninvasive and does not require rework of the pipe network. These sensors can be attached to the outside of the pipe. The generated data is generally time series vibration data and are easy to process with generic signal processing algorithms. Apart from regular sensors, researchers are also leaning towards the characteristics of new sensors. TENG, visible light and thermal camera, custom piezoelectric sensors, and optical fiber are some of the new routes the researchers are exploring.

2) DATA TYPE
We can observe two data collection procedures: single shot and time-bound sampling. Single-shot data in our review means that the information is captured all at once. IR [28], [57], visible light [52], and GPR [87] images are example of single-shot data. In the case of optical fiber, single-shot data are used on each section through the length to check the whole pipe to evaluate leakage [59]. The last one is the water sensor data [63], [85]. The water sensor is dependent on the conductivity of the water. If water is present, then the sensor triggers the controller. The second type of data is time-domain sampling. It means that data is captured for a duration of time. Then signal processing and algorithm are applied to that captured signal. We have observed two types of time-domain signals discrete and continuous signals. The relationship between the signal types we have observed is presented in Figure 6. In our review, discrete means sampling for a short amount of time and continuous meaning sampling all the time. In general, in this review, we have observed that researchers have taken a short time sample of vibration and acoustic signals, but in the case of pressure, flow, and ultrasonic range-based sensors, they have taken a continuous reading. Mainly when the researchers are using the diurnal pattern as a reference point [68], [69], or considering conservation of mass [11], [62] for leakage detection. The rest of the acoustic and vibration-based detection relies on discretetime data. Also, in [67], the researcher has converted raw acceleration data to an image. They are essentially converting time-series data to an image. This one actually blurs the line of the classification. Figure 7 shows the experiment type and its classifications. 64% of the literature had an experimental element, 15% of the research had only simulated data, 17% had simulated data and algorithms verified by the experimental result, and only 4% of the research work used historical data to develop their algorithm. Overall, we can see an emphasis on experimental data.

3) EXPERIMENT TYPE
The dominance of experimental data in leakage detection is because of the variety of sensor types and the pipe condition. Researchers have organized their experiments in three ways. The first one is the location of the pipe. Researchers have used a straight or looped pipe above ground to conduct their experiment [53], [54], [69], [74], [77], [79], [85]. Shukla et al. have experimented on underground water supply pipes to comprehensively understand the environmental dynamics [66].
The second type of experiment is a portion of the pipe network considered. The most common way is to set up a long straight line or a loop, flow water through a pipe and capture the data. Li et al. and El-Zahab have used a portion of the pipe to collect data [76], [77]. Shukla et al. and Okosun et al. have used an isolated pipe network to conduct the experiment [66], [74]. On the other hand, [59], [62], [68], [77], [79] have used a small section of the whole system to gather information. Similarly, Zhou et al., Alves et al., and Lai et al. used a real case's scaled-down model to conduct their experiment [57], [70], [87]. Sohaib et al. used a full-sized vessel to conduct their experiment [75]. The third type is the actual data from industrial or real-life setups. Some researchers can get their hands on industrial access, and Bao et al. have used the power plant flow as an experimental ground [52]. Depending on the access to the network, the experiments varied significantly. Kang [60].
In case of simulation and experiment, the researches have used mathematical calculation to establish their model and finally experimented to verify their accuracy and suitability. In [55], [56], [58], [67], [73], [80], and [83] Figure 8 shows the experimental objective of the research. 47% of the research has focused on leak detection, 42% of considered has considered leakage detection and location, and only 11% of research focuses on leak location alone. For leakage detection, there are many techniques available. For acoustic and vibration-based sensors, decomposition of signal to obtain feature space and classify the features with an ML classifier or a preset threshold is the most common technique to detect leakage [53], [54], [58], [65], [66], [75], [82], [83]. The second approach is the conservation of mass theory can be applied to a system for a period of time to get leakage scrutiny. Systems with flow sensors can capture water consumption patterns and cumulative volume of water to detect leakage [11], [61], [69], [81]. Leakage detection is comparatively easy than leakage location. For locating leakage, Predescu et al. systematically manipulated the water valves to generate a sensitivity matrix that can be used to locate the leakage [85]. Similarly, Sophocleous et al. has proposed two-stage calibration for detecting leakage [84]. Vision [52], infrared [28], [52], [57], and GPR [87] basessystems have an inherent benefit for leakage location as these devices need to be pointed at the leakage. The three most prominent techniques described in the literature are, based on the arrival time of acoustic or vibration sensor [72], [75], [83], [86], based on the difference of sound intensity, and the last one is based on the hydraulic [61], graph-based [82] or ML model [73], [84] calibration. Figure 9 shows the major methodologies, their components, and their relationships. From Figure 9, we can see four major ways the researchers have processed the data. 8% of the researchers have used image-based methods, 30% have used either acoustic or vibration-based methods, 26% have used  flow and 22% have used pressure-based methods. Some of the researchers have used multiple sensors to increase accuracy.

5) METHODOLOGY
The first one is image-based. Only Bao et al. have used image processing, but the rest of the researchers have tried to quantify the observed change [52]. Bao et al. have used a combination of IR and visible light cameras [52], and the rest of the research focuses on spatial [57] and temporal [28] characteristics of the thermal images. Only preliminary thresholding and transformation are observed here. GPR signal-based images are a new addition in this area and have not observed any automated process in this area to detect leakage [87].
The second one is the methodology for vibration and acoustic sensor data. These sensor data are processed in three steps. At first, the signals are decomposed, then feature vectors are made and finally compared with preset thresholds or fed to an ML algorithm to classify the data.
Pressure sensors are generally installed at an interval at the WSN to ensure the pipelines are within the expected pressure range. High pressure can rupture pipelines, so the whole system is monitored to balance the water pressure. One issue with using a pressure sensor to detect leakage is that it cannot detect small leakages. Kim et al. have used cumulative integral to detect small leakage to address this issue [86]. As leakage changes the pressure distribution of a significant part of the WDN, researchers have used genetic algorithms along with other algorithms to detect and locate leakage [84].
The last way is the application of flow sensor-based data. The primary theory is the conservation of mass. Researchers are considering water consumption patterns in their model to detect leakage [68].

6) COMMUNICATION
Of 47 publications, 17 or 36% used IoT technology for communication. This research comprises both homogeneous networks and hybrid networks. WiFi was the first choice comprising 35%, followed by cellular IoT modems, 25%, LoRa modems, 15%, other technologies, 20%, and 5% are not defined.
WiFi is the most common and widely available. Bao et al. have used Wifi for video transmission and robot control [52]. In the rest of the research, the WiFi modem is only used as a low-power sensor node [11], [15], [68], [85], [91], VOLUME 10, 2022 [95]. The second most popular option is the cellular network 3G/4G GSM, and Nb-IoT [63], [69], [70], [73], [81]. These modem uses cellular network and have extended coverage. One issue with GSM and Nb-IoT networks is that users need to pay for the services. LoRa is a network that can cover a large area and does not need any service charge to use. SigFox is a type of LoRa technology, and Gericke et al. and Pérez-Padillo et al. used it in their research [71], [90] [95]. Shihari et al. have not clearly defined the wireless modem they are using [80]. The solution choice is entirely based on the power consumption, nodal distance, and sampling frequency.
In the next section, we have discussed the research questions in detail.

V. DISCUSSION
The research questions are answered here based on the analyzed data.

A. HOW THE LEAKAGE DETECTION RESEARCH TECHNIQUES ARE EVOLVING OVER TIME?
In the past decade, most leakage detection procedures depended on the human experience. With the advent of the powerful microprocessor and efficient signal processing techniques, a massive shift in automation can be observed in leakage detection techniques. For this systematic review, we looked for papers published between 2016 and 2022 that studied communication technologies, new algorithms, and sensor implementations utilized in water leak detection and location research. The observations are listed below: • We can see an emphasis on experimental research.
Thirty articles were based on experimental data, eight were based on simulation and then verified by experiment, seven were based only on simulation, and only two were based on historic data.
• The incorporation of IoT in leakage detection has increased from 2018 and onward. Before 2018, five articles incorporated IoT in leakage detection research, and after 2018, it jumped to thirteen articles. IoT in water leakage detection is a very niche field. Out of all research, only four articles gave a complete solution that incorporated algorithm development and IoT integration.
• The rest of the publications gave the bare minimum execution of the IoT.
• A leakage produces visual, thermal, acoustic, and vibration signature outside the pipe and pressure and flow differences inside the pipe. Vibration and acoustic-based methods comprise 30% of all research, and flow and pressure-based methods account for 48% of the study. Image, optical fiber, TENG, and other novel methods make up the rest of the research. We believe the reason for vibration and acoustic methods make most of the research is that these sensors are easy to install on pipe compared to pressure and flow sensors that require inline installation with the pipe.
• Visual and thermal signature is the latest edition on leakage detection. We saw them first in 2017, and in 2022 two publications are using the image and IR-based technology. Optical fiber is a promising technology, but it is the most expensive one but it cannot be used on a branched network.
• TENG is in the early stage of development and will require substantial research work before it can be implemented in mass numbers.
• We can see a shift from rigid algorithm-based techniques to more ML-based approaches. From 2020 onward, we can see that 40% of all research used some form of ML algorithms. ML algorithms have much higher accuracy than traditional methods.
• In the case of IoT, there is no one to dominate in the leakage detection field. IoT is relatively new in this domain, and the application scenario is extensive.

B. WHAT KIND OF SENSOR ARE BEING USED?
summarizes the characteristics of different sensors used for leakage detection purposes. For each sensor, there are multiple models available and we have chosen one of them to make the comparison table. Accelerometer, and hydrophone sensors capture the vibration-induced data like noise and oscillation in the pipe and water. Deformation of piezo-electric crystal due to vibration is the main working principle of these acoustic and vibration sensors. On the other hand, MEMS sensors work by the change in capacitance of microstructure and a fixed plate and are widely used on mobile phones, tablets, and other electronic devices.
There are two types of acceleration meters used in the research field. The first one is a digital accelerometer, and the second one is a piezoelectric element-based analog sensor. The digital sensor Wilow AX-3D has many advanced features, including wireless data transfer and control and a built-in battery for mobile applications [77]. The analog sensor data is captured using a capture card, and then further processing is done. Wang [53], [54], [66], [73], [79].
A hydrophone is a microphone system that can detect sound waves under water. As it remains inside the water, it is less prone to environmental noise. It is like a stethoscope but underwater. Piezo [65] and MEMS [15] based hydrophones are available.
A water sensor, commonly known as a moisture sensor, detects the presence of water. The main problem with this sensor is that it needs to get wet to detect the presence of water, limiting the cases where these can be reliably used. Che et al. have used their IoT-based leakage detection system [63]. When leakage occurs in a water distribution system, the average pressure at that point changes. This pressure change also affects the nearby nodes. This is the principle of leakage detection via pressure sensors. Padillo et al. have used a pressure sensor in a field experiment to design their algorithms [90].
The conservation of mass theorem can detect leakage. The summation of inbound and outbound water should be equal, and this is the backbone for using flow and range sensors. Range sensors are used in the literature to measure the water volume in the overhead tank, and flow sensors are used in the links to measure water consumption. Gautam et al. have used a range sensor to measure the depth of the overhead water tank to estimate water volume [68]. The flow sensor is one of the popular choices for water leakage detection. There are two types of flow sensors. The non-contact ultrasonic and inline hall sensor-based. Lin et al. have implemented an ultrasonic flow sensor based on propagation time and transmission time difference [69]. The second one is the commercially available YF-S201 uses a hall sensor to detect water flow. Ali [11], [63], [80], [85]. Coelho et al. have used a similar sensor but a different model, YF-B2 [70].
On paper, the flow sensor might look easy to integrate into the system and develop the algorithm, but due to the accuracy of ±5 ∼10, there is a high chance of false alarm.
We have seen the use of some novel sensors in the literature, like PVDF piezoelectric sensor [74] and triboelectric nanogenerator [78], but they are not commercially available. The researchers have fabricated these sensors themselves. In the articles, pH [11], [80] and turbidity [11], [63] sensors are also mentioned but are not included in the table because they are used for water quality information and are not related to leakage detection.
The MEMS accelerometer sensors are mass-produced and suitable for deployment in large quantities. From our observation, we can see from the table that the MEMS accelerometer is the best choice as a sensor as it consumes the least amount of power, is small in size, has a reasonable cost and is commercially available in large quantities.

C. WHAT KIND OF ALGORIOTHMS ARE USED IN DATA PROCESSING?
The algorithm used the leakage detection can be grouped into two groups. The first one is simple signal processing and thresholding-based, and the other is ML-based. VOLUME 10, 2022   We can see that in Figure 11 researchers have used imagebased leakage detection and location method. Three types of input images can be seen. The first one is the infrared or thermal image [28], [52], [57], the second one is the visible light image [52], and finally GPR images [87]. In [28], [57], and [87] researchers have only characterized the spatial, and temporal variation in the image due to the temperature change over the time and tried to identify the fingerprint of the underground leakage. Bao et al. have used visible light and IR camera along with Otsu thresholding with black and white hat transform to detect and classify leak location automatically [52]. None of the image-based papers have used machine learning models for classification. Figure 11 shows the flow of acoustic and vibration-based models. Signal decomposition is done in three major ways. Time series data is decomposed to get frequency domain data, IMF, EMD, EEDM, VDM are done to get harmonic signals and finally PCA is done to get spectral band envelop. Then the data is converted to feature space and finally passed through ML models to get classification results.
Flow meters are used on the basis of the conservation of mass. According to the conservation of mass, the mass of water entering the system and the water getting out of the system must be equal. Vhimis et al., and Wang et al. use flow data along with a preset value to detect leakage [61], [62]. Table 8 shows the uptake in the ML model is leakage detection. It is clear that SVM is the first choice and neural network (CNN, ANN and AlexNet) is the second choice, followed by DT and RF. Although it is possible to feed the raw signal to the ML classifier directly, we have not observed it here. Researchers have extracted features from the audio and vibration-based signals and then fed the feature vector into the classifier. Feature vector reduces the dimensionality of the signal and makes it easy for the ML model to classify the data. Xu [54], [66], [70], [77].
ML based models are easy to train and have better performance than their signal processing and thresholdingbased counterpart. ML model will lead the future in this sector.

D. WHICH COMMUNICATION TECHNOLOGIES ARE CURRENTELY BEING USED?
There are many devices with the same technologies. For example, ESP8266 and ESP32 are WIFI SoC and use similar technology but different integrated peripheral supports. In this section, we have chosen one IC from each category to make the comparison. VOLUME 10, 2022 IoT devices are designed to serve a wide market. For example, Wi-Fi was designed to replace the highspeed wired ethernet, Bluetooth was primarily designed to stream audio, Zigbee was for controlling home, and office appliances, Nb-IoT and GSM technology were for machine-to-machine communication, and LoRa for long-range, low-power communication. Table 9 shows a comparative analysis of the available IoT technologies. It is up to the researchers to think about the use cases and use the appropriate technology for their purpose.
In IoT research, Wi-Fi was chosen 35% of the time, followed by cellular IoT modems (25%), LoRa modems (15%), and other technologies (20%). We can see that Wi-Fi is the most popular choice for IoT applications. As Wi-Fi was the most common form of wireless device before the popularity of IoT, the vendors designed their devices to meet the Wi-Fi requirement. About one-third of the research papers have used Wi-Fi for their application.
The chips and modems come in two forms. SIM900L and RFM95W are modems and require an external microcontroller to operate. On the other hand, ESP8266, and CC2530 are SoC which has a built-in controller making the footprint small.
The operating frequency plays an important role as higher RF frequencies are blocked by walls, trees, and other obstructions, but lower RF frequencies are less susceptible to this kind of issue. That's why Nb-IoT and LoRa have lower operating frequencies than other technologies. Antenna size and frequency have an inverse relationship. So, lower frequency means longer antennas. On the other hand, data transmission speed is another vital factor that gives us information about how long it will take to send data to a base station. LoRa has the lowest data transfer speed, followed by Bluetooth, Zigbee, Nb-IoT, GSM, and Wi-Fi.
If the device is only a sensor node and only requires periodic transmission, then LoRa or Zigbee is a good choice. On the other hand, if the device needs to transmit a bulk amount of data, then Wi-Fi is the best choice. Another two crucial factors are power consumption and the range of the device. LoRa and Nb-IoT have a similar range of 10 km, but the Nb-IoT modem has a higher data rate.
There is no one ideal solution for wireless technology for leakage detection. We will see a mixture of wireless technologies for different use cases. For example, for an apartment building where there already is a Wi-Fi infrastructure in place, we will see Wi-Fi node-based leakage detection sensors. For small homes or farms, LoRa will be an ideal choice, and for monitoring remote locations, Nb-IoT will be used. An ideal IoT chip needs to have very low power consumption during operation, ultra-low deep sleep current, and long-distance communication capability for the research community.

VI. CONCLUSION
Leakage detection and localization is the first step to reducing the water loss in a network. The quickly it is possible to detect the leakage, the maintenance crew can fix it to save the precious resource.
In this paper, the authors have reviewed the existing technologies and the current trend in the domain. There are three main parts in this field, namely data collection via sensors, analyzing the data with an algorithm, and sending the result to the server via a communication link. An MCU is used to manage these three tasks on site.
Flow and pressure sensors have a lower sensitivity, thus an ideal choice for burst detection. This leaves room for background leakage or small leakage detection tasks. Vibration sensors and their derivatives like acceleration sensors, contact microphones, and hydrophones are getting more attention as they are highly sensitive and cheap compared to other sensors. We can see an uptake of the acceleration sensor in leakage detection studies as it is the cheapest one among all other sensors. It is mass-manufactured, requires a very low lower, and is small in size.
In this study, three ways researchers have implemented their algorithms. The first one is a simple threshold-based algorithm. The second one is a practical algorithm based on feature extraction and thresholding, and the last one is an ML-based algorithm. Out of these three types of algorithms, ML has far superior performance. One catch is that ML-based algorithm is that it requires preprocessing and feature extraction and generally requires more computation power as well as more memory. Even though this drawback, researchers are leaning toward machine learning models to achieve higher accuracy in leakage detection in a water supply network. ML will play a vital role in this field.
In this review, we can see three trends in the communication side. The first one is WIFI based network, the second one is a cellular network, and the third one is a hybrid network. WIFI is the popular choice for IoT modems because it is cheap and does not require monthly subscription costs. The WIFI and cellular-based sensor nodes can connect directly to the internet without any middle man and report if any issue is found. Depending on the user's need, a heterogeneous mixture of short-range and longrange devices will dominate in this field. -We hope to see more sensor node-based implementation of leakage detection technologies in the future. This sensor will have some form of vibration sensor and ML models built into them, thus resulting in higher accuracy. We will also see a move away from rigid algorithm-based techniques and incorporate over-the-air updates on the MCU. This over-the-air update will help the nodes update the ML model. IoT will enable close to real-time water leakage detection. These steps will make reliable leakage detection technology possible in large volumes.
MOHAMMED REZWANUL ISLAM received the B.Sc. degree in engineering from the Bangladesh University of Engineering and Technology, in 2016, and the Master of Engineering degree from Charles Darwin University, Australia. He is currently a Ph.D. Researcher with the College of Engineering, IT and Environment, Charles Darwin University. His research interests include the Internet of Things, sensor networks, embedded systems, artificial intelligence, and machine learning.
SAMI AZAM is currently a Leading Researcher and a Senior Lecturer with the College of Engineering and IT, Charles Darwin University, Casuarina, NT, Australia. He has several publications in peer-reviewed journals and international conference proceedings. His research interests include computer vision, signal processing, artificial intelligence, and biomedical engineering.
BHARANIDHARAN SHANMUGAM is currently a Research Intensive Senior Lecturer with the College of Engineering and IT, Charles Darwin University, Australia. He is attached to the Energy Resources Institute where his primary role is to apply his research skills in securing the digital assets of the critical infrastructure, including smart girds and Virtual Power Plants. He has a large number of cyber related publications in various journals and conference proceedings. His research interest includes cybersecurity. He is keen to ensure the next generation cyber experts are trained suitably to combat the varying threat landscape.
DEEPIKA MATHUR is currently a Research Fellow with the Northern Institute, Charles Darwin University, Australia, and is based at the Alice Springs Campus. Her research interest includes examining ways regional towns can be made more sustainable and healthier through the built environment. In particular, she has been conducting research on minimizing construction waste generation and ways of recycling and reusing this waste in regional towns, such as Alice Springs.