Information-Based Search for an Atmospheric Release Using a Mobile Robot: Algorithm and Experiments

Finding the location and strength of an unknown hazardous release is of paramount importance in emergency response and environmental monitoring; thus, it has been an active research area for several years known as source term estimation (STE). This paper presents a joint Bayesian estimation and planning algorithm to guide a mobile robot to collect informative measurements, allowing the source parameters to be estimated quickly and accurately. The estimation is performed recursively using Bayes’ theorem, where uncertainties in the meteorological and dispersion parameters are considered and the intermittent readings from a low-cost gas sensor are addressed by a novel likelihood function. The planning strategy is designed to maximize the expected utility function based on the estimated information gain of the source parameters. Subsequently, this paper presents the first experimental result of such a system in turbulent, diffusive conditions, in which a ground robot equipped with the low-cost gas sensor responds to the hazardous source simulated by incense sticks. The experimental results demonstrate the effectiveness of the proposed estimation and search algorithm for STE based on the mobile robot and the low-cost sensor.

already known, or the hazardous material may be detected visibly.It can be more challenging when the location of the hazardous material is unknown and the material cannot be detected using visual methods.In this paper, a joint estimation and search planning algorithm is devised, which will simultaneously search for and estimate important parameters of a release using point observations of concentration on a mobile robot.
Potential responses to a harmful atmospheric release include mapping, boundary tracking, source localization, or source term estimation/reconstruction (STE) [6].Mapping and boundary tracking both aim to provide a spatial approximation of the contaminated area.However, these approaches are limited by the dynamics and size of the phenomena and would either require many sensors or a lot of time to produce a map.Regardless of the mapped area that is likely to have changed by the time, a map is produced.Furthermore, difficulties would be incurred with large amounts of noise, intermittent sensing, or by the splitting up of contaminated regions.Source localization is a more realistic approach [7]- [9], although this provides neither the information about the spread of hazardous material nor the quantity of the emission.Instead, STE methods will estimate the location of the release and the strength of the source.With this information, an atmospheric transport and dispersion (ATD) model can be used to approximate the spread of contamination, and it will be possible to forecast the future and long-term hazard, including the estimates of deposition [10].
Estimation of the source term of an atmospheric release is most popularly dealt with using a large network of static concentration sensors and meteorological stations as reviewed in [6] and [11].On the other hand, the development of smaller sensors and intelligent autonomous robots means that the mobile platforms, such as unmanned ground or aerial vehicles equipped with various sensors, are the modern approach to perform sensing tasks.Applied to environmental monitoring tasks (see [12], [13]), mobile platforms overcome the issues, such as maintenance, powering, networking, positioning, and costs of large static networks of sensors.For source estimation, mobile sensors are preferred, because a single sensor has the potential to solve the entire problem by searching more desirable measurement locations and collecting more informative, spatial-temporal data.
Using mobile robots to search an area for objects of interest has received considerable research in recent years, where This work is licensed under a Creative Commons Attribution 4.0 License.For more information, see http://creativecommons.org/licenses/by/4.0/many problems can be formulated as a probabilistic search to account for uncertainties within the search environment.For example, multiagent search of predefined land patches for static targets has been investigated in [14] using the info-gap approach to deal with the severe uncertainty associated with the priori information.Cooperative search for multiple stationary ground targets using a group of unmanned aerial vehicles has been investigated in [15], where a probability map is constructed and updated based on unmanned aerial vehicle measurements.More recent works have endeavored to combine the probabilistic search and target localization problem in a unified framework.For example, in [16], the multirobot-multiobject exploration and tracking problems are solved jointly with probabilistic guarantees on the tracking performance.For challenging dynamic target search and tracking, an adaptive online cosearch approach has been proposed in [17] using a distributed sampling model in Bayesian estimation.
The STE problem using a mobile robot can be considered as a probabilistic search or a cognitive search according to a pioneering work [18].However, it differs from the above-mentioned work that focuses on target search and localization.The overarching goal is to collect more useful information for STE of a release rather than deploying the robots to locations where the targets are likely to be.In this sense, the philosophy of this paper is more similar to the work of searching unknown transient radio sources using mobile robots [19], where a comprehensive sensing model is required to bridge the sensor readings and the sources.The unique challenges associated with an atmospheric release are the uncertain parameters in the dispersion model and the unreliable readings from the sensor due to turbulence.Therefore, the search planning needs to be considered jointly with STE.
In this paper, we develop the probabilistic search algorithm for an atmospheric release using an information-based approach.Bayes' theorem is used to estimate the source term parameters including the source location, where all the parameters in the dispersion model are given probability distributions to account for their uncertainties.The Bayesian estimation is implemented using a sequential Monte Carlo algorithm.In probabilistic search planning, the Kullback-Leibler divergence between the current and future posterior distributions is used as a measure of the expected information gained by the robot maneuver.This was inspired by the literature on an optimal experiment design [20], [21] and informative path planning [22].The value of the expected information is approximated using importance sampling techniques associated with the STE process.The proposed algorithm is assessed experimentally, using smoke from burning incense sticks as a source, fans to create wind, and a robot equipped with a metal-oxide (MOX) gas sensor.
The contents of this paper can be summarized by several technical contributions that complement and facilitate one larger, more sincere, practical contribution.The latter is the fact that, to the best of our knowledge, we have produced the first online experimental STE results using robot cognitive search in realistic conditions, which paves the way for deploying this algorithm in response to an accidental release or attack of chemical substances in the atmosphere (e.g., see the scenario in [23]).Technical contributions that facilitated this are as follows.
1) Inspired by the literature on STE [6], an informationbased search algorithm is developed to accommodate the uncertainty in all dispersion parameters of the release, with key ones being the wind speed, direction, and the diffusivity.
2) The sensor model used in the algorithm is extended from discrete particle encounter measurements (see [18]) to the continuous space of a low-cost sensor.More importantly, a novel likelihood function is designed to accommodate the intermittent reading of the low-cost sensor.3) A modified dispersion model is also used to cater for an uncalibrated MOX sensor to reflect an expected voltage reading instead of concentration.This is an important step, as the calibration of MOX sensors is difficult and affected by numerous factors [24].4) This is arguably the first experimental study of an information-based search that does not use a thermal source or assume that the strength is of a known quantity, such that the release is turbulent and the meteorological conditions are inconsistent.Finally, the overall experimental setup is simple but effective, using inexpensive sensors and a safe, easily accessible source.We hope that it will benefit other STE researchers enabling quick development and testing of algorithms outside of simulations.
The remainder of this paper is organized as follows.First, more relevant works are reviewed in Section II.A formal description of the problem is given in Section III.In Section IV, the methodology is described, including the conceptual search solution, modeling required to implement the conceptual solution, and the sequential Bayesian implementation.Section V outlines the experimental setup and describes the robot searcher and the sensing environment.An illustrative run and numerical results of the experiments are presented in Section VI.Finally, conclusions and future work are given in Section VII.

II. RELATED WORK
Autonomous search, with the goals of localizing chemical leaks, sources of odor, or further understanding search patterns observed in nature, has been a popular subject of research for some time.Search is a quotidian task for animals during foraging, hunting, or finding a mate.Due to the large amount of applications in nature and the extremely efficient and successful searches observed, many search algorithms have been biologically inspired.In the absence of sensory cues, evidence suggests that the searches in nature are often random, for example, Brownian motion and Lévy walks have been observed in marine predators [25] and wandering albatrosses [26].However, when sensing cues are available, such as odor, the movement of the searcher is adapted to proceed more efficiently toward the source.On a microscopic scale, gradient-based approaches are used, for example, Escherichia coli bacteria [27] use chemotaxis to move toward the greatest supply of energy.On a larger scale, sensing can be more sparse, caused by a weak source, greater distances, or insufficient mixing [28].In these conditions, chemotaxic strategies are abandoned as irregular gradients, and intermittent sensing causes them to lose performance or fail.Animals that search in these conditions adopt alternative methods, for example, the male silkworm moth uses wind and odor information when searching for a female mate releasing sexual pheromone [29].The incredible efficiency and somewhat systematic search path performed by the moth have inspired search algorithms such as the surge and cast approach [30].
Most biologically inspired search strategies can be regarded as reactive, where observations trigger predefined movement sequences to localize a source [31], [32].Alternatively, approaches have been developed based on a fusion of probabilistic and information theoretic principles, otherwise known as cognitive strategies [18].Recent cognitive search strategies make decisions online, formulated as a partially observable Markov decision process (POMDP) [33].The POMDP framework utilizes state, action, and reward.For our problem, the state refers to the current knowledge about the source, the actions are potential future measurement locations, and the reward is a quantity to describe the gain in information supplied by the corresponding action.Infotaxis is a cognitive search strategy proposed to be effective in the sparse sensing conditions where the gradient-based approaches would be unsuitable [34].Assuming that the environmental parameters and the source strength were known, Bayes' rule was applied to update a probabilistic map of source location throughout the search, in response to sparse sensory cues in the form of particle encounters with a sensor [35].Considering only onestep-ahead maneuvers on a square lattice, the most informative actions were selected based on minimizing the expected entropy of the posterior distribution, with an adaptive term to bias the searcher's movements toward the source, as levels of uncertainty were reduced.The strategy showed robustness to significantly sparse conditions and has thus inspired several studies proposing modifications and extensions [36], [37].
A critical extension of the algorithm was its implementation in the sequential Monte Carlo framework, using a particle filter, alleviating its grid-based implementation and allowing the source strength to be included in the parameter space [18].This was essentially estimating the source term of the release.In this paper, the focus was on removing the assumption that the strength was known, so a few details on the performance of the strength estimate were provided.
Other strategies to perform source estimation with a mobile sensor include a genetic algorithm with an expert system for sensor planning [38] and Markov chain Monte Carlo (MCMC) sampling after a predefined sweeping path [39].A pioneering work of using multiple robots for STE has been seen in [40].The information-based probabilistic approaches are preferred in this paper, as they take into account the utility of the next measurement when making maneuver decisions.In simulations and on experimental data sets-based studies, information-based search planning strategies have been shown to outperform conventional approaches such as a uniform sweep [41].However, experimental results of STE performed online using the mobile sensor are yet to be found.Besides the simulated data, the previous work has used the experimental data sets, whereby the artificial searcher could move to neighboring locations to take a new measurement.This was done on the data set collected in a turbulent water channel [41] and for the radiological data set [18].
Note that there have been several source localization experiments, rather than STE, that have been carried out (see [9], [42]).These methods did not estimate important parameters of the release, such as its strength, and the robots were generally initiated downwind of the source within the dispersion.Furthermore, the experiments would typically use a constant and uniform wind flow, generated within a wind tunnel, creating a well-defined plume; conditions which are rare in more realistic scenarios.There have been a few instances where localization of the source has been demonstrated in more turbulent conditions, for example, particle filter-based algorithms have been used in outdoor environments to locate a source of an airborne material using MOX sensors [43]- [45].To date, cognitive-or information-based search experiments are normally based on a thermal source with smooth dispersion, as opposed to turbulent airborne materials, and they have assumed to be known dispersion parameters and source strengths [32], [36], [46].To this end, this paper should mark the first online implementation of a cognitive search for STE using a mobile sensor, where both the location and parameters of the release are unknown.

III. PROBLEM FORMULATION Consider a flat rectangular search area
⊂ R 2 that is expected to contain a hazardous release.A robot equipped with the MOX gas sensor is to navigate within the area to estimate the release parameters, otherwise known as the source term.This will provide the necessary inputs to an ATD model to produce a forecast of the hazard.For simplicity, it is assumed that at time index k, the robot is aware of its own location within the area.In practice, this can be achieved by using a GPS or a simultaneousness localization and mapping system.
The hazardous sensor outputs a continuous reading z ∈ R + that can be related to the concentration of the hazardous material in the air.This information can be used to predict the parameters of the source, i.e., the source term.The source term can include several parameters that depend on the type of release and the models used to forecast the dispersion.In this paper, the source term of the release is parameterized by the following.
1) Cartesian coordinates of the source p s = [x s y s ] T ∈ in meters (m).2) Release rate/strength of the source q s ∈ R + in grams per second (g/s).
3) The wind speed u s ∈ R + in meters per second (m/s) and direction φ s ∈ R in radians (rad).4) Diffusivity of the hazard in air d s ∈ R + in meters squared per second (m 2 /s).5) Lifetime of the emitted material τ s ∈ R + in seconds (s).Hence, the parameter vector of the source term can be defined as The robot is to autonomously search the environment, collecting point observations z 1:k = {z 1 , . . ., z k } from the hazardous sensor at discrete time steps k = 1, . . ., k and at known locations p 1:k = {p 1 , . . ., p k }.At each time step k, the robot updates its estimates of the source parameters k by drawing the inference on the probabilistic distribution p( k |z 1:k ) and then chooses the next location p k+1 to make the next observation with the hazardous sensor by taking an action a k , such that p k+1 = p k + a k .

IV. METHODOLOGY
To solve the formulated problem efficiently, the goal is to navigate the robot to the most informative data collection locations so that the estimation of the source term can be performed more rapidly and accurately.The developed solution in this paper is twofold.First, Bayes' theorem is used to update posterior density estimates of the source parameters and uncertain dispersion variables in response to the new sensor data.Second, an information-based reward is derived to choose the next position to collect the sensor data, which is expected to provide the most information given the current posterior results.In this section, the autonomous search and estimation algorithm is described.The proposed solution is outlined first, which explains further the framework of the approach, followed by the descriptions of the models and assumptions required to implement the solution and then its algorithmic implementation.

A. Proposed Solution
This section describes the autonomous search and estimation algorithm used to guide a robot to localize and reconstruct a source of hazardous material characterized by the unknown source term vector k .The key variables of the source to be identified are its location p s and release rate q s .The remaining parameters include the wind speed u s , wind direction φ s , diffusivity d s , and the average lifetime of the hazardous material τ s .It is assumed that a good prior can be provided for those parameters but they are still included in the state vector to account for uncertainties.The robot, located at p k at time step k and equipped with the gas sensor, is to navigate the environment collecting measurements in the form of voltage readings relative to the hazard (i.e., z k ∈ R 0≤z≤5 in this paper).At each time step, the robot will choose from the admissible set of actions = {↑, ↓, ←, →}, the move a * k ∈ that is expected to yield the most information, derived as an information-based reward.
1) Estimation: A probabilistic framework is used to estimate the source parameters in response to large uncertainties in the observed data in the form of a voltage reading from an uncalibrated sensor.The current state of knowledge regarding the source parameters is represented by a posterior probability distribution p( k |z 1:k ), where z 1:k implies that the measurement data are collected at locations p 1:k , respectively.The posterior distribution is subsequently updated according to Bayes' rule in response to the new sensor data z k+1 , such that where The initial prior distributions π( 0 ) ≡ p( 0 ) of the source parameters are assumed to be given, and these can be provided autonomously through sensory data or by user input.
If information concerning the source term is available prior to the search, it can be exploited through an appropriate distribution to represent the prior knowledge known about the release.However, in the absence of information, the prior can be set to an uninformative distribution.For example, the prior distribution for the location of the source is a uniform distribution that is bounded by the domain .In subsequent iterations, the prior distributions are replaced by the posteriors to reflect the information gained from the previous sequence.
In this Bayesian inference framework, it is also assumed that the source term is constant, i.e., k+1 = k , which implies that p( k+1 2) Sensor Planning: The reward function for sensor planning is inspired by the literature on the optimal experiment design [20], where it is referred to as the utility function ϒ(z k+1 (a k )).This is used to capture the information gain on the estimate of k , given the next sensor data z k+1 after taking the action a k .Different utility functions can be adopted.Since the future measurement z k+1 is generally unknown, it is suggested that the optimal design of an experiment should be the one that maximizes the expected utility of the subsequent measurement E[ϒ(ẑ k+1 (a k ))], where the expectation is calculated with respect to the hypothetical future measurement ẑk+1 .The experimental design problem is adapted to direct a mobile sensor, where the choice of the next experiment is synonymous with the movement of the sensor.The maximization problem can be written as The expected utility of maneuver a k can be further expressed as an integral based on the probability of a future measurement ẑk+1 (a k ) and its corresponding utility ϒ(ẑ k+1 (a k )) where Z is the range of the possible future measurements at the future sampling position.In this paper, the utility function is defined as the Kullback-Leibler divergence between the predicted source term distributions before and after the measurement ẑk+1 (a k ) is taken into account, i.e., between the distributions p( k+1 |z 1:k ) and p( k+1 |z 1:k , ẑk+1 (a k )).Thus, the utility function is defined as Combining ( 5) and ( 6) leads to the following expression for the reward function: The method applied to approximate ( 7) is described in the sequential Bayesian implementation section.
The sensor planning strategy provides the full search algorithm under a single framework, which provides balanced exploration and exploitation by adapting to the state of the posterior density estimates of the source parameters.This is characterized by more explorative behavior when the posterior distributions have a widespread and are uninformative and exploitative behavior, directed toward the source, as the posterior distributions become more informative.The approach naturally moves toward the source location, as the posterior estimate becomes more certain.

B. Modeling
One of the great benefits and influencing factors of using Bayes' theorem is the ability to approach the problem from a probabilistic perspective, where variables and models can be given distributions to represent their level of certainty.In this section, the models used for the gas sensor measurement and estimated observations from a dispersion model are derived and then combined to form the likelihood function used in (2).
1) Dispersion Model: To construct the likelihood function p(z k+1 | k+1 ) used in (2), there must be a method of linking sensor measurements z k with the expected observations.To do this, a model of dispersion is required, which will provide the expected concentration at position p k produced from a hypothesized source with parameters k .Any relevant model can be used; there exist highly complex models using computational fluid dynamics or equations derived from analytical solutions to the advection-diffusion equations such as the Gaussian plume dispersion model.The model is interchangeable without any other changes to the algorithm and should be chosen to reflect the current scenario.In this paper, a particular solution to the advection-diffusion equation is adopted from [34].This is a simplified equation based on atmospheric statistics assuming homogeneous diffusion and a constant mean wind direction and speed.Although other approaches may be more accurate, this model is chosen as it is very fast running and expected to be useful in turbulent short-range conditions.The expected concentration to be read by a sensor at position p k from a source at position p s , releasing gas at a rate of q s with average lifetime τ s in an environment with mean wind speed u s , wind direction φ s , and diffusivity d s , is given by where An example of the modeled plume is given in Fig. 1, where the sensor model to be described in Section IV-B3 has been applied.From (8), the state vector of the unknown source term and meteorological parameters is k = [x s y s q s u s φ s d s τ s ] T where key parameters are the source location and strength.The remaining variables are included as uncertain parameters to increase robustness, as these variables are rarely known with absolute certainty.
2) Sensor Model: The focus of this paper is on validating an STE framework and demonstrating how a low-cost setup can be used for rapid prototyping and source estimation experiments.Therefore, a low-cost MOX gas sensor is adopted.Its output is a voltage reading, which will vary due to a change in resistance of the sensor, caused by contact with atmospheric contaminants [24].
Typically, MOX sensors can be calibrated to a known gas so that meaningful concentration measurements, in physical units such as parts per million, can be found.However, the calibrations are sensitive to uncontrollable atmospheric conditions, such as temperature, humidity, and pressure [24].In many scenarios, the atmospheric conditions can change, the equipment to measure them is not available, or the source of interest may be unknown or has not yet been calibrated to the sensor.To address this problem and make the proposed STE framework more applicable, the sensor used in this paper is not calibrated.For simplicity, it is assumed that the expected voltage reading V from the contamination is directly proportional to the concentration of the substance.Based on the dispersion model defined in (8), the proportional relationship from the expected concentration to the expected voltage follows: where α is the calibration factor.This is a reasonable assumption based on [24].While the substance is unknown or the sensor is not calibrated to the specific material, a scaled release mass A 0 = q s /α is estimated, thus resulting in the new model for the expected voltage reading With a slight abuse of notation, we continue to use k to represent the new source term where q s is replaced by A 0 .Moreover, to account for the unmodeled chemical concentration, an additive measurement noise v is assumed to associate with the expected voltage reading V (p k , k ) due to the sensor noise.
Another challenge in using this low-cost sensor is that the sensor is not specific to a particular material.There exists a positive reading by the sensor in clean air, which in this paper, is modeled as the background noise v.This also implies that the chemical concentration from the source of interest may not be picked up by the sensor when the concentration is relatively low.
3) Measurement Likelihood: The likelihood function p(z k | k ) needs to be constructed to provide the probability of the sensor reading given a source term realization.As described earlier, the observational data z k are determined by a number of factors, including the expected voltage reading V (p k , θ) and different noises.In this paper, it is assumed that both the additive measurement noise vk and the background noise v k follow Gaussian distributions, such that vk ∼ N ( vk ; 0, σk ) and v k ∼ N (v k ; 0, σ k ).The standard deviation of the background noise σ k can be obtained experimentally which is set as a constant.In the contrast, σk is more difficult to quantify.Therefore, we follow common practice in STE to set the errors as a percentage of the modeled concentration reading.
While the noise distributions can be modeled, there still exists a phenomenon to be accounted for in the sensing process due to the complicated nature of chemical dispersion and the low-cost sensor, which is the miss-detection of the sensor.To solve this problem, we define an event D to describe the case where the gas has been picked up by the sensor (D = 1) and the case where the sensor did not respond to the gas (D = 0).The probability of detection is defined as P d = Pr{D = 1}, which is a tuning parameter to be set in the experiments.Therefore, the sensor model used in this paper can be expressed as The corresponding likelihood function can be written as

C. Sequential Bayesian Implementation
The Bayesian estimation of source parameters is implemented in the sequential Monte Carlo framework using a particle filter.The output is an approximation of the posterior distribution p( k |z 1:k ), which represents the current state of knowledge about the source parameters.Given the posterior distribution in the form of a weighted random sample, the integral in ( 7) is approximated so that the expected most informative maneuver can be chosen.
1) Sequential Monte Carlo Estimation: The conceptual solution derived to estimate the source parameters is implemented using a particle filter.The posterior distribution from ( 2) is approximated by a set of weighted random samples , where T is a sample representing a potential source term and w (i)  k is the corresponding normalized weighting, such that N i=1 w (i) k = 1.Given the weighted samples, the posterior distribution is approximated as where δ(•) is the Dirac delta function.The sample weights are updated in a recursive manner by sequential importance sampling [47].At each time step, a set of new samples i=1 can be drawn from a proposal distribution q( (i) k+1 ), which should resemble the distribution p( k+1 |z 1:k+1 ).The corresponding unnormalized weights are then updated according to The proposal distribution is typically used to update the samples to the next time step for estimating dynamic states.By assuming a time-invariant source term (i.e., the source position is fixed and the release rate is constant), the proposal distribution can be assumed to be identical to the posterior at time k.This leads to a simple algorithm where [18].Due to cancellation of terms in (15), the unnormalized particle weights are updated using the likelihood function and the previous weight as follows: The sample weights are then normalized as w(i) k+1 to obtain the new approximation of the posterior.Importance sampling is carried out sequentially at each time step.This can eventually lead to only a few particles with nonnegligible weights known as the degeneracy problem.To avoid sample degeneracy, the number of effective samples is estimated by When the number of effective point estimates N e f f falls below a prespecified threshold η, the sample points are resampled.
This can lead to another problem where highly weighted particles will be multiplied many times, leading to a lack of diversity.This problem is referred to as sample impoverishment.To improve the diversity of the random samples, the resampled estimates are regularized by drawing new samples from a Gaussian kernel.The new samples undergo an MCMC move step [47], where they will be accepted with a probability proportional to their likelihood.
2) Sensor Planning: The reward in ( 7) must be integrated over values of the future measurement z k+1 .This value is unknown until the maneuver has been made.Therefore, the distribution of the hypothetical measurement ẑk+1 needs to be generated based on the dispersion model and the current estimate of the source term through the likelihood function.Based on the current sample set i=1 and using the law of total probability, the likelihood function can be approximated as a mixture model where To generate a set of samples from this distribution, M particles {ẑ ( j,i) k+1 } M j =1 can be drawn from each p(ẑ k+1 (a k )| (i)  k+1 ) based on (12), which yields a total of M N samples.
To reduce the computational load, a small number of samples , where N z N.Moreover, we set M = 1 in this paper, and hence, only one sample of ẑ(l) k+1 will be produced given a particular (l)  k value for l = 1, . . ., N z .Therefore, the distribution p(ẑ k+1 (a k )|z 1:k ) can be approximated by The detailed process of generating {ẑ Given the hypothetical future measurement ẑ(l) k+1 , the utility function ϒ(•) defend in ( 6) can be evaluated.First, based on the set of samples k , the same set of samples can be used to approximate the predicted distribution p( k+1 |z 1:k ).Then, the posterior distribution p( k+1 |z 1:k , ẑ(l) k+1 ) can be approximated by the sample set , where and the corresponding weight is updated based on the same Bayesian law ( 15) and ( 16), such that ŵ(i,l) ŵ(i,l) k+1 = 1.Thus, the utility function can be approximated as

Algorithm 1 Drawing Samples for Hypothetical Measurement ẑk+1
Algorithm 2 Select Optimal Control Action a * k At last, the expected utility function with respect to the hypothetical future measurement ẑk+1 can be expressed as The expected utility is calculated for all the maneuvers in the set ; then, the robot selects the move a * k that has the greatest expected utility.Following the maneuver, the robot takes a new observation z k+1 and the estimation and sensor control cycle are iterated until some stopping criteria are reached.The action selection algorithm is summarized in Algorithm 2.
This concludes Section IV.The estimation and sensor planning implementations describe the entire algorithm required for decision making of the robot to search for and estimate the source term of a hazardous source.All that remains is a system to take the output of the algorithm, a new position coordinate, and the robot maneuver to the desired location to take the following measurement.The robotic system and the experimental setup are described in Section V, and then, the experimental results are presented.

V. EXPERIMENT SETUP
In this section, the experimental setup used to validate the proposed algorithm using a mobile robot is described.Carrying out experiments is very challenging in this area [48].In our experiment, a robot equipped with a gas sensor moves around autonomously to estimate the location and strength of a source releasing hazardous material into the atmosphere.The smoke produced from burning incense sticks is used to simulate a hazardous release, and electric fans are used to create a wind field.The robot navigates the environment to the most informative measurement locations to make sensor observations, which are point measurements of the smoke concentration.The measurements are used to estimate the source term recursively, using the probabilistic algorithm described in Section IV-C1.At each time step, the robot moves to the position dictated by the information-based reward described in Section IV-C2 to take a new measurement.

A. Environment
The smoke produced from burning incense sticks, as shown in Fig. 2(a), is used as the simulated, hazardous material during the experiments.Incense sticks, otherwise known as joss sticks, are a popular item used by the public for aesthetic reasons, therapy, deodorizer, or for meditation.Such an accessible source enabled simple, safe, and easily repeated experiments that could be conducted in an indoor environment.An example of the highly turbulent smoke plume generated by the burning incense during the experiments is shown in Fig. 2(b).Multiple experiments are conducted with a varying number of burning sticks to analyze the response of the algorithm in different sensing conditions and to assess the accuracy of the scaled release rate estimates A 0 .  of the robot (1.8, 1.2) are indicated in Fig. 3(a).A wind field is generated, roughly along the positive x-direction, using fans to the left and to the right of the search area.A photograph of the experimental setup during an experimental run is shown in Fig. 3(b), displaying the position of the robot during a search and the source location at the bottom left.The localization of the robot is provided by an indoor positioning system (Vicon).The image on the floor is produced by a downward-facing projector that is used for visualization during the experiments and the videos provided in [49].The setup was inside a large ventilated building, large enough for there to be little effect caused by trapped smoke or wall reflectance.

B. Search Robot
A "Turtlebot" robot is adapted for gas sensing experiments shown in Fig. 4(a).An MOX gas sensor is used to sense the smoke.There are a range of sensors available, each with different sensitivities toward materials.The MQ135 gas sensor shown in Fig. 4(b) was chosen for the experiments due to its reported sensitivity to smoke.In order to improve the response of the sensor during the experiments, a CPU cooling fan was added to suck air into the sensor as is shown in Fig. 4(c).The sensor information is sent via a serial connection to a laptop on-board the Turtlebot, which sends it to a ground station, together with the associated location stamp.The ground station (Intel core i7 desktop PC) runs the sequential estimation of the source term parameters and outputs the new position command based on the online optimization.

VI. EXPERIMENTAL RESULTS
Multiple experiments are conducted to validate the algorithm and assess its behavior in response to varying source strengths.Illustrative runs are presented to show the characteristics of the cognitive search and source reconstruction with a strong and weak source.Examples of the output are shown in the form of marginal posterior density curves for all the estimated parameters in the source term vector k .

TABLE I ILLUSTRATIVE RUN PARAMETERS AND PRIORS
Table II is provided which summarizes the results of three trials each for experiments with two, four, and six burning incense sticks.Table II indicates the accuracy of the location estimate of the algorithm and the time taken to complete the search.Finally, averaged marginal posterior densities of the release strength are included to demonstrate the performance of the release rate estimate.

A. Illustrative Runs
The illustrative runs and experiments are conducted using the environment and the robot that have been described.The starting position of the robot during the runs is p 0 = (1.8,1.2).The number of random samples used in the particle filter is N = 10 000, and the number of samples used to approximate the expected utility from ( 21) is N z = 100.The probability of detection during the runs was set to P d = 0.7, and the standard deviation of the background noise was fixed at σ k = 0.005.
To initiate the experiments, prior distributions for the source parameters must be input to the algorithm.As discussed briefly in Section IV-A, the prior distributions should reflect information known about the release.To assess the algorithm in realistic conditions, it is assumed that there is a little information known about the release beforehand.The prior distributions used to initiate the illustrative runs and the experimental results were set to the values shown in Table I, where the true values are indicated if they were known.This is followed by a short discussion on the choice of the prior distributions.
1) The prior distributions for the location of the source ( p 0 (x s ), p 0 (y s )) were set to uniform within the domain.This would be equivalent to someone drawing a large rectangle to indicate "we think that the source is within here."2) The scaled release strength prior p 0 (A 0 ) was given a gamma distribution.This was used to indicate to the algorithm that the release is likely to be weak.This prior was fixed for every test, regardless of the real strength or the number of burning incense sticks.
Operationally, these should be set using meteorological sensors and information about the hazardous material.4) The average lifetime prior p 0 (τ s ), which in this case, refers to the average time taken for the smoke particles to cool and fall to the ground (in other cases, it may be a result of chemical reactions), was set to a uniform distribution as this parameter was unknown.In some circumstances, it could be given a more informative distribution based on information known about the hazard.An illustrative run using two burning incense sticks is shown in Fig. 5. Fig. 5(a)-(d) shows the path of the robot and the measurement positions, at various time steps, represented by the blue line and the blue dots.At each time step, the robot stops at the blue dot to sample, updates the estimates of the source term, and then decides where to move next.The current position of the robot is indicated by a larger, green circle, and the true position of the source is at the black circle.The large amount of small pink dots represents the N random samples that are used to approximate the posterior estimates of the source location parameters, as described in Section IV-C1.Posterior density estimates of the location of the source are shown in Fig. 6(a) and (b).The blue curve represents the estimate, the red vertical line is the true location, the tall green dashed vertical line is the mean, and the shorter lines are standard deviations.It is clear how the red line, representing the mean, is close to the peak of the density curve for the estimates in the x-and y-coordinates.The posterior estimate of the source release rate is shown in Fig. 6(c), where the red dashed curve indicates the inverse gamma prior and the blue line is the estimate.The performance of the release rate estimate is analyzed in Section VI-B, where the output is compared for different amounts of sticks.Posterior densities of the remaining parameters are shown in Fig. 7, and these parameters are mainly included to add robustness to the algorithm in the presence of uncertain meteorological conditions.
In Fig. 8, another illustrative run is shown using snapshots, where four burning incense sticks were used as the smoke source.In this run, smoke was detected by the detector much earlier in the search, causing the robot to proceed toward the source earlier on, as more information was available.Posterior densities at the end of the run for the location and strength estimates are given in Fig. 9.The sensor readings throughout the search are given in Fig. 8.The difference in the sensing conditions caused by changing the number of burning incense sticks can be seen by comparing Figs.5(e) and 8(e).

B. Numerical Results
The illustrative runs were repeated three times each for two, four, and six burning incense sticks.The autonomous stopping criteria were created based on the spread of the estimate.At each time step during the experiments, the spread of the posterior distribution was estimated as , where ζ k is the covariance of the source position particles.The results are summarized in Table II, where the mean estimates of the source location and strength are given with details about the search time.
For all runs, the location estimate was very accurate, typically within 10 cm of the true source position.In the STE literature using static networks, it is common for there to be greater error along the downwind x-direction and then the crosswind y-direction.However, in several of the experimental runs, a more significant error was seen in the crosswind direction.This was caused by the slow recovery time of the MOX gas sensor.When the robot moved from an area of very high concentration to very low, it was not reflected by the sensor, as it would still be recovering from its high reading.This is a negative property of MOX sensors reported previously in source localization experiments [9] and is a current topic of research focused on reducing and modeling the response time [50].In response to more sticks, the robot could estimate the source term more quickly.This is due to higher, more informative concentration readings earlier on in the search.The difference in search time between four and six sticks is quite small; this is from the robot still moving crosswind, even though the posterior estimate clearly showed that the source is upwind of the robots position.This behavior can be expected, as the goal of the decision making is not to move directly toward the source but also to follow an informative path that collects information about the source location, strength, and the meteorological parameters.Furthermore, by moving crosswind, the robot could gain an accurate crosswind position estimate of the source earlier on in the search from a standoff position.
To assess the strength estimate of the algorithm, one would usually compare the strength estimate directly with the true value.In these experiments, the true release strength of the incense sticks was unknown, and the sensor did not output a concentration reading.Smoke itself can be a mixture of several materials, so the composition of the material that the sensor was reading was unknown.The sensor was uncalibrated to the smoke, and the output was a voltage not a reading of concentration, meaning that only a scaled release strength A 0 could be estimated as described in Section IV-B.Upon calibration of the sensor, this can easily be adjusted to a true physical value.To assess the strength estimate of the algorithm, the outputs relative to one another using the varying amounts of sticks were compared.
The averaged marginal posterior densities of the release strength are shown in Fig. 10, using all the runs from Table II.The curves show the averaged posteriors for two sticks in   There are a number of reasons for the increased spread of the posterior for the larger release rate estimates: 1) modeled variance was increased with sensed value and the sensed value was larger with higher release rates; 2) a larger release rate leads to the possibility of a stronger source further away causing the increased spread of several posterior parameters; and 3) the final result was further from the prior distribution resulting in more spread.The strength estimate can be dependent on several of the unknown parameters due to coupling.It is beneficial for these parameters to be entered into the algorithm accurately, and however, it has been shown that the algorithm is robust to quite uninformative prior information.

VII. CONCLUSION AND FUTURE WORK
An autonomous search algorithm has been developed to navigate a robot to the expected most informative locations to estimate the source term of an atmospheric release.The output of the system can provide important details about a release or leak of an airborne material into the atmosphere, such as its location or the rate of emission.Such information permits a model to forecast the spread and deposition of the material into the surrounding area.To make the algorithm work in realistic conditions, a new likelihood function was designed to accommodate intermittent readings from the low-cost sensor, where the sporadicity was a consequence of a weak source, insufficient mixing of the airborne material, and the small size and low sensitivity of the sensor.An experimental setup was devised, which is easily repeatable and effective for early testing of STE techniques.Illustrative runs demonstrated the search behavior of the algorithm and its accuracy in location estimation.Numerical results showed the consistency of the algorithm and the effect of a stronger or weaker source.Finally, it was shown how the algorithm is able to predict the relatively scaled release strength of the source.The results mark a first for an STE algorithm running online to guide a mobile sensor.In the future, a similar configuration will be developed for outdoor tests using an aerial vehicle instead of a ground-based robot.

Fig. 1 .
Fig. 1.Example plot of the expected observation z k of the robot in a square area, produced from a source k with parameters: x s = −1.2,y s = −0.2,q s = 0.1, u s = 1, φ s = 90 • , d s = 0.1, and τ s = 2.

Fig. 2 .
Fig. 2. (a) Incense stick used as a smoke source during experiments.(b) Snapshot of the burning incense during an experiment to illustrate the turbulence.

Fig. 3 .
Fig. 3. Experimental setup used for the illustrative runs and experimental results.(a) Diagram of the environment displaying the starting position of the robot, the wind direction, and the location of the incense sticks.The red shaded area indicated the bounds where the robot can move.(b) Photograph of the experimental setup with a down facing projector for data visualization.

Fig. 5 .
Fig. 5. Illustrative run using two burning incense sticks at time steps (a) k = 7, (b) k = 32, (c) k = 58, and (d) k = 65.The green dot represents the current position of the robot that has followed the blue line trajectory and taken observations at positions indicated by the blue dots.The location of the source is indicated by a black dot, and the small pink dots represent the random samples of the estimation algorithm.(e) Voltage reading of the sensor throughout the experiment.(f) Mean and standard deviation of the release rate estimate over time.

Fig. 5
Fig. 5 demonstrates how the robot begins the search by moving in a crosswind direction.In response to a very little or no readings of smoke, the robot moves slightly upwind while proceeding to travel crosswind in the other direction.By time step 32, shown in Fig. 5(b), the pink dots have moved away from the visited locations of the robot where no smoke was seen; however, due to the large amount of uncertainty expected during the search, some dots still remain in this area in case the low or zero reading could have been caused by either sensor noise or atmospheric turbulence.By time step k = 58, shown in Fig. 5(c), the robot has narrowed down the source position, and the pink dots begin to converge into the true source location.At the end of the search, shown in Fig. 5(d), the robot

Fig. 6 .
Fig. 6.Outputs of the STE algorithm after an illustrative run with two sticks.Posterior density estimates of the location in (a) x-and (b) y-coordinates.The blue curve indicates the posterior estimate, and the red vertical line is the truth.The green dashed lines represent the mean and standard deviation of the estimate.(c) Posterior density of the scaled release rate A 0 .The blue curve indicates the posterior estimate, and the red dashed curve represents the prior distribution.

Fig. 7 .
Fig. 7. Remaining source parameter estimates at the end of the experiment.The red dashed line indicates the prior, and the blue curve is the estimate.(a) Wind direction φ s .(b) Wind speed u s .(c) Diffusivity d s .(d) Lifetime τ s .

Fig. 8 .
Fig. 8. Illustrative run using four burning incense sticks.(a)-(d) Snapshots of the experiment at different time steps k.(e) Plot of the path at the end of the search.The path followed by the robot is indicated by the blue line.The location of the source is indicated by a black dot, and the small pink dots represent the random samples of the estimation algorithm.(f) Voltage reading of the sensor throughout the experiment.(g) Mean and standard deviation of the release rate estimate over time.

Fig. 9 .
Fig. 9. Outputs of the STE algorithm after an illustrative run with four sticks.Posterior density estimates of the location in (a) x-and (b) y-coordinates.The blue curve indicates the posterior estimate, and the red vertical line is the truth.The green dashed lines represent the mean and standard deviation of the estimate.(c) Posterior density of the scaled release rate A 0 .The blue curve indicates the posterior estimate, and the red dashed curve represents the prior distribution.

Fig. 10 .
Fig. 10.Averaged marginal estimates of the release rate A 0 of the source at the end of the experiments.The red solid line indicates the prior distribution.The blue curve is the average release rate estimate after three runs with two burning incense sticks.The cyan dotted curve is the average for four sticks, and the green dashed line is for six.

TABLE II RESULTS
FOR THREE TRIALS USING TWO, FOUR, AND SIX INCENSE STICKS