Radio Frequency Signal Strength Based Multitarget Tracking With Robust Path Planning

The proliferation of technologically advanced and mobile devices poses risks to public safety and security. These threats can be mitigated by systems equipped to perform timely identification and tracking of devices and their operators. In pursuit of these capabilities, we propose an automated sensing platform designed specifically for tracking multiple, mobile radio frequency (RF) targets. There are a number of challenges involved with tracking multiple moving RF sources. We formulate the task as an iterative state estimation and path planning process, whereby the sensor platform first estimates the positions of the targets through observation of the RF environment and then plans and executes a movement path. By developing a sensor model informed by RF propagation theory, we construct a particle filter based state estimator with the potential to track multiple targets using only signal strength observations. In addition, we propose a path planning technique rooted in uncertainty minimization and safety based constraints. Finally, we validate the proficiency of the proposed methods with simulated experiments. Through analysis of tracking metrics and localization performance we show the benefits of our proposed active sensing techniques as they apply to tracking multiple RF targets. We demonstrate the robustness of our method to various environmental scenarios by testing with a multitude of realistic and challenging experimental parameters (e.g., speed of the sensor platform, number of targets, speed of targets, level of signal-to-noise ratio (SNR)). The results indicate that our method performs better than other state-of-the-art tracking methods, with significant improvements seen in the most difficult scenarios with higher speed targets. In these and other settings, our method is more often able to localize the targets and with less error and uncertainty in position estimation. We also show that our method is computationally efficient and scales well to an increasing number of targets.


I. INTRODUCTION
The subject of target tracking has a rich history with applications spanning a number of domains. In this paper we explore the area of multitarget tracking as it relates to radio frequency (RF) sources. A study concerning the tracking of RF sources has great significance during our current times where the number of devices communicating with RF signals is constantly increasing. Advances in manufacturing have resulted in devices capable of dynamic mobility and advanced computation while also maintaining an inconspicuous, lowprofile form factor. In the wrong hands, or even through The associate editor coordinating the review of this manuscript and approving it for publication was Pedro Miguel Cabral . misguided use, these devices pose a security risk that could be mitigated with the appropriate tracking capabilities. The ability to quickly locate these troublesome devices is of the utmost importance to emergency personnel, first responders, police, military, and security organizations. A motivating example comes from a warning by the National Interagency Fire Center [1] which details the risks of operating Unmanned Aerial Systems (UAS) near wildfires. They explain how the operation of a UAS near the vicinity of a wildfire creates a risk for air-based fire fighting operations. A number of incidents across the U.S. have resulted in fire fighting operations being suspended until the interfering devices no longer pose a risk to the firefighters. There exists enterprise products which seek to alleviate this problem [2], [3], but FIGURE 1. Model of the system architecture composed of RF signal sensing, multitarget tracking with particle filter based state estimation and intelligent path planning. they come at expensive costs and rely on specialized hardware systems.
The task of tracking targets in the RF domain presents unique challenges as a result of the properties of electromagnetic wave propagation. While there exists many mathematical models for describing the theoretical behavior of RF signals, real systems are subject to a number of impairments with significant impacts [4] on tracking performance. Furthermore, as opposed to other sensing modalities, RF signals can be extremely noisy, with interference from the environment and other signals.
This work focuses on the task of multitarget tracking. This presents a number of additional challenges compared to the task of single target tracking. When tracking multiple targets the method must take into account the movements of both the sensor and all the targets. It is also difficult to make observations that are informative of all target locations. Due to the nature of RF signals, there is a complicated non-linear relationship between the positions of the antennas and the associated signal power that is measured. A robust method will take into consideration all these factors as well as the relative speeds of the sensor and targets.
As seen from the system architecture model in Fig. 1, the method proposed in this work splits the problem of multitarget tracking into multiple iterative steps. The components include RF signal strength sensing, state estimation and intelligent path planning. In the state estimation step, the sensor platform uses signal observations to update its belief about the target states. The belief state is represented by a multitarget particle filter. The filter utilizes Bayesian inference with a sensor model informed by realistic RF antenna and propagation theory. In the path planning step, the sensor platform uses its current belief of the target states to determine a movement action for tracking all the targets. We propose a heuristic-based robust and efficient path planning algorithm (REPP) that is designed to minimize target uncertainty and obey safety constraints across a variety of challenging tracking scenarios. In comparison to similar RF multitarget tracking methods, we validate our proposed method against a set of more challenging scenarios where targets can travel at higher rates of speed relative to the sensor platform. We also test with other challenging conditions such as low SNR levels and additional targets. Our goal is to show the robustness of our proposed method by proving that it remains performant across a variety of environmental scenarios that could be encountered in a deployment of such a tracking system.
The major contributions of this work are as follows: • An end-to-end belief based system for tracking multiple, moving targets using only RF emissions.
• A sensor model informed by RF antenna propagation theory which enables multitarget tracking using a single sensor platform.
• A novel sensor path planning method uniquely formulated for belief based multitarget tracking scenarios.
• An analysis of critical trade-offs relating to localization, tracking accuracy, estimation uncertainty and computational efficiency through simulated experiments.

II. RELATED WORKS
This section provides a review of related works within the domains of RF signal tracking and active sensing.

A. TARGET TRACKING
The topic of RF signal tracking has a rich history with a multitude of sensor systems and localization methods. A majority of work focuses on the tracking of a single, stationary target using a network of sensors [5] or multiple sensor measurements [6]. The resulting simultaneous signal observations at multiple locations allow for the VOLUME 11, 2023 43473 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
direct estimation of a target's location using least-squares regression. Other works have attempted to extend these techniques to the tracking of multiple stationary targets using a sensor network [7]. These approaches are constrained by the assumption that the number of sensors in the network is typically much greater than the number of targets [8].
Performing tracking using a large sensor network is also costly, and requires precise synchronization of the constituent sensors.
Of particular relation to the methods proposed herein are those works which track targets using a single sensor platform [9], [10], [11], [12], [13]. The vast majority of these systems constrain themselves to tracking a single target, which is either stationary or moving with known dynamics. In contrast, the methods proposed here endeavour to perform mobile, multitarget tracking using only a single sensor platform receiving emissions from the target sources. The constraints imposed on this system would enable a low-cost implementation with capabilities comparable to systems of much greater cost. Such a system could be deployed quickly in unknown environments, without prior configuration. RF focused multitarget tracking has also been attempted [14], [15]. Many of these methods test on scenarios where the targets are moving slowly relative to the sensor. To be able to track UAS or other fast moving targets, additional methods and analysis are needed. Target tracking methods based on Random Finite Sets (RFS) have also been formulated for use with RF sources [16], [17]. While these approaches offer excellent tracking performance and comprehensive coverage of target dynamics they are computationally expensive and often intractable without additional assumptions and constraints.

B. RADIO FREQUENCY SIGNALS
When developing methods for RF signal processing, the exact form of the input signal plays an important role. For the task of RF target tracking, the typical signal formats are time of arrival (TOA) [18], time difference of arrival (TDOA) [19], angle of arrival (AOA) [5], [6], and received signal strength (RSS or RSSI) [10], [20]. TOA, TDOA, and AOA based methods require either multiple antennas or specialized antenna [11] arrays to perform well. They also depend on precise synchronization between the antennas, in order for the measurements to be accurate. The strict requirements imposed by these methods make them costly to implement and deploy.
Alternatively, RSS is captured by a single RF receiver without the need for any synchronization or additional sensors [21]. Signal strength decreases according to the inverse square of the distance between the transmitting and receiving antennas. The dependence of RSS on the locations of the relevant antennas therefore allows it to be used for RF target tracking. Signal strength is also heavily dependent on the propagation environment. Attenuation caused by fading, multipath, and other RF impairments can inject large amounts of noise into the RSS measurements [22] and influence the SNR of the signal. The path loss of an RF signal describes the attenuation caused by the distance the signal travels in addition to the previously mentioned impairment effects. It is often difficult to model the exact path loss of a signal because of its dependence on unknown characteristics of the environment [23]. The complex relationship between RSS and antenna location makes the resulting localization inaccurate. With the proposed state estimation techniques, the uncertainties caused by RSS measurements can be mitigated in order to perform accurate target tracking.

C. PATH PLANNING AND ACTIVE SENSING
The term active sensing has multiple definitions within the domains of machine learning and RF sensing. In the context of this paper, active sensing includes the broad class of methods with a moving sensor, and more specifically sensors moving with intelligent path planning control methods. There have been numerous techniques developed for learning the decision processes responsible for intelligent motion. Active sensing methods applied to target tracking typically involve a reward formulated for localization [24]. These rewards primarily aim to minimize the uncertainty of the belief state [25] or maximize information gain [9], [26]. For certain contexts, such as those focused on security and safety, the unknown emergent behaviors could be considered nonoptimal. In this work we consider alternative optimizations that prioritize minimizing belief uncertainty and maintaining safe distances from the targets.
Within the RF domain, active sensing often refers to systems that generate and output electromagnetic energy [27] which then reflects off of the target and returns information about the target. These radar systems [12] can be used to provide precise location and movement information about the target. Importantly, the methods proposed here do not fall under this category of active RF sensing. Rather, the system relies only on receiving and processing the RF emissions originating from the targets themselves. Systems built on emission observations provide much less information than their active counterparts, but are also much cheaper to implement [28]. When combined with intelligent sensor control and state estimation techniques, the deficiencies of the lower fidelity observations are diminished to a level at which they can be used to perform mobile multitarget tracking.

A. PROBLEM FORMULATION
We formulate the problem of RF multitarget tracking as an iterative decision process combining state estimation and path planning. In this formulation, the proposed sensor platform is the agent, whose actions represent the decision process. Furthermore, the sensor platform cannot directly observe the underlying state of the environment, that is, the location of the targets. Instead, the platform maintains a belief of the underlying state informed by RF signal observations and a sensor model dictating the likelihood of an observation given a possible state. The belief is a probability distribution across all possible states. The belief state informs a heuristic-based path planning algorithm which determines the movement of the sensor platform. In the subsequent sections, we will describe the details of these components.

B. STATES
One of the principal goals of this multitarget tracking method is to ascertain the positions of the targets as they move over time. We consider a 2D environment with a fixed number of mobile target transmitters N . The target state is x ∈ R 2 , where x is the position of the target in a 2D Cartesian coordinate system. Accordingly, the environment state captures the positions of all targets, The sensor state s ∈ R 3 contains the 2D position of the sensor and its heading. We assume the sensor has perfect knowledge of its own state.

C. SENSOR MODEL
Regarding the sensor observations made by our system, we utilize a track-before-detect (TBD) approach where the signal power measurements are used without any thresholding. In an environment with multiple mobile target transmitters, the signal power measurements at the sensor platform are determined by the log distance path loss model: where PL is the path loss between the antennas, d is the distance between the antennas, P tx is the power at the transmitter antenna, P rx is the power at the receiver antenna, d ref is a reference distance used for characterizing the power, γ is the path loss exponent, and X n is a zero-mean Gaussian random variable with standard deviation σ fading , representing the attenuation caused by shadow fading. Furthermore, the Friis transmission formula [29] details the dependence of the received power on specific variables of the system: where G tx is the directivity of a transmitter antenna in the direction of the receiver antenna, G rx is the directivity of the receiver antenna in the direction of the transmitter antenna, c is the speed of light, and f is the signal frequency.
To mitigate the difficulties of tracking multiple RF targets with a single antenna, we explicitly model the directivities G of the antennas with a pre-determined radiation pattern. By modeling the directivities of the antennas our state estimation methods are equipped with prior knowledge dictating the strong effect direction has on the received signal power. Fig. 3 shows the radiation pattern for a Yagi-Uda antenna. This is a common type of directional antenna with a single main lobe and various side lobes. The main lobe indicates the intended direction for receiving signals as it corresponds to the direction with the largest and smoothest gain. The various side lobes have lower gains, which correspond to a decrease in the received signal power. As such, the directivity of an antenna with respect to a second antenna is a function of the angle between them, G(θ). For simplicity, we discretize the gain values from Fig. 3 It follows that the likelihood of a received signal strength observation z is characterized by a Gaussian distribution: where z is the sensor's signal strength observation. The combined observation resulting from all targets is Z = [z 1 . . . z N ]. As indicated by (1) and (2), P rx is a function of the positions of the targets relative to the sensor platform, in addition to antenna and signal characteristics. In practice, some or all of the dependent variables will be unknown, and will require estimation as part of the solution methods. In our case, the antenna and signal characteristics are assumed to be known prior, leaving the target positions to be estimated.
In an effort to validate the use of simulated signal strength values we conducted a collection of real signal strength values using a UAV and corresponding controller. Fig. 4 shows a heatmap of the measured signal strength values versus a corresponding heatmap of the simulated values for the same locations. The simulated values were generated using (2). The similarities across both heatmaps reflect the inverse-square law in RF power levels as it applies to the distance from an emitting source. The discrepancies can be attributed to noise in the environment and multipath propagation effects.

D. STATE ESTIMATION
The belief state distribution represents the estimation of the underlying state of the environment. For the task of tracking multiple RF targets we propose the use of a multitarget particle filter [30]. Each particle is represented by a weight VOLUME 11, 2023 43475 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
where M is the total number of particles. We used the Sequential Importance Resampling [31] regime for our multitarget particle filter. This process proposes new particles by sampling from a generative model of environment state transitions and then weighting particles according to the observation likelihood function. The new particle distribution is created by resampling M times according to the likelihood weights, thereby eliminating belief states which do not align with observations. Using the signal strength observation likelihood from (3), the weights w (i) ∝ p(z|x (i) ) provide a quality assessment of each particle. The likelihood values are first normalized and then used as weights for resampling with replacement of the proposed particles. The general form of the state transition model is given by: where details of the state kinematics follow directly from the path planning action a t−1 , and any other assumptions made during the simulation.

E. PATH PLANNING
The primary objective of the sensing agent in the RF multitarget tracking task is to continuously localize all the targets by minimizing the uncertainty of their belief distributions. The difficulty in this task is centered around planning a path which enables the sensor to make useful observations in pursuit of this goal. As detailed in section III-C, the signal strength observations are dependent on the positions and orientations of both the sensor and the targets. The noise in the sensor observations and motion of the targets makes the planning problem difficult. Many methods approach this problem by applying information theoretic objective functions [14] as the criterion for an action policy. While these methods offer theoretical guarantees of reducing the belief uncertainty, they are computationally expensive.
In an effort to produce methods capable of running on lowresource edge devices, we propose a robust and efficient path planning (REPP) algorithm with a low computational burden. The path planner uses RF theory, statistics and geometry to account for multiple fast moving targets and noisy observations. By observing the antenna pattern in Fig. 3, it is clear the area with the largest gain and flattest gain gradient is at the front of the antenna. Accordingly, signal strength observations from a particular source will have the least variance when that source is positioned in the direction of the front of the antenna. It follows that an optimal path for tracking an RF target will orient the front of the antenna towards the target. While this approach is sufficient for accomplishing the primary objective of the tracking task, it fails to consider certain realities that may arise in a real implementation. When tracking multiple mobile targets it is often desired to maintain a minimum distance between the sensor and the targets for reasons including safety and counter-detection. It is also critical to ensure that all targets are able to be tracked. With fast moving targets at unknown positions, it is easy for the belief distributions to rapidly expand while the sensor is centered on another target.
When tracking multiple mobile targets simultaneously, the naive strategy must be augmented to account for the additional difficulties. With only one antenna to track multiple targets, the sensor platform must temporarily focus on a single target, then cycle its focus onto the other targets. To accomplish this, the sensor initializes a set of targets F = {n | 0 ≤ n ≤ N − 1} representing the targets to be focused upon. When a target is focused upon, its index is removed from the set, and when the set is empty it is reinitialized. Through this cyclic process, the sensor rotates its focus through every target and ensures that all targets are continually tracked.
As mentioned previously, an optimal RF sensing strategy will orient the front of the antenna towards a target, but to maintain safe operating conditions we apply an additional criterion which dictates that a specified distance be maintained between the sensor and all targets. To accomplish this, we create a list of path proposals and then select the proposal which satisfies a threshold distance constraint with a bias towards proposals that move the sensor towards all targets. For each target currently in set F three proposals are created. In formulating the proposals, we use the means of the particle filters as estimates for the target positions.
While using the mean of the particle filter may seem to be a gross oversimplification of the possible state distribution represented by such a filter, through experimentation we observe that the particles tend to quickly converge to approximately unimodal distributions. Furthermore, the mean is computationally cheap to compute and appears to contain sufficiently informative information regarding such unimodal distributions.
Using the mean of the filter as an estimate of the target position, the first proposal directs the sensor in a path directly towards the target. The second and third proposals are created by forming a circle of radius r min centered at the target position and then finding the path of the sensor towards the resulting two tangents of the circle. From these three, the proposal which results in the sensor having a minimum sum of distances to all targets is selected for the next step. In the final step, all remaining proposals are checked to ensure they satisfy the safety constraint. Specifically, the particle filter is rolled out according to the proposed trajectory, and the distance between the resulting particle states and the sensor is compared to the distance threshold r min . If the distance is less than the threshold this means the sensor is within the safety buffer zone surrounding a target. The expected value of the inequality is evaluated for each target, and if the maximum of these values is less than or equal to a safety constraint threshold S, then the proposal is used for the sensor action a:

IV. SIMULATION EXPERIMENTS
A thorough analysis evaluating the effectiveness of the proposed methods was conducted through computer simulations. Considering the motivating example from section I, we strive to create a simulated environment that mimics the scenario of UAV tracking. This setup provides a challenging task which showcases the unique strengths of our method. Table 1 lists the values of various parameters used throughout the experiments. These were chosen specifically to match the physical and RF constraints present in UAV tracking. In all settings, each target has an omni-directional antenna attached statically, with negligible directivity between the target and sensor platform. Additionally, the sensor platform contains one directional Yagi-Uda antenna attached statically where the main lobe is oriented to point towards the direction of forward movement of the sensor. For the scope of these experiments, we limited our setting to 2D to lower the computational burden. The methods can be generalized to 3D without modification. All simulations are conducted in a 2D environment with unbounded dimensions where each time step is 1 second. At each time step, the sensor receives a signal strength P ← proposals(n) 5: if P ̸ = ∅ then 6: p * ← argmin if safety_constraint (p * ) then targetSet.remove (n) 8: if targetSet = ∅ then 9: observation from each target, and then performs a movement action. The targets move at constant speeds with the specific speed depending on the experiment. The targets randomly change their heading ±30 • with probability 10% at each time step.
At the start of each experiment the targets are each placed at random locations with distances to the sensor platform in the range of 50 to 100 meters. The particle filter is initialized with random positions at distances of 1 to 200 meters from the sensor platform. During operation, the particle filter resamples from the random prior with probability 1%.
Each experiment is run until all targets are simultaneously localized or a maximum of 400 time steps has been reached. Targets are considered localized if the standard deviation of each particle filter dimension is below a threshold. For all experiments this threshold was set to 35 meters. Each experimental setting is repeated 100 times and the results 43478 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. averaged. Fig. 5 shows time-lapses of various simulation experiments. The simulations were run using a Python script on a computer with 16 GB of RAM and 10 CPU cores.

V. RESULTS
Here we present the results of our experiments. We demonstrate the robustness of our method by showing how its performance does not decrease when faced with fast moving targets, scaling the number of targets, and decreases in SNR. To illustrate the unique benefits of our proposed REPP method, we compare against LAVAPilot [15], another path planning method designed for RF multitarget tracking. To validate the efficiency of our method we compare against an information theoretic approach, where the Monte Carlo tree search (MCTS) algorithm was run using Shannon entropy to determine node value. The MCTS method determines optimal actions by continually selecting nodes in the search tree, expanding the search tree, simulating playouts, and then updating action values accordingly. We formulate the search tree using the particle filter from section III-D and a discretized action space. To demonstrate the benefits of our method we present a collection of results and analysis. First, we examine the overall performance of our method through a variety of tracking metrics. Next, we discuss the localization performance through plotting of the estimation error and variance. Finally, we consider the efficiency of the methods by documenting the time needed for path planning. Through quantitative analysis of tracking performance and computation time, we show the strengths of our proposed method in relation to other state-of-the-art techniques.
From Fig. 5 we see several examples of tracking experiments from start to end. In all cases the particle filter belief state quickly converges to approximately unimodal or bimodal distributions. In cases where the targets move quickly or the experiment is prolonged, it is clear that the filter distribution follows the changing states of the targets. Due to the heuristics contained in our REPP method, the sensor path typically maintains a central location in relation to the targets, and often crosses back over previous path locations. By biasing the path proposals towards the center of the targets and forcing the system to cycle target focus, the sensor path often crosses over itself multiple times. We assert that in the case of multiple, fast moving targets, this type of centralized path is crucial for tracking all targets. Without the bias and cycling the sensor can move too far from a distant target to the extent that the target state distribution diverges and localization becomes impossible.
In Table 2 we have included a collection of metrics that cover key characteristics for determining tracking performance. For these experiments we varied the speed of the sensor as well as the speeds of the targets. This setting aims to highlight the unique challenges posed by tracking targets with speeds similar to the sensor platform. The localization metric shows the probability of the method satisfying the localization stopping criterion before the end of the simulation. The specifics of the stopping criterion are detailed at the end of section IV. Here we see that REPP outperforms LAVAPilot in all but two cases, with the largest improvement being +41%. Next, the localization time shows the average time in seconds for the method to achieve the localization criterion. Again, our method outperforms the alternative in all but one case, with the largest improvement being −64.33 seconds. In the final two columns we report the root mean squared error (RMSE) and standard deviation. These metrics are reported for the final time step of the experiment. In a majority of the experimental scenarios our method has lower RMSE with the largest differences attributed to improvements with our method. Despite the standard deviation column having mixed results, the largest differences are also improvements with our method.
In situations where targets were moving quickly and at large distances apart our method was better able to manage the overarching task of localizing all the targets. LAVAPilot would spend too much time localizing a single target while the uncertainty of the other targets estimates increased quickly. We believe that forcing the sensor to cycle its focus between targets is essential to tracking fast moving targets. When targets were slow moving our method was able to localize the targets in fewer time steps compared to LAVAPilot. In many deployment scenarios, there is a critical need for fast localization and even seconds could be significant to the users of the system. Compared to LAVAPilot, REPP consistently localized the targets more often and in less time, with the greatest difference being over a minute.
In Fig. 6 we plot histograms representing the distribution of the duration of each experiment. Experiments are completed when the localization criterion from section IV has been satisfied or manually ended at a maximum of 400 time steps. Fast localization is desired in multitarget tracking, therefore distributions with weight in lower time steps are preferred. Counts in the last bin represent experiments in which the localization criterion was not satisfied and the experiment was manually stopped. These results show that in the majority of cases our REPP method not only satisfies the localization criterion more often but also achieves faster localization. From these results it is clear that across all experimental settings it is more challenging to localize targets with faster speeds. Even in easier tracking scenarios, where the targets are slow moving, our method achieves localization in fewer time steps.
In Fig. 7 we plot the localization error of the path planning methods. We use root mean squared distance error (RMSE) between the mean particle estimates and their respective true target locations as the metric for localization performance. From these plots we can form multiple insights about  the relative performance of the two methods. While the localization error increases with target speed, our method is more robust in this case, seeing less of an increase in error compared to LAVAPilot. This represents one of the most challenging tracking scenarios, where the distance between the sensing platform and the targets can often grow. When the sensor speed is significantly faster than the target the To further support our claims of robustness, we show that our method remains performant when scaling to additional VOLUME 11, 2023  targets and when the SNR of the received signals is increased. In Fig. 8 we show that doubling the number of targets being tracked has no significant impact on RMSE, even with fast moving targets. The methods generally perform the same with REPP showing slight improvements in some cases. In Fig. 9 we show that decreasing the SNR also has no significant impact on the performance of the method. In the case with greater SNR (i.e., less fading attenuation), our method outperformed the alternative in all cases.
In Fig. 10 we display boxplots of the maximum standard deviation from each particle filter dimension averaged across all targets. These plots show how the uncertainty of the particle filter belief state decreases throughout the experiment. An effective tracking method will have the uncertainty quickly decrease and remain stable. Given the setting and the mobility of the targets, it is reasonable to expect the uncertainty of the filter to plateau rather than continually decrease. From these results we see that when the sensor speed is slower our method clearly outperforms the alternatives. With faster sensor speeds our method has similar performance if not slightly worse. Considering all metrics for tracking performance, this shortcoming does not appear to be substantial in the overall comparison of the two methods. Furthermore, our method performs better in the more difficult tracking scenarios where target speeds are fast and sensor speeds are slow.
In Table 3, Fig. 11 and Fig. 12 we compare REPP with LAVAPilot [15], another heuristic-based path planner and MCTS, an information-based path planner. We include this comparison to highlight the differences between heuristic and information based path planners. In general, heuristicbased methods are computationally cheaper but do not match the tracking performance of information-based methods, which include theoretical guarantees of optimality. Table 3 highlights the significant difference in computation time between the different methods. We vary N to show how computation time scales with the number of targets being tracked. In both cases, the heuristic-based methods require only a few milliseconds of computation time for planning, which is sufficient for real-time tracking systems. Conversely, the information-based MCTS requires 6.35 and 16.61 seconds, which would make real-time tracking impossible. Furthermore, REPP is computationally efficient when scaling to tracking additional targets. Here we observe a 160% increase in planning time compared to a 203% increase with LAVAPilot [15].
From Fig. 11 we see that the MCTS method reaches a minimal RMSE faster than the other methods. Despite this, REPP converges to a similar error within approximately 140 seconds. In the most challenging case, where the sensor speed is slowest, REPP performs similarly to MCTS while LAVAPilot [15] has higher error. In Fig. 12 we make similar observations regarding the methods and how they influence the standard deviation of the particle filter. Again, the MCTS method minimizes the standard deviation faster than the heuristic approaches. Beyond approximately 140 seconds, REPP performs almost identically to MCTS while LAVAPilot [15] is never able to lower the standard deviation to the same amounts. Despite the marginal localization improvements with MCTS, the immense planning time required for the information-based method makes it unsuitable for a real-time tracking system with multiple, mobile targets.

VI. CONCLUSION
In this paper, we have proposed a method called REPP, which combines state estimation and a path planning algorithm for the purposes of tracking multiple mobile RF targets. Using a sensor model based on realistic RF propagation conditions, the particle filter maintains a belief distribution of all target positions using only signal strength observations as input. The belief state is used by a path planning component which plans the actions taken by the sensor platform. The path planning algorithm seeks to minimize the uncertainty of the target belief distributions while also maintaining a safe distance from all targets. In comparison to state-of-the-art RF multitarget tracking methods, ours is robust to both fast moving targets and low SNR conditions, and scales well with additional targets. Experiments demonstrate that our method is up to 41% more likely to achieve the localization criterion and up to 64.33 seconds faster at satisfying the criterion. Furthermore our method is computationally efficient, and requires much less planning time than other comparable methods. In similar conditions our method spent 0.0086 seconds for planning, while an informationbased approach was much slower at 16.61 seconds. Adding difficulty to the tracking task by increasing the number of targets and lowering the SNR did not impact the better performance of our model. This level of efficiency and robustness is critical to real life scenarios where mobile, rogue RF devices need to be tracked in a timely manner in noisy environments.