Linear Complexity Gibbs Sampling for Generalized Labeled Multi-Bernoulli Filtering

Generalized Labeled Multi-Bernoulli (GLMB) densities arise in a host of multi-object system applications analogous to Gaussians in single-object filtering. However, computing the GLMB filtering density requires solving NP-hard problems. To alleviate this computational bottleneck, we develop a linear complexity Gibbs sampling framework for GLMB density computation. Specifically, we propose a tempered Gibbs sampler that exploits the structure of the GLMB filtering density to achieve an <inline-formula><tex-math notation="LaTeX">$\mathcal {O}(T(P+M))$</tex-math></inline-formula> complexity, where <inline-formula><tex-math notation="LaTeX">$T$</tex-math></inline-formula> is the number of iterations of the algorithm, <inline-formula><tex-math notation="LaTeX">$P$</tex-math></inline-formula> and <inline-formula><tex-math notation="LaTeX">$M$</tex-math></inline-formula> are the number hypothesized objects and measurements. This innovation enables the GLMB filter implementation to be reduced from an <inline-formula><tex-math notation="LaTeX">$\mathcal {O}(TP^{2}M)$</tex-math></inline-formula> complexity to <inline-formula><tex-math notation="LaTeX">$\mathcal {O}(T(P+M+\log T)+PM)$</tex-math></inline-formula>. Moreover, the proposed framework provides the flexibility for trade-offs between tracking performance and computational load. Convergence of the proposed Gibbs sampler is established, and numerical studies are presented to validate the proposed GLMB filter implementation.


I. INTRODUCTION
T HE aim of multi-object tracking (MOT) is to estimate the number of objects and their trajectories from noisy sensor data.The challenges are the unknown and time-varying number of objects, accompanied by false alarms, misdetections, and data association uncertainty, culminating in computational bottlenecks for most real world applications.Notwithstanding this, numerous solutions have been developed, with multiple hypothesis tracking [1], joint probabilistic data association [2], and random finite set (RFS) [3] being the most widely used approaches.The RFS approach, in particular, is gaining substantial interest due to its versatile multi-object state space models [4], and efficient solutions [5].Various tractable RFS multiobject filters have been devised ranging from the probability hypothesis density localization filters [4], [6] to the generalized labeled multi-Bernoulli (GLMB) tracking filter [7].
The GLMB filter [7] is an analytic solution to the RFS multi-object filter that provides tracks and their provisional identities/labels to meet MOT requirements [1].The unique feature of this formulation is the provision for principled trajectory estimation (even with only single-scan filtering), capable of tracking millions of objects, online [5].Moreover, the GLMB family of densities is furnished with elegant mathematical properties, offering a versatile set of tools including, conjugacy [7], closure under truncation with analytic truncation error [8], analytic approximation of general labeled multiobject density with minimal Kullback-Leibler divergence [9], analytic Cauchy-Schwarz divergence and void probabilities [10].These properties enabled MOT with multiple sensors and scans [8], [11], [12], non-standard models [9], [13]- [15], unknown system parameters [16], multi-object control solutions [10], [17], as well as distributed implementations [18]- [20].The GLMB filter is also amenable to parallelization that reduces computation times.For instance, in [5], spatial search was used to decompose the filtering density into independent GLMBs that are processed in parallel, while, in [21], a parallel centralized implementation for multiple sensors was developed.The GLMB filter has also found a host of applications from robotics [22], sensor networks [23], [24], cell biology [14], [25] to audio/video processing [15], [26].
The main computational bottleneck in GLMB filtering is the truncation of the multi-object filtering density [7].Truncation by discarding terms with small weights minimizes the L 1 -error [8], and can be posed as a ranked assignment problem, solvable by Murty's algorithm [27] with cubic complexity in both the number of measurements and hypothesized objects [28], [29].A more efficient solution based on Gibbs sampling (GS) was proposed in [30], which incurs an O(T P 2 M ) complexity, where T , the number of iterates of the algorithm, dominates the number of hypothesized objects and measurements P and M .GS is an efficient Markov Chain Monte Carlo (MCMC) technique for sampling from complex probability distributions, popularized by the seminal work of Geman and Geman [31], which opened up applications in many disciplines ranging from statistics, engineering to computer science, and is still an active research topic.
Following the strategy of selecting significant GLMB components by random sampling [30], a number of GLMB truncation techniques have been developed.In [32], an approximate GLMB filter with linear complexity in the number of hypothesized objects was proposed by neglecting the standard data association requirement of at most one measurement per object.While this technique can be modified to accommodate the data association requirement, the complexity reverts to quadratic in number of hypothesized objects [32].In [33], a herded Gibbs sampling implementation of the labeled multi-Bernoulli (LMB) filter [34]-a one-term approximation of the GLMB filter-was developed.However, this implementation is slower than the systematic-scan Gibbs sampling (SGS) GLMB implementation [30], not to mention that, compared to the (exact) GLMB filter, the LMB filter is more prone to track fragmentation and track switching.In [35] and [36], the cross-entropy method [37] was applied, respectively, to multi-sensor GLMB filtering and its distributed version, but with higher complexity than the SGS implementation in [11].
To alleviate the computational bottleneck in GLMB filtering, this paper proposes a tempered Gibbs sampling (TGS) framework for selecting significant GLMB components, with linear complexity, i.e., O(T (P + M )).Similar to the widely known random-scan Gibbs sampling (RGS), TGS randomly selects a coordinate to update, but provides the mechanism to improve mixing and sample diversity [38].However, generic TGS incurs an O(T P 2 M ) complexity.To this end, we develop an innovative decomposition of the conditionals that enables TGS to be reduced to O(T (P + M )) complexity, with negligible additional memory (compared to generic O(T P 2 M ) TGS).The samples generated by the proposed TGS algorithm converge to a tempered distribution (not necessarily the same as the original stationary distribution).Furthermore, keeping in mind that TGS is regarded as the combination of importance sampling and MCMC, the importance-weighted samples indeed converge to the original stationary distribution [38].The drastic reduction in computational complexity facilitates the development of multi-object control solutions, which require fast computation of the multi-object filtering density for online operations [39]- [42].Moreover, our solution is not only restricted to GLMB truncation, but amenable to a wide range of applications of the ranked assignment problem [43].
The proposed TGS framework enables RGS to be implemented with O(T (P + M )) complexity, whereas generic RGS incurs O(T P M ).Further, we propose deterministic-scan Gibbs sampling (DGS), an efficient algorithm using deterministic coordinate selection, as opposed to RGS's completely random coordinate selection.DGS's better mixing and sample diversity compared to RGS are validated by numerical studies.In addition, using DGS, we show how an exact implementation of SGS can be accomplished with O(T P M ) complexity.We also present numerical studies in MOT to discuss the tradeoffs between tracking performance and computational load in the proposed TGS framework.Due to the computation of the cost matrix, the resultant GLMB filter implementation incurs an additional O(P M ) complexity.In summary, our main contribution is a TGS framework for GLMB truncation that: • Incurs a linear complexity of O(T (P + M )), with negligible additional memory, by exploiting the structure of the GLMB filtering density; and • Admits as special cases, RSG, DGS, and SGS implementations, with O(T (P +M )), O(T M ), and O(T P M ) complexities, respectively.
We validate the proposed approach via a series of comprehensive numerical experiments, with considerations for computational efficiency, tracking accuracy, and sample diversity.The rest of this paper is organized as follows.Section II provides the necessary background on GLMB filtering and SGS-based GLMB truncation.In Section III, we present linear complexity GS for GLMB filtering density truncation.Numerical studies are presented in Section IV, and concluding remarks are given in Section V.
Class of all finite subsets of a given set S f Birth probability density of x given label ℓ Γ Space of all (extended) association map γ γ (Extended) association map Single-object likelihood g(Z|X) Multi-object likelihood II.PRELIMINARIES This section presents the necessary background for the development of the main result of this article.We first outline the basics of generalized labeled multi-Bernoulli (GLMB) filtering in Subsection II-A, and then summarize the Gibbs sampling (GS) approach to GLMB truncation in Subsection II-B.Throughout this paper, single-object states and multiobject states are respectively represented by lower case letters (e.g., x) and upper case letters (e.g., X).Further, boldfaced symbols denote labeled states or distributions (e.g., x, X, π).Frequently used notations are summarized in Table I.

A. GLMB Filtering 1) Multi-object State:
In GLMB filtering, the system state to be estimated at each time is the set X of labeled states of the underlying objects, called the multi-object state [7].Each labeled single-object state x ∈ X is an ordered pair (x, ℓ) in the product space X × L, where X is a (finite dimensional) state space, and L is a discrete space called the label space.Let B τ denote the label space of objects born at time τ , then the space L of all labels up to time k is given by the disjoint union ⊎ k τ =0 B τ .The state x of an object varies with time, while its label or identity ℓ (usually consists of the object's time of birth and an index to distinguish those born at the same time) is time-invariant.The trajectory of an object is a sequence of consecutive labeled states with a common label.
The cardinality |X| (i.e., number of elements) of the multiobject state X varies with time due to the appearance and disappearance of objects.In addition, a multi-object state X at any time must have distinct labels.More concisely, let L (x) denote the label of x, and for any finite X ⊂ X × L, define the label set L (X) Then for X to be valid multi-object state, we require ∆ (X) = 1.
In Bayesian estimation, the single-object state and the measurement are modeled as random variables.Hence, for multiobject estimation, the multi-object state and measurement are modeled as Random Finite Sets (RFSs).In the following we describe the so-called standard multi-object models for the dynamics and observations in a multi-object system.
2) Multi-object Dynamic: For simplicity, we omit the time subscript "k", and use the subscript "+" for the next time k +1 when there is no ambiguity.Each element x = (x, ℓ) of the current multi-object state X either survives with probability P S (x) and evolves to state x + = (x + , ℓ + ) at the next time according to the transition density f S,+ (x + |x, ℓ)δ ℓ [ℓ + ], or dies with probability 1−P S (x) [7].The term δ ℓ [ℓ + ] in the transition density ensures the object retains the same label.In addition to the surviving objects, new objects can be born.New born objects are usually modeled by an LMB RFS, where an object with state x + = (x + , ℓ + ) is born at the next time with probability P B,+ (ℓ + ), and the state density f B,+ (x + , ℓ + ).The multi-object state X + at the next time is the union of surviving objects and new born objects, described by the multi-object Markov transition density f + (X + |X) [7].It is assumed that, conditional on the current multi-object state, objects survive and move independently of each other, and that new born objects and surviving objects are independent [3].
3) Multi-object Observation: Each element x ∈ X is either detected with probability P D (x) and generates an observation z at the sensor with likelihood g(z|x), or misdetected with probability 1−P D (x).In addition to the detections, the sensor also receives clutter, modeled by a Poisson RFS with intensity function κ.The multi-object observation Z is the union of detections and clutter.It is assumed that conditional on the multi-object state, objects are detected independently from each other and that clutter and detections are independent.
The association of objects with sensor measurements (at time k) is described by a positive 1-1 mapping γ : L → {-1:|Z|}, i.e., a mapping where no two distinct arguments are mapped to the same positive value [7].The (extended) association map1 γ specifies that object (with label) ℓ generates measurement z γ(ℓ) ∈ Z, with γ(ℓ) = 0 if it is undetected, and γ(ℓ) = −1 if it does not exist.The positive 1-1 property ensures each measurement comes from at most one object.Let L(γ) {ℓ ∈ L : γ(ℓ) ≥ 0} denote the set of live labels2 of γ, and Γ denote the space of all association maps, then the multi-object likelihood function can be written as [11] g(Z|X) where 4) The GLMB Filter: All statistical information about the underlying state is contained in the filtering density-probability density of the current state conditioned on all measurements up to (and including) the current time.Given the multi-object filtering density π at the current time, the propagation to the next time step is given by [3] Under the standard multi-object system model, the multiobject filtering density takes on the GLMB form [7]: where Ξ is some finite discrete space, p (ξ) (•, ℓ) is a probability density on X, and w (ξ,I) is a non-negative weight satisfying the condition (ξ,I)∈ Ξ×F (L) w (ξ,I) = 1.To obtain a multi-object state estimate from the GLMB density (4), we first find the most probable cardinality n * from the cardinality distribution and then the highest-weighted component (ξ * , I * ) with cardinality |I * |= n * , see [7].The state estimate for each object ℓ ∈ I * can be taken as the mean (or mode) of p (ξ * ) (•, ℓ).
Alternatively, the entire trajectory of object ℓ ∈ I * can be estimated as described in [8], [17].
For compactness, we write a GLMB density in terms of its parameters: Given a current GLMB filtering density of the form (6), its propagation to the next time is the GLMB [8], [30] where the new parameters are given by p(ξ,j) ψ(ξ,γ+) Note that each component (ξ, I) of the GLMB filtering density at time k generates a (very large) set {(ξ, I, γ + , I + ) : I + ∈ F (L + ), γ + ∈ Γ + } of children components to the next time.Due to the terms δ L(γ+) [I + ] and 1 F (I⊎B+) (I + ) in ( 9), we only need to consider components with I + ⊆ I ⊎ B + and L(γ + ) = I + .While this is a big reduction, in general the total number of GLMB components with non-zero weights still grows super-exponentially with time.Implementing the GLMB filter requires truncating the GLMB filtering density.Truncation by keeping the most highly weighted components minimizes the L 1 truncation error [8].
GS is a computationally efficient Markov Chain Monte Carlo (MCMC) technique for sampling from complex probability distributions whose conditionals can be computed/sampled at low cost.In GLMB truncation, we aim to maximize the number of distinct significant samples, rather than focusing on the actual distribution of the samples as per MCMC inference.All distinct samples can be used regardless of their distribution, because each distinct sample constitutes a term in the approximant (the larger the weights, the smaller the approximation error) [30].Hence, it is not necessary to discard burn-ins and wait for samples from the stationary distribution.
Systematic-scan GS (SGS) is the classical approach that samples from the stationary distribution π by constructing a Markov chain with transition kernel [31], [44] where the i-th conditional, defined on {-1:M }, is given by . This means, for a given γ + , the next state γ ′ + of the chain is generated one component after another, by sampling For GLMB truncation, the conditionals are the categorical distributions given by [30, Proposition 3], which is restated in a slightly different form as follows.Proposition 1.For each i ∈ {1:P }, let ℓī denote ℓ 1:i-1,i+1:P .Then the i-th conditional, defined on {-1:M }, is given by where Remark 1.The above result shows that the conditionals are completely characterized by the P × (M + 2) cost matrix in Fig. 1(a) (which can be pre-computed from the measurement Z + ) and the values of γ + on ℓī, i.e., {γ + (ℓī)}.Specifically, the unnormalized i-th conditional is simply given by the i-th row after entries with (positive) indices contained in {γ + (ℓī)} have been zeroed (or masked) out, as illustrated in Fig. 1(b).Since evaluating 1 {γ+(ℓī)} (j) (and hence the mask 1 − 1 {γ+(ℓī)} (j)) for each j incurs an O(P ) complexity, computing , then γ + is also positive 1-1, i.e., a valid association map.Multiplication by the mask 1 − 1 {γ+(ℓī)} (j) ensures that any j violating the positive 1-1 condition has zero probability of being sampled.
Selecting significant GLMB components can be performed with O(T P 2 M ) complexity via SGS as shown in [30].Noting that P (the number of hypothesized objects) is strongly correlated with M (the number of measurements), this complexity translates roughly to a cubic complexity, i.e., O(T P 3 ) or O(T M 3 ).Nonetheless, SGS has been extended to address multi-dimensional ranked assignment problems in multi-scan and multi-sensor GLMB filtering [8], [11], [12].

III. LINEAR COMPLEXITY GS FOR GLMB FILTERING
This section presents efficient linear complexity GS for selecting significant components in GLMB filtering.We begin with the widely-known random-scan GS (RGS) in Subsection III-A.Subsection III-B then presents tempered GS (TGS), a recent generalization that can overcome the drawbacks of RGS, but incurs an O(T P 2 M ) complexity.In Subsection III-C, we develop a decomposition of the conditionals (of the stationary distribution) allowing TGS to be implemented with a linear complexity of O(T (P + M )).Salient special cases of the proposed linear complexity TGS, including deterministic-scan GS (DGS), are discussed in Subsection III-D.For completeness, the linear complexity TGS-based GLMB filter implementation is discussed in Subsection III-E.

A. RGS for GLMB Truncation
Whereas SGS generates the next iterate γ ′ + by traversing and updating all P coordinates of the current iterate γ + , RGS only selects one coordinate at random to update [45], [46].This means the transition kernel π(γ ′ + |γ + ) is given by ) when γ + and γ ′ + differ at most in the i-th coordinate and 0 otherwise.A generic RGS implementation would incur an O(T P M ) complexity because computing the conditionals requires O(P M ).Further, RGS is inefficient in the sense that it generates less distinct significant samples than SGS, for the same number of iterates of the chain, leading to poorer GLMB approximations and tracking performance.
The two main factors affecting the efficiency of RGS are mixing time and sample diversity.Intuitively, mixing time is the number of iterations required for subsequent states to be treated as samples from the stationary distribution π.Sample diversity refers to the proportion of distinct samples in a given number iterates of the chain.While the actual distribution of the samples is not relevant for GLMB truncation, fast mixing is necessary for efficient generation of distinct significant samples.Furthermore, even if the chain converges to the stationary distribution, sample diversity can be poor due to frequent revisiting of previous states from successive iterations.RGS's notorious slow-mixing [38], [45], [47], together with observed poor sample diversity means that it could take many iterations for significant GLMB components to be generated.

B. Tempered Gibbs Sampling
Similar to RGS, TGS also generates the next iterate γ ′ + by randomly selecting a coordinate to update.However, TGS provides an additional mechanism to improve mixing and sample diversity [38].Specifically, a coordinate i ∈ {1:P } is chosen according to the distribution where φ i (•|γ + (ℓī)) is a bounded proposal, defined on {-1:M }, with the same support as π i (•|γ(ℓī)).Further, given the selection of the i-th coordinate, its state is updated by sampling from the proposal, i.e., Algorithm 1: Tempered Gibbs Sampling [38] Input (note that TGS reduces to RGS in the special case φ i = π i ).
The proposal φ i controls sample diversity, and determines coordinate selection that can influence mixing [46], [48], [49].The transition kernel of TGS is given by φ(γ ) when γ + and γ ′ + differ at most in the i-th coordinate and 0 otherwise.
In GLMB truncation, we are not interested in the importance weights since the goal is to generate distinct samples with significant GLMB weights (which are different from the importance weights).While TGS can circumvent the drawbacks of RGS by using fast mixing proposals that yield diverse samples, these may not be significant (in GLMB weights) because the tempered stationary distribution could be very different from π.One way to generate diverse and significant samples is to use proposals that approximate the conditional π i , but are more diffuse, which can be achieved with a mixture consisting of the conditional and its tempered version, i.e., where α, β ∈ (0, 1], and for any function This popular proposal preserves the modes of the conditional π i (•|γ + (ℓī)) to capture significant samples, and at the same time, increases sample diversity via the more diffuse tempered term [38], [50].Moreover, the state informed coordinate selection strategy of TGS provides faster mixing [38], [51], [52].
While the TGS kernel (Algorithm 1) avoids traversing all coordinates, it still incurs an O(P 2 M ) complexity (and hence generating T iterates incurs O(T P 2 M ) complexity) because: • Computing the conditionals π i (•|γ + (ℓī)) for all i ∈ {1:P } incurs O(P 2 M ) complexity since each conditional requires performing the positive 1-1 checks using the set inclusion 1 {γ+(ℓī)} (j), which incurs an O(P M ) complexity; incurs O(M ) complexity.Nonetheless, it is possible to further exploit the particular structure of the problem through the positive 1-1 constraint to implement this kernel with O(P + M ) complexity.
The above discussion also holds for the tempered unnormalized conditional π β i (•|•), and is stated more concisely in the following.

Corollary 1. Suppose that the successive TGS iterates γ + and γ ′
+ differ at the n-th coordinate.Then, for β > 0, and for i ∈ {n}, j ∈ {-1:M }, The above result means propagating each tempered conditional and its normalizing constant to the next iterate of the Markov chain can be performed with a constant time complexity.Consequently, O(T (P + M )) complexity TGS can be developed for selecting significant GLMB components.The pseudocode in Algorithm 2, herein referred to as T GS + , outlines the steps for generating iterates γ

that for any bounded function
where the coordinate probability distribution ρ(•|γ + ) is given by (19).Further, the variance of the importance weights is stable, i.e., does not grow with the number of coordinates.

Proof. See Appendix (Subsection VI-B).
Remark 4. In TGS, the variance of the weights does not grow with the number of coordinates, and hence the algorithm scales gracefully with P .Combining tempering with importance sampling is a strategy used to improve slow mixing in generic MCMC [38].However, the variance of the weights grows exponentially with the number of coordinates [53], [54], which means that the performance of a generic MCMC method deteriorates with large values of P .Unlike generic tempering in MCMC, TGS only tempers the conditional of the selected coordinate, but still inherits the benefit of improved mixing [38].

D. Salient Special Cases
1) Random-scan GS: Under the proposed TGS framework, setting β = 1 in the proposal (21) translates to uniformly random coordinate selection.This special case, described by Algorithm 4, is indeed RGS, herein referred to as RGS + to distinguish it from generic O(T P M ) complexity RGS.Since ρ(•|γ + ) is a uniform distribution, it requires no computation.However, the masking steps for π i (•|γ + (ℓī)) are still needed.Nonetheless the complexity of RGS + reduces to O(T (P + M )) because: • Sampling n uniformly from {1:P } incurs O(P ) complexity (line 3); i=1 (based on Corollary 1) takes O(P ) complexity (lines 6-7).

2) Deterministic-scan GS:
The other extreme of complete randomness, namely deterministic coordinate selection can also be accommodated by T GS + .The simplest deterministic scan is to traverse the coordinates according to the periodic n (γ n (γ + ) ; sequence 1, 2, • • • , P , i.e., the selected coordinate at t-th iteration is given by c(t) = 1 + mod(t-1, P ).To realize this in the TGS framework, a proposal of the form where ε is a very small positive number.Note that the corresponding coordinate distribution is given by .
Hence, in the limiting case as ε tends to zero, only coordinate c(t) can be selected at the t-th iterate.
In principle, this special case has O(T (P + M )) complexity.However, since the coordinate is effectively selected according to the prescribed sequence c(t), the sampling step is not needed.Moreover, using the recursive construct in the initialization (Algorithm 3), it is possible to implement this socalled DGS + special case as described in Algorithm 5, which only incurs an O(T M ) complexity.A similar approach based on the reverse scan order, i.e., P, P − 1, • • • , 1, is possible by setting c(t) = P − mod(t-1, P ) in Algorithm 5 (line 3).
Note that in the literature, the terms "Systematic-scan GS", "Sequential-scan GS", and "Deterministic-scan GS" all refer to "SGS".The proposed DGS + is clearly not the same as SGS, but is closely related to it.Specifically, the sequence of every P -th iterate of DGS + , with α = 1 or β = 1, is indeed a sequence of SGS iterates.However, this approach to SGS (summarized in Algorithm 6), referred to as SGS + , has to traverse P coordinates to generate one iterate.Hence, for T iterations of DGS + , SGS + effectively has T /P iterations.This also means that the complexity per iteration increases by P , and consequently SGS + incurs an effective O(T P M ) complexity, a drastic reduction from O(T P 2 M ) as in [30].
Remark 5. Apart from the forward-scan coordinate selection discussed above, other deterministic scan orders are possible. πn( + (ℓ1:n-1), γ + (ℓ1:P )]; For efficiency and sample diversity, it is important that the periodic coordinate sequence c(t) scans all P coordinates in one period.

E. TGS-GLMB Filter Implementation
The TGS-based implementation of the GLMB filter is the same as the SGS-based implementation in [30, Algorithm 2], with two key changes: • SGS-based truncation is replaced by the proposed O(T (P + M )) TGS algorithm (Algorithm 2); and • There is an additional O(P M ) complexity step for computing the initial coordinate distribution (Algorithm 3).Thus, the resulting GLMB filter implementation incurs an overall complexity of O(T (P + M ) + P M + T log T ).The additional factor of T log T arises from the need to remove duplicates from the TGS output.Noting that in practice, the number T of iterates is large, typically T (P + M ) >> P M , nonetheless log T is small, and P + M >> log T .This is a substantial reduction from the O(T P 2 M ) complexity of the SGS-based implementation.

IV. EXPERIMENTAL RESULTS
This section presents numerical studies for GLMB filtering with truncation via the proposed TGS based schemes and the new recursive implementation of SGS:  [30]).Note that, although all methods use the same initialization (Algorithm 3), only T GS + requires ρ(•|γ + ), whereas RGS + /SGS + do not require the tempering related steps.No parallel implementations are used in our experiments.
A common linear Gaussian setup on a square 3 km × 3 km surveillance region over 100 time steps is used throughout this section.The number of objects is time varying due to various births and deaths.The single-object state is a 4D vector [x, ẋ, y, ẏ] T of 2D position and velocity which follows a constant velocity model with sampling period 1 s and process noise standard deviation σ p = 5 ms −2 on each axis.Existing objects have a constant survival probability of P S = 0.99.The effects of individually varying key parameters affecting T , P , and M are also examined.In addition to numerical studies with the default parameters described above, further experiments are also carried out according to Table II with varying number of GS iterations, total number of trajectories (or peak number of objects), plus detection and clutter rates.The effects of different mixture rates and tempering levels for T GS + are also studied.
Evaluations are undertaken for computation time, tracking accuracy, and (association) sample diversity via (i) measured wall-clock times, (ii) OSPA [55] and OSPA (2) [5] metrics, and (iii) the number of unique solutions per scan during GLMB filtering.Results on the tracking accuracy and sample diversity of RGS/SGS are the same as per RGS + /SGS + and hence omitted.Note that each Monte Carlo trial generates random births, motions, deaths (see the example in Fig. 3) and measurements according to the multi-object model parameters.
All empirical averages are then reported over 1000 Monte Carlo runs using MATLAB R2020a on a dual CPU machine with dual 64-Core AMD EPYC 7702 CPUs and 1TB RAM.

A. Computation Time
Fig. 4 shows (in log scale) the measured wall-clock times averaged per scan (lower is better).It is imperative to examine both average and maximum times particularly for latency sensitive real-time applications [56].As expected, the results in Figs.4(a) and (b) show increasing computation times with increasing numbers of trajectories and iterations.The results in Figs.4(c) and (d) show generally flat trends of computation times with varying rates of detection and clutter, since the plots use the same default values for the number of iterations and number of trajectories.In all cases, it can be seen that all linear complexity GS implementations have similar run times, but all are significantly faster than the higher complexity methods SGS + /RGS/SGS.Further, RGS + /SGS + show better computational efficiency than RGS/SGS.Of the linear complexity truncation strategies, T GS + is generally the most expensive, due to the additional calculations involved with tempering and coordinate selection.The deterministic scan − −− → DGS + / ← −− − DGS + have more similar run times to RGS + , depending on the fraction of time spent in sampling versus overheads.Nonetheless, T GS + / − −− → DGS + / ← −− − DGS + are observed to have improved tracking accuracy and improved sample diversity, when compared to RGS + , as will be seen and further discussed in the next subsections.

B. Tracking Accuracy
Fig. 5 shows the OSPA and OSPA (2) metrics via overlay bar charts (lower is better).Specifically, an average OSPA error (taken over all time steps) and a single OSPA (2) error (computed over the entire scenario window) are calculated with parameters p = 1 and c = 100 m.In essence, the OSPA distance captures the error between two sets of point estimates (i.e., single-scan), whereas the OSPA (2) distance captures the errors between two sets of track estimates (i.e., multi-scan), both in a mathematically consistent and physically intuitive manner.
The resultant OSPA and OSPA (2) for varying numbers of trajectories and iterations are shown in Figs.5(a) and (b).For a relatively small number of objects, the average filtering and tracking errors for all GS variants are virtually identical.As the number of objects is gradually increased, as expected, the corresponding errors for SGS + (the most expensive variant) exhibit the smallest increase of all variants.Conversely but also as expected, the errors for RGS + show the largest increase, due to slow mixing and insufficient diversity.The errors for T GS + are significantly better than those for RGS + , even though both have the same linear complexity, due to Overall, the results suggest that T GS + / − −− → DGS + / ← −− − DGS + generally outperform RGS + on tracking accuracy, even though they have similar complexities.In addition, there is small disparity between − −− → DGS + / ← −− − DGS + , suggesting that scan order can influence results.Unsurprisingly, it is observed that SGS + generally outperforms T GS + , due to a significant disparity in complexity and hence running times.Consequently, the proposed linear complexity T GS + based solutions have the potential to offer a robust trade-off between tracking performance and computational load.

C. Sample Diversity
The average number of unique samples over the entire scenario is shown in Fig. 6 (higher is better).A higher number of unique samples means a higher number of association hypotheses, or mixture components in the estimated GLMB filtering density, which generally results in a lower GLMB truncation error.
The trend in Fig. 6(a) confirms that the number of unique samples follows the increase with the number of trajectories, which is necessary to capture the corresponding increase in the diversity of the components in the filtering density.In the case of the lowest number of objects, it can be seen in Fig. 5(a) that even though all the GS variants have the same tracking accuracy, they generally produce different numbers of unique solutions.This is due to the fact that a small number of samples is still sufficient to capture the GLMB filtering density.As the number of objects increases, it can be seen that SGS + produces more than twice the number of unique samples of T GS + , but incurs a significant increase in time complexity, which is expected due to the exhaustive traversal of all coordinates in SGS + .In addition, the informed (single) coordinate selection in T GS + produces more unique samples than RGS + even though both have the same complexity.Note also from Fig. 6(a) that the number of unique samples from − −− → DGS + / ← −− − DGS + is also higher than that from RGS + but lower than that from T GS + .The relative trends in the number of unique samples are generally consistent with those of the tracking error shown in Fig. 5(a).
Examination of Fig. 6(b) confirms that the number of iterations directly influences to the number of unique samples.It can also be seen that T GS + produces up to twice the number of unique samples as RGS + .While − −− → DGS + / ← −− − DGS + produce slightly fewer unique samples than T GS + , there is also a corresponding reduction in the observed run times.As expected, SGS + outperforms the other linear complexity variants, but at a significant increase in computational complexity.Again, it is observed in general that the relative increase in the number of unique samples is consistent with the increasing running times in Fig. 4(b) and with the decreasing OSPA and OSPA (2) values in Fig. 5(b).Inspection of the curves for increasing detection rates in Fig. 6(c) shows a decreasing trend due to lower number of components required to accurately represent the GLMB filtering density in higher SNR.Inspection of the curves for increasing clutter rates in Fig. 6(d) shows the converse trend.
It can be seen that T GS + / − −− → DGS + / ← −− − DGS + generally outperform RGS + on increased sample diversity, and that the former has the same (or smaller) complexity than the latter.Compared to RGS + , the experimental results in general suggest that the use of tempering and/or state informed coordinate selection can significantly improve slow mixing and sample diversity, but nonetheless retain the benefit of linear complexity.Furthermore, the improvement in tracking accuracy of each method levels off once enough significant samples have accumulated.
In [30], it was shown that the standard Gibbs sampler (i.e., SGS) is π-irreducible.It was also shown in [57] that the TGS extension to generate the Markov Chain {γ

Fig. 1 :
Fig. 1: Computing the unnormalized i-th conditional.Multiplying the i-th row of the matrix in (a) with a masking function (that depends on γ+) and normalizing the resulting function in (b) yields the i-th conditional.
New objects appear according to an LMB birth density with N B = 50 components.Each LMB component has a fixed birth probability P B,+ (ℓ i ) = 0.01 and Gaussian birth density f B,+ (x, ℓ i ) = N (•; m(i) B , R B ) for i ∈ {1, ..., N B }, where the means m (i) B = [x (i) , 0, y (i) , 0] T are uniformly spaced in the surveillance region, and R B = diag([10, 10, 10, 10] T ) 2 .This implies a mean birth rate of λ B = 0.5 per scan, combined with the constant death or survival probability, results in a total of approximately N X = 50 trajectories, or a peak of approximately 33 objects simultaneously.Observations are noisy 2D positions with noise standard deviation σ m = 10 m on each axis.All objects have a constant detection probability of P D = 0.86.Clutter follows a Poisson model with a mean rate of λ c = 90 returns per scan and a uniform density on the observation space.Each of the GS implementations are run for T = 5000 iterations, and the T GS + / − −− → DGS + / ← −− − DGS + use default mixture and tempering parameters of α = β = 0.5.
Fig.7shows the experimental results for T GS + , in 2D histograms, with varying mixture and tempering parameters, α and β.Recall from (21) that higher values α or β weaken the effect of tempering.Note from the averaged wall-clock times over all scenarios shown in Fig.7(a) that weak tempering or mixing slightly reduces computation time.More unique solutions require slightly longer run times due to the removal of

TABLE I :
Summary of frequent notations.

TABLE II :
Parameter settings.