When the Closest Targets Make the Difference: An Analysis of Data Association Errors

Multi-object data fusion – combining measurements or estimates of more than one truth object from more than one observer – requires a first “data association” step of deciding which data have common truth objects. Here we focus on the case of two observers only, with the data association engine powered by a polynomially-complex list-matching algorithm such as of Jonker-Volgenant-Castanon (JVC), auction or Munkres. The paper's purpose is to develop an approximation for the probability of assignment error: How often does the data association engine tell the fuser to combine data from truth objects that do not go together? We assume data with Gaussian errors and a Poisson field of truth objects, and we focus on the low-noise case where errors are infrequent and fusion makes sense. In this article, for isotropic, independent identically distributed errors, a single scalar parameter representative of the scene complexity is identified and, exploiting that, a reasonably simple approximate expression for the association error is derived.


I. INTRODUCTION A. PROBLEM FORMULATION
Suppose we have n truth objects (let us for compactness call them "targets") {x i } n i=1 , with x i ∈ B r where B r ⊂ R ν is the ball of radius r in the ν-dimensional Euclidean space R ν , centered at the origin. The targets are observed by two sensors whose measurements are where {ε i (s)} n i=1 , s = 1, 2, are independent identically distributed (IID) real Gaussian vectors with zero mean and respective covariance matrices R i (s) for s = 1, 2. The operator a(i) indicates that the data is permuted at sensor 2 versus sensor 1.
An assignment algorithm aims at deciding the association between targets and sensor measurements. The objective of the algorithm is to link the indices i s to the set {1, 2, . . . , n}, for s = 1, 2, with the goal of fusing data (e.g., [1]) to provide improved estimation -at best, it is quite suboptimal to fuse after an association error involving proximate targets, and at most pernicious it can obscure the entire scene understanding via "track-switch" [17]. In the ideal situation with no assignment errors, we have a(i) = i, for all i = 1, . . . , n. Let us introduce the association "costs" c k, that quantify the level of dissatisfaction caused by assignment of measurement k at Sensor 1 to measurement at Sensor 2. In the case that R i s (1) = R i s (2) = σ 2 I, in which I denotes the identity matrix of size ν × ν, a sensible cost function is, for k, ∈ {1, . . . , n}, c k, = σ −2 y k (1) − y (2) 2 . (2) To formalize the assignment criterion, it is convenient to define the binary-valued decision variables b k, , with b k, = 1 if indices are declared to be originated by the same target, and b k, = 0 otherwise. Using b k, , the optimization problem can be cast in the form [11]: In other words, we want to select exactly one entry on each row and exactly one entry on each column of matrix [c k, ] in such a way that the sum of the selected entries of the cost matrix is minimized over all possible choices. This is illustrated in Fig. 1. To be clear: ideally, the selections from the association matrix would, as opposed to those shown as circled elements in Fig. 1, be diagonal. We intend to find the probability that it is not, that something more like what is actually shown in Fig. 1 is seen.

B. BACKGROUND
Data association is a list-matching problem, with the number of lists equal to the number of sensors. Such optimization problems can seldom be solved exactly because of their combinatorial nature, and the exploration of suboptimal solutions is required [2], [7], [32], [33]. Here we specifically address the two-dimensional (or two-list) assignment problem -an attempt to assign objects the same labels from estimates at two remote stations. It is noted that the two-dimensional assignment problem is of polynomial complexity [5], [6], and there exists efficient means to solve it even for hundreds of objects, such as via Hungarian [21], [28] auction [3], [4] or JVC [15] algorithms.
The costs (as in (2)) arise from the normalized and negated logarithm of min a min x {p(y, x, a)} (4) in [18], [29], in which here (generically) p(·) is a mixed probability density/mass function, a represents the assignment, x the targets and y the measurements -see (1). Note that (4) finds the maximum a-posteriori (MAP) estimate of the permutation a, under the (reasonable) assumption that each a has prior probability 1/n!. Another approach in [18] uses a diffuse prior on the targets x, and hence the costs amount to min a p(y, a|x)p(x)dx (5) while [12] uses a Gaussian prior in an elegant development.
From the perspective of this paper, the approach (4) offers that we can change 1 (2) to when the measurement noise is not isotropic ( i ( j) has covariance R i ( j)). Most of our analysis here deals with the isotropic case, but the general case is addressed in [36]. The authors of [27] offer the good suggestion in a footnote that the general case can be addressed in terms of averaged covariance matrices; in [36] an exact (but not explicit) expression is given for the general case, but it is shown that the results of this paper can be used directly and explicitly provided (although not exclusively) that the covariances are a property of the sensor or of the targets. The expressions (4) and (5) ignore the "subset problem" that the list of targets observed and reported on from Sensor 1 may not be the same as that from Sensor 2. Further, there is the issue of sensor bias, meaning that (1) becomes in the case of the usual translational bias. 2 The bias b can be a parameter or can have some prior distribution (such as Gaussian), and good treatments of both of these extended cases are in [13], [19], [20], [31]. Both bias and the subset problem can have significant impact on any association engine's performance, and our analytical approach is not ready for these. The subject of this paper is neither bias nor the subset problem. Instead, we intend to develop an expression for the probability of an assignment error, an expression that is valid in the low error-rate regime and whose calculation is numerically straightforward.
Sea in 1971 seems to have been the first to address the problem of association error. That paper [34] takes a given target and assumes it is immersed in a Poisson field of measurements: it examines the likelihood of one of them being closer than the true measurement -a slightly different model from the Poisson-target model we adopt, although certainly elegant. Reference [27] refers to this as a "one-way" switch, as opposed to the "two-way" (global nearest neighbor) problem attacked in [35]. Mori and his co-authors in [26] note 3 that the controlling term is of the form of an expectation of an error function (here, instead and equivalently to the error function's complement, we use the standard normal exceedance probablility, or Q-function) which they approximate exponentially to get a correct-association probability leading term. They also give results as a function of target density, and recognize the 1 The log-determinant terms can be canceled by row and by column. 2 Translational bias error can be a good approximation of more realistic nonlinear biases when the target grouping is sufficiently tight. It is usually expressed relative to an "anchor" sensor, in (7) the first. 3 We have used [26] as a reference; in fact there is an accessible treatment in [25], and both are pre-dated by a technical report [23] and a conference publication [24]. importance of a scalar parameter they call β that represents the expected number of targets per unit volume normalized to measurement standard deviation. The β in [26] is tantamount to our χ (and the γ in [27]) that we have expressed in terms of an explicit Poisson-field assumption of targets. As we shall see in Section V and Fig. 7 there is a likely typographical error in [26]; allowing for this, we get numerical results that reflect nicely the exact error probability we have derived.
We also note that [26] is somewhat more general than our work here, in that an arbitrary measurement covariance (there denoted as S) replaces our assumption of isotropic measurement noise (i.e., S a multiple of the identity matrix). However, the assumption is made in [26] that S is the same for all targets. In [36] it is shown that a simple association-error probability expression -as in [26] and here -seems to be possible only under the assumption either that S is a property of the sensor or under the assumption that it is a property of the targets. Since while these conditions are both generalizations of the common-S assumption, and since neither seems much more realistic than our assumption, we prefer to stick with isotropic measurement noise for clarity.
The same authors as in [26] refine their approach in 2014's [27] to include features. Target features -an amplitude likelihood is a common one, but other bespoke features as well as target classification suggestions from the sensors can be used -can be key in association decision, and, indeed, in some cases overshadow distance information. Reference [27] examines various feature distributions and offers computational approaches for some. Ultimately it finds that unless the features are themselves Gaussian the resulting expressions, while useful, do not simplify to an appealing form.
In [8] the main results here are used and quoted, but they were not developed. More attention in [8] is given to simulation evidence directing the analytic framework, specifically that when there are association errors but these errors are relatively rare -in a sense, the easy-association case -almost all association errors are pairwise switches between the two closest (nearest to each other) targets. Several other papers, among them [13], [19], [20], also analyze the problem by simulation, and especially include bias. Levedahl [22] examines the association problem practically and, in addition to giving a swap probability expression based on a Poisson measurement model, notes that there are several sources of error beyond the misassociation we study here: these being failure to gate, missed-detections/clutter and bias. In [36] the results from [8] were used, and there were extensions to non-isotropic noise (as in (6)). Also, as a popular multi-sensor data fusion scheme uses iterative two-list matching (for example, associating the data from sensors 1 and 2 together, then associating the fused result with sensor 3, etc.) there was an extension to probability of incorrect association for such schemes, as well as a suggestion for preferred sensor ordering. It is important to note that [36] worked with a fixed configuration of targets, aimed at developing a measure of "scene difficulty" as regards data fusion. On the other hand, this paper and [8] adopt a Poisson model for target location, so that results for a fixed configuration (as in [36]) need to be marginalized over the random quantity of the closest-targets separation.

C. ROADMAP
Our goal in this paper is to approximate as closely as possible the probability of misassociation between two targets -and we reiterate that our results refer to isotropic Gaussian measurement noise, no missed-detections nor clutter-generated false-alarms, and a Poisson field of targets. We are especially interested in the clean-data regime: that is, we want to report to the system when errors are "starting to be made," warning of a target scene that may become problematic for data fusion. To this end, we begin by pointing to simulation evidence in [8] that when errors are very unlikely (that is, when a tracking system is working well) the errors that do happen are almost always pairwise exchanges between the closest pair. We make no claim that such errors are the only ones that happen, and in that sense our expression undercounts the error rate, meaning that its complement is only an upper bound on the probability of perfect association. But it is close when the scene is clean.
Our plan of attack can be summarized as follows. First, we consider the statistical characterization of pairwise switches in the assignment problem under the assumption that a single switch occurs in a known pair of positions of the cost matrix. We derive the exact formula (25) for the probability of switch, conditioned upon the position of (actually, the separation between) the two targets from which the switched measurements originated. An approximation thereof, accurate on the probability tail, is given in (26)- (27). The expressions here are valid for any pair of targets, but the intention, as suggested above, is to apply it to those that are most proximate.
Then, classical results for the minimum distance between any pair of points in a Poisson field, in the regime of large number of points per unit volume, are exploited to yield the asymptotic expression (29) and an approximation thereof, see (32). Thus, we have (25) -or (26) and (27) -for the probability of misassociation between any pair of targets with a fixed separation, we have [8] telling us only to worry about the closest pair, and we have (29) -or (32) -describing the probability density function of the closest pair. Hence, capturing all, our main achievement is the combination of these, yielding the expression (43) for the probability of single pairwise switch.
Equation (43) is quite simple, but for the regime of small probability of error we proffer an even simpler (linear) expression (52) relates the error probability P (E ) to a characteristic parameter χ ν . The scalar χ ν (ν is the target dimension) we identify as the fundamental quantity capturing all the relevant physical parameters of the system: the relative scene density that describes the scene complexity.

II. PAIRWISE SINGLE ERROR PROBABILITY
Our analysis is limited to the case of single pairwise switch errors. A pairwise switch error occurs when there exists a single pair of indices (i, j) such that the measurement y i (1) at Sensor 1 originated by target i is associated with measurement y j (2) at Sensor 2 originated by target j = i. All other associations are correct, namely matrix [c k, ] is diagonal, except for the presence of ones at positions (i, j) and ( j, i). In this section, it is assumed that the pair (i, j) is known, namely, we consider only two known vectors x i and x j . The event of a pairwise switch error among the measurements originated by vectors x i and x j is denoted by E i, j . According to (3), E i, j occurs when the wrong association cost is less than the correct one, i.e., Let us denote by x·y = ν i=1 x i y i the standard inner product between real ν-vectors x and y. From (2) and (8), we have Using (1) and defining the vectors from (9) we have where˜ (s), s = 1, 2, are two independent, zero-mean Gaussian ν-vectors with identity covariance.
To simplify the evaluation of this probability, let us introduce a change of basis. Let be an orthogonal ν by ν matrix whose columns form an orthonormal basis of R ν with first element the vector ξ i, j / ξ i, j . Clearly, T = T = I. Orthogonal transformations preserve the inner product, namely Note that and˜ (s) ≡ (s), s = 1, 2, has the same distribution as (s), because the vector (s) is spherically invariant. Therefore, (11) can be rewritten as where we have denoted μ +˜ (1) by W = [W 1 , . . . ,W ν ] T and μ +˜ (2) by Z = [Z 1 , . . . , Z ν ] T . Conditioning to Z = z: and the random variable ν i=1 W i z i is Gaussian, because a linear transformation of W . Now, Therefore, (14) can be rewritten as Removing the conditioning, yields where ξ i, j has been denoted by ξ for notational simplicity. Given the structure of the integrand, (18) can be more conveniently expressed using polar coordinates, see e.g. [37, p. 501]: The Jacobian determinant of the transformation is Thus, the integral in (18) can be rewritten as follows By using the relationship the last line of (22) reduces to and we arrive at the final expression Numerical evaluation of P (E i, j ) in (25) is straightforward. The integrals appearing in (25) also admit closed form, either exact or approximate, in terms of hypergeometric functions but the expressions are cumbersome and do not allow easy mathematical manipulations. Instead, in Appendix C we derive an approximation valid for large ξ , summarized in (C.20) and (C.21), which are here rewritten: For ν even: For ν odd: The coefficients appearing in (26) and (27) are given in (C. 19), see also Table 2 in Appendix C. In Appendix B an exponential Chernoff bound is also derived, see (B.6).  Figs. 2 and 3 show P (E i, j ) computed by the exact formula (25) in function of ξ , compared to the approximation provided by (26) and (27). Fig. 2 also shows the Chernoff bound (B.6). We see that approximations (26) and (27) are excellent on the probability tail (large ξ ). On the other hand, for small values of ξ , Fig. 3 reveals that the approximations are accurate for ν even and small, while they worsen for the high-dimensional targets.
One important insight obtained from (26) and (27) is the functional dependence of P (E i, j ) upon ξ . Since the derivation is based on the assumption ξ 1, we expect a good approximation for the tail of P (E i, j ), i.e., when P (E i, j ) takes small values, and Fig. 2 confirms this. From (26) and (27) we see that the higher-order coefficients of the sums enclosed in square brackets associated with the smallest power of ξ have minor impact on the tail of P (E i, j ) and could be adjusted in order to better match the exact expression of P (E i, j ) given in (25) even for values of P (E i, j ) that are not especially small. This numerical computation might improve substantially the accuracy of the approximation for small values of ξ , with negligible impact on the tail, allowing us to end up with formulas valid in all regimes of practical interest. We do not dwell on these improvements as our interest is mainly on the asymptotic regime ξ 1.

III. MINIMUM DISTANCE BETWEEN HOMOGENEOUS POISSON POINTS IN A HYPERSPHERE
In this section, we exploit classical results for the minimum distance of homogeneous Poisson points in a hypersphere. Two different formulations are considered and are shown to be asymptotically equivalent.
r For any integer k and any set of k disjoint re- where κ ν = π ν 2 / ( ν 2 + 1) is the volume of the unit ball in R ν . Now, let B r denote the ball with radius r, with volume m(B r ) = κ ν r ν . In words, the previous result with A replaced by B r reads: The random variable 4 To avoid misunderstanding, we explicit note that the general formula in [16] does not involve the measure of A, unless the Poisson process is homogeneous inside A and no points can lie outside A. That is, the version used here assumes that the function f (x) appearing in [16] reduces to the indicator of the set A. Note also that in [16] the notation m(A) has a different meaning. 5 There is a typo in [ converges in distribution to an exponential random variable with mean 2/m(B r ) = 2κ −1 ν r −ν . Using the limiting result in (29) as an approximation, yields which, in view of (29), is valid in the regime of λ 1 (crowded regions). Expression (32) shows that the probability density function of the random variable = inf is given by: for nonnegative ω and otherwise zero.

2) RESULTS FOR UNIFORM POINTS IN B r
As an alternative to the above derivation, let {U k } ∞ k=1 be IID νdimensional random vectors of the Euclidean space R ν , each with density p(x), x ∈ R ν . Suppose R ν p 2 (x)dx < ∞.
If the vectors are uniformly distributed over C r , the unit hypercube of side r of R ν , we have which yields R ν p 2 (x)dx = 1/m(C r ) = r −ν . Substituting in (35) reveals the following. The random variable converges in distribution to an exponential random variable with mean 2κ −1 ν r ν .

3) COMPARISON BETWEEN THE TWO FORMULATIONS
To compare the two previous results, note that (29) implies that λ 2 κ ν ν is asymptotically distributed as an exponential of

4) SIMULATIONS
As a sanity check, we show in Fig. 4 the cumulative distribution function (CDF) F λ 2 κ ν ν (ω) of the random variable λ 2 κ ν ν , for a Poisson field of ν-dimensional vectors in a ν-hypercube of side r. The solid lines refer to empirical CDF obtained by 10 3 Monte Carlo runs, while the dashed lines show the theoretical CDF predicted by (29). Fig. 4 refers to ν = 6 and r = 0.5 (blue), r = 1 (green) and r = 1.5 (red). For each value of r, four values of λ has been considered, such that n = λ r ν = 10, 10 2 , 5 10 2 , 5 10 3 . As λ grows we see that the empirical CDFs approach their theoretical counterpart.

IV. APPROXIMATIONS FOR THE PAIRWISE ERROR PROBABILITY
In Sect. II we have computed the probability P (E i, j ) of a pairwise switch between measurements originated by targets x i and x j , obtaining the exact expression (25) and the useful approximation (26)- (27). In compact form, this approximation reads, see (C.18) in Appendix C: Recall that P (E i, j ) is function of the normalized distance between the two vectors ξ = 2 − 1 2 σ −1 x i − x j . We assume now that the targets x i and x j are the closest among the set of many targets (that is, λ targets per unit volume) and therefore we interpret P (E i, j ) as being conditioned to the the pair of closest vectors. The unconditional probability of a pairwise switch is: where f (ξ ) is the probability density function of the random variable ξ . Denoting by E[N] the expected number of targets, it is convenient to define the following dimensionless parameter which combines all the relevant system parameters and, as we shall see soon, determines the tail error probability. In the regime λ 1, from expression (32) we have yielding Note that P (E ) is only function of ν, χ ν , and of the set of coefficients {c m (ν)}. Recall that: r ν is the dimension of the space; r κ ν = π ν 2 / ( ν 2 + 1) is the (dimensionless) volume of the unit hypersphere in R ν ; r λ is the (large) expected number of targets per unit vol- is the expected number of targets; r r is the radius of the hypersphere; r σ is the standard deviation of the noise; r the coefficients {c m (ν)} are defined in (C. 19) (examples are given in Table 2).   (42) and P (E i, j ) given in (25). The inset of Fig. 5 reproduces the same curves of the main plot, but with linear scale on the y-axis, and shows that the approximation worsens for large values of χ ν , namely for large values of P (E ).
The slope of the curves in Fig. 5 is shown by the dotted straight lines. As all curves have slope π/4 radians, it follows that the tail of P (E ) is accurately described by the linear expression for some α ν > 0. To find α ν , note that in the regime of χ ν → 0 we may approximate in the integrals appearing in (43). Then, for ν ≥ 3, and m ≤ Inserting (45)-(47) into (43) gives which implies By using the identity and the definition of c m (ν) in (C.19), we have The values of α ν in (51) can be computed numerically in a straightforward way for any desired value of ν. For instance, for ν = 3, 6, 10, we find α 3 = 1.5384, α 6 = 31.5, α 10 = 8.5575 10 3 . We have thus arrived at the following conclusion. As far as the tail of P (E ) is concerned, one can use the following simple expression, with α ν given in (51), For easy reference, Table 1 reports the values of α ν and, recalling the definition in (41), the products α ν 2 ν 2 κ 2 ν and α ν ν 2 − ν 2 , for ν = 3, . . . , 12.
It is worth noticing that the error probability P (E ) in (52) is proportional to (E[N]) 2 , as the minimum distance between randomly located points is stable when multiplied the squared number of points, according to spacing theory [14], [30].

1) BEHAVIOR FOR LARGE ν
For asymptotic evaluations of (51) in the regime ν → ∞, it can be shown that the main contribution to α ν comes from the central term of the summation, with index (ignoring the flooring operation) m = ν−3 4 . Considering only this contribution from the expression in square brackets in (51), yields where we have repeatedly used (x) ≈ x x e −x .
Recall from (1) that each component of the generic target x i is affected by additive Gaussian noise of variance σ 2 . Suppose ν large. By the law of large numbers: 1 This reveals that ˜ i ≈ σ √ ν is the effective radius of the hyperball occupied by the generic target, whose volume is κ ν (σ √ ν) ν . Likewise, when ν grows, it makes sense to let the radius of the surveyed region r grow as for some σ s > σ . A measure of the maximum number M of non-overlapping targets in a hyperball of radius r is given by the ratio of the volumes of the surveyed region to the effective volume filled by a single target. With r given in (55) this ratio is 6 The error probability P (E ) given in (52) can be expressed in terms of the ratio σ s /σ , as follows: In the regime of large ν, from (57) we have where we used the asymptotic expression (54) for α ν and the approximation (x) ≈ x x e −x . In terms of asymptotic rate, namely, first-order approximation at the exponent, this implies 6 With r held fixed with ν, one would get M = 1 ν ν 2 ( r σ ) ν , which does not render adequately the geometrical concept of how the hyperspheres occupied by the targets fill the space.
In (59), the quantity log(σ s s /σ 2 ) plays the role of an asymptotic performance index (signal-to-noise ratio). As we have assumed σ s > σ , it follows that log(σ s s /σ 2 ) > 0. From (59) we see that the error probability converges to zero exponentially fast with ν → ∞ provided that σ s /σ > 2/ √ e ≈ 1.213. We can interpret the above results in light of the sphere hardening phenomenon: when ν → ∞, it suffices that σ s > 2 σ / √ e, to ensure that the volume κ ν r ν = κ ν ( √ νσ s ) ν contains a diverging expected number E[N] of effective targets, each filling the volume κ ν ( √ νσ ) ν , which do not overlap with each other, yielding a vanishing error probability. This brings an analogy with Shannon's channel coding results [10].
2) BEHAVIOR FOR SMALL ν Using (53) and (55) in (43), we get where δ = 1.0177. It is seen that (60) is not monotonic with ν, for 3 ≤ ν ≤ 20. As an example, for σ s /σ = 10, P (E ) is decreasing for ν < 11 and increasing for ν > 11. An insightful tradeoff emerges. The larger is the dimension of the observations, the more information is collected, which is beneficial in terms of error probability. On the other hand, large dimensions make the association problem more challenging with a negative effect on the error probability. The same qualitative behavior observed by varying ν is observed if the product √ ν σ s is replaced by r and held constant, see (55).

V. NUMERICAL INVESTIGATIONS
The results of the first campaign of computer simulations, shown in Fig. 6, corroborate the previous analysis. Crosses refer to the actual case with multiple association errors, while circles refer to a single "pairwise switching," which is the assumption used to derive the analytical formulas. To be clear, the approximation in (43) is based on the assumption that there is only one "switch" of two targets that causes the association error, and that this switch takes place between the two targets that are closest to each other. Fig. 6 (with theoretical curves already shown in Fig. 5) confirms that single pairwise switching is the main source of error for small P (E ). For denser target scenes -meaning larger χ ν values, reflective either of greater measurement uncertainty (larger σ ) or more targets -errors occur more frequently, as sometimes  non-proximate pairs switch, and more perniciously sometimes there are multiple-target exchanges. Each such error is counted in simulation: if four targets were involved in a messy exchange then two errors would be counted. It is instructive to refer to [8] where all these possible events are clearly laid out and considered (mostly) by simulation. At any rate, the match between simulated points and theoretical predictions is very good in the addressed regime of small P (E ). A second campaign of simulations gives the results summarized in Fig. 7. Here the number of Monte Carlo runs is 10 6 and E[N] = 25. Note that χ ν can be varied through both its numerator and denominator; accordingly, we report simulation points obtained through fixing r and varying σ , and also by fixing σ and varying r. We also offer a comparison with the approximate curves, derived in [26]. 7 The approximation of [26] is more accurate than the one here for "large" values of P (E ), reflecting dense target scenes, and may be better suited for such regimes. On the other hand, that P (E ) 1 is key to the approximation here, so its match to empirical truth is significantly better the formula provided in [26] for lower values of χ ν . The computed results follow simulation decently up to quite high probabilities of misassociation for low-dimensional situations (smaller ν); as the dimension becomes large there are greater opportunities for targets and target groups to approach one another in complex ways, so for (say) ν = 9 the agreement seems to wane above 10%.

VI. CONCLUSION
A necessary first step prior to the fusion of multi-object/multisensor data for improved accuracy ensures that all data being fused refer to the same truth objects. In the case of two sensors, which we treat here, this is a list-matching problem, which is optimally solved surprisingly quickly, even for large numbers of objects. However, errors are sometimes made, and it is key to determine how often. This paper gives a (reasonably) simple approximation for the probability of a list-match error, valid in the case of isotropic measurement noise at each sensor and asymptotically tight in the low-error regime.
Expressions for sequential multi-sensor fusion error probability are in [36], as are those for the non-isotropic case. The latter simplifies nicely for the case that covariances are a property of the sensor or of the target (but not both) although simulation evidence has suggested that the approximations work well regardless. Sensor bias is examined by simulation in [8] and an easy overlay to the current work is suggested in [36], but remains a matter for future work. The case of the "subset problem" -that the object lists at the sensors are not the same, as when targets are missed or that target lists contain clutter objects -remains open, and will be the subject of future investigation.
We would finally like to note that our two-step approach -of finding the probability of a switch between a pair of targets with a given separation, then marginalizing this using a probability density describing the minimum separation of two targets in a field of Poisson targets -is somewhat different from other attacks on the problem that focus on a field of Poisson measurements. There is perhaps more intuitive appeal in focusing on Gaussian measurements coming from Poisson targets versus measurements themselves that are Poisson; but then again it is arguable whether the Poisson assumption is all that realistic anyway. However, our approach does have the advantage that one can stop after the first step, and simply use (25) -or (26) and (27) -to find the probability of a switch between two known targets. That is, if instead of describing an average fusion complexity one wanted an equation that explained the association difficulty of a particular scene, that is easily done using those midway-point equations.

APPENDIX A MOMENT-GENERATING FUNCTION
Let us consider two independent Gaussian random vectors 8 x and y of size ν, whose entries are mutually independent. Suppose that the mean vectors are m x and m y , and the covariance matrices are diagonal: σ 2 x I and σ 2 y I, respectively, where I is the ν by ν identity matrix. In symbols: Let us consider the Moment-Generating Function (MGF) of Given y =ȳ, it is easy to show that the random variable x Tȳ is Gaussian with expected value m T xȳ and variance σ 2 x ||ȳ|| 2 . Thus, the (conditional) MGF of x Tȳ is Arranging the terms and defining h t = 1 − (σ x σ y t ) 2 andσ 2 The integral appearing in (A.8) can be reduced to the MGF of a Gaussian random vectorȳ ∼ N m y h t ,σ 2 t I , which yields: (A.9) When x and y have the same distribution, namely σ x = σ y = σ φ and m x = m y = m φ , specializing (A.9) we obtain Further assuming σ φ = 1 and m φ = ξ i, j yields the final expression where ξ = ξ i, j . Using the notations adopted in the main text of this document, we see that the right-hand side of (A.11) gives the MGF of the inner product W ·Z appearing in (13) , see (10). Thus, defining U = W ·Z for simplicity of notation, we have 12) where W and Z are IID Gaussian vectors with covariance matrix I and mean vector with norm ξ . Note that the region of convergence of U (t ) is t ∈ (−1, 1). Note also that As well known, the former gives E[U ] = d dt U (t )| t=0 and the latter implies that U (t ) is strictly convex over (−1, 1). It also follows that the same two properties hold for U (−t ).

APPENDIX B CHERNOFF BOUND
Let us consider P (E i, j ) = P (U < 0) = P (W ·Z < 0), see (13). Using the expression of the MGF of U given in (A.12) of Appendix A, we obtain the following Chernoff bound: for any t ∈ [0, 1), where we have used Markov' inequality. Since this must be true for any t ∈ [0, 1), we have where t * is given by  Since U (−t ) is strictly convex in [0,1), t * can be computed by solving d dt U (t ) = 0. The result is which belongs to [0,1) as it is more evident by rewriting it as Note that t * is a function of the ratio ξ 2 /ν. From (B.2), the final bound is with t * given in (B.5).