Bayesian Cramér-Rao Lower Bound for Magnetic Field-Based Localization

In this paper, we show how to analyze the achievable position accuracy of magnetic localization based on Bayesian Cramér-Rao lower bounds and how to account for deterministic inputs in the bound. The derivation of the bound requires an analytical model, e.g., a map or database, that links the position that is to be estimated to the corresponding magnetic field value. Unfortunately, finding an analytical model from the laws of physics is not feasible due to the complexity of the involved differential equations and the required knowledge about the environment. In this paper, we therefore use a Gaussian process (GP) that approximates the true analytical model based on training data. The GP ensures a smooth, differentiable likelihood and allows a strict Bayesian treatment of the estimation problem. Based on a novel set of measurements recorded in an indoor environment, the bound is evaluated for different sensor heights and is compared to the mean squared error of a particle filter. Furthermore, the bound is calculated for the case when only the magnetic magnitude is used for positioning and the case when the whole vector field is considered. For both cases, the resulting position bound is below 10cm indicating an high potential accuracy of magnetic localization.


I. INTRODUCTION
Magnetic localization uses position-dependent distortions of the Earth magnetic field. Distortions are caused by ferromagnetic material in the environment and are persistent over time as long the environment remains the same. The existence and the usefulness of such distortions have been shown for different environments and use cases. The majority of the literature published in this area is concerned with indoor localization of pedestrians and robots [1], [2], [3], [4], [5], [6], [7], [8], [9], where the distortions are due to steel in the walls, floors, and in furniture. For indoor environments, magnetic localization is often combined with an inertial navigation system or an odometer to improve the accuracy and reliability. Besides indoor environments, magnetic distortions were shown to enable localization also The associate editor coordinating the review of this manuscript and approving it for publication was Jenny Mahoney. along roads [10], [11], [12], railway tracks [13], [14], [15], and in the airspace [16]. While on roads and railway tracks, the distortions are well pronounced and can be measured with low cost magnetometers, in the airspace the magnetic variations have smaller amplitudes. Hence, high quality magnetometers in combination with an inertial navigation system are required in the airspace.
Position estimation based on magnetic distortions is a highly nonlinear estimation problem. The nonlinearity is introduced in the likelihood in which a map of the distortions is required as the link between the estimated position and the observed magnetometer measurements. Due to this nonlinearity, also nonlinear estimation algorithms are required. A popular approach for magnetic localization is the particle filter used, e.g., in [3], [4], [6], [10], [13], and [16]. Besides the particle filter also fingerprinting methods based on convolutional neuronal networks [8], k-nearest neighbors [2] and dynamic time warping [1] are proposed in the literature that perform position estimation based on a batch of data that is fitted to a database of magnetic fingerprints. In this paper we focus on approaches based on Bayesian filters because they allow an easy integration of motion models and control inputs and provide the uncertainties of the estimated position. In particular particle filters are investigated, were the advantage of the particle filter is its simple implementation and capability to handle also ''strong'' nonlinearities for which linearized filters like the unscented or extended Kalman filter might diverge or give unsatisfying results.
The sampling-based nature of the particle filter has the disadvantage of a high computationally cost compared to Kalman filters, particularly in large state spaces. Fortunately, when only position, speed, and acceleration have to be estimated, the complexity is small enough to carry out the calculation in real time with a reasonable update rate. Even though the particle filter was shown to achieve good results in different scenarios, it is still unclear if it is even close to optimality. In this paper, we therefore show how to obtain a lower bound on the mean squared error. The benefit of such a bound is that it allows to determine the minimal achievable position error for a given magnetic field and it can be used as a benchmark during the development of a localization system to check if a specific estimation algorithm is close to optimality. The bound can also be used to check if a given magnetic field, e.g., inside a room, enables localization with an accuracy that fulfills the requirements of the targeted application. If the requirements are not fulfilled, one has to think about using additional sensors in combination with the magnetometer to achieve the desired accuracy before developing the actual localization system.
The proposed lower bound is based on the Bayesian Cramér-Rao lower bound (BCRLB) introduced by Van Trees [17]. The BCRLB combines the Cramér-Rao bound defined by the inverse Fisher information matrix (FIM) with a state space model allowing to incorporate prior knowledge, e.g., a motion model into the bound. Furthermore, in [18] a recursive version of the bound was derived that reduces the computational complexity. The BCRLB requires a measurement model that is analytically known and that leads to a differentiable likelihood. For magnetic localization this causes some difficulties because the true magnetic field map and hence the likelihood is unknown in general. The magnetic map is only known (with some uncertainty) at positions at which it was measured. Using the laws of physics to describe the magnetic field in analytical form is not possible because this would require an exact knowledge of all materials in the environment, its physical properties, and dimensions. In addition, even when the environment is exactly known, the involved differential equations have to be solved with numerical methods. In this paper, we therefore follow the approach presented in [19], where Gaussian processes (GPs) are learned based on a training data set to describe the measurement model. In [19], the authors applied this idea to obtain a lower bound for localization with received signal strength fingerprints. In contrast to this, here we adapt this methodology to the problem of magnetic localization and we derive the equations required for implementation and we show how a deterministic input can be incorporated into the bound. To the best of our knowledge this has never be done before.
After an introduction to the topic in Section II and a derivation of the equations specific for magnetic localization in Section III, an evaluation on a novel data set is performed in Section IV. In the evaluation, we use magnetometer measurements recorded in our laboratory to investigate the achievable accuracy of magnetic localization. The bound derived from the measurements is compared to the mean squared error (MSE) of a particle filter to check if the filter is close to optimality. Furthermore, the bound for the magnetic field measured at different heights is investigated and we compare the bounds for the cases when the whole magnetic vector or only its magnitude is considered in the estimation.

II. BAYESIAN Cramér-RAO LOWER BOUND
In this section, the BCRLB and GPs are introduced and it is shown how they can be combined. In addition, an efficient recursive version of the bound is derived.

A. BCRLB FOR ANALYTICAL LIKELIHOOD MODELS
The BCRLB is a Bayesian bound on the minimal achievable MSE. In contrast to the classical Cramér-Rao bound, which is a bound on the variance of an unbiased estimator, the BCRLB is also valid for biased estimators. The BCRLB is defined by the inverse of the Bayesian information matrix (BIM) J and fulfills the matrix inequality where M is the MSE matrix for an estimatorx that estimates the state vector x given the observations z. The index of the expectation operator E x,z [·] indicates that the expectation is taken w.r.t. the joint probability density function (pdf) p(x, z).
The inequality sign states that the difference M − J −1 is a positive semi-definite matrix. The BCRLB and the BIM was first introduced by Van Trees in [17] and is given by The differential operator a b = ∇ b ∇ T a is an outer product of the nabla operator ∇ a = ∂ ∂a 1 · · · ∂ ∂a N T and when applied to a function gives its Hessian matrix. The regularity conditions for the bound are given in [17] and assume that for the bias the following limits are fulfilled for each element x i of vector x = x 1 · · · x n x T . In addition, the first and second partial derivatives of the joint density p(z, x) w.r.t. the state must be integrable to ensure the existence of the expectations in the bound. The BCRLB (2) can be also split into a data-dependent part and a part depending on prior knowledge by applying the Bayes' rule on the joint density p(z, where F(x) is the FIM of the likelihood p(z|x). The BIM therefore is the sum of the expectation of the FIM over the state distribution p(x) and the information obtained from the prior distribution itself. Connecting the BIM to the FIM is useful in practice because for many likelihoods the FIM can be found in the literature. In the BCRLB it is assumed that the measurement model and the likelihood p(z|x) is analytically known and the corresponding F(x) can be calculated. This is the case for many estimation and localization problems. For example, for global navigation satellite systems the range measurement model is given by the Euclidean distance between the receiver and satellite antenna and some additive white Gaussian noise (AWGN). For such models the FIM is well defined and easy to calculate. As we will see in the next section, this is not the case for magnetic localization.

B. BCRLB FOR MEASUREMENT-BASED LIKELIHOODS
In contrast to the previous section, now the case is addressed in which the measurement model itself is not known analytically but noisy measurements are available. In the following, we differentiate between the true but unknown measurement model where n k ∼ N (0, σ 2 n ) is AWGN andh (·) is the function that relates the state x k ∈ R n x to the measurements z k at time step k. Becauseh (·) is unknown, an approximatioñ h (x) ≈ h (x, D) based on some measured training data set D = {X D , Z D } is required, where Z D is a set of noisy measurements of the true measurement model and X D are the corresponding known inputs. Here only functions from an input space with dimension n x to a scalar outputh : R n x → R are considered. For vector-valued functions, e.g., the magnetic vector field, the measurement model is considered a combination of multiple scalar functions.
In principal, there are many ways to approximate a function but for the use in the BCRLB an approximation should be chosen that accounts for its stochastic nature introduced by the training data set. This paper follows the approach proposed in [19] where the measurement function is approximated with a GP. The advantage of the GP approximation is that the resulting function will be differentiable and for each input, here the state vector x, the function value is Gaussian distributed allowing for a straightforward calculation of the FIM and BIM. To properly derive the bound for GP based measurement models the BCRLB is conditioned on the training data. This is achieved by conditioning all pdfs on the training data set D. In contrast to the bound in [19], here we condition the bound also on a deterministic control sequence U which allows us to better control the state trajectories over which the bound is calculated, more details on this are presented in Section III. For brevity, in the following the training data set and the control input is combined in the set D u = {D, U}. The bound conditioned on D u is given by with the BIM The proof for the conditional bound is exactly the same as the proof for the normal BCRLB in (1) which can be found in [17]. The only difference is that the pdf p(z, x) is exchanged with the conditional pdf p(z, x|D u ). For the regularity condition this results in with the limits As before in (5), the bound consist of a data dependent part and a part accounting for the prior information The main step in obtaining the BCRLB is the calculation of the FIM where the likelihood p(z|x, D u ) conditioned on the training data is required.

1) LIKELIHOOD OF GP-BASED MEASUREMENT MODELS
In this section, we give a short overview on the GP equations that are required for the derivation of the GP based BCRLB. For a more detailed introduction about GPs and a detailed derivation of the following equations we recommend reading [20]. For the calculation of the BCRLB, the approximation of h (·) is considered to be a GP. Therefore, the output of the function itself is a Gaussian random variable with pdf where µ(x k , D u ) is the posterior mean and r(x k , D u ) the posterior covariance. The posterior mean and covariance of a GP are functions of the training data and the input for which the measurement function has to be evaluated where the shorthand notation K DD = K(X D , X D ) and K x k D = K(x k , X D ) is used to describe the covariance matrices between the training points with themselves and the covariance between the training points and the input for which the GP is evaluated. The covariance matrix K(·, ·) for two sets A and B of input points has the form and is fully described by the covariance function k (·, ·) evaluated at the different set elements {A} i and {B} j . The noise of the training data is accounted for with the variance σ 2 D . The choice of the covariance function depends on the underlying true function that is approximated. In this paper, the GP approximates a function that maps each input, here the position of a magnetometer, to the magnetic field value at that input. From practical experience it is known that the magnetic field changes with the position and that the field at two positions becomes less correlated when the distance between the points increases. As proposed in [5] and [7], this behavior can be captured with the well-known squared exponential kernel that decays with increasing distance between the input points. The parameter l is the length scale and controls how wide the kernel is and how rapid the covariance decays. The other parameter is the variance σ 2 k that is a measure of the overall variance of the function that is approximated. The mean function is assumed to be constant because we expect the magnetic field to vary around the value of the Earth magnetic field that can be considered homogenous in small areas like a room.
To obtain the likelihood required for the FIM in (12), the measurement model (6) is combined with (13). The observation z k is according to (6) the sum of two Gaussian random variables, one for the GP that describes the measurements function and one for the noise of the sensor. The sensor noise is considered white and independent from the GP. The likelihood of the observation is therefore the convolution of the pdf in (13) and the noise pdf The resulting likelihood is again Gaussian and the covariances of the GP and the noise are simply summed up. Intuitively this is plausible, the likelihood gets wider and less informative when the sensor noise, the uncertainty in the GP or both increase.

2) FISHER INFORMATION MATRIX OF GP-BASED LIKELIHOODS
In this section, the FIM for the likelihood in (17) is presented. The equations for the FIM are obtained in a straightforward manner from the equations of the general Gaussian case [21] [ The FIM in (18) is the element in i-th row and j-th column. To get the FIM for the GP-based measurement model therefore the derivatives of the mean and covariance in (17) w.r.t. the state variables are required. For any differentiable covariance function k (·, ·) and mean function m (·) the derivatives of the posterior mean are For the posterior covariance the derivatives are given by where we used the notation S = K(X D , X D ) + σ 2 D I and z = z D −m (X D ). For a specific GP model the only thing left to do when calculating the FIM is to plug in the derivatives of the mean and covariance function into (19) and (20). For the GP in this paper those are the derivatives of the squared exponential kernel (16) and the constant mean function where a is an arbitrary vector with appropriate dimension.

C. BCRLB FOR NONLINEAR FILTERING
Magnetic localization is a dynamic estimation problem because the position will change over time. Therefore, the BCRLB for the state x k at a specific time step k introduced in the previous section has to be extended to state sequences. VOLUME 10, 2022 Furthermore, during localization the state at the current time step typically depends on the previous one. This dependency can be described in form of a motion model as it is used, e.g., in Kalman and particle filters. For the following nonlinear motion models of the form are considered. The process noise w k−1 is assumed to be AWGN. In the derivation of the BCRLB for nonlinear filtering this is not required but for the implementation of the BCRLB the AWGN case allows for some simplifications. For the filtering bound let T be a vector in which the state sequence from time step 0 to k + 1 is stacked.
T is the vector containing the sequence of measurements. According to (8), for X 0:k+1 the BIM is where the index of J indicates for which state sequence it is valid and p k+1 is the joint pdf The dimension of J 0:k+1 is (k + 2)n x × n x (k + 2) and hence for long sequences the BIM becomes hard to invert. In [18], the authors address this issue with a recursive version of the bound in which only matrices with dimension n x × n x are inverted. In the next paragraph, we follow the proof given in [18] and point out where some assumptions have to be made about the GP in the measurement model in order to obtain an efficient recursive form of the bound. The basis for the recursive calculation is to split which allows us to write the BIM as a block matrix where we used that the differential operator in (24) can be written in block form Furthermore, if J 0:k+1 is the BIM for the sequence X 0:k+1 , the BCRLB for the single state x k+1 is given by the lower right block of J −1 0:k+1 . With block matrix (26) and the Schur complement we obtain In (28) only the upper left block matrix A k+1 has to be inverted but this still has the dimension (k + 1)n x × n x (k + 1).
To further reduce the dimension of the involved matrices the different blocks in (26) need to be broken down themselves into blocks. In the following the expectations are always taken w.r.t. the joint density p(X 0:k+1 , Z 1:k+1 |D u ) when no other density is explicitly mentioned in the index of the expectation operator. For the joint density it is also useful to define the identity The matrix A k+1 is split into four blocks Matrix B k+1 is split only into two blocks With (30) and (31) the BIM in (28) becomes The efficient form presented in [18] requires the matrix B (1) k+1 to be zero because this reduces (32) to where the lower right block of A −1 k+1 is obtained from (30) and the Schur complement It can be seen from the definition of B (1) k+1 in (31) that the matrix becomes zero when no part of the joint pdf in (29) depends on x k+1 and X 0:k−1 at the same time due the differentiation w.r.t. this state vectors. This is fulfilled when both, the likelihood and the motion model have the Markov property The matrices in (33) all have the dimension n x × n x but the calculation of (34) still involves larger matrices. Fortunately, due to the Markov property it can be shown [18], that where the matrices A k , B k , C k are the matrices in (26) for time step k and D 11 . Given (28), (34) and (36) the connection to the previous BIM is established by Inserting (37) into (33) results in the recursive bound for which all involved matrices have dimension n x × n x For better readability and comparability in the following we will use the notation from [18] where the bound is given by The Markov property required for the efficient recursive formulation of the filtering bound implies that the GP as a part of the likelihood must be learned offline. Theoretically, as it is the case for simultaneous localization and mapping, the GP can be also adapted online based on all available measurements. For the online learning case, the BCRLB has to be calculated by inverting the large BIM for the whole sequence and the derivation of the concrete equations for implementation becomes much more involved as pointed out also in [19]. Fortunately, the bound of interest here is for localization only and assuming an offline trained GP is reasonable in practice.

III. IMPLEMENTATION OF THE BOUND
In the following, the implementation of the recursive bound for motion models of the form (23) is discussed.

A. EQUATIONS FOR GENERAL NONLINEAR MOTION WITH AWGN AND GP-BASED LIKELIHOODS
For the calculation of (39), the matrices D 11 k , D 12 k and D 22 k are required. In addition the recursion must be initialized with the BIM J 0 at time step zero at which no measurement is available yet. From (30), (31), and (26) one can see, that the required matrices are Note, in the above equations the expectations are theoretically over the complete joint density but the variables not part of the argument are simply marginalized out of the joint density by taking the expectation. A connection to the FIM of the current measurement is obtained from rewriting D 22 For the system model (23) with the Gaussian pdf p(x k+1 |D u , x k ) = N (f (x k , U) , Q) this results in the following equations where F T k = ∇ x k f (x k , U) T is the Jacobian matrix of the nonlinear motion model w.r.t. the state vector. The matrices D 11 k and D 12 k only contain information related to the movement model while D 22 k contains the information obtained from the newest measurement. For most cases the expectations have to be calculated numerically because the involved integrals are intractable. One exception is the linear case, for which closed -form solutions exist.
In (42), the expectations are all taken w.r.t. pdfs of the state x conditioned on the data set and the inputs. After some thought, this makes sense, since the bound assumes that the motion model describes for each time step the distribution of the state vector in the state space, e.g., how likely a robot is at a certain position in space. In which part of the state space the state vector ends up depends on the realization of the process noise and all these possibilities must be reflected in the bound. For the likely case that no closed-form solution exists for the expectation, it can be obtained by simulating multiple state trajectories, evaluating the FIM and the other terms for the different trajectories and then averaging over them in each time step. Due to the law of large numbers, the numerical solution converges to the true solution when enough trajectories are used. The calculation of the expected values was also the reason to include inputs into the bound. The input gives some control about the simulated trajectories. This allows us to simulate more realistic trajectories, e.g., a pedestrian in a building most likely will not perform a random walk. Also, the input enables the simulation of trajectories that stay within a certain part of the state space for which the GP in the likelihood is ''well'' defined by sufficiently dense measurements. Including the inputs also alters the estimation problem. The resulting bound holds only for filters that also have access to the inputs. For the case where no inputs are available, the equations for the bound remain the same. One simply has to remove the conditioning on U in the involved pdfs. The numerical evaluation of the expectations comes with considerable computational costs but is still much faster than evaluating the MSE of a filter that requires not only the averaging over different realization of the state vector but also over different realization of the measurement noise.

B. EQUATIONS FOR 2D MOTION OF A WHEELED ROBOT
With the above equations it is now possible to easily evaluated the bound for different types of state space models. For this paper the focus is on localization of a wheeled robot in a 2D plane. The state vector is assumed to contain the x and VOLUME 10, 2022 y position and the heading ϕ The motion model is with the time T between two time steps and the inputs speed v k and turn rate ω k . The inputs are considered deterministic but process noise accounts for imperfections in the actuators that can not translate the input perfectly into motion. The process noise is assumed to be Gaussian

The Jacobian matrix for (44) is
For the observation model we will consider two different cases, first only the magnitude of the magnetic field is used in the likelihood and second the vector field is used. Using only the magnitude simplifies the mapping process because the magnetometer attitude during map creation is not required but may reduce the achievable accuracy. If the accuracy is actually reduced and by how much will be further investigated in the evaluation part of this paper. The question how much the accuracy is affected by the choice of the measurement model is also a nice example for the usefulness of lower bounds. The measurement model for the magnitude is scalar and of the form (6) with the likelihood defined by (17). The FIM for the calculation of the expected value in D 22 k is obtained from (18). For the vector case we follow the approch in [5] and assume that each component of the vector field is approximated by a separate GP. Hence the measurement model is given by with the magnetic map where each element is a GP that is independent from the others. The superscript n indicates that the magnetic field in the map is given in the local navigation frame and the index stands for the different coordinate axis. The measurement is obtained in the body frame b of the magnetometer, to relate the measurement to the map therefore the rotation matrix from navigation to body frame is required. For the rotation matrix we assume the three magnetometer axis are aligned with the vehicle frame as shown in Fig. 1. With this frame definitions the rotation matrix is Based on the measurement model (46) the likelihood for the vector measurements can be calculated applying the rules for affine transformation of multivariate Gaussian random variables. When the random variable a ∼ N (µ a , a ) is Gaussian distributed then the random variable b = Aa follows the Gaussian distribution b ∼ N (Aµ a , A a A T ). Thus, the mean and the covariance of the likelihood ).
(49) are given by and where µ z n (x k , D u ) is the mean vector in the navigation frame and z n is the covariance in the navigation frame In (52) r n i i ∈ {x, y, z} are the GP variances of the different vector field components. Based on the Gaussian likelihood above, the FIM is defined by the vector version of (18) presented in [21] [ In the FIM, all derivatives are w.r.t. the state vector therefore the dependency on D u can be dropped for brevity. The derivatives in (53) are element wise. The derivative of the mean vector (50) therefore is a vector of the derivatives of each component. The derivative w.r.t. x k is For y k the derivative have the same form and only ∂ ∂x k has to be exchanged for ∂ ∂y k . For the heading one obtains For the covariance matrix (51) the derivative is again a matrix containing the derivatives of each element. For x k the non-zero elements of the derivative are As for the mean, the derivative of the covariance w.r.t. y k has the same form as (56). For the heading derivative the non-zero elements are In the equations (54)-(57) for the FIM now the derivatives of the covariance (20) and mean (22) of the GP in the navigation frame can be inserted. With the FIM then the recursive BCRLB is obtained from (39) and (42).

IV. EVALUATION SETUP
In this section, the overall setup used for the evaluation is introduced. This includes the measurement setup, the hyperparameter optimization, and an overview on the parameters of the Monte Carlo simulation.

A. MEASUREMENT SETUP
The evaluation is based on measurements recorded in the Holodeck laboratory at the German Aerospace Center Institute of Communications and Navigation in Oberpfaffenhofen shown in the left part of Fig. 2. The magnetic field is measured with an array of magnetometers. The array shown in the right part of Fig. 2 consists of ten low-cost Kionix KMX62 magnetometers that are connected via CAN bus to a computer that records the data. The distance between two neighboring sensors is fixed to ≈ 0.2 m. The first sensor is mounted 5.4 cm above the floor and the tenth sensor is at a height of 1.85 m. For the recording of the magnetic vector field in the lab the array is moved around manually in a vertical position. The magnetic field therefore is measured in ten different planes with different heights. This enables to investigate the impact of the height on the achievable localization accuracy. The magnetometer measurements are timestamped and synchronized with ground truth pose information of an optical tracking system from Vicon. In Fig. 2 the cameras of the system mounted on the ceiling can be seen. The recorded poses contain the position and the attitude. This enables to correct rotations of the array during the measurements. Prior to the Holodeck measurements, the magnetometers are calibrated with ellipsoid fitting, see e.g. [22], based on a data set recorded on a meadow at the German Aerospace Center site without any buildings close by. The true magnetic flux density of the Earth magnetic field for calibration is obtained from the international geomagnetic reference field model and has a value of 48.61 µT. From the calibration data also the standard deviation of the sensor noise was estimated. The average value over all sensors was σ n ≈ 0.23 µT. The exact value varies from sensor to senor by ±21.26%.

B. HYPERPARAMETER OPTIMIZATION AND TRAINING DATA
The hyperparameters of the GP are obtained by maximizing the logarithm of the marginal likelihood. The optimization is performed with the GPML Matlab toolbox presented in [23] and uses Gaussian hyper priors on the value of the constant VOLUME 10, 2022 mean function and the standard deviation σ D of the noise of the training data. For each sensor the hyperparameters are optimized individually for each axis and the magnitude over all axes.
During the lab measurements, the magnetometers were moved freely while recording with an update rate of 200 Hz which resulted in a large amount of measurements. For training this amount is reduced by placing an equidistant grid in the measured area and then taking for each grid point only the measurement that was closest to it. When setting the distance between grid points there is a trade-off between computational complexity and the accuracy of the GP approximation. When the amount of points increases the GP is more likely to describe the approximated function accurately while the complexity will increase cubical with the number of grid points. Here the distance in the x-and ydirection between two neighboring point was set to 0.1 m. To check if this spacing is sufficient for a good approximation of the magnetic field, cross validation was performed with measurements that are not part of the training data set D.
In Fig. 3-Fig. 5 the result of the GP regression is shown for magnetometer 1, 6 and 10. The measurement locations for training and cross validation of sensor 1 are shown in Fig. 6. We only show here the GPs of three sensors because this already gives a good idea of the general behavior of the magnetic field at different heights. The hyperparameters obtained from optimization, the height and the dynamic range for the investigated sensors are listed in Table 1. The dynamic range is the difference between the maximum and minimum value of the magnetic field of the respective axis. The dynamic range and the hyperparameters confirm what is also apparent from Fig. 3, the magnetic field close to the floor has the highest spatial variability. The magnetic field and parameters of the other sensors show, that when getting higher above the floor the range and variability first decreases and then increases again while the sensor becomes closer to the ceiling. These results comply with our expectations, when a sensor is close to the floor it is strongly affected by ferromagnetic material like rebar. With an increasing distance to the floor this effect is reduced until the sensor becomes closer to the ceiling. This general  trend of the hyperparameters is also visible Fig. 7 where the length scale and the dynamic range is plotted over the ten different sensors heights. Figure 7 also shows that although the hyperparameters follow the same overall trend there are larger differences between the different field components, e.g., from 80-140 cm for the length scale. This could be expected because the influence of ferromagnetic material on a magnetic field is not necessarily the same for all field components as can be observed also from Fig. 3-5.
Before the BCRLB for the above shown magnetic fields is calculated, cross-validation is performed to verify that the learned GPs closely approximate the underlying magnetic field. The histogram of the residual between the   cross-validation measurements and the GP for sensor 1 evaluated at the respective inputs is shown in Fig. 8. The residual for the different sensor axis is approximately Gaussian distributed and has a standard deviation slightly larger than the value σ D obtained from hyperparameter  optimization. This seems to be the case also for the magnitude. The magnitude measurements follow theoretically a non-central chi distribution. Fortunately, when the magnitude is large compared to the measurement noise variance the chi distribution approaches the shape of a Gaussian with a variance equal to the measurement noise and a mean close to the true mean.
For a perfect approximation the residual would contain only sensor noise with a standard deviation of σ n ≈ 0.23 µT. We suspect that the additional ''noise'' is partially caused by inaccuracies in the ground truth. An position error of only 5 mm and an magnetic field gradient of 35 µT/m already causes errors of 0.175 µT. In addition, small attitude errors in the range of 0.2 • and a magnetic field magnitude of 35 µT could lead to deviations of 0.12 µT in the measurements of the different sensor axis. From this we conclude that the residuals are close to what is to be expected and hence the GP seems to capture the underlying magnetic field with a high accuracy.

C. PARAMETERS FOR MONTE CARLO SIMULATION
The numerical approximation of the exceptions in (42) is performed over sample trajectories obtained from the motion model in (44). The quality of the approximation increases with the number N MC of trajectories but using many trajectories also increases the computation time required to VOLUME 10, 2022  Fig. 10. The results indicate that even for 100 trajectories the bound is already close to the bound for N MC = 5000. To gain a little bit more accuracy in the calculation for the remainder of the paper N MC = 750 was selected.

V. RESULTS AND DISCUSSION
In this section, the achievable accuracy of magnetic localization is evaluated with the GP based BCRLB and system model (44). In the evaluation, the BCRLB is calculated for two cases. In the first case, only the magnetic magnitude is considered and in the second case, the whole vector is accounted for. For both cases, the bound for three sensor heights is calculated. Furthermore, the BCRLB is compared to the empirical root mean squared error (RMSE) of a sampling importance resampling particle filter. This comparison shows how well the filter performs and is at the same time a plausibility check for the bound.

A. BCRLB FOR MAGNETIC LOCALIZATION
In Fig. 11 and Fig. 12 the square root of the BCRLB diagonal elements for sensor 1, 6 and 10 are shown. Overall the achievable position accuracy is well below 10 cm. For the heading one can expect accuracies of a few degrees for the magnitude and below one degree for the vector case. Overall using the whole magnetic vector field results in a bound that is always below the one of the magnitude only case. In particular the heading is improved by using the whole vector, this is not only reflected in a higher accuracy but also in an significantly improved convergence time. This behavior was to be expected, because for the magnitude only case the measurements are independent from the heading and hence the entries of the FIM associated with the heading become zero. That the heading is still observable, in the sense that the BCRLB gets reduced over time, is due to the motion model. Over the motion the heading is related to the position which is related to the measurements. The observability therefore is not direct. It should be mentioned, that in this particular example it is assumed that the magnetometer is mounted in the center of rotation. If the sensor would be mounted with a lever arm w.r.t. the center of rotation a rotation would lead to a change in the measured magnitude and the FIM for the heading would contain non-zero entries improving the observability. How well this works will likely depend on the length of the lever arm, is the length small compared to the length scale heading changes would only lead to small changes in the magnitude which can vanish in the sensor noise. In the vector case if the position is known the heading can be obtained from only one measurement. This is the principle on which compasses are based. This direct observability of the heading results in a fast convergence of the vector bound and heading errors below one degree after the first measurement. For the magnitude case the one degree threshold is only reached after almost 6 s.
The comparison between the different sensors show, that overall sensor 1 close to the floor achieves most of the time the lowest values. Considering the definition of the FIM in (53) and the hyperparameters in Table 1 this is not surprising. Sensor 1 has the highest dynamic range and smallest length scale leading to more areas in the magnetic field with strong gradients that give rise to higher values in the FIM. Between Sensor 6 and Sensor 10 the differences are not that obvious anymore, e.g., the heading BCRLB of the vector case for Sensor 6 is smaller than for sensor 10 for most of the time. A similar behavior is observed for the x position between 4 s-8 s for the vector and the magnitude only case.
In summary the results show that in the given environment a sensor close to the floor is able to achieve localization accuracies below 1 cm. For sensors higher above the floor the performance degrades due to a reduced spatial variability in the magnetic field. Nevertheless even for Sensor 6 which has the largest distance from the floor and the ceiling the position accuracy was below 10 cm. If one can chose between the magnitude and the vector case clearly using the whole vector field is the better option from an accuracy point of few. This is not surprising, taking the magnitude of the vector field reduces the information and makes the magnetic field at different position less distinct. From a practical point of view the magnitude case is still interesting because it largely simplifies the mapping in which it is sufficient to know the sensor position in space but not its attitude.

B. PARTICLE FILTER PERFORMANCE
The BCRLB is a lower bound for the MSE of any filter. This means the bound can be used as optimality measure during filter development. In this section its shown how well the particle filter performs when applied to magnetic localization and if there is room for improvement. In particular the sampling importance resampling filter [24] is evaluated. The MSE is obtained from running the particle filter over the 750 state trajectories and 25 realization of the sensor noise along these trajectories. This means in total the filter is evaluated 18750 times. From the result the MSE if obtained by averaging. This approach is needed due the definition of the MSE in (1) as the expected value of the estimation error w.r.t. the joint pdf of the states and measurements.
The used implementation of the particle filter is mostly a standard sampling importance resampling filter. In each time step first a prediction based on the motion model and the inputs is performed and then the weights are updated. If the effective sampling size is below 0.8 · N p resampling is performed. During the implementation of the filter it quickly became apparent that evaluating the GP for every particle slows down the filter significantly. Due to the large number of required filter evaluations this becomes quickly unfeasible. The complexity of the filter is also a problem in practice where for most applications a real time position solution is desired. Therefore, we decided to modify the filter by not evaluating the GP variance as one might do in practice. The variance evaluation is the bottleneck as it requires a matrix vector multiplication while the mean only requires the calculation of the scalar product between two vectors. This can be easily seen from (14). The variance contains the term K x k D S −1 K Dx k that requires the left and right multiplication of a vector that depends on the current state x k and the square matrix S −1 with a dimension equal to the amount of training data. For the mean only K x k D S −1 (z D − m (X D )) has to be evaluated where S −1 (z D − m (X D )) is a vector that depends solely on the training data and hence can be calculated once offline. Neglecting the variance of the GP speeds up the particle filter significantly but also removes the information about the GP uncertainty. Therefore, to account for the variance of the GP in the weight update the standard deviation of the likelihood was set to 1.2 · σ n . To increase the stability of the filter also the process noise was increased by 20 % compared to the noise set in the Monte Carlo simulation. For this settings, Fig. 13 shows the RMSE of the particle filter in comparison to the square root of the BCRLB.
The result in the figure shows that in principle the filter RMSE follows the shape of the bound but cannot reach it. For the vector case, the filter RMSE of the 2D position is quite close to the bound with an difference in the order of millimeters after the filter has converged at around 1 s. When only the magnitude is considered, this gap is larger and more in the range of 1 cm. For position estimation it seems that including the whole magnetic vector improves the result significantly also for the filter. This seems reasonable, considering the whole magnetic vector field makes the field at a certain position more ''unique'' which helps the filter to converge toward the correct position. For the heading one can observe that overall the filter result for the vector field is well below the result and the bound for the magnitude only case. As before this was to be expected. Observing the full vector with each measurement enables the filter to directly observe the heading instead of having to wait until the sensor moves through the changing magnetic field as it the case for the magnitude only case. As already for the bound, this results in a faster decay in the heading uncertainty after initialization.

VI. CONCLUSION
In this paper, the BCRLB with deterministic inputs and GP-based likelihoods was adapted to the problem of magnetic field-based localization. For magnetic localization, two special cases were considered. In the first case, only the magnetic magnitude was assumed to be available to the estimator and in the second case, the whole magnetic vector field was considered. The bound for both cases was then evaluated based on measurements of an indoor environment with multiple sensors mounted in different heights above the floor. To the best of our knowledge this was never done before.
The first step of the evaluation was to optimize the GP hyperparameters individually for all sensors. The resulting hyperparameters show a clear height dependence. With an increasing distance to the floor and the ceiling, the length scale of the squared exponential kernel increases and the overall dynamic range of the magnetic field decreases. In the second step, the BCRLBs for three selected sensors were calculated. The bounds show that for the three sensor heights position accuracies below 10 cm are attainable, for the heading values below 1 • are possible. The results clearly indicate that using the magnetic vector instead of its magnitude improves the position and heading accuracy. This comes with no surprise since taking the magnitude reduces the information contained in the magnetic field. The magnitude case is still quite important in practice since it allows to reduce the effort required for creating the magnetic map by avoiding the need for an attitude reference system. In the final step of the evaluation the BCRLB was compared to the RMSE of a particle filter. The position RMSE of the filter is close to the bound for the vector case but never attains it. For the magnitude case a larger difference is observed. Hence, in both cases the results show that the used particle filter is not optimal w.r.t. the RMSE.