Orientation and Location Tracking of XR Devices: 5G Carrier Phase-Based Methods

Accurate knowledge of the three-dimensional (3D) orientations and 3D locations of the user devices, such as wearable glasses, is of paramount importance in different extended reality (XR) use cases and applications. In this article, we address the corresponding six degrees-of-freedom (6DoF) tracking challenge of 5G-empowered XR devices. We describe a new uplink (UL) carrier phase measurements based estimation approach, allowing for low-latency 3D orientation and 3D location tracking directly at the 5G network base-stations or gNodeBs (gNBs). extended Kalman filter (EKF) based practical signal processing algorithms are described while also the applicable Cramér-Rao lower-bounds (CRLBs) are derived and presented. Also, the related aspect of over-the-air estimation of the XR headset antenna constellation or antenna geometry is addressed. Additionally, the important practical challenges related to user equipment (UE) clock drifting as well as integer ambiguities in carrier phase based methods are both considered. Finally, an extensive set of numerical results is provided in an example indoor factory like environment, covering both 3.5 GHz and 28 GHz network deployments. The obtained results demonstrate the feasibility of continuous 6DoF tracking through the proposed approach, with root mean squared error (RMSE) accuracies below one degree for the 3D orientation and below one centimeter for the 3D location, respectively. The results also demonstrate that UE clock drifting and carrier phase integer ambiguities can both be efficiently estimated and tracked, as part of the overall proposed concept and methods.

Orientation and Location Tracking of XR Devices: 5G Carrier Phase-Based Methods Jukka Talvitie , Member, IEEE, Mikko Säily , Member, IEEE, and Mikko Valkama , Fellow, IEEE Abstract-Accurate knowledge of the three-dimensional (3D) orientations and 3D locations of the user devices, such as wearable glasses, is of paramount importance in different extended reality (XR) use cases and applications.In this article, we address the corresponding six degrees-of-freedom (6DoF) tracking challenge of 5G-empowered XR devices.We describe a new uplink (UL) carrier phase measurements based estimation approach, allowing for low-latency 3D orientation and 3D location tracking directly at the 5G network base-stations or gNodeBs (gNBs).extended Kalman filter (EKF) based practical signal processing algorithms are described while also the applicable Cramér-Rao lower-bounds (CRLBs) are derived and presented.Also, the related aspect of over-the-air estimation of the XR headset antenna constellation or antenna geometry is addressed.Additionally, the important practical challenges related to user equipment (UE) clock drifting as well as integer ambiguities in carrier phase based methods are both considered.Finally, an extensive set of numerical results is provided in an example indoor factory like environment, covering both 3.5 GHz and 28 GHz network deployments.The obtained results demonstrate the feasibility of continuous 6DoF tracking through the proposed approach, with root mean squared error (RMSE) accuracies below one degree for the 3D orientation and below one centimeter for the 3D location, respectively.The results also demonstrate that UE clock drifting and carrier phase integer ambiguities can both be efficiently estimated and tracked, as part of the overall proposed concept and methods.Index Terms-3D orientation estimation, 3D positioning, 5G NR, 6G, Bayesian filtering, carrier phase, clock drifting, extended reality, integer ambiguity, six degrees-of-freedom, tracking.

I. INTRODUCTION
M OBILE communication networks, most notably the fifth generation (5G) New Radio (NR) standardized by the 3rd Generation Partnership Project (3GPP), offer improved bit rates and largely reduced latencies compared to earlier cellular systems [1], [2].XR is one of the timely new use cases of 5G and 5G-Advanced, paving the way eventually towards the future metaverse in the sixth generation (6G) era [3].XR is generally an umbrella term, covering different variants of augmented reality Fig. 1.Basic XR concept illustration highlighting the field-of-view and its potential impact on the content quality, together with the involved six degreesof-freedom containing the XR device 3D position and 3D orientation.
(AR), virtual reality (VR), and mixed reality (MR), with wide variety of consumer-scale and industrial applications.
The requirements and technical aspects of XR service characteristics in 5G NR are generally well defined and summarized in [4].One of the distinct features is the requirement for continuous and highly accurate XR device orientation and location estimation, commonly referred to as 6DoF tracking [5], [6].Depending on the more specific XR application, such 6DoF pose information -illustrated conceptually in Fig. 1 -is needed at the network side even at millisecond-level update intervals while the accuracy requirements can be in the order of centimeters [4], [5], [7].While user equipment (UE) positioning -and more broadly radio-based situational awareness [8], [9] -are generally inherent features of 5G and 5G-Advanced, the accuracy of the existing UL or downlink (DL) based methods is far below the XR requirements, especially when it comes to the 3D orientation estimation of the XR device [10], [11].Terminology-wise, we use the terms UE and XR device inter-changeably in the rest of this article, as the XR device operates as a UE from the cellular network perspective.UE orientation estimation is conventionally performed at the UE side by using antenna array measurements, camera-based methods [12], [13] or inertial sensors such as accelerometers, gyros and magnetometers [14], [15] When using antenna arrays, the orientation can be estimated directly as part of the channel parameter estimation, where the orientation is specifically linked to the angle-of-arrival (AoA) observations or measurements.Such methods have been developed and described, e.g, in [16], [17], [18], [19], [20], [21], [22], [23].However, in order to reach high orientation estimation accuracy, large array sizes are commonly required, which increases implementation cost and potentially introduces also additional reference signal overheads and increased latency.Furthermore, and very importantly, most of the existing angle-based methods build on UE side processing and DL measurements, hence, the orientation information has to Fig. 2. Illustration of the fundamental problem geometry with a single TRP and a single UE.For presentation simplicity, the UE contains three antenna points illustrated in red.The figure also high-lights the impact of 3D rotation at the UE side.The proposed concept builds on acquiring and processing UE antenna-level ranging estimates through 5G NR reference signals.
be separately communicated from the XR device to the network via applicable messages.Such signaling or message protocols are always subject to latency while also consuming the UL bandwidth.
In general, inertial measurement sensors are able to provide accurate UE orientation estimation locally, see for example [14], [24] and the references therein.This applies, however, only under appropriate calibration conditions.For example, accelerometers are able to measure orientation displacement over time, but when time evolves, also the errors accumulate larger and larger [14].Furthermore, fusing camera-based or other optical data with inertial measurement unit (IMU) data can facilitate the orientation tracking but in case of, e.g., new environments, variable lightning condition and/or UE mobility, the camera-and IMU-based orientation tracking may cause loss of absolute orientation information, thus making the XR content rendering challenging.Use of local sensors is generally a valid approach, but in this article we deliberately pursue pure cellular radio based 6DoF tracking solutions.
To this end, in this article, we focus on the 6DoF XR device tracking through cellular carrier phase measurements.However, the proposed methods can be applied also in the context of other applications calling for 6DoF tracking.While carrier phase based methods are conceptually known and broadly utilized in global navigation satellite system (GNSS) system context, see for example [25] and the references therein, they are currently also considered in 3GPP standardization for enhanced positioning capabilities in 5G-Advanced evolution [26], [27], [28].Compared to the existing state-of-the-art in cellular orientation estimation, we show that the proposed carrier phased -based approach enables accurate orientation estimation with small numbers of antennas -as an extreme example, even a single antenna element can be exploited during orientation tracking.Additionally, and importantly, the UE orientation estimation can be performed based on UL signals, which enables real-time UE orientation estimation also at the gNB or the network transmission and reception point (TRP) side.Hence, the proposed approach enables an effective over-the-air (OTA) inertial sensor like functionality, with minimum latency, while avoiding the error accumulation problems of local IMUs and offering support for full 6DoF tracking.
For clarity, it is noted that the utilization of 5G/cellular carrier phase measurements has been addressed in the recent literature, e.g., in [29], [30], [31].In [29], carrier phase measurements are utilized for line-of-sight (LoS)/non-line-of-sight (NLoS) identification in complex propagation environments.Actual carrier phase -based ranging and indoor positioning are addressed in [30] where the time-of-arrival (ToA) estimates obtained using already standardized DL reference signals are further enhanced through carrier phase -based processing.The work in [31], in turn, focuses on improved UE clock synchronization through carrier phase measurements, and thereon for improved localization accuracy through time-based positioning methods.However, none of these works consider the important 6DoF challenge and the related 3D orientation estimation -the main technical problem addressed in this article.
In relation to the existing literature, the contributions and novelty of this article can be stated and summarized as follows: r We describe a new carrier phase ranging based 6DoF tracking concept for continuous estimation of XR device 3D orientation and 3D location in 5G and beyond cellular systems; r We provide applicable CRLBs, and assess how the car- rier phase based measurement accuracy impacts the 6DoF estimation performance; r We formulate EKF based state models and tracking algo- rithms for practical signal processing implementations; r We propose and describe new concepts and algorithms for over-the-air estimation or calibration of the XR device antenna constellation; r We provide further extended problem definition and EKF processing solutions to allow for estimating and tracking also the UE clock drifting and the carrier phase integer ambiguities; r We provide extensive numerical performance results in the context of 5G NR with realistic deployment assumptions, demonstrating the feasibility of continuous 6DoF tracking using the proposed approach, even in the presence of UE clock drifting and carrier phase integer ambiguities.The rest of this article is organized as follows: Section II describes the assumed system model in terms of the 6DoF geometry, XR device and network TRP antenna systems as well as the carrier phase measurement model.Section III provides the proposed 6DoF tracking methods while also addressing the performance bounds in the form of CRLBs.Also applicable UL-based and DL-based variants of the proposed scheme are discussed.In Section IV, important extensions to UE clock drifting and integer ambiguity tracking are provided.A large collection of numerical results is provided in Section V, while the conclusions are drawn in Section VI.Finally, selected mathematical details are provided in the Appendix.
Notations: Vectors are denoted by bold lowercase letters (i.e., a), bold uppercase letters are used for matrices (i.e., A) and scalars are denoted by normal font (i.e., a).The operators (•) T , E x {•}, and • denote the transpose, expectation with respect to random variable x, and Euclidian norm, respectively.

II. SYSTEM MODEL
In this work, we consider standalone wearable XR devices with built-in 5G NR modem and computing capabilities, such as the device types XR5G-A4 or XR5G-A5 defined in [4].Specifically, we consider the 6DoF scenario including the estimation of the 3D position and 3D orientation of the XR device, with the basic system geometry being illustrated in Fig. 2. Our emphasis is on the utilization of 5G NR carrier phase-based ranging measurements that can be obtained from physical signals as Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
described, e.g., in [30], [31].Hence, our emphasis in this article is on the concepts and signal processing on top of the raw carrier phase measurements -instead of individual ranging algorithms.For clarity, it is noted that the basic system model considered in this section excludes UE clock drifting and carrier phase integer ambiguity challenges.However, these are addressed explicitly in Section IV.
To this end, the estimated unknown 6DoF state vector s is defined as where θ = [α, β, γ] T describes the 3D UE orientation given in rotations α, β and γ, which refer to yaw, pitch (zero at horizon) and roll of the device, respectively.Furthermore, p UE = [x UE , y UE , z UE ] T is the UE position, defined as the center point of rotation, where x UE , y UE , z UE are the x-coordinate, y-coordinate and z-coordinate of the UE, respectively.Furthermore, for presentation simplicity, we consider a system with a single UE (i.e., the XR device) and M active TRPs located at p TRP,m for m = 0, . . ., M − 1.The UE is equipped with N antennas with distinct transmit and receive RF chains.In this article, to support a straightforward generalization of the provided system model on scenarios with multiple gNBs and arbitrary antenna configurations, the TRP is defined as a single antenna element.Therefore, it is possible to consider arbitrary-shaped antenna arrays at the gNB side by simply deploying nearby TRPs according to the desired antenna array structure, and similarly, to define different gNB transceivers by deploying TRPs far away from each other.In practice, the TRP can also be implemented, e.g., as a phased antenna array as long as there is an unambiguous reference position or reference point for determining the range estimate.
By utilizing antenna-level carrier phase measurements, the range r n,m between the nth UE antenna and the mth TRP, can in general be measured either at the TRP based on UL signals, or at the UE based on DL signals.Considering M TRPs and N UE antennas, the measurement model for the carrier phase -based range measurements is defined as where h : R 6 → R MN is the measurement function, n ∈ R MN is measurement noise, which determines the ranging error as n ∼ N (0, Σ), and Σ ∈ R MN×MN is the measurement covariance matrix.Moreover, the measurement function can be further divided into TRP-wise models as where the measurement function of the mth TRP h m (s) : R 6 → R N is given as (4) Finally, the measurement function, describing the range r n,m between the mth TRP and nth antenna element of the UE/XR device can be written as where R(θ) ∈ R 3×3 is the 3D rotation matrix, which applies the rotation determined by the UE orientation θ.The rotation matrix is defined according to 5G NR specifications in [32], and is also provided in the Appendix for presentation completeness and readers' convenience.Furthermore, ρ n ∈ R 3 , with n = 0, . . ., N − 1, is the position of the nth antenna element relative to p UE (UE local coordinates) without rotation (i.e., α = β = γ = 0).An alternative method for representing orientations is to employ quaternions [33], which in specific use cases are able to introduce potential advantage, such as improved numerical stability.However, we follow the approach determined by the 3GPP [32], and therefore utilize the rotation matrix -based representation for the orientation in this work.Terminology-wise, we call the set ρ n , n = 0, 1, . . ., N − 1, the XR device antenna constellation in the following.In Fig. 2, the considered system geometry is illustrated for a scenario with a single TRP and a single UE equipped with three antennas.The reference coordinate frame is defined by the TRP as (x, y, z), which determines the positions of the UE antennas ρ n with zero rotation.By applying specific rotation R(θ), the UE orientation can be altered, resulting in a new UE coordinate frame of (x , y , z ).The UE orientation is defined based on yaw α, pitch β and roll γ, which describe rotations of z-axis, y-axis and x-axis in respective order.By measuring the ranges r n,m , as shown conceptually in Fig. 2, the UE 3D orientation and 3D position can be estimated and tracked through the proposed methods, described next in Section III.

III. PROPOSED METHODS AND BOUNDS
In this section, we describe the actual 6DoF estimation and tracking methods, while also provide the corresponding performance bounds in the form of CRLBs.Also selected UL-based and DL-based variants of the proposed scheme are discussed.

A. 6DoF Performance Bounds
In order to analyze the performance of carrier phase -based 3D UE orientation and 3D position estimation at a theoretical level, we derive the CRLB, which provides a theoretical lower-bound for the covariance matrix of an unbiased estimator ŝ, expressed formally as [34] CRLB where J(s) ∈ R 6×6 is the Fisher Information Matrix (FIM).The FIM element at the i th row and j th column is defined as where (y|s) is the log-likelihood function of the measurements given the estimated state vector s, si denotes the i th element of the state vector s, while the state vector s is defined in (1).
Considering the Gaussian measurement model defined in (2), the FIM can be conveniently obtained as [34] where H(s) ∈ R MN×6 is the Jacobian matrix of the measurement function, defined with respect to the state vector s.Furthermore, the Jacobian matrix is constructed from two sub-matrices as where H θ (θ) ∈ R MN×3 relates to the UE orientation state and H p (p UE ) ∈ R MN×3 relates to the UE position state.The Jacobian matrix related to the UE orientation state is given as where the partial derivatives, derived based on the fundamental system geometry and measurement function in (5), read as The matrices R α (θ), R β (θ), R γ (θ) ∈ R 3×3 represent the partial derivatives of the rotation matrix R(θ) with respect to α, β and γ in respective order, expressed formally as whose detailed descriptions are provided for readers' convenience in the Appendix.The corresponding Jacobian matrix related to the UE position state can, in turn, be given as Moreover, based on the considered measurement function in (5), the partial derivatives can be expressed as At last, utilizing ( 6) and ( 8), the CRLBs for the unknown 6DoF state parameters can be stated as follows: CRLB(α) = J −1 (s) [1,1] , CRLB(x UE ) = J −1 (s) [4,4] CRLB( β) = J −1 (s) [2,2] , CRLB(ŷ UE ) = J −1 (s) [5,5] CRLB(γ) = J −1 (s) [3,3] , CRLB(ẑ UE ) = J −1 (s) [6,6] .(15) The above CRLBs can be further exploited for obtaining the orientation error bound (OEB) and positioning error bound (PEB) [17] as PEB = J −1 (s) [4,4] + J −1 (s) [5,5] + J −1 (s) [6,6] , (17) where the OEB represents the lower bound for the RMSE of the 3D orientation estimation, and PEB represents the lower bound for the 3D positioning RMSE.Furthermore, when there is a-priori information on the estimated parameters, or the parameters are estimated recursively in time so that the preceding estimates provide information on the current estimate, the FIM can be (re-)expressed with two additive components as [35] J post (s) = J prior (s) + J meas (s).
In (18), the state vector notation is deliberately changed to s to allow for including also rate-of-change parameters in the state definition -as shown concretely in (22).Additionally, J post (s) is the FIM of the a-posteriori estimate, J prior (s) is the FIM of the a-priori estimate, and J meas (s) is the FIM obtained from the measurements, calculated conceptually similar to (8).The prior information of the state is determined as where P(s) denotes the prior probability distribution of state s.
Considering a system with dynamic parameters, which are estimated and tracked recursively over time steps k, the a-posteriori FIM can be re-expressed as wherein for presentation convenience, we have omitted the state vector from the FIM notations while emphasizing time k.Furthermore, the a-priori FIM is defined as where F describes the state-transition matrix, and Q[k] is the process covariance, which are both defined and further described along the upcoming EKF formulations in Section III-B, specifically ( 25) and (26).Furthermore, numerical results of the stated bounds are provided in Section V.

B. 6DoF Tracking Through Extended Kalman Filtering
In the following, we next formulate an efficient EKF-based approach for estimating and tracking the desired 6DoF parameters of an XR device in a time-varying mobile scenario.To this end, we first extend and redefine the static state vector, given in (1), to support the desired mobility, and define the time-varying state vector at the kth time step as where T is the rotational rateof-change, including distinct rates-of-change of rotation angles α, β and γ, respectively.In addition, T is the 3D UE velocity, defined relative to the x-axis, y-axis and z-axis, respectively.
For the proposed EKF formulation, we assume that the UE velocity, as well as the rate-of-change of the UE rotation, are almost constant over a small time period.Consequently, we utilize a linear constant white-noise acceleration (CWNA) statetransition model [36], and accordingly define the state-space model, consisting of the state-transition model and measurement model as where F ∈ R 12×12 is the state-transition matrix, and 12 is the driving noise of the state-transition model.Furthermore, the state-transition matrix is expressed as and Δt is the time duration between two consecutive time steps.
According to CWNA state-transition model [36], the process Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
covariance matrix can be given as , where ( 26) The magnitude of process covariance can be scaled with σ 2 Δθ and σ 2 v , which describe the power spectral density of angular acceleration of each rotation angle (α, β and γ), and the power spectral density of UE acceleration with respect to each component axis (x UE , y UE , z UE ), respectively.The measurement model in (24) follows the formulation provided in (2).Accordingly, y[k] ∈ R MN denotes the measurement vector, which is build on the measurement function h : R 6 → R MN , defined in (3), and on the measurement noise n[k] ∼ N (0, Σ[k]).For the measurement function input, parameters of velocity and rotational rate-of-change are omitted from the EKF state vector so that Given the state-space model in ( 23) and ( 24), the EKF-based estimation and tracking proceeds through iterations of a prediction step and an update step.First, at time index k, the prediction step of the EKF is expressed as where are the a-priori estimates of the mean and covariance of the state s[k], respectively.After this, the a-posteriori estimates of the mean and the covariance, denoted as ŝ+ [k] ∈ R 12 and P+ [k] ∈ R 12×12 , respectively, can be obtained by the update step of the EKF as where the Kalman gain K[k] ∈ R 12×MN is defined as In addition, ŷ[k] denotes the obtained measurements at time index k, and ŝ− [k] is the estimated a-priori state vector without the velocity and rotational rate-of-change parameters, as given in (24).Furthermore, H[k] ∈ R MN×12 is the Jacobian matrix for attaining the linear approximation of the non-linear measurement function.For the considered EKF formulation, the Jacobian matrix can written as where H θ [k] and H p [k] are evaluated at the a-priori mean estimate ŝ− [k] as derived and expressed in ( 10) and ( 13), respectively.
Finally, it is noted and highlighted that the considered nonlinear filtering problem could be solved through other alternative methods, such as the unscented Kalman filter (UKF) or the cubature Kalman filter (CKF), that also build on Gaussian density approximations.However, since the underlying models are differentiable and because the EKF is shown through the numerical results to essentially reach the respective posterior bounds, EKF is a well-argued nonlinear filtering approach and thus adopted in this article.In general, the main advantage of extended Kalman filter over alternative nonlinear filtering solutions, such as the UKF or the particle filter, is its relatively low computational complexity compared to its filtering performance [37].We have also verified through concrete numerical evaluations that UKF and CKF yield essentially the same performance as the EKFbased solutions.

C. Uplink vs. Downlink Aspects
In general, the proposed methods can be utilized for either DL-based or UL-based measurements, depending primarily on which reference signals are available and utilized for carrier phase ranging.Furthermore, depending on the more specific approach, there is need to report certain parameters or measurements between the TRPs and the UE.This directly affects the estimation latency and the amount of the side-information signaling, and the used approach for a given XR application should be eventually selected based on the underlying performance requirements defined in [4].
In the UL-based approach, a feasible physical estimation resource is the uplink sounding reference signal (SRS) [1], [10] and hence the TRPs can obtain the ranging estimates through the corresponding carrier phase measurements.If the actual 6DoF UE tracking is then also pursued at the network, information about the UE antenna constellation needs to be communicated as side-information -or alternatively, estimated over-the-air as described in Section III-D.As a further alternative, the ranging measurements can be communicated from the network towards the UE, together with the TRP locations, after which UE can execute the 6DoF tracking.
In the DL-based approach, in turn, the most feasible estimation resource is the downlink positioning reference signal (PRS) [1], [10] with which the UE can calculate the antennalevel range estimates through the carrier phase measurements.If the 6DoF tracking is then also pursued at the UE, sideinformation about the TRP locations is needed through DL signaling.As an alternative, the DL PRS based ranging measurements can be communicated towards the network, together with the UE antenna constellation information, which then allow the 6DoF tracking to run at the network.
Overall, given the strict latency requirements for photorealistic rendering in different XR use cases [4], [7], the UL-based approach with UE antenna configuration or constellation signaling allows for lowest latency.Such an approach can be seen as an effective over-the-air inertial sensor like functionality with support for full 6DoF tracking without error accumulation problems commonly observed with, e.g., IMUs [14].Finally, it is noted that the provided EKF formulation provides also information about the prevailing rate-of-change of the XR device orientation, due to the adopted state variable formulation in (22).When such information is available in the network, with minimum latency through the UL-based approach, viewpoint prediction, proactive rending and other similar schemes discussed, e.g., in [38], become basically technically feasible.
While the baseline approach for acquiring the headset antenna configuration or constellation at the network is through uplink feedback signaling, we next describe a novel OTA measurement and estimation approach for obtaining such knowledge directly from the uplink radio signals.

D. Extension to Over-the-Air Estimation or Calibration of Headset Antenna Constellation
The 6DoF tracking through EKF, presented in Section III-B, assumes a known headset antenna constellation.That is, the position of each antenna element, ρ n , with respect to the UE Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
position with zero-rotation were assumed to be known during the tracking process.However, in certain practical scenarios, such information on the antenna constellation might be unavailable, or the information is not desired to be shared with the entity performing the 6DoF tracking.In such cases, it is possible to extend the proposed EKF to enable over-the-air estimation of the antenna constellation.
Considering that the headset antenna constellation, described by the relative antenna element positions ρ n , is not known, it is unfeasible to jointly estimate and track the antenna constellation, the device position p UE and device orientation θall simultaneously.As defined in the measurement function in (5), the element positions ρ n are determined with respect to the device position (center point-of-rotation) and zero-orientation, which results in implicit relation between the unknown antenna constellation and the unknown device position and orientation.Nonetheless, although the absolute joint estimates of the above parameters are out of reach, it is possible to estimate and track the constellation so that the antenna elements are positioned with respect to each other, revealing the fundamental geometric structure of the antenna constellation.The estimated antenna constellation is then connected to the true constellation through specific rotation and position offset.
For the proposed over-the-air calibration of the XR headset antenna constellation, we consider four separate approaches: 1) Unknown Movement and Rotation: In the first approach, the XR device is moving and rotating in the area without any prior knowledge of the device positions or orientations.Consequently, the EKF state vector is given as are the position and orientation of the UE/XR device, and the relative position of the nth antenna element, respectively.These state values are connected to the corresponding true values through a specific rotation and position offset.2) Unknown Rotation at a Fixed Known Position: In the second considered approach, the XR device is rotating at a fixed and known position without knowing the orientations.Since the position of the UE/XR device is known, the EKF state vector is reduced to thus including only tracking of the orientation and antenna constellation.3) Unknown Movement and Fixed Known Rotation: In the third approach, the XR device is moving in the area with a fixed and known orientation.In this case, the EKF state vector is given as , and therefore considering only tracking of the position and antenna constellation.4) Known Movement and Rotation: In the fourth approach, the XR device is moving and rotating in the area by assuming perfect knowledge of the device positions and orientations at each time instant.As both the position and orientation of the UE/XR device are known, the EKF state vector includes only the unknown antenna constellation given as Out of these, the approach or scenario 4) serves as the performance benchmark, while items 1), 2) and 3) are more realistic in practical applications.
For all the considered approaches defined above, the estimation and tracking of the headset antenna constellation can be carried out by using the EKF formulations, presented in Section III-B.However, the Jacobian matrix, shown in (31), must be extended according to the unknown antenna constellation parameters.To this end, considering measurements from the mth TRP, the Jacobian matrix H ρ,m ∈ R N ×3N for the antenna constellation ρ n with n = 0, . . ., N − 1 can be written as , where ( 32) For example, regarding the first approach with unknown movement and rotation, the Jacobian matrix can be constructed as which is an extension of the original Jacobian matrix shown in (31).
After the estimation and tracking procedure, or at any time instant during the procedure, the estimated antenna constellation can be transformed into the desired frame of reference as where ρn is the estimated relative position of the mth antenna while θ REF and p UE,REF are the orientation and position of the XR device, all in the desired frame of reference.By defining the frame of reference according to the original definition of ρ n (i.e., antenna elements defined relative to the device position with zero rotation), it is possible to evaluate the estimation accuracy for the antenna constellation.Furthermore, since the described procedure involves setting a specific reference orientation and position, it can also be considered as one kind of a calibration process, where the estimated frame of reference is associated with the used global frame of reference.
The theoretical estimation accuracy for the antenna constellation can be derived by applying (32) to the derivations presented in Section III-A.In Fig. 3, the behavior of the corresponding CRLB for estimating the antenna constellation is illustrated using single-shot measurements from three TRPs, whose exact locations are provided in Section V.The figure presents the average CRLB over all antennas at different XR device locations, and Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
shows that the estimation accuracy is the highest when the device is located close to a TRP.Although not visible in the figure, at a fixed location, the device orientation does not significantly affect the calibration bound.However, minor variations due to orientation can be observed, as different orientations alter the distance between the antenna elements and TRPs.The figure also clearly illustrates that the calibration performance is favorable when the device is located inside the convex hull composed by the three TRPs.This is due to the more favorable geometry in the corresponding area, compared to being located outside.Additional concrete numerical examples will be provided in Section V.

IV. ADDRESSING INTEGER AMBIGUITY AND CLOCK CHALLENGES
In this section, we extend the problem definition to cover carrier phase measurements with integer ambiguity as well as unknown UE clock drifting.Furthermore, we formulate a joint EKF-based tracking solution for UE position, UE orientation, integer ambiguities, and UE clock.After this, we describe the corresponding tracking method with differential measurements.

A. Carrier Phase Measurements With Unknown Integer Ambiguity and UE Clock Drifting
The carrier phase measurement for the nth antenna element, obtained at the mth TRP, can be written as [31] φ where r n,m is the range between the nth antenna element and the mth TRP, as described in (5).In addition, c is the speed of light, δ is the clock error between the UE and TRPs, λ is the carrier wavelength, and ν n,m ∼ N (0, σ 2 ν ) is measurement noise.Moreover, N int n,m denotes the integer ambiguity for the pseudorange between nth antenna element and mth TRP, where the pseudorange refers to the range with a clock error as r n,m + cδ.The unknown UE clock δ is considered time-varying and can be modeled at time instant k as [39], [40] δ

B. 6DoF Tracking Under Integer Ambiguities and UE Clock Errors
For joint tracking of position, orientation, integer ambiguities, and clock error, the state vector shown in ( 22) is extended with new state variables incorporating the integer ambiguities and clock error as denotes the integer ambiguity rate-of-change for the nth antenna element and mth TRP.Compared to conventional approaches in the literature such as [41], [42], where integer ambiguity resolution is performed outside of the Kalman filter, we estimate all parameters within the EKF.By this way, we pursue towards more computationally attractive solution for practical signal processing implementations -especially with multiple UE antennas and TRPs.
For tracking purposes, the state-transition models of integer ambiguity and clock are assumed to follow a CWNA model similar to the device position and orientation.Consequently, to obtain the a-priori estimates of the mean and covariance, denoted as ŝ− int&clk [k] and P− int&clk [k], respectively, the EKF prediction step considering integer ambiguities and clock bias can be performed by following the same approach as shown in (28).However, the state transition matrix and process covariance matrix are extended to block-diagonal matrices ΔN int and σ 2 Δδ denote the power spectrum densities of process acceleration for integer ambiguity and clock in respective order.Furthermore, the a-posteriori estimates of the mean and the covariance, denoted as ŝ+ int&clk [k] and P+ int&clk [k], can be obtained according to (29) by considering modified state variables and measurement model.To this end, by considering the measurement model with integer ambiguity and clock bias in (35), the Jacobian matrix for the EKF update phase is extended with partial derivatives with respect to the tracked integer ambiguities and clock as ∂φ n,m /∂N int n,m = −λ and ∂φ n,m /∂δ = c in respective order.Since the partial derivatives with respect to the orientation θ and position p UE are identical to the ones presented in Section III-A, the Jacobian matrix for tracking with integer ambiguities and clock is given as where 1 NM is a vector of all-ones with NM elements, and H[k] is as described in (31).
As defined in (37), the proposed EKF formulation with unknown integer ambiguity and UE clock considers all state variables as real numbers, which does not reflect the true behaviour of integer ambiguities.Therefore, we propose using an additional EKF update phase, where the real-valued integer ambiguity estimates from ŝ+ int&clk [k] are rounded and fed back to the EKF in form of virtual integer ambiguity measurements.This approach enables estimation of rounded (integer precision) integer ambiguities inside the same EKF without introducing supplementary algorithms for integer ambiguity resolution.Accordingly, the additional EKF update step can be defined as where the Kalman gain is the measurement vector including the rounded integer estimates.Furthermore, H int N = [0 NM×12 , I NM , 0 NM×(NM+2) ] and Σ int N [k] ∈ R NM×NM are the constant measurement matrix and the covariance matrix for the virtual integer ambiguity measurements, respectively.To obtain the rounded integer ambiguity measurements for the additional update step, we consider a simple rounding function ŷint Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Fig. 4. EKF processing steps for tracking with integer ambiguities and unknown clock bias.Conceptually similar processing steps are also utilized with differential carrier phase measurements.Instead of rounding, also alternative integer ambiguity estimation methods can be deployed.
Alternatively, one could also consider the least-squares ambiguity decorrelation adjustment (LAMBDA) method [43], where with the estimated a-posteriori covariance P+ int&clk [k], all integer ambiguities can be jointly estimated based on the least squares principle.Compared to the simple rounding scheme, the LAMBDA method can provide increased estimation robustness in challenging operation environments at the cost of higher computational complexity.This is because the LAMBDA method includes possibility to scale the integer ambiguity search space, and thus improves integer ambiguity resolution under challenging cycle slips.However, since the search space grows exponentially when increasing the number of UE antennas and TRPs, the resulting computational complexity must be carefully considered in practical scenarios with real-time processing.An illustration of the proposed EKF processing steps is shown in Fig. 4.

C. 6DoF Tracking Through Differential Carrier Phase Measurements
Considering the carrier phase measurement of the n ref th antenna element and m ref th TRP as a reference, a differential phase measurement is described for where Compared to the formulations and approach in Section IV-B, the differential measurement approach allows to suppress the UE clock offset impact without explicit clock tracking.
The tracked state vector with differential phase measurements can now be written as where are the integer ambiguity difference and its rate-of-change.Again, assuming the CWNAbased state-transition model, the EKF prediction step to obtain the a-priori estimates of the mean ŝ− diff [k] and covariance P− diff [k] can be performed based on (28) by considering the following modifications.Firstly, the statetransition matrix and the process covariance matrix for tracking with differential carrier phase measurements are described as block-diagonal matrices , respectively.Moreover, σ 2 ΔN diff denotes the process acceleration for integer ambiguity difference.
For the EKF update step with differential carrier phase measurements, the a-posteriori estimates of the mean and covariance, denoted as ŝ+ diff [k] and P+ diff [k], can be obtained based on (29) by appropriately modifying the state variables and measurement model.Accordingly, given the measurement model in (39), the Jacobian matrix for differential phase measurements includes additional terms of integer ambiguity differences, given as ∂φ diff n,m /∂N diff n,m = −λ, and can be further described as Since also the integer ambiguity differences ΔN diff n,m [k] are integer values, the additional EKF update step, described in (38), is also performed for the differential measurements to obtain the a-posteriori estimates of the mean and covariance as ŝ++ diff [k] and P++ diff [k], respectively.Regarding the additional EKF update step with differential measurements, the constant measurement model and the measurement covariance matrix are given as T ∈ R MN is the measurement vector including the estimates of rounded integer ambiguity differences, where ŷdiff The EKF processing steps illustrated in Fig. 4 are conceptually also valid for the differential measurements.

D. Further Discussion
The key driver towards the proposed EKF processing with integer ambiguity estimation is feasible computational complexity to enable practical real-time processing which can scale up to several user device antennas.The utilized rounding approach to handle the integer ambiguities is simple, but effective when utilized with the EKF through virtual integer ambiguity measurements, as shown in Fig. 4.However, the proposed overall concept with antenna-level measurements allows, as such, estimating integer ambiguities also with more advanced methods, such as the LAMBDA method, with further details available e.g. in [31], [41], [42], [44].In [41] and [42], a Kalman filter is processed independently using real (float) numbers, and a subsequent estimation method is used separately to resolve the integer ambiguity.In [42], a separate batch weighted nonlinear least-squares method is used to jointly estimate the UE position and real valued integer ambiguities, but eventually both [41] and [42] utilize the LAMBDA method for integer ambiguity solution.Moreover, a mixture Kalman filter is proposed in [44] for elegant, but computationally fairly evolved solution involving particle sampling.
In practical setups with measurement outliers, or with longer measurement intervals, the mixture Kalman filter can provide robustness against cycle slips.Furthermore, by dynamically Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.scaling the computational complexity according to underlying estimation uncertainty, for example, by varying the number of used particles, the mixture Kalman filter could provide a favorable trade-off between computational complexity and robustness for the considered XR scenario.

A. Scenario and Assumptions
The performance of the proposed concept and methods is next numerically assessed in an indoor factory like setting, following the 3GPP evaluation guidelines and assumptions stated in [26], [32].The size of the factory area within which a person wearing the XR headset is moving is limited 10 m × 10 m while both 3.5 GHz and 28 GHz network deployments with three TRPs are considered.In the baseline evaluations, the XR headset is assumed to contain five antenna points, while the number is also later varied.The considered gNB or TRP deployment locations are illustrated in Fig. 5 To account for different practical aspects and limitations, such as oscillator impairments, antenna reference point errors and antenna phase center offsets, we utilize the extensive carrier phase estimation accuracy results, collected and reported in [26], while also vary the corresponding ranging estimation error variance to understand how the antenna-level ranging performance impacts the corresponding performance of the proposed 6DoF estimation and tracking concept.Furthermore, all numerical results are evaluated based on 100 separate UE test tracks with varying UE position and orientation.Within any individual UE test track, carrier phase measurements are obtained at intervals of Δt = 10 ms, which is the duration of one radio frame in 5G NR.In practice, the required update interval depends on the expected maximum velocities (or rate-of-change) of tracked parameters and varies according to different system parameters, such as measurement noise.With certain considered simulation setups, larger intervals could be used, especially in ones with low measurement noise.Nonetheless, in 5G NR, it is possible to transmit PRSs or SRSs even with less than 1 ms intervals, and thereon impact the update interval.For the test tracks, the true UE movement and the true evolution of orientation angles are modeled using a non-linear kinematic model, which builds on the 3D-extension of the model presented in [45].Each user track is 30 s long, and bounded within the 10 m × 10 m × 2.5 m operation area as illustrated in Fig. 5.Moreover, the values of orientation angles are limited to −180 deg ≤ α < 180 deg and −60 deg ≤ β, γ ≤ 60 deg to reflect physically feasible XR headset rotation.Inside a track, the UE velocity and the rate-ofchange per orientation angle vary up to 4 km/h and 90 deg/s, respectively.Such a velocity corresponds essentially to human walking speed while the orientation rate of change is well inline with the human head moving rates reported, e.g., in [46].An example of a considered 3D UE track with varying 3D UE orientation is illustrated in Fig. 6 by showing separate antenna positions as functions of time.For visual clarity, only the positions of 3 antenna elements are shown, instead of all 5.In addition, to assist with perception of 3D space, projections of the antenna positions are shown in the XY-plane, XZ-plane and YZ-plane.
Assuming uncorrelated measurements, the covariance matrix of the measurement noise is defined as Σ = σ 2 r I MN , where σ r is the standard deviation of the ranging error.By following the reported carrier phase-based positioning and ranging results in [26], we evaluate results for the theoretical 6DoF bounds in Section V-B by varying the ranging error from σ r = 0.1 mm to σ r = 50 mm.Such fairly large range covers well the individual accuracy results reported and discussed in [26], reflecting different practical assumptions.Moreover, for the 6DoF tracking results in Section V-C, we consider two fixed ranging error deviations, given as σ r = 50 mm and σ r = 5 mm, which reflect to 3.5 GHz and 28 GHz network deployments, respectively.These stem from the comprehensive ranging performance results provided in [26], serving as realistic practical example scenarios.These ranging standard deviation values are in the order of half the wavelength of the considered frequency bands, hence, one can argue that the ranging accuracy assumptions are even somewhat pessimistic -or at least not overly optimistic and thus safe.
In order to provide statistical variation for the 6DoF tracking results in Section V-C, each track is further run over 100 trials with different measurement noise realizations.In the beginning of each track realization, the used EKF is initialized without assuming any prior information on the UE position, UE orientation, or related motion.The initial UE position and UE orientation as well as their covariance, are obtained based on Gauss-Newton algorithm [34] by exploiting the derived Jacobian matrix in (9).Moreover, the initial UE velocity and the rotational rate-of-change are set to zero with very large covariance.Regarding the state-transition model, the power spectral densities of angular acceleration and UE acceleration of the process covariance Q[k] are defined as σ 2 Δθ = 9•10 −4 and σ 2 v = 4•10 −4 , respectively.Throughout the evaluations we assume a known measurement covariance matrix Σ[k].However, in practice the measurement covariance can be estimated on the run, or learned in advance based on training data.Finally, the baseline results build on the assumption of known XR headset antenna geometry, however, results related to over-the-air estimation of the antenna constellation are also provided towards the end.Similarly, the extended problem scope with unknown integer ambiguities and drifting UE clock is considered towards the end of the numerical results.

B. 6DoF Bounds
In the following, we first study the measurement-related (no prior) bounds and their behavior as functions of the standarddeviation of the carrier phase -based ranging error, σ r .The behaviors of the corresponding posterior bounds are illustrated and studied along the actual EKF-based tracking results in the next subsection.
First, CRLBs are evaluated and shown separately for the orientation parameters α, β and γ in Fig. 7.The RMSE curves are obtained based on the derived expressions in (15) as CRLB(α), CRLB( β) and CRLB(γ), respectively.As can be observed, highly accurate estimation of the orientation parameters is theoretically feasible, with 1 • RMSE of an individual parameter calling for approximately 3...5 mm ranging accuracy in the considered system geometry.Based on [26], such ranging Fig. 7. CRLBs for the estimation of orientation parameters α, β and γ as functions of ranging error standard deviation σ r .Fig. 8. CRLBs for the estimation of x-coordinate, y-coordinate and z-coordinate as functions of ranging error standard deviation σ r .Fig. 9. OEBs as functions of ranging error standard deviation σ r for three different XR device sizes.Also the corresponding 1st to 99th percentile distribution ranges are illustrated through shaded regions.
accuracies are feasible -especially at the millimeter-wave bands.We can also observe that the estimation of α is somewhat more accurate, theoretically, stemming from the favourable TRP geometry in horizontal plane, in which the α directly operates.
The corresponding no-prior bounds for the x-coordinate, ycoordinate and z-coordinate are illustrated in Fig. 8.The RMSE curves are evaluated based on the derived expressions in (15), as CRLB(x UE ), CRLB(ŷ UE ) and CRLB(ẑ UE ), respectively.Based on the bounds and their numerical behavior, below centimeter level positioning accuracy is theoretically feasible with the 10 mm ranging performance.It can also be observed from the figure that the estimation of the z-coordinate is less accurate, at least to some extent, stemming from the fact that all the network TRPs are deployed at mutually similar antenna heights as described in the previous subsection.Hence, taking this aspect into account in the TRP or gNB deployment is essential to maximize the 6DoF estimation performance.
Next, the average OEB behavior, as defined and expressed in (16), is evaluated and shown in Fig. 9 as a function of the ranging error standard deviation σ r .Besides the assumed XR Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Fig. 10.PEB a function of ranging error standard deviation σ r with the default XR device size.Also the corresponding 1st to 99th percentile distribution range is illustrated through shaded region.
device antenna configuration, results include also two additional configurations, where the dimensions of the device and thereon the antenna geometry are either halved or doubled.This provides further insight to the error and performance dependency on the assumed antenna separations at the XR device.The shaded areas in the figure show the OEB domains between the 1 st and 99th percentiles over all time steps and test tracks.In can clearly be observed that for a given antenna-level ranging standard deviation, the performance is very stable across the different paths and points along the tracks.We can also observe that in order to reach a true one degree 3D orientation RMSE, the ranging accuracy should be just a few millimeters.The figure also clearly shows that increasing the device size has a clear positive impact on the 3D orientation estimation accuracy, due to the improved antenna geometry.
Finally, the average PEB behavior, as defined and expressed in (17), is evaluated and shown in Fig. 10 while again varying the ranging error standard deviation σ r .Regarding the PEB, the RMSE values with half-sized array or double-sized array are practically identical for the considered system configuration, and are thus not shown in the figure.The results in Fig. 10 again highlight the consistency of the estimation performance along the tracks and points of the individual tracks, assuming a given ranging standard deviation.Concretely, in order for the 3D location RMSE to be below 1 cm with high probability, antenna ranging accuracies in the order of 5 mm are required in the considered scenario.

C. 6DoF Tracking Results
Next, we study and present the performance of the proposed EKF and the corresponding posterior bounds derived and expressed in (20).We consider two alternative ranging error standard deviations, σ r = 5 mm and σ r = 50 mm = 5 cm, chosen based on the extensive carrier phase estimation studies in [26], and corresponding to 28 GHz and 3.5 GHz network deployments, respectively.As noted earlier, the results build on 100 independent UE tracks with time-varying position and orientation, while each track is also sampled 100 times with different measurement noise realizations for statistical purposes.
First, the average orientation estimation RMSEs for the proposed EKF as functions of time are shown in Fig. 11 with the two considered ranging error standard deviations.The figure also shows the corresponding posterior OEBs for reference.It can be clearly observed that after the initial transient, the proposed EKF yields performance very close to the respective posterior bound, hence implying an efficient estimator.Application-wise, it can also be observed that below one degree 3D orientation tracking is feasible, especially at the millimeter-wave frequencies.
Similarly, the average positioning RMSEs for the proposed EKF as functions of time are shown in Fig. 12, together with the applicable posterior PEBs, for the two considered ranging error standard deviations.The results high-light again the efficiency of the EKF solution, while also evidence close to one centimeter 3D location tracking RMSE at the millimeter-waves.
The actual cumulative distributions of the absolute estimation errors with the proposed EKF for orientation angles α, β and γ are shown in Fig. 13, again considering the two antennalevel ranging error standard deviations of 5 mm and 50 mm.Similarly, the corresponding cumulative distributions of the absolute estimation errors for x-coordinate, y-coordinate and z-coordinate are shown in Fig. 14.The cumulative distributions demonstrate, overall, highly consistent tracking performance across the tracks and noise realizations.Additionally, similar to Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.the no-prior performance findings in the previous subsection, the EKF performance is relatively better for α, compared to other orientation parameters, due to the more favourable horizontal geometry of the TRPs.The distributions also show that tracking the device z-coordinate is more challenging, as noted earlier, due to the fact that the antennas of all three TRPs are located essentially at the same height.Finally, we summarize selected key performance measures in Table I, covering both the frequency range 1 (FR1, 3.5 GHz) and FR2 (28 GHz) deployments.The table shows the 3D orientation and 3D position RMSEs with the proposed EKF formulation together with the RMSEs of the corresponding static estimator which refers to the Gauss-Newton based estimation utilized as the EKF initialization approach.Additionally, the respective bounds are shown for comparison purposes.The results in the table clearly high-light the added value of a true tracking solution that incorporates information also from the preceding estimates.Specifically, as shown by the results in Table I, essentially one order of magnitude RMSE improvements can be obtained through EKF-based tracking compared to static estimation.Overall, the results demonstrate the feasibility of extremely accurate 6DoF XR device tracking, with below one degree and below one centimeter orientation and position RMSEs available at the millimeter-wave bands.
Besides the proposed EKF-based approach, corresponding CKF and UKF based solutions were also implemented and tested.However, the obtained results with CKF and UKF are practically identical with the EKF, and thus they are omitted from the numerical results shown in this article.It should be emphasized that under the condition of similar performance between EKF, CKF and UKF, the EKF is preferred from the computational complexity perspective.

D. Results With Reduced Headset Antenna Counts
Next, we shortly study how reducing the antenna count in the XR headset, from 5 down to 4 and 3, impacts the 6DoF tracking performance.For the considered 4 antenna scenario, we employ the antenna elements with indices 0, 1, 2 and 3 from the original antenna constellation which was earlier illustrated in Fig. 5. Furthermore, for the 3 antenna scenario we remove yet an additional element, and thus employ only the antenna elements with indices 0, 1 and 2.
The average EKF tracking RMSEs of 3D positioning and 3D orientation estimation with reduced antenna counts are shown in Table II for both the FR1 (3.5 GHz) and FR2 (28 GHz) deployments.In addition to the EKF tracking results, the corresponding performance bounds (PEB, OEB) are again provided.All results are evaluated based on the same test tracks and parameters as with the original 5 antenna setup.According to the shown RMSEs, it is visible that reducing the antenna count at the FR2 deployment has a rather small effect.Specifically, when reducing the antenna count from 5 to 3, the EKF RMSE for the 3D orientation increases from 0.83 deg to 1.14 deg, and correspondingly for the 3D position from 3.05 mm to 3.76 mm.However, the performance drop is more notable at FR1, where the RMSE of 3D orientation estimation is roughly doubled, and the RMSE of 3D positioning almost tripled, when reducing the antenna count from 5 to 3. The reason for this is that in the FR1 deployment, the assumed 50 mm ranging error standard deviation inflicts antenna-level measurement errors that are already at the level of antenna separation distances, which in turn introduces challenges to the estimator when the antenna count is reduced towards to the critical number of 3 antennas.Nevertheless, the proposed EKF is still able to achieve RMSEs close to the theoretical bounds, and thus proving its feasibility to the considered tracking of XR device orientation and location even with only 3 headset antenna points.These findings also clearly motivate for clever XR headset design in terms of optimizing the antenna positions.

E. Results With OTA Calibration of Headset Antennas
We next demonstrate and evaluate the proposed over-the-air estimation or calibration of the XR headset antenna constellation.The evaluations are performed using the original antenna constellation with 5 antenna elements according to Fig. 5, including the four different OTA estimation approaches, described in Section III-D.The RMSEs of the antenna constellation estimates as functions of time for the four considered approaches are shown in Fig. 15.For all the approaches, we assume the FR2 deployment with 5 mm ranging standard deviation, and Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.evaluate the performances by exploiting the same test tracks and orientations as described in Section V-A.Besides the EKF tracking results, the corresponding theoretical posterior bounds are derived and shown to illustrate the EKF performance with respect to an estimator with optimum performance.
In the first approach, namely the Unknown Movement and Rotation case, the XR device is moving and rotating in the operation area, while the EKF tracks jointly the device position and device orientation as well as the antenna constellation.It should be emphasized that during the tracking process the estimated parameters do not reflect the true parameters due to the implicit relation between the position, orientation and relative antenna coordinates.Thus, the accuracy of the antenna constellation estimation can be evaluated only after the estimated antenna constellation is transformed to the original frame of reference, as discussed and defined in (34).As seen in Fig. 15, from the four considered approaches, the approach with unknown movement and rotation is the most challenging, reaching to antenna constellation estimation RMSE of below 4 mm.However, it can also be seen that the proposed EKF gets close to the posterior performance bound found to be approximately at 3 mm at the end of the tracking process.
In the second approach of estimating the antenna constellation, namely the Unknown Rotation at a Fixed Known Position scenario, the XR device is rotating at a fixed known position, while the EKF tracks jointly the device orientation and the antenna constellation.For these evaluations, the XR device is located at a fixed position in the center of the area (p UE = [0, 0, 1.5] T ), while the orientation changes according to the orientation of the earlier considered test tracks.As seen in Fig. 15, compared to the first approach, the estimation RMSE is improved due to the reduced process covariance through known device position, reaching antenna constellation accuracy of below 2 mm already.In the third approach, namely the Unknown movement and fixed known rotation., the orientation is known instead and fixed at θ = [0, 0, 0] T , while the device is moving with unknown position.Due to an interchangeable relationship between the antenna constellation and the device position, this approach results in lower estimation accuracy compared to the one with known position, eventually reaching around 3 mm accuracy.
Whereas the first three approaches can be considered more essential and relevant for practical scenarios and applications, the fourth approach, namely the Known Movement and Rotation reference case, works as a performance benchmark for estimating the antenna constellation.In this approach, similar to the first and third ones, the XR device is moving and rotating in the operation area, but assuming now that the device position and orientation are perfectly know at each time instant.Thus, the EKF tracks now only the unknown antenna constellation.As seen in Fig. 15, the RMSE of antenna constellation estimation reaches accuracies that are only fractions of a millimeter.Moreover, unlike with the other considered OTA calibration approaches, the estimation accuracy is now able to continuously improve in time due to the absence of process noise related to the device position and/or the orientation.

F. Results With Integer Ambiguity and UE Clock Drifting
For the results with integer ambiguity and clock bias, as well as with differential carrier phase measurements, we consider the same general parameters described in Section V-A, except that the filter update interval is reduced to 1 ms to support tracking of integer ambiguities.The standard deviation of the clock skew noise ξ[k] is defined as σ ξ = 10 −10 , which can result in clock variations corresponding to tens of meters of ranging error during one track.Moreover, the considered true integer ambiguities stem directly from the time-varying position, orientation and clock of the UE.For the standard deviation of carrier phase measurement error we vary three different values, including σ ν = 0.01λ m, σ ν = 0.05λ m, and σ ν = 0.1λ m, where λ is the carrier wavelength.These correspond to standard deviations of carrier phase errors of 3.6 deg, 18 deg, and 36 deg in respective order.
In the EKF process model, the additional parameters related to clock drift and integer ambiguities have been chosen based on brief testing over different parameter combinations.Regarding the approach with tracking the clock and integer ambiguities, we define the power spectrum density of clock acceleration as σ 2 Δδ = 10 −20 .Furthermore, the power spectrum density of process acceleration for integer ambiguity is defined σ 2 ΔN int = 10 8 for FR1, and σ 2 ΔN int = 10 10 for FR2.The relatively large process noise is adopted in order to allow fast integer updates with the considered simple rounding scheme.However, the process noise magnitude can be considerably decreased when using more advanced integer resolution methods, such as the LAMBDA method.Then, regarding the approach with differential carrier phase measurements, the power spectrum density of process acceleration for integer ambiguity difference is defined σ 2 ΔN diff = 10 10 for FR1, and σ 2 ΔN diff = 10 12 for FR2.The covariance matrix for the virtual integer ambiguity measurements in the additional EKF update step is given as Σ For all results with integer ambiguity, we initialize the EKF using differential measurements at FR1 (defined in (39)) and a non-linear least squares estimation based on the Gauss-Newton method with multiple initial starting points for the iterations.With the differential measurements, the initial clock error can be removed, allowing accurate UE position estimation.The initialization of the EKF is accomplished using 100 samples before proceeding to the proposed EKF processing steps, shown in Fig. 4. In practice, the computational burden and latency Fig. 16.Cumulative distributions of the absolute 3D positioning errors for tracking with integer ambiguities and unknown clock, as well as for tracking with differential measurements.Fig. 17.Cumulative distributions of the absolute 3D orientation estimation errors for tracking with integer ambiguities and unknown clock, as well as for tracking with differential measurements. of such brute force approach could be mitigated by exploiting additional initial positioning methods supported in 5G NR, such as multi-round-trip-time (Multi-RTT), time-difference-ofarrival (TDoA), or AoA.
In Fig. 16, cumulative distributions of absolute 3D positioning errors are presented for both FR1 and FR2, considering tracking with integer ambiguities and unknown clock as well as two separate standard deviations of carrier phase measurement error.When comparing the positioning accuracy between FR1 and FR2, at FR2 higher accuracy is achieved due to the shorter wavelength.At FR1, EKF-based tracking with integers and clock ("Int&Clock"), presented in Section IV-B, performs similarly compared to the corresponding tracking with differential measurements ("Differential"), presented in Section IV-C.However, at FR2, tracking with integers and clock becomes more challenging due to more frequent integer ambiguity cycle slips.The 90% error percentiles with measurement standard deviation of σ ν = 0.01λ are below 12 mm at FR1 and below 4 mm at FR2, while with σ ν = 0.05λ, below 40 mm at FR1 and below 14 mm at FR2.The corresponding cumulative distributions for absolute 3D orientation errors are shown in Fig. 17.On contrary to the positioning results, there are no visible differences between  tracking with integers and clock vs. tracking with differential measurements.This is due to fact that orientation estimation is based on relative measurements between different antennas, and is thus insensitive to common clock error between the antennas.With measurement standard deviation of σ ν = 0.01λ, the 90% error percentiles are below 0.27 deg at FR1 and below 0.06 deg at FR2, whereas with σ ν = 0.05λ, the percentiles are below 0.89 deg at FR1 and below 0.19 deg at FR2.
Cumulative distributions of the clock estimation error are shown in Fig. 18 for the tracking with integers and clock.Again, the estimation accuracy at FR2 is higher compared to FR1.Considering the measurement standard deviation of σ ν = 0.05λ, the 90% error percentiles for the clock estimation are below 0.03 ns for FR1 and below 0.02 ns at FR2.
As shown earlier in Fig. 14, the positioning performance is limited due to poor TRP geometry from the z-coordinate (altitude) point of view.Thus, for final evaluation, we introduce an additional TRP to the position p TRP,3 = [0, 0, 4] T , which can be considered to be placed on the ceiling in the middle of the operation area.In Fig. 19, cumulative distributions of positioning errors are compared between 3 TRPs and 4 TRPs for tracking with integers and clock ("Int&Clock"), while assuming FR1 and two measurement standard deviations of σ ν = 0.01λ and σ ν = 0.10λ.Moreover, the cumulative distributions are shown separately for horizontal 2D error defined in xy-plane ("Hor."),and altitude error defined in direction of z-axis ("Alt.").From the figure, it can be seen that with 3 TRPs the altitude error is Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.considerably larger compared to the horizontal error, especially with σ ν = 0.10λ.However, with 4 TRPs the horizontal and altitude errors are at the same range, and most importantly, there is a significant improvement in the altitude estimation accuracy compared to the original 3 TRP case.In addition, with 4 TRPs the 90% error percentiles for both the horizontal and altitude errors reach below 4 mm with σ ν = 0.10λ.Finally, in Fig. 20, the corresponding cumulative distributions for 3D orientation error are shown for the 3 TRP and 4 TRP scenarios.It can be seen that introducing the fourth TRP does not provide similar increase in accuracy with orientation estimation as with positioning.Thus, orientation estimation accuracy is not as sensitive to the TRP geometry as positioning accuracy.Nonetheless, overall, when applying 6DoF tracking in the considered 5G NR scenario, it is essential to ensure an appropriate TRP geometry.

VI. CONCLUSION
In this article, the so-called six degrees-of-freedom tracking challenge of XR headset was addressed, in the context of 5G and beyond cellular networks.The proposed approach builds on antenna-level carrier phase measurements, obtained through either uplink and downlink reference signals, combined with Bayesian filtering in the form of EKF to continuously track the XR headset 3D orientation and 3D location.The EKF formulation also allows for tracking the device velocity and orientation rate-of-chance that can provide added value in mobile XR applications, e.g., for viewpoint and/or field-of-view prediction and proactive rendering with minimum latency when operating in the uplink-based mode.Also applicable no-prior and posterior Cramér-Rao lower-bounds were stated and derived to assess the 6DoF performance limits.Furthermore, over-the-air estimation concepts and algorithms were proposed for acquiring the XR headset antenna geometry.Additionally, extended problem definitions and corresponding EKF solutions were provided to account for the important integer ambiguity and UE clock drifting challenges.An extensive set of numerical results were provided, evaluating and assessing the performance of the proposed methods and the corresponding performance bounds, covering results for both 3.5 GHz and 28 GHz network deployments.To obtained results clearly demonstrate that especially in millimeter-wave networks, the proposed concept can facilitate highly accurate 3D orientation and 3D location tracking -with the best numerical examples showing RMSE accuracies below one degree and below one centimeter, respectively.Also, estimating the XR headset antenna constellation at RMSEs closing towards one millimeter as well as accurate tracking of the carrier phase integer ambiguities and UE clock drifting were shown to be technically feasible.

R(θ) =
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Fig. 3 .
Fig. 3. Illustration of the CRLB for estimating the antenna constellation ρ n , n = 0, . . ., N −1, at different UE positions with three TRPs.The CRLB refers to the average of the single-shot measurement bounds of individual antennas.
m ref denotes the integer ambiguity difference between a specific measurement and the corresponding reference.Moreover, Δν n,m = ν n,m − ν n ref ,m ref is the effective measurement noise of the differential measurement.

Fig. 5 .
Fig. 5. Illustration of the overall scenario, indicating the 3 TRP positions as well as the mobile XR device operation area (on the left), and the configuration of the antenna elements at the XR device (on the right).

Fig. 6 .
Fig. 6.Example illustration of one simulation track showing the 3D positions of individual UE antenna elements as functions of time.For improved clarity, only the positions of 3 antenna elements out of 5 are shown.The grey curves show the projection of the antenna positions to the XY-, XZ-and YZ-planes.

Fig. 11 .Fig. 12 .Fig. 13 .
Fig. 11.Average 3D orientation estimation RMSE for the proposed EKF and the corresponding posterior OEB with the two considered ranging error standard deviations.

Fig. 14 .
Fig.14.Cumulative distributions of the absolute estimation error for xcoordinate, y-coordinate and z-coordinate with the two considered ranging error standard deviations.

Fig. 15 .
Fig.15.Average 3D positioning RMSE and the corresponding posterior bounds for the online OTA calibration of the headset antenna constellation with four different approaches, while assuming 5 mm ranging standard deviation.

Fig. 18 .
Fig. 18.Cumulative distributions of the absolute clock estimation error for tracking with integer ambiguities and unknown clock.

Fig. 19 .
Fig. 19.Cumulative distributions of the absolute 2D horizontal errors and altitude errors for 3 and 4 TRPs, while assuming tracking with integer ambiguities and unknown clock at FR1.

Fig. 20 .
Fig.20.Cumulative distributions of the absolute 3D orientation estimation errors for 3 and 4 TRPs, while assuming tracking with integer ambiguities and unknown clock at FR1.

TABLE I AVERAGE
RMSES FOR 3D POSITION AND 3D ORIENTATION ESTIMATION, TOGETHER WITH THE CORRESPONDING BOUNDS AT FR1 (3.5 GHZ, 50 MM RANGING STD) AND FR2 (28 GHZ, 5 MM RANGING STD)

TABLE II AVERAGE
EKF TRACKING RMSES WITH REDUCED HEADSET ANTENNA COUNTS FOR 3D POSITION AND 3D ORIENTATION ESTIMATION, TOGETHER WITH CORRESPONDING BOUNDS AT FR1 (3.5 GHZ, 50 MM RANGING STD) AND FR2 (28 GHZ, 5 MM RANGING STD)