Height-Adaptive Human Gait Model Through Radar-Driven Pipeline With Two Co-Located mmWave MIMO Radars

In this paper, a methodology for developing a human walking model adapted to the individual measured by radar was proposed. The fundamental parameters used in the model are the gait parameters and physical dimensions. Empirical validation of the proposed methodology is undertaken, involving the acquisition of data using a 77-GHz FMCW radar. The data is collected from three distinct individuals walking five different trajectories with respect to the radar. Moreover, the gait parameter estimation accuracy is evaluated for the different walking trajectories of the target. The studied gait parameters are speed, step length, and step frequency. These could be estimated with mean errors up to 0.077 m/s, 9.3 cm, and 0.128 Hz for all trajectories, respectively. Nevertheless, these errors diminish to 0.022 m/s, 2.2 cm and 0.03 Hz, respectively when the targets walk in a straight trajectory aligned with the radar beam. Moreover, the feasibility of estimating body part dimensions directly from the radar data is investigated. It was found that only the total human height could be directly estimated using the employed hardware. Except for the tallest participant of 2.01 m, the height could be estimated with a mean absolute error up to 10.9 cm. Enhanced hardware configurations or the integration of machine learning techniques may improve the accuracy of body part dimension estimations.


I. INTRODUCTION
Human gait is a distinctive motion pattern that reflects unique characteristics of an individual.As a fundamental aspect of human locomotion, gait serves as a signature of the movement style and the physical condition of the person.The development of an accurate personalized model of human walking, capable of capturing these distinct signatures, holds great value for a variety of applications.
Significant applications where the gait model finds utility lie in healthcare and sports, specifically in gait monitoring for diagnostic and rehabilitative purposes.Medical professionals The associate editor coordinating the review of this manuscript and approving it for publication was Cheng Hu .
can analyze the gait model and its associated parameters to gain insights into medical conditions.As an example, several studies have demonstrated that self-selected gait speed and step length serve as reliable markers for diagnosing Parkinson's disease [1], [2], [3].On top of that, progression or degression in the human motor system and other changes in gait features can be analyzed when comparing models captured at different time instances.Another field of application is found within computer animation.The use of gait models in computer animation makes virtual characters more realistic.Human gait representation is crucial for creating lifelike and appealing characters in animated films, video games, and virtual reality (VR) experiences.The integration of a gait model into animation enables persons to incorporate their own unique walking style into their virtual twins.Furthermore, having an accurate model of the past motion of a person may be used to predict future motion.This prediction can be used in for example smart home systems.By utilizing gait models, smart home systems can employ predictive gait modelling to anticipate the needs of residents, such as adjusting lighting or room temperature based on their anticipated movements.
Gait measurements have been obtained through various sensing methods, such as vision-based sensors [4], [5].However, these raise concerns about privacy, and their performance can be easily affected by varying light conditions, such as low target illumination due to darkness or smoke.Alternatives utilizing laser optics, as demonstrated in [6] using lidar devices, can generate high-resolution point clouds for gait analysis but come with a higher cost.Another approach involves the use of motion sensors that make contact with the body, e.g.utilizing accelerometers and gyroscopes [7].These motion sensor-based methods exhibit high measurement accuracies.Nevertheless, one major drawback of wearables lies in the inconvenience of wearing the devices, as individuals may forget or feel uncomfortable with wearing them.
In contrast, radar-based measurements offer several advantages.Radar-based systems are low-cost, accurate and contactless while have a relatively low privacy sensitivity, making them suitable for gait monitoring applications.Consequently, there are plenty of studies that use radar for human gait measurements [8], [9], [10], [11], [12], [13].These works have successfully extracted gait parameters such as step time, step length, cadence and walking speed using monostatic radars.In the cited studies, the gait parameters were determined by exploiting the range and Doppler information over time.The measurements were done with the target walking either on a treadmill, or on a straight trajectory walking inbound or outbound with respect to the radar.The Doppler signature is well-expressed in these cases, since high Doppler shifts can be measured.In a realistic scenario, however, the target is unlikely to always follow a path aligned with the radar beam.When the location and orientation of the human relative to the radar changes, the Doppler signatures are affected as well.Hence, the gait estimation performance is expected to deteriorate.This has not been investigated in earlier work before.Therefore, in contrast to existing work on gait measurements with radar, the angular dependency on the accuracy of gait parameter estimation is studied in this work.
In addition to accurately determining gait parameters, an essential aspect of creating a lifelike human walking model involves estimating the size of body parts.Several studies exist in which the human body pose and/or keypoints are estimated, applying different methods [14].In [15] and [16], ellipses are fitted on a 2D input image made by a camera.These ellipses represent the body parts, scaled and rotated dependent on size and location of the human.In [17], a depth camera is used to generate a dense point cloud for body pose and shape estimation using a mathematical method.The aforementioned techniques are not feasible for radar, because here detected data points in space are rather sparse compared to an image.The sparse nature of the spatial radar scatterers presents a difficulty when it comes to estimating the size of body parts.To realize pose estimation from radar point clouds, machine learning (ML) was utilized in [18], [19], and [20].Using this technology, the positions of human keypoints could be estimated within a certain error margin.However, a huge amount of data and time is required to train a reliable neural network (NN).On top of that, training the NN requires a large variety of training data containing lots of different people and environments in order to reduce bias and overfitting.Another rather simple method to deduce the human height from radar measurements has been presented before in both [21] and [22].The gait parameter was substituted into a formula from [23] which was determined based on experimental data, in order to estimate the physical height of the human body.Thus, the presented method is highly dependent on the accuracy of the gait parameter estimation.Notably, the validation of the method in the former paper was limited to a single individual and was solely evaluated through simulations rather than empirical measurements.The method was tested using measurements on multiple targets in the latter paper, and the experiments had shown the formula can not be used directly for accurate body height estimation.Furthermore, in [22] a ML technique called random forest for regression was used to determine the height using gait parameters, resulting in slightly lower errors than the earlier method.In this work, the feasibility of estimating the human body dimensions directly from the post-processed radar data is studied.The determination of the body height parameter is conducted without utilizing any ML technique.This omits the need for large datasets, and related time and costs.Furthermore, the body height is estimated independently, without any direct empirical reliance on the gait parameters.Therefore, the estimation is less generalized compared to earlier work using typical relations between gait parameters and height.Also, the body dimension estimation performance will be evaluated for different aspect angles.
The contributions of this work are threefold: 1) Study of the angular dependency on gait parameter estimation accuracy when utilizing a monostatic radar.2) Study the feasibility of estimating body part dimensions from processed radar data and present a method of estimating the target's height directly from the radar data.The height estimation capability is evaluated under different target trajectories with respect to the radar.3) Present a novel pipeline for generating an individualized spatio-temporal human walking model from monostatic radar measurements.
The remainder of this paper is organized as follows.In the next section, the related background information is discussed.
Section III gives a description of the presented pipeline.Subsequently, in section IV the measurement approach is described in detail.In section V, results are presented and discussed, as well as the limitations of this work.Finally, conclusions are drawn and recommendations for future work are made in section VI.

II. THEORETICAL BACKGROUND
In radars, electromagnetic (EM) waves are emitted by an antenna, which are potentially reflected by objects in the environment.After reception of the reflected signal, the range, velocity and angular position of the objects with respect to the radar can be determined.The time needed for the wave to travel from the transceiver to an object at distance r and back is given by τ = 2r/c 0 , with c 0 the speed of light.Current highly accurate millimeter wave (mmWave) sensing devices employ frequency-modulated continuouswave (FMCW) signals with multiple-input multiple-output (MIMO) technology [24], [25], [26].An FMCW radar continuously transmits signals called chirps.The signal frequency of a chirp increases linearly over time with slope S. The chirp is characterized by the start frequency f c , the bandwidth B and the chirp duration T C (chirp cycle time).

A. RANGE ESTIMATION
Before being able to estimate the radar parameters (range, velocity, angle), the transmitter (TX) and receiver (RX) signals are mixed in a quadrature mixer and low-pass filtered.The output is an intermediate frequency (IF) signal, which is digitized by an analog-to-digital converter (ADC).The instantaneous frequency of the IF signal (also called beat frequency) is equal to the difference between the instantaneous frequencies of the two inputs of the mixer: This frequency is not a function of time and stays constant, as can be seen in Fig. 1. f T is the TX signal frequency and f R is the RX signal frequency.When multiple objects are present at distinct distances from the radar, the range fast-Fourier transform (FFT) can be taken over the chirp samples (i.e.fast time) to separate them.When the chirp signal is transformed into the frequency domain, for every detected object a peak is visible in the spectrum.The frequency axis, obtained when the IF signal is sampled N s times with a sampling frequency of F s by the ADC, is directly related to the range information: with 0 ≤ k < N s .Tc = N s /F s is the length of the sampled IF signal in time, also known as the ADC sampling window length.B = S Tc is the sweep bandwidth of the chirp signal.
The FMCW radar presents a resolution under which objects can not be differentiated anymore.The range resolution, which represents the minimum distance required to distinguish between two objects, can be determined by the following expression: A limit exists on FMCW radar range, called the maximum unambiguous range.Above this limit, the time duration for the EM wave to complete a roundtrip is longer than the chirp duration.The radar cannot ascertain whether the echo originates from the most recently transmitted chirp or the preceding one.This situation arises when the range exceeds the value of Note the dynamic range of Equation ( 2) is between zero and r max and does not include negative values.This is because the complex IF signal has a one-sided positive frequency spectrum with no power on the negative side of the spectrum.Therefore, the frequencies F s /2 < f IF ≤ F s which are aliased towards the negative frequencies (between −F s /2 and 0) can be interpreted as the positive frequencies above the Nyquist sampling frequency.Thus, the spectrum is extended to F s .

B. VELOCITY ESTIMATION
To determine the velocity v of an object, it is necessary to transmit at least two chirps.A sequence of equally-spaced chirps is called a chirp frame.The range-FFT may have peaks in the same location, but with different phases.Depending on the phase difference between two chirps φ = 4πvT c /λ, the velocity can be determined because the phase changes linearly due to equal spacing of the chirps in the frame: Here, λ is the wavelength of the chirp signal.By performing the Doppler-FFT over the chirps (i.e.slow-time), the velocities of different objects can be determined.For two objects at the same distance but different velocities, the phase between the received chirps will vary, enabling the differentiation and separation of the two objects.For a set of N c chirps spaced by T c , the velocity is given by with discrete-time angular frequency ω l = 2πl/N c and index is the frame duration time.
Analogous to the range resolution, the velocity resolution is also constrained.The velocity resolution is given by Furthermore, a detectable velocity range exists.This is given by A positive velocity means the object is moving away from the radar, and a negative velocity indicates the object is approaching the radar.

C. ANGLE ESTIMATION
When the radar antenna consists of at least two elements, it becomes feasible to perform direction of arrival (DOA) estimation.Due to the distinct locations of these elements, the measured distances from the same reflected signal exhibit slight differences.These differences arise from the time delay τ = d sin (θ)/c 0 associated with the DOA angle θ, where d represents the antenna spacing.Therefore, the received phase depends on the element location in space.
The phase difference between two elements is denoted as φ.It is assumed the DOA is equal to the direction of departure (DOD), i.e. the antenna aperture dimensions are small with respect to the distance to the reflecting object.In other words, a far-field assumption is made here, with the measured target residing at a distance larger than the Fraunhofer distance d Fraun = 2D 2 /λ, in which D is the maximum target dimension [27].The DOA can be computed as The angular resolution at boresight view with half wavelength antenna spacing is given by (10) in unit radians.N R is the number of receive antennas.
As shown in Equation (10), the angular resolution depends on the number of receive antennas.By increasing this number, the angular resolution is enhanced accordingly.This improvement can be effectively achieved by employing MIMO with N T transmit antennas.Multiplexing is required on the transmit antenna elements to make sure there is no simultaneous transmission.This will be explained in more detail in the following paragraph below.Subsequently, the beamformer utilized to compute the power spectrum as a function of the angle is explained.

1) TDM MIMO RADAR
When utilizing an N T × N R MIMO array antenna, a virtual array of N T × N R antennas can be synthesized, increasing angular resolution.For instance, in Fig. 2 the array has four RX elements with spacing d and two TX elements with spacing 4d.Due to the difference in travel distances of the signal from each transmit element, which is affected by the angle θ as 4d sin (θ), eight distinct channels are created, each with a unique time of flight.Therefore, the MIMO array can be treated as an eight-element uniform linear array (ULA).
The multiple transmit antennas are transmitting the chirp signal sequentially using time-division multiplexing (TDM).Within one frame, each TX transmits N c chirps.The chirp time duration is extended by a factor N T , leading to an increased chirp time of T c,TDM = T c N T .The improved angular resolution comes at the expense of reduced maximum velocity because the chirp time increases for the same frame duration.Alternatively, the velocity resolution is deteriorated if the chirp time is kept constant, resulting in a higher frame duration.

2) MVDR BEAMFORMING
When the DOA is estimated by taking the FFT over the spatial samples, the resolution is constrained by the number of elements.To overcome this limitation and achieve higher resolution, various methods can be employed including beamforming methods, subspace methods and parametric methods [28], [29].In this work, a beamforming method is adopted because it is sufficiently accurate for the intended purpose while having a relatively low computational complexity.Hence, only this method will be explained hereafter.A beamformer is a spatial filter which combines weighted element outputs linearly using a weight vector w.For an array with M elements, the weighted signal is given by The (•) * and (•) H operators represent the conjugate and Hermitian transpose, respectively.With N C time samples, the antenna observation signal of element m is given by x m (n) with n = 1, . . ., N C .This signal represents the slow-time signal (i.e.signal consisting of chirp samples in a frame), which is assumed to be narrowband.The narrowband assumption holds if the product of the signal bandwidth B s and the maximal travel time T between two array elements for an impinging plane wave B s T ≪ 1.In the system being utilized, this condition is met as the obtained IF signal, which is fed to the ADC, is narrowband.When all M elements are included, the observation signal is written as a vector x(n).The power spectrum for angle θ is computed by The beampattern displays a prominent peak in the spatial power spectrum, indicating the angular location where the highest power is concentrated.This corresponds to the direction from which most of the reflected power originates due to the presence of an object at that specific angle.The conventional beamformer, also known as the delay-and-sum beamformer, combines the output of each element coherently to obtain an enhanced signal.The downside of this method is the limited resolution when the number of antenna elements is small.The resolution of the minimum variance distortionless response (MVDR) beamformer is less limited by the number of elements, creating a sharp peak at the target DOA [30].This method was chosen because it has a higher spatial resolution than a conventional beamformer.On top of that, contrary to e.g. the MUSIC algorithm, the number of sources (i.e.scattering directions) does not have to be known and can vary arbitrarily.The MVDR beamformer is based on the following constrained optimization problem: min w H (θ) Rx w(θ ) (13) subject to w H (θ )a(θ) = 1, ( with a(θ) the steering vector and Rx the sample correlation matrix.The output power is minimized (first function), while ensuring that the signals from the desired direction θ remain undistorted (second function).So, power from all other directions is minimized while the beamformer concentrates only in one direction.
The sample correlation matrix is as follows: The solution of the optimization problem in Equations ( 13) and ( 14) gives an expression for the weight vector: The power function is then obtained by substituting Equation (16) into Equation (12): It is important to note that a matrix inversion of the sample correlation matrix is carried out, so this matrix needs to be full rank.In other words, the number of temporal samples must be larger than the number of elements (N > M ).The power function is computed for every range value, such that a range-azimuth plot is created.

D. SIGNAL MODEL 1) WAVEFIELD DESCRIPTION
Due to the far-field assumption, it is known the wave impinging on the antenna from every scattering point is a plane wave.The plane wave propagating and the coordinate system with respect to the array antenna can be seen in Fig. 3. θ is the azimuth angle, which is the angle between the x-axis and the orthogonal projection of the signal vector on the x-y plane.φ denotes the elevation angle, which is the angle between the signal vector and the z-axis.The Cartesian coordinates of an arbitrary antenna element are given by r = [x, y, z].The spatial frequency of the incoming wave is The spatial frequency is related to the temporal frequency through ||k|| 2 = ω 2 /c 2 0 .The steering vector for a wave coming in on an array of M elements is given by This expression for the steering vector can be substituted into Equation (17) to find the power function of the MVDR beamformer.

2) BACKSCATTERING
The FMCW emits a chirp of finite length.The signal propagates to the target and its energy is reflected by the target in K scattering points.Then, the signal is received by the radar again.If the operating frequency f 0 is high enough (i.e. the wavelength small compared to target dimensions), the scattering behaviour can be modelled as a summation of the different scattering responses [31].The signal model employed in this context [32] provides the received baseband signal, which can be mathematically represented as with ρ k the reflectivity function, R k the scalar range between the radar and the scattering point and k the phase of the baseband signal of the scattering point k.The reflectivity function contains factors like path loss, antenna gain, radar cross section (RCS), transmit power and other losses like atmospheric loss and system loss [33].So, the target is regarded as a number of point scatterers moving in space over time.Each scatterer induces a phase delay proportional to the range and a frequency shift proportional to the scatterer velocity.The phase of the signal reflected by scatterer k is The Doppler frequency shift induced by a single point scatterer moving over distance, varying the scalar range R k over time, is obtained by taking the time derivative of the phase: is the radial velocity of the point k in the moving object [34].So, if the range with respect to the radar stays constant over time, no Doppler shift will be visible.Contrary, if the object is moving directly towards or away from the radar, the highest Doppler shift is visible since the range changes the most over time.A positive Doppler shift means the target is moving towards the radar, and a negative Doppler shift indicates the target moves away from the radar.
The unit vector of the radar line of sight (LOS) direction is given by with R 0 the distance vector from the target to the radar, θ and φ the azimuth and elevation angles in the radar coordinate system, respectively.|| • || denotes the Euclidean norm, while (•) T represents the transpose operator.When the object is moving in space over a distance r towards a direction given by the unit vector n m , the observed micro-Doppler shift induced by the movement given in Equation ( 22) can also be written as: So, when the directions of n and n m are either equal or equal with opposite sign, the observed Doppler shift is maximum.When the movement unit vector is perpendicular to the radar LOS direction, the observed Doppler shift is minimum.
A distinct object is characterized by multiple scattering points.The different velocities of different moving objects result in different Doppler signatures, which can be used to distinguish between them.For example, the lower legs of a human body have a higher Doppler shift than the torso, as they move with greater velocity in the direction of the radar.However, the separation of velocity profiles for different body parts presents a challenge [35].This difficulty arises due to the received signal being a superposition of signals from multiple body parts, each with different Doppler shifts and RCSs.Additionally, multipath fading can introduce additional components to the received signal, making the task even more complex.Adding to the challenge, the radar can only measure the radial velocity, making it unable to directly measure the instantaneous velocity of individual body parts, as illustrated by Equation (24).

III. PROPOSED PIPELINE
The overall pipeline describing the methodology is illustrated in Fig. 4. The FMCW radar acquires measurements from the human subject.Subsequently, the acquired signal undergoes post-processing, where various radar processing techniques are employed, as elaborated in the following paragraphs.After post-processing, the extracted parameters are the human height dimension and the gait parameters.These parameters are then used to individualize the spatio-temporal walking model.In the present study, only the human height dimension is estimated; nevertheless, the proposed pipeline remains applicable when estimating dimensions of multiple human body parts, such as leg lengths, torso length and width, and so on.

A. POST-PROCESSING PIPELINE
Post-processing is a crucial part within the overall pipeline.Following the radar measurement, the raw radar data undergoes post-processing to extract valuable information.This involves a series of steps, as depicted in Fig. 5.In the following paragraphs, each block of the post-processing pipeline will be explained in detail.

1) RADAR DATA REPRESENTATION
The output data from the ADC is initially recorded in a binary format during the measurement process.After parsing, the data is represented comprehensibly in three dimensions.This three-dimensional representation takes the form of a cube, as depicted in Fig. 6.The first dimension entails the N s samples of each chirp IF signal, or fast-time.The second dimension involves the N c chirps in one frame, referred to as slow-time.Lastly, the third dimension represents the N R ×N T virtual receivers.As previously mentioned in Section II, the three dimensions can be converted to range, Doppler and angle, respectively.

2) STATIC CLUTTER REMOVAL
The radar system may encounter undesired echoes, referred to as clutter, which arise from contributions of objects other than the target.To ensure that only the behavior of the target is prominently visible in the acquired data, it is crucial to minimize the effects of clutter.Static clutter due to direct reflections from surfaces or objects such as walls and furniture have zero velocity and are therefore described by a DC component in the Doppler-FFT plot.A straightforward FIGURE 6. Radar cube.(From [25]).
yet effective approach to eliminate this DC component is by subtracting the average value from the signal values [35].The average signal value over slow-time in a single frame is computed and subtracted from the received signal in the frame.The number of chirps is given by N c and the signal is given by y[n].The computation of the averaged signal is as follows: 3

) RANGE-AZIMUTH ESTIMATION
Following static clutter removal, the resulting signal mainly consists of contributions from the non-static target of interest.The next step is to estimate the range r between the radar and the target.This is done by taking the FFT over the fast-time samples.The result is a range spectrum in which the peaks correspond to the principal wave reflections.Exploiting the signals of the antenna elements in the horizontal plane (xy plane), the azimuth angle θ with respect to the radar can be estimated using the theory described in Section II-C.After doing this for every range value, a range-azimuth plot is created.An example of a range-azimuth plot is shown in Fig. 7.This plot characterizes the range and azimuth DOA of the reflected waves.Subsequently, the spherical coordinates are converted into Cartesian coordinates to obtain the locations in the x-y plane (top view) for each frame: The center location of the target in the x-y plane is computed by taking the mean location of the detected target points.

4) TARGET DETECTION
When the range and azimuth angle of an object with respect to the radar are determined, its location can be estimated as described above.The location is expressed from a top view perspective (i.e.range-azimuth plot).The range-azimuth plot contains the targets, but is also likely to be surrounded by clutter and noise which are time-and position-variant.To effectively distinguish the target within this dynamic environment, a detector with a variable threshold is required, adapted to the background noise.The detector which is used is the constant false alarm rate (CFAR) detector.It is designed to maintain a predefined false alarm rate, ensuring that detections above a certain threshold are labelled as part of potential targets.A false alarm occurs when an object is erroneously detected as a target when it is, in fact, clutter or noise.This method is the most common algorithm for target detection, and it can be readily applied to the range-azimuth (or range-elevation) plot.Hence, it will be used in this work.
In a cell-averaging constant false alarm rate (CA-CFAR) detector the noise level is estimated using the samples of N cells reference cells surrounding the cell under test (CUT) [36], [37].The one-dimensional CA-CFAR detector can be seen in Fig. 8.The noise level is the average value of the reference cells.The scaling factor B is determined using a desired probability of false alarm P FA : The CUT is surrounded by a designated number of guard cells.These cells are intentionally excluded from the averaging process to ensure that the noise level remains uninfluenced by the presence of the target.Without the inclusion of guard cells, the computed noise level would include some of the target power as well because the target is usually spread over multiple cells.Instead of one dimension, the reference cells and guard cells are taken in two dimensions within this study.The guard cells encircle the CUT, while the reference cells surround the guard cells once more.

5) CLUSTERING
The process of identifying potential target cells is followed by clustering these cells to group together points belonging to the same object, enabling the differentiation of multiple objects while marking outliers.To achieve this, the densitybased spatial clustering of applications and noise (DBSCAN) algorithm is utilized [39].This algorithm is employed due to its ability to mitigate the presence of non-static clutter induced by multipath reflections, which could not be omitted before by the clutter removal.The clustering process relies on two input parameters, namely the neighbour search radius (ϵ) and the minimum number of points to identify a core point (MinPts).The algorithm starts with an arbitrary point p and finds all points within reach with respect to ϵ and MinPts.If the minimum number of points is not found within the search radius, p is a noise point (outlier).However, if the condition is met, p is a core point and a cluster is established.The algorithm iterates over the neighbouring points and retrieves all core points and border points belonging to the detected cluster.The algorithm repeats the search for all points (detected cells) until every point is labelled either as part of a cluster or as an outlier.
The detected and clustered points in the range-azimuth plot of Fig. 7 are highlighted in Fig. 9.The main cluster depicted in green at the center represents the target's location, containing P detected points, each characterized by unique values for r and θ.The smaller cluster on the right (colored in blue with red outliers) corresponds to clutter and is neglected automatically by the algorithm as it does not contain relevant information on the target of interest.This is done knowing only a single target resides in the room, corresponding to the bigger cluster.
After applying the DBSCAN algorithm to the points detected by CFAR, it is utilized once again on the center locations of the target.The center locations were obtained by taking the mean in both the x and y direction (i.e.dimensions of x-y plane shown in Fig. 3), representing the trajectory samples over the time frames.The clustering operation of the center locations makes sure (non-static) clutter is not part of the target trajectory and can be ignored.Furthermore, if a situation occurs in which multiple targets are present in the scene, these targets can be separated when they are not in proximity of each other.Nevertheless, this research only focuses on single targets, hence this situation is not taken into account.

6) TRAJECTORY FILTERING
From the previous steps, the trajectory samples (center locations) of the target are obtained.However, these samples may not all be accurately representing the torso location of the human due to measurement uncertainty.To make the temporal estimates more robust, filtering is applied to the system state x n , which contains both the location (x n , y n ) and velocity (ẋ n , ẏn ) of the target: First, the future target position is predicted using Newton's motion equations, relying on knowledge of the current state.The applied model assumes the velocity between two adjacent samples in the trajectory to be constant.This motion model is applicable because it is presumed the human walks with approximately the same gait speed during the whole trajectory.Nonetheless, the actual motion might deviate from the model, leading to a dynamic model uncertainty known as process noise.So, a prediction algorithm is required which takes into account both the measurement noise and the process noise.Both the measurement noise and the process noise are assumed to be independent zero-mean Gaussian white noise.The algorithm used to do this is the linear Kalman filter [40], [41], [42].This filter consists of two parts.The first part is the time update, in which the human location for the next frame is predicted by looking forward in time.This is done by using the state extrapolation equation, with the motion equations embedded in the state transition matrix F: with xn the estimated system state vector at time step n and x− n+1 the predicted system state vector for time step n + 1.Also, the uncertainty of the prediction is provided using the covariance extrapolation equation: with P n the covariance matrix of the state estimation at time step n, P − n+1 the predicted covariance matrix of the next state estimation and Q = E{w n w T n } the process noise matrix, with w the process noise vector.If no previous iterations are available, an initial estimate of both x0 and P 0 has to be made.This initial estimate of the location and velocity is based on the non-filtered samples.The initial location guess is taken to be the first sample of the non-filtered trajectory.The initial velocity guess is based on the first five non-filtered samples of the trajectory.To get this initial velocity estimate, the covered distance after five samples is divided by the elapsed time while covering this distance.
The second part of the Kalman filter is the measurement update, which combines this prediction with the actual measurement to correct its location.The first step is to compute the Kalman gain K n from the previously predicted covariance matrix and the observation matrix H: in which R n = E{v n v T n } is the measurement error covariance matrix, with v the measurement noise vector.The observation matrix relates the measurement vector z n to the state vector, i.e. z n = Hx n .The next step of the measurement update is updating the estimate with the measurement: Finally, the third and last step of the measurement update is updating the measurement uncertainty matrix After both the time and measurement are updated, the process is repeated iteratively in order to predict and update the system state vectors for all time samples.By performing these iterations, both the measurement and process noise are reduced.

7) GAIT FEATURE TIME-FREQUENCY REPRESENTATION
The subsequent step involves performing the Doppler-FFT such that for multiple time instants, the Doppler spectrum is generated.Due to target tracking, the range bins in which the target resides are known.Consequently, relevant range bins are selected to analyze the Doppler frequency of the body.The advantage of having knowledge of the target's range bins is that the Doppler signatures of objects in other range bins are not visible in the Doppler spectrum.This results in a cleaner Doppler spectrum in which the main contribution comes from the target of interest.To illustrate the change of (Doppler) frequency over time, the spectrogram is frequently used as a visualization method.It is computed using the short-time Fourier transform (STFT).The STFT is a moving window Fourier transform [43], in which the Fourier transform is taken over the samples inside the window which consists of a short segment of a longer time signal.The spectrogram is expressed as in which y(t) is the analyzed signal and w(t) the sliding window [32].The time-frequency representation is generated for the slow-time signal y(t) for every relevant range bin.This window is taken to be a Hamming window of length 256 with an overlap length of 194 and a FFT length of 256 as well.
As the body size exceeds the range resolution, it occupies multiple range bins.Therefore, the spectrograms of multiple range bins wherein the body potentially resides are summed to obtain the overall spectrogram of the body.The drawback of the STFT is the limited resolution trade-off.A shorter time window creates a better time resolution, but results in a worse frequency resolution and vice versa.Upon computing the spectrogram, the subsequent task is to extract a gait feature from it.Each body part exhibits a distinct micro-Doppler signature.As previously mentioned in Section II-D, separating Doppler profiles of different body parts within the spectrogram is challenging.Despite this, extracting the Doppler signature of the torso is relatively straightforward.This is primarily due to the torso's higher RCS compared to other body parts (e.g., limbs), resulting in the highest power Doppler frequency component representation.Additionally, the radar's approximate location is at torso height, causing the majority of the power to be directed towards the torso, given the directional antenna's orientation.To obtain the radial velocity (or Doppler frequency) of the torso from the measurements, the weighted mean of the spectrogram is taken [35].The power-weighted average frequency for time sample t is given by in which N is the size of the FFT during the STFT operation.So, k represents the k th frequency component.

8) ELEVATION ESTIMATION
Analogous to the range-azimuth estimation, a range-elevation plot can be generated as well.Instead of using array elements in the horizontal direction, this is done using a vertically orientated ULA in the z-direction.As for the range-azimuth points, CFAR detection is executed on the range-elevation plot.The z-coordinate of a detected point with range r and elevation angle φ in a coordinate system with the radar in the origin is given by

B. WALKING MODEL
The spatio-temporal walking model used in the pipeline is the well-known global human walking model presented by Boulic et al. [23].This kinematic model is based on experimental biomechanical data.The model describes the spatio-temporal motions of 12 human body parts, depending on a normalized gait velocity v g .The gait speed s g is normalized by the leg length L leg , which is the distance from the floor to the thigh in a static pose.The model assumes the walking motion is periodic, with phases having either double support of both legs or single support of one of the legs.One full gait cycle, also called a stride, is illustrated in Fig. 10.The cycle is expressed by the relative time and is a value between zero and one; it is given by and t rel,r ≡ tf g + φ g + 0.5 (mod 1), (39) for the left limbs and the right limbs, respectively.Here, t is the time instant, f g is the step frequency, and φ g is the cycle phase.The cycle phase has a value between zero and one, hence the modulo 1 included in the formula.Note the relative time of the limbs on the right side of the body has an offset of 0.5 compared to the left side, corresponding to half a gait cycle.
Positions of the body parts are computed relative to a human reference point.This is a point in the human body that has a certain translation and rotation with respect to the radar.Translations and rotations of body parts are described as a function of the velocity and the relative cycle time.The rotations and translations with respect to the body can be seen in Fig. 11.Translations include vertical T V , lateral T L and horizontal T H translations of the body around the human reference point.Rotations include forward/backward φ FB , left/right φ LR and torsion φ TO rotations of the body around the human reference point.The whole body also has a certain thorax angle φ T , representing the direction in which the human is moving.Furthermore, rotations of joints on the left and right side of the body are described.These joint rotations include flexing at the hip with angle φ H , knee with angle φ K , ankle with angle φ A , shoulder with angle φ S and elbow with angle φ E .For example, the rotation of the shoulder in degrees is given by φ S (t rel , v g ) = −3 − 9.88v g (0.5 + cos (2πt rel )) . ( For all the body joints rotations and translations expressions, the reader is referred to the original paper by Boulic et al.The human walking model was realized from scratch in MATLAB using the described expressions.An image of the skeleton of a walking human representing the model can be seen in Fig. 12, in which the 12 body parts are visible.The body parts are listed in Table 1, together with their default length.Note the head is modelled as a sphere.The default dimensions are based on the anthropometric data described in [44].These sizes can be scaled using parameters extracted from the radar measurement data.In this work, this is done using the total human height as a scalar.
The general walking model is adapted to each person using the gait parameters, the height parameter and the cycle phase.These are determined as described below.

1) GAIT PARAMETER ESTIMATION
Biomechanical walking parameters are extracted using information from both the estimated target trajectory and the micro-Doppler signature of the torso.The two gait parameters which are estimated and fed to the walking model are the normalized gait speed and the step frequency (convertible to cadence).Note that the step length can be computed from these two parameters.
As explained before in the post-processing pipeline, the Kalman-filtered trajectory is expressed by the system state vector x n for every frame.The system state vector contains the velocity in both the x and y direction.The gait velocity for frame n is derived from these values as Before it is fed to the model, the gait velocity needs to be normalized to a leg length of one meter such that the expressions in provided by the model are correct: The step frequency is directly determined using the torso Doppler signature obtained from the spectrogram.First, the moving average is subtracted from this signal such that the DC component is removed.Then, by taking the FFT of the resulting waveform, the step frequency is obtained by determining the highest peak in the frequency spectrum.Subsequently, the step length is readily computed using The walking model assumes a continuous motion.Therefore, not all combinations of the step frequency and velocity are realistic.If the parameters do not match to a certain extent, a not lifelike step length is computed and skidding of the feet will occur.

2) HUMAN HEIGHT ESTIMATION
The next step is to estimate the human body height from the radar data.To achieve this, the detected points in the range-elevation plot are used.A detected point with range r and elevation φ with respect to the radar is converted to the height h in Cartesian coordinates using with h r being the radar height with respect to the floor.After this conversion, an arbitrary number of scattering points are expressed within a certain height range.These points also include multipath reflections via the floor, the ceiling and the wall.These contributions are not of interest because these are not direct reflections from the human body to the radar.Therefore, these contributions will be omitted before making a height estimate.This is done by specifying upper and lower limits for both the range and the elevation angle, with r ∈ [R l , R u ] and φ ∈ [ l , u ].These bounds are based on the estimated ranges of the scattering points in the x-y plane, obtained from the horizontal radar.For P detected points in range-azimuth, a range vector r p,az ∈ R P×1 and azimuth angle vector θ p ∈ R P×1 can be constructed which describe the detected points.The limits for the points in range-elevation are then given by R l = min{r p,az } − 0.1, ( 45) The factor of 0.1 m in equation 45 is included to account for the distance between the two radars and for possible errors in range estimation.The maximum height which can be estimated is approximately 2.6 m, as shown in equation 46.
This value is chosen because people are generally shorter than this length.Due to the target's motion, the measured DOA φ of the backscattered EM waves exhibits fluctuations.These angle variations arise from phase changes caused by the target's movement [27], [45].As previously discussed, the target comprises multiple scattering points, each with changing aspect angles, leading to fluctuations in the relative amplitude and phase of the received echoes at the radar receiver.Consequently, the amplitude and phase of the signals received by the antenna also experience fluctuations.As a result, the signal power of some scatterers may be too low to be detected by CFAR, and the estimated DOAs may contain glint noise components, resulting in inconsistent values across different frames.Due to these uncertainties, it is not always expected that the maximum value of the height corresponds to the actual human height.In some frames, the maximum value may be too high, while in others, it may be too low.To obtain a reliable estimate, the height is determined based on the maximum height values from multiple frames.Firstly, the height maximum of each frame is calculated.Subsequently, a moving average of these maxima over 10 frames is computed.The highest value of this moving average is taken as the estimate of the human height.The default dimensions in Table 1 are then scaled using the estimated human height to achieve a more accurate model representing the target's true physical dimensions.

3) CYCLE PHASE ESTIMATION
The final component of the methodology involves estimating the cycle phase φ g , which attains values between zero and one.This parameter is determined through a method based on [46].At this stage of the pipeline, a model can be constructed with an arbitrary value of t rel by selecting different values of φ g , as indicated by equation 38.The model encompasses the trajectory, gait parameters, and human size parameters.Utilizing the model, the observed Doppler frequency can be computed over time using equation (22), given that the body parts' movements in 3D space are known with respect to the radar.These movements are described by the generated spatio-temporal model, enabling to determine the expressions for the micro-Doppler signatures of all body parts.The scattering point of a body part is taken to be in its center.Subsequently, the Doppler frequency of the torso, as simulated by the model f D,sim (t), is fitted to the Doppler frequency of the torso obtained from measurements f D,meas (t).A fit function is generated, computing the mean squared absolute error between the measured and simulated Doppler frequencies over time with T time samples.Then, the best value for φ g is selected corresponding to the minimum value of the fit function.This is described mathematically as

IV. MEASUREMENT APPROACH A. MEASUREMENT SCENARIO
Measurements were conducted to evaluate the performance of the presented pipeline.The measurement scenario is illustrated in Fig. 13.In the actual setup shown in Fig. 14, the trajectories were marked on the floor using tape, with blue tape indicating distances of 0.5 m.The radar was positioned on a table at a height of 1 meter.Three volunteers participated in the experiment, walking along five different predetermined trajectories.Each trajectory had a distinct angle with respect to the antenna beam.The volunteers moved outwards with respect to the radar and repeated this motion five times for each angle, walking at their self-chosen pace, step length, and step frequency.Upon reaching the end point, the target remained stationary until the radar measurement was completed.The volunteers provided consent to participate in the experiments.The walking angles indicated in Fig. 13 FIGURE 13.Measurement scenario.refer to the angles with respect to the y-axis, representing the broadside beam direction of the radar.However, the aspect angle varies depending on the human's location in relation to the radar.The aspect angle denotes the angle between the vector from the radar to the target and the vector in the direction of the target's movement.For each trajectory, the aspect angles are depicted as a function of the covered distance in Fig. 15.

B. EQUIPMENT AND EXPERIMENTAL SETUP
The measurements were performed using two Texas Instruments (TI) mmWave radars (two IWR1443 with IWR1443BOOST EVM board) [47].Each MIMO radar consists of 2 TX elements and 4 RX elements, realizing two virtual ULAs, each of 8 elements.One board is solely used for azimuth estimation, while the other is solely used for elevation estimation.Both boards operate simultaneously, with the total available bandwidth divided over the two boards.The radar boards are aligned and connected rigidly using two metal plates, as depicted in Fig. 16.The vertical displacement between the antennas of the two boards is 7.2 cm.Having two boards, from which one is 90 • tilted with respect to the other, allows the target to be measured in both azimuth and elevation with a good angular resolution.This comes at the cost of an increase in hardware and an increase of captured data which needs to be stored.The chirp emitted by the TI radar board and its configuration parameters is visualized in Fig. 17.The chirp is configured using the programmed parameters in Table 2.The chirp configuration results in the properties listed in Table 3.Several design choices were considered while configuring the chirp.The range resolution should be maximized, such that the human body scatterer locations can be localized as accurate as possible.Therefore, the sweep bandwidth should be as high as possible.In order to realize this, the available bandwidth should be divided efficiently over the two boards.The two radars used different start frequencies  to prevent interference.The frequency separation (500 MHz) is high enough to avoid interfering with each other.Using (1), the computed range corresponding to the intermediate frequency between the two radar boards is over 3 km, which is much higher than the maximum unambiguous range.On top of that, such a high IF will be low-pass filtered after the quadrature mixer.Therefore, the radars do not interfere with each other.The resulting sweep bandwidth is 3001.6MHz, creating a range resolution of approximately 5 cm with a maximum unambiguous range of over 10 m.The maximum range is considered sufficient since the application is indoor, and the dimensions of the room in which measurements were performed are smaller than 10 m.
A trade-off had to be made between the number of antenna elements and the velocity parameters.The aim is to measure the torso velocity of a walking person.Therefore, the maximum unambiguous velocity should not be too low, otherwise the velocity of a fast-walking person cannot be measured.The chosen chirp parameter results in a maximum unambiguous velocity of approximately 3.2 m/s.This is enough to capture the macro-Doppler gait velocity with micro-Doppler velocity variations of the torso.It is desired to have a virtual array which is as large as possible, such that the angular resolution is as good as possible.However, the number of feasible elements is limited by the velocity parameters due to TDM.Two virtual ULAs are realized, of which one array is horizontally orientated and the other is vertically orientated, enabling estimation of the azimuth and elevation angles, respectively.Each board uses four RX elements and two TX elements, creating a virtual ULA of eight elements.
The measurement trajectory distance is around 3.5 m, so a measurement time of 7 seconds was chosen.This is enough to capture the entire motion on every repetition.The periodicity is 0.1 seconds, so 70 frames were recorded.
Synchronous triggering of the two radar boards is crucial, such that the data frames are aligned.The two radar boards are controlled by a Cmod S7 FPGA board [49].In order to trigger a frame on the IWR1443, the FPGA gives a pulse to a specific pin.The active frame time matches the frame periodicity of the chirp configuration, and the active trigger time is chosen to be at least 25 ns, as indicated in the user's guide of the device [47].The trigger moment of the two boards should match exactly, such that no synchronization errors occur.Therefore, the physical length of trigger signal wiring is taken to be equal for both.The two radar boards are each connected to a PC using several provided cables.Capturing and saving measurement data is done using TI's mmWave Studio software.

C. CAPTURING THE GROUND TRUTH
A ground truth of the gait parameters is required to validate the measurements.The ground truth data of the step length was created by fastening an erasable marker on the shoe of the volunteer, leaving a dot on the ground after every step.After walking the designated trajectories, the stride lengths (i.e.distance between consecutive dots) were measured using a measuring tape.To measure the distance, the dots were projected onto the straight white trajectory line, and the distance from the starting point towards the projected point was measured.By measuring in this manner, the step width is omitted from the measurement, and the distance solely consists of the step length.The marked dots have a certain mean and standard deviation (SD) with respect to the initial location of the individual (starting point of trajectory).Consequently, the step lengths can be described by a mean and standard deviation.Moreover, the ground truth of the step frequency was captured using the inertial measurement unit (IMU) of a smartphone, which was attached to the ankle of the volunteer.The smartphone device is an OnePlus 8T, which contains a Bosch BMI260 series IMU [50], using a sampling rate of 200 S/s.The IMU measures the linear acceleration of the ankle, from which its frequency corresponds to the step frequency.Again, there is a certain variability and mean over the five repetitions in which the step frequency will be expressed.The ground truth velocity was computed by taking the product of the frequency and the step length (Eq.43).For the step lengths the mean and variability is known, but no information was saved on which dot belongs to which repetition.Therefore, the standard deviations of the ground truth velocities could not be computed.Thus, the gait velocity will only be expressed by a mean value.
The ground truth properties of the three volunteers are listed in Table 4.The distribution of the parameter values is assumed to be normal.Person 1 has a similar step length compared to person 2, but with a slightly lower total standard deviation.Person 3 has a significantly longer step length.Furthermore, person 1 and person 3 have comparable mean step frequencies, with person 3 having almost double the total standard deviation in step frequency.Person 2 walks with a higher cadence compared to the others.The average speeds of person 2 and person 3 are similar, while person 1 walks slower.The ground truth heights of the volunteers are 1.82 m, 1.61 m and 2.01 m, respectively.

A. EXPERIMENTAL PIPELINE VALIDATION
First, a validation process is conducted to ensure the functionality of various crucial components within the proposed pipeline.These components are the trajectory estimation, step frequency estimation, height estimation, and the estimation of the cycle phase.

1) TRAJECTORY ESTIMATION
Fig. 18 illustrates five instances of localization samples before and after Kalman filtering for various walking angles.The observed positions of the trajectories are prone to errors, but the implementation of the Kalman filter reduces these errors, resulting in continuous trajectories without abrupt outliers.It should be noted that only a fraction of the initial trajectory was extracted, excluding the acceleration and deceleration phases.The resultant trajectories align closely with the expected walking paths (indicated by striped lines), validating the goal of the approach.The filtered trajectory still has a certain deviation from the straight line.Note that the participants do not precisely follow the striped line either, since it was an indicative line to follow rather than a strict trajectory.It is impossible to walk straight without having minor deviations from the line.

2) STEP FREQUENCY ESTIMATION
For the frequency estimation, the initial step involved computing the spectrogram of the extracted trajectory, from which the step frequency was then estimated by examining this Doppler signature.Fig. 19 displays the spectrograms for various measurements taken at different angles, along with the corresponding Fourier transforms of the Doppler signatures.Also, a spectrogram without clutter removal is added and shown in Fig. 19a.The spectrograms with clutter removal show low backscattering at zero Doppler frequency even though there were multiple cluttering objects in the room, such as tables and lab equipment.This means the static clutter removal works, and it is further supported by the influential zero-Doppler component evident in Fig. 19a, which impacts the weighted mean.An observable sinusoidal pattern is evident in the spectrograms depicted in Figs.19b and 19d.This oscillating behavior represents the cyclic micro-Doppler gait movement, wherein the body undergoes slight back-and-forth motion, thereby introducing an additional Doppler shift on top of the macro-Doppler shift resulting from the individual's coarse velocity.The step frequency, as depicted in Figs.19c  and 19e as the main frequency component, corresponds to the micro-Doppler frequency of the torso.Notably, when the aspect angle between the radar and the target is low (approaching 0 • ), a distinct Doppler signature is observed due to the high radial velocity detected by the radar.However, with an increase in aspect angle, both the radial velocity component and the Doppler frequency decrease.As shown in Figs. 19f,19h,and 19j, the spectrograms indicate a weighted mean approaching the zero Doppler line when the aspect angle approaches 90 • .Nevertheless, the micro-Doppler frequency swing remains present due to the radar beam never being continuously perpendicular to the movement normal vector.As such, the step frequency can still be derived from the frequency spectra displayed in Figs.19g, 19i, and 19k.As the angle approaches 90 • (lower plots), the prominence of the stride frequency becomes more pronounced, and the step frequency becomes less noticeable.The stride frequency, which is half the step frequency, characterizes the frequency of an entire gait cycle involving a step with the right leg and a step with the left leg.The frequency of movement in the lateral plane is half that of the movement frequency observed in the sagittal plane.Consequently, the spectrogram exhibits lateral plane motion in contrast to the previous sagittal plane motion.

3) HEIGHT ESTIMATION
The computation of the heights of detected points in the range-elevation plot was carried out following the previously described methodology.Given that the volunteers exhibit varying body heights, it is expected that the heights of detected points would also vary accordingly.In Fig. 20, histograms depicting the body heights of three individuals walking in trajectory 1 are presented.The histograms correspond to human height dimensions of 1.82 m, 1.61 m, and 2.01 m for the three persons, respectively.The lower part of the histogram does not display any detected points for all three individuals.This absence of detection is likely attributed to the fact that the feet have a low RCS and may move out of the range limits set for detection due to their continuous motion during walking.Moreover, the limited field of view (FOV) of the radar in the elevation plane results in lower received power from the feet.Another property for all three graphs which stands out is the high backscattering at the radar height, which is approximately 1.07 m.This is due to the high RCS of the torso and the antenna gain being the highest at an elevation angle of 90 • .
The height points corresponding to person 1 are presented in Fig. 21 as a function of the frame number, where low frame numbers indicate the beginning of the trajectory and high frame numbers indicate the end of the trajectory.Fig. 21a displays the outcome for trajectory 1, wherein the human walks with an aspect angle of 0 • in front of the radar.At the start and the end of the trajectory, the number of detected points on head height is lower compared to the middle part.The difference in detected points can be attributed to the limited FOV at the beginning of the trajectory, leading to decreased backscattering when the target is in close proximity to the radar.For the end of the trajectory, the low amount of detected points on head height can be attributed to the lower backscattering power due to increased path loss.For trajectory 5 (Fig. 21b), the first and last parts of the trajectory show no detected points at all.The cause of this is again the limited FOV, but now in the horizontal plane.At the start and end of the target's trajectory, the azimuth angle θ for the vertically orientated array is too high (or too low) to be able to realize sufficient backscattered power for a successful detection.

4) CYCLE PHASE ESTIMATION
The cycle phase is estimated after computing the mean squared error between the simulated and measured Doppler frequencies of the human torso.The best fit and the measured Doppler frequency are shown in Fig. 22 for trajectory 1, trajectory 3 and trajectory 5, with measurements performed on person 1. Across all three trajectories, the measured Doppler frequency exhibits a consistent temporal behavior when compared to the Doppler frequency obtained from the fitted simulation.This confirms the velocity, step frequency and cycle phase estimations give a well-approximated simulated Doppler signature.The achieved accuracy is considered sufficient for estimating the cycle phase utilizing the model-fed gait parameters.This claim is based on the observation that the shapes of the frequency components of both the simulated and measured Doppler curves remain consistent across all presented results.

B. EXPERIMENTAL RESULTS OF PERFORMANCE 1) GAIT PARAMETER ESTIMATION
The gait parameters were estimated by applying the proposed algorithm on the experimental data.The computed values were compared to the ground truth data of the corresponding trajectory, ranging from number 1 to 5. The resulting estimation errors for velocity, step frequency and step length are presented in Fig. 23.
Figs. 24a, 24b and 24c show that the velocity was assessed with an overall error margin of less than 0.17 m/s for all single repetitions (red circles).The accuracy of the velocity estimation is not evidently dependent on the angle.The estimation for persons 1 and 2 was carried out with an overall maximum error of 0.12 m/s.So, person 3 displayed a relatively higher error with more spread compared to other participants.This is due to the fact that person 3 has a taller body with longer legs, resulting in long step lengths.Consequently, this individual was unable to fit as many steps within a single trajectory as the others before hitting the wall due to the confined space.As a result, this person had to end the walk earlier, thus covering a shorter distance.Due to the reduced covered distance, the accuracy of velocity estimation was lowered since the performance of the Kalman filter is dependent upon the number of available samples.In order words, the distance between acceleration (start of the trajectory) and deceleration (end of the trajectory) is shorter for person 3. Observed over five repetitions, the velocity is estimated with an error lower than 0.077 m/s for all angles for the three volunteers.
The depicted results in Figs.23d, 23e and 23f generally show an increase in the error of step frequency estimation as the angle relative to the radar increases.This trend is observed across all three individuals.For persons 1 and 2, the step frequency at angle 0 • , 22.5 • , and 45 • could be estimated with an error of less than 0.066 Hz.However, for the last two trajectories which have higher aspect angles, the step frequency error increases.This is in line with the results shown in Fig. 19 because here it is visible that the step frequency component is harder to derive for higher aspect angles as well.Extreme values of errors for persons 1 and 2 go up to 0.175 Hz and 0.204 Hz, respectively.This shows that for some repetitions, the spectrogram is changed in such a way that the step frequency is hard to extract.Nevertheless, the mean error can be reduced by taking the mean over the five repetitions (blue asterisks), since the extreme outliers are averaged out.The step frequency error of person 3 is below 0.063 Hz at an angle of 0 • or 22.5 • and increases towards a maximum of 0.177 Hz for higher aspect angles.Overall, the error shows similar behaviour as a function of the angle compared to persons 1 and 2. However, a remarkable outlier is visible at the second trajectory (22.5 • ).This error was caused by an inaccurate trajectory estimate in the x-y plane.Therefore, the selected range bins of the spectrogram were not the range bins where the target actually resided.Thus, a part of the spectrogram was generated for range bins without any human Doppler signatures, and the estimated step frequency is inaccurate.
As described in the methodology in section III, the step length was determined from the radar data by dividing the gait velocity by the step frequency.Consequently, errors that appear in these parameters will propagate through to the estimated step length parameter as well.This effect is visible in the data shown in Figs.23g, 23h and 23i.When evaluating the step length error for the first three trajectories for persons 1 and 2, respectively, the step length error is below 6.2 cm and 5.2 cm for all repetitions.For the last two walking angles, the errors in step frequency are relatively high and these errors reappear in the step length error as well.For these two walking angles, persons 1 and 2 have a maximum error of 9.4 cm and 9.7 cm, respectively.Moreover, for person 3, the step length errors are higher compared to the other two individuals.The cause of this is the relatively high error in both velocity and step frequency, which again influences the step length estimation.Only for the angle of 0 • the error is low because the velocity and step frequency errors are small here as well.The mean error of the step length for all three individuals averages out the outlying errors, making the mean error approach zero.The mean step length errors are maximally 5.0 cm, 4.3 cm, and 9.3 cm, respectively.
Thus, to make the gait estimation more accurate, outliers with higher errors can be averaged out if the trajectory is long enough.In order to show this effect, the gait parameters are estimated over the five measurements instead of over one measurement.The results are listed in table 5.The errors listed in this table were also plotted in Fig. 23.

FIGURE 23.
Estimation error values compared to ground truth.The errors for the velocity estimation, the step frequency estimation and the step length estimation are presented in the first, second and third row, respectively.The first column corresponds to person 1, the second column to person 2 and the third column to person 3.

2) HUMAN HEIGHT ESTIMATION
The height of the participants was estimated using the described method.The results are shown in Fig. 24.For person 1, the height is estimated relatively accurately compared to the other persons, especially at low aspect angles.For this specific individual, most of the estimated heights have an absolute error below 8.2 cm.Nevertheless, there are a number of extreme cases (4 out of 25) in which the errors are higher than 14 cm.One possible explanation for occurring errors is the limited resolution in both range and angle.The range resolution is 5 cm, while the angular resolution is 14.3 • .The resolution in the Cartesian plane depends on the resolution of the polar plane.This follows from equation (37).With increasing range, the area of range-angle cells in the Cartesian plane increases.Thus, resolution is reduced since more points in the plane are assigned to a single combination of r and φ.On top of that, the target is moving, so the scattering points are not stationary.So, the elevation angles of the scattering points on top of the head are not constant over time.Angular noise is introduced due to the changing phases of the echoes arriving at the receiver.
For person 2, the height is generally overestimated, especially at low aspect angles.The errors are maximally 15.9, 8.1, 6.9, 3.4, and 2.8 cm for the five angle values, respectively.So, there is an evident decline in height estimation error when the angle gets higher.The amount of points detected at and above head height is higher for low aspect angles.A possible explanation for this might be the increased backscattering power due to a higher RCS because of the shape of the head.Then, the CFAR detector detects more points around the actual scattering center.Compared to the other two persons, person 2 has a lower incident angle, which enhances this effect.The height estimation of person 3 always underestimated, with errors appearing to be even more than 20 cm for multiple measurement repetitions.There are a number of possible reasons for having such high error values.To start with, the FOV of the vertically-orientated radar is limited to a 3dB-beamwidth of ±28 in the elevation plane [47].So, the received power from high elevation angles corresponding to the location of the head is relatively low compared to the power received from boresight.Moreover, the RCS of the top of the head is low due to its shape.Incoming waves will be reflected to many different directions other than the direction where the radar is located due to diffuse reflection.Also, the backscattering from the head is low due to a high angle of incidence on the head.Contrary, the torso has a low angle of incidence and a smoother surface, in which the waves are reflected specularly to the radar.
The mean absolute error (MAE) of the height estimation is shown in Fig. 25.Evidently, the overall height estimation accuracy of person 3 is the worst.As explained before, this volunteer is too tall for the system to make an accurate height estimate for a number of reasons.Person 1 and person 2 show similar accuracy in MAE.However, for person 1 the error is higher for high angles, while for person 2 the error is higher for low aspect angles.Overall, the MAE for persons 1 and 2 stays below 10.9 cm.

C. COMPARISON WITH CURRENT STATE-OF-THE-ART
To evaluate the outcome of this research, it is compared to previous (state-of-the-art) papers.The performance of the gait parameter estimation algorithm in this work is compared to a number of references listed in Table 6.The listed parameters are the mean parameters taken over a certain measurement time (and covered distance) that were reported in the relevant work.The current work has errors in gait parameters which are in the same order of magnitude as previous work.It previous work on velocity parameter estimation when the target is walking in a straight trajectory towards the radar.The step length and step frequency errors in [8] are relatively low compared to other work, including this research.This is because the measurement scenario differs a lot from the scenario from the others; the target was walking on a treadmill for two consecutive minutes.Therefore, a lot of data is captured because the walk takes a long time.If the mean gait parameters are then computed, errors will be averaged out over an extensive time and the results approach the ground truth values.In other works, the walking distance varied between 5.2 and 10 m, which is much shorter, resulting in less accurate estimates.In the current research, the walking distance was short as well (approximately 3 m without acceleration and deceleration phases).The error values in the table are based on 5 repetitions, so the total walking distance is about 15 m.No previous work exists in which the gait parameter estimation performance is analyzed as a function of the aspect angle.So, the error for high aspect angles in this research can not be compared to similar outcomes of previous work.Nevertheless, it can be noted that the velocity and step frequency for trajectories with high aspect angles deteriorate with respect to low aspect angles.However, the gait parameters could be estimated with similar accuracy compared to results in earlier work for an aspect angle of 0 • .Previous works on body dimension estimation or human keypoint estimation mainly rely on the use of ML techniques.The performance of the gait parameter estimation algorithm in this work is compared to a number of references listed in Table 7.A NN is a much more powerful tool compared to the technique used in this work.It has lower height estimation errors compared to the approach presented here.However, it takes a lot of training data comprising different persons and environments and, in general, it lacks explainability in how the results are achieved.The method presented here can be applied to arbitrary radar data, but requires refinement.Its performance can be improved if both the FOV and the resolution of the radar are increased.Moreover, the method based on the empirical formula stated in [22] is far from accurate and should not be used.

D. LIMITATIONS
This study has several limitations.The first limitation is the number of participants, repetitions, and trajectories.If measurements had been performed on a larger number of people with more repetitions, more data would be available.This would have a positive effect on the reliability of the research.Now, conclusions have to be drawn based on only three participants performing three repetitions for five different trajectories.Additional participants and trajectories would be useful to verify the pipeline with more supportive data.Greater participant variability would lead to a wider range of step lengths, gait speeds, and body dimension parameters being introduced.In addition, the current work was limited to measuring one participant in the room.However, in realistic scenarios, multiple people could be present in the room.Furthermore, if large amounts of data had been captured, this could have been used to develop and train a NN which could capture the dimensions of multiple human body parts.It should be stated that an attempt had been made to infer human key points from radar data using the open-source model proposed in [20].However, the model was only trained on two persons in a fixed environment and the input requires being a point cloud.Hence, with the data from the current work, no accurate key point estimations could be made.
Secondly, the room in which the measurements were performed had limited space.Therefore, only a short trajectory could be measured.On one side, this is realistic since rooms where the system can potentially be deployed do not always have to be big.On the other side, the gait estimation accuracy is limited because the gait is measured for only a short period of time.A more reliable estimate could be made if the individual is measured for a long period.This has been confirmed by earlier work.
Thirdly, the hardware used is limiting the ability to estimate the human body dimensions.The radars used were two separate FMCW radars, both having a virtual ULA of eight elements.The radars work independently and are only able to estimate the angle in one plane (i.e.either azimuth plane or elevation plane).If a more sophisticated virtual array with more elements had been used, a two-dimensional virtual array could be realized.For example, if the resources would allow, this virtual array could have been 8×8 elements (comparable to the one used in [19]) instead of two arrays of 8 × 1.Then, the location of every detected point could be estimated in both azimuth and elevation because beamforming can be realized for θ and φ instead of only for either one of the two.Then, a point cloud could have been realized, from which the dimensions of more human body parts can be potentially estimated instead of just the height.
Another limitation is the method of computing the step length from the radar data and the velocity from the ground truth data.This was both done by using the relation in Eq. (43).Consequently, the computed gait parameter is dependent on the other two gait parameters.Therefore, inaccuracies inherent in the previously obtained parameters propagate into the computed parameter itself.This decreases the precision of the estimated step length and reduces the reliability of the established ground truth velocity.
Moreover, the method of measuring the ground truth step length does not enable to link the marked dots to a specific repetition.The step lengths were measured only after the five consecutive repetitions were performed.Therefore, the dots were not marked with the repetition number and could not be retraced.The step lengths could only be expressed using a mean value with a certain variation, instead of the absolute values between the marked steps.

VI. CONCLUSION
This research proposes a novel pipeline for developing a human gait model adapted to a measured person utilizing a monostatic FMCW radar.The methodology was validated and essential components of the pipeline were verified.Furthermore, the study explored the gait parameter estimation error under varying target locations and orientations concerning the radar.Leveraging experimental data from three individuals walking in five different trajectories, the study successfully estimated three gait parameters.However, it was observed that the accuracy of gait parameter estimation diminishes with an increase in the aspect angle of the target.This is because the Doppler signature is less expressed for higher aspect angles due to the radar beam and the vector of the movement direction approaching orthogonality.Furthermore, single repetitions exhibited higher parameter errors compared to the average over five repetitions.Consequently, enhancing the trajectory length or increasing the number of repetitions would be advantageous to improve parameter estimation performance.Compared to previous work employing similar scenarios, this work has similar accuracy in velocity, step length, and step frequency.Additionally, the research examined the feasibility of estimating body part dimensions from processed radar data.Using the hardware employed in this study, the only dimensional parameter that could be estimated was the total human height.The sparse nature and the noise of the detected points with noise make it hard to determine the height accurately.The performance of body part dimension estimation could be potentially enhanced by increasing the number of antenna elements, the FOV, and the resolution in range and angle.Nevertheless, this improvement is constrained by the limited available resources (time, frequency, and space) that must be carefully considered.Compared to earlier work on human keypoint estimation, ML techniques have shown to be more accurate than the technique used in this work.
Several questions arise from this research, warranting future investigations.First of all, the body part sizes in the model were scaled based on a single parameter, namely the human height.For future research, a method should be developed to extract dimensions for each individual body part.Then, the physical dimensions of the body parts in the gait model are more representative because they all will be independently determined instead of having a certain ratio between them.Utilizing a NN may be effective, given the promising results showcased in Table 7. Furthermore, future efforts could be directed towards creating a more sophisticated walking model.The current model used in this study, presented in [23], was derived from experimental data, thereby limiting its level of individualization in terms of flexing of joints.Nevertheless, gait is a complex and distinctive human characteristic, with joint rotational properties that may vary significantly from person to person.Some individuals might not exhibit the typical gait due to physical disabilities, resulting in deviating motions.It would be valuable to have a gait model which is individualized according to the exact movement of the target, instead of having some generalized properties.Another aspect of the research which demands future investigation is the hardware and its deployment.The radar setup used in this study limits the performance of the system in terms of height estimation.Future research could analyze the influence of the size of the virtual array and investigate the effects of different radar deployment heights and tilt angles.This analysis may lead to improvements in the overall system performance.Finally, the latency of the processing pipeline and the trade-offs for its improvement can be investigated as well.The present study did not address pipeline latency concerns, because the data was exported and processed offline.However, in future applications where real-time functionality is necessary, latency becomes an important aspect and should be improved.

FIGURE 9 .
FIGURE 9. Detected and clustered points in the range-azimuth plot of Fig. 7.

FIGURE 11 .
FIGURE 11.Human with twelve body parts and its rotations and translations.(From[32]).

TABLE 4 .
Ground truth parameters of the volunteers.

FIGURE 19 .
FIGURE 19.Spectrograms (a) without and (b,d,f,h,j) with static clutter removal of torso Doppler signatures of person 1. Fourier transforms of weighted mean of corresponding spectrograms are depicted in (c,e,g,i,k).The figures correspond to measurements for incrementing angles of 0 • , 22.5 • , 45 • , 67.5 • and 90 • , respectively.

FIGURE 20 .
FIGURE 20.Examples of histograms of heights aggregated over all frames for (a) person 1, (b) person 2 and (c) person 3 walking on trajectory 1.

FIGURE 21 .
FIGURE 21.Detected points of person 1 and the corresponding estimated height versus the frames for (a) trajectory 1 and (b) trajectory 5.

FIGURE 25 .
FIGURE 25.Mean absolute error of height estimation.

TABLE 5 .
Mean gait parameter estimation for different angles and persons and errors with respect to ground truth.

TABLE 6 .
Comparison of gait parameter estimation in this work to previous work.

TABLE 7 .
Comparison of human height parameter estimation in this work to previous work.