Analysis of the Recent AI for Pedestrian Navigation With Wearable Inertial Sensors

Wearable devices embedding inertial sensors enable autonomous, seamless, and low-cost pedestrian navigation. As appealing as it is, the approach faces several challenges: measurement noises, different device-carrying modes, different user dynamics, and individual walking characteristics. Recent research applies artificial intelligence (AI) to improve inertial navigation's robustness and accuracy. Our analysis identifies two main categories of AI approaches depending on the inertial signals segmentation: 1) either using human gait events (steps or strides) or 2) fixed-length inertial data segments. A theoretical analysis of the fundamental assumptions is carried out for each category. Two state-of-the-art AI algorithms (SELDA, RoNIN), representative of each category, and a gait-driven non-AI method (SmartWalk) are evaluated in a 2.17-km-long open-access dataset, representative of the diversity of pedestrians' mobility surroundings (open-sky, indoors, forest, urban, parking lot). SELDA is an AI-based stride length estimation algorithm, RoNIN is an AI-based positioning method, and SmartWalk is a gait-driven non-AI positioning method. The experimental assessment shows the distinct features in each category and their limits with respect to the underlying hypotheses. On average, SELDA, RoNIN, and SmartWalk achieve 8-m, 22-m, and 17-m average positioning errors (RMSE), respectively, on six testing tracks recorded with two volunteers in various environments.


I. INTRODUCTION
T HE development of pedestrian navigation solutions has been an active field of research for almost two decades. The first technology employed is the global navigation satellite system (GNSS) working in open-sky outdoor conditions. Indoors, radio-beacon-based technology is deployed to locate pedestrians with ranging or mapping of signal footprints. These technologies are now widely commercialized but operate only in equipped infrastructure involving high installation and maintenance costs. Other approaches, aiming at fully autonomous navigation, are still being developed. They rely on image processing with simultaneous localization and mapping (SLAM), structure from motion or odometry methods, and inertial signals processing to infer the pedestrian's dynamics using wearable sensors. Inertial pedestrian navigation is very attractive because it does not require infrastructure, is operational with low-cost sensors that can be attached to several locations on the person's body (wrist, trousers pocket, etc.), and comply with the European General Data Protection Regulation recommendation promoting privacy by design technologies. This work involved human subjects or animals in its research. Approval of all ethical and experimental procedures and protocols was granted by Comité pour les recherches impliquant la personne humaine.
Digital Object Identifier 10.1109/JISPIN.2023.3270123 But pedestrian inertial navigation faces challenges. It accumulates positioning errors over time due to low-cost sensor noises. It lacks robustness when the pedestrian motion mode changes (slow/normal/fast walking, staircases, stationary, etc.) or when the wearable fixing point varies (handheld sensors, in the trouser/vest pocket, etc.). It also fails to adapt to individual walking gait characteristics (disability, injury). Artificial intelligence (AI) is interesting for simultaneously addressing all these varying conditions and thus providing improved robustness and positioning accuracy. Consequently, it is increasingly applied to inertial pedestrian navigation research. Less explicit than traditional physics-based approaches, it raises design and hyperparametrization issues.
This article aims at analyzing the recently proposed approaches in the field of pedestrian navigation with inertial wearable sensors to identify the key features that contribute to the success or limitations of robust and accurate positioning. It extends the previous conference proceeding [1], which classifies AI-based inertial pedestrian navigation methods into two main categories depending on the inertial signals segmentation. A detailed analysis of the fundamental hypotheses in each category, their likelihood, and their impact on positioning performance is conducted. A comparison with a non-AI pedestrian navigation approach is added. The experimental performance assessment is conducted with handheld inertial sensors on a larger open-access dataset in three different environments: 1) urban, 2) forest, and 3) a shopping mall parking lot, including indoor and outdoor parts and staircases.
The rest of this article is organized as follows. Section II presents the AI-based pedestrian navigation state-of-the-art methods: human-gait-and sampling-frequency-driven AI methods, along with their underlying hypotheses. The three methods selected for evaluation are described in Section III. Section IV is dedicated to the experimental evaluation and comparison of the three selected methods on pedestrian tracks. Finally, Section V concludes this article.

II. STATE-OF-THE-ART ON CURRENT AI METHODS FOR PEDESTRIAN INERTIAL NAVIGATION
AI-based methods for pedestrian inertial navigation can be classified into two classes [1]: (A) human-gait-driven and (B) sampling-frequency-driven methods. The first category is inspired by the nature of human walking and the inertial signals are segmented with the user's gait (step or stride) events, whereas the second category segments inertial data into fixed-length sequences, usually overlapped.

A. Human-Gait-Driven AI Methods
The inertial data sequences are segmented using the gait events (step or stride instants) and processed to estimate the gait vector of each segment. Gait events are derived from the cyclic inertial signals patterns. For instance, Abiad et al. [2] consider that a peak in the acceleration norm or a valley in the angular rate norm represents a step instant, regardless of the device's location. However, gait detection under irregular hand movements remains challenging.
Due to the different features needed to estimate the step/stride length and the walking direction, they are often treated separately.
An alternative to feature engineering is end-to-end regression with a sequence of inertial measurements over a gait interval. Feature extraction is usually performed by a deep network. Gu et al. [13] use two stacked autoencoders, Yan et al. [14] use a deep believe network built with multiple Gaussian Bernoulli Restricted Boltzmann Machines [15], and the authors in [16] use an LSTM [17]. We will detail the model in Section III. The regression task is done by one or several dense layers.
The main challenge of stride/step length estimation is the changeable device locations, user dynamics, and different individual walking characteristics. Classifying the device's location can improve robustness [3], [4]. Customizing the model for each individual is another way to improve accuracy [18].
2) Walking Direction Estimation: Estimating the user's walking direction with wearable inertial sensor measurements is a complex problem because the misalignment between the device's pointing direction and the user's walking direction is not necessarily constant. A common strategy is to express the inertial measurements in the navigation frame via device attitude tracking.
Liang et al. [6] used the device's yaw angle and magnetometer measurements as features to infer the user's walking direction using an OS-ELM network. Pedestrian heading estimation [19] do not need explicit device attitude tracking, instead, it performs data augmentation and adaptive alignment by a learnable spatial transformer network to make the model invariant to the inertial data's rotation.

B. Sampling-Frequency-Driven AI Methods
This branch of AI methods considers pedestrian positioning as an end-to-end problem. Inertial measurement sequences are segmented into fixed-length segments, usually overlapped. Deep networks are trained to infer the user's average velocity or change in position over a segment.
The constant sampling frequency of the inertial measurements is a necessary condition for this branch of methods. If it is not the case, data interpolation and synchronization are needed.
RIDI [20], the "ancestor" of this branch, considers only strapdown configurations (leg pocket, in a bag, hand-held, and body mounted). Raw inertial measurements are corrected thanks to a device attitude tracking algorithm and a neural network, before being integrated twice to obtain the user's position. The same team later proposed RoNIN [21], which can operate beyond strap-down configurations, that we selected for experimental evaluation (see Section III). IONet, another pioneer of this category, regresses the user's walking direction change and displacement every time it receives a 200-frame segment (200 Hz).

C. Hypotheses Made for Each AI Category
The underlying hypotheses of each category are listed and analyzed.

1) Hypotheses for the Human-Gait-Driven AI Methods:
• Hypothesis GH1: Human walking is cyclic. A normal cycle of human locomotion is illustrated in Fig. 1. In [22], walking locomotion is described as a process in which the erect, moving body is supported by first one leg, and then the other. As the moving body passes over the supporting leg, the other leg swings forward to prepare for its next supporting phase. • Hypothesis GH2: According to the work in [23], the upperand the lower-body movements are correlated. During normal walking, the head and the trunk move up and down as the center of gravity follows the lower limbs' periodic movements, and arms flex and extend reciprocally. This hypothesis makes gait event detection from inertial signals plausible. • Hypothesis GH3: Human paces are regular and constrained. According to a statistical study presented by [3], within 10 145 strides of gait measurements of different subjects and different walking dynamics, collected by a foot-mounted device, 99.5% of strides were within 1.55 m, and no stride exceeds 1.75 m. The mean and standard deviation of stride lengths are 1.33 m and 0.18 m, respectively. • Hypothesis GH4: Step/stride length and inertial signals collected from different body parts are correlated. Empirical models are developed based on hip or foot-mounted sensors. Weinberg [24] found a correlation between the hip's vertical acceleration amplitude and stride length. Ladetto [25] found a correlation between acceleration variance and stride length. The hypothesis is plausible considering the correlation between foot movements and those of the rest of the body (GH2). • Hypothesis GH5: A corollary of GH1, when the device location remains the same, the user's change in walking direction over two consecutive steps shall be approximately the same as the change in the device's pointing direction. GH1 and GH3 are observed during natural walking, certain users such as senior citizens can easily break these assumptions. GH2 is a simplification of reality since users can move freely their arms and hands while walking. As for GH4, even if these correlations exist, they are likely to be significantly diversified depending on the device carrying mode, which explains the fact that most research works either consider a single carrying mode [14], [16] or perform carrying mode classification [3], [4]. GH5 can be easily corrupted by noisy movements (swinging, device in pocket/bag).

2) Hypotheses for Sampling-Frequency-Driven AI Methods:
• Hypothesis FH1: The true kinematic of the user's center of mass is continuous and can be recovered from inertial measurements collected from different body parts. • Hypothesis FH2: Each fixed-length segment is independent of the others. In other words, a segment contains sufficient information to recover the user's velocity or change in position over the same time window. • Hypothesis FH3: The inertial signals are sampled at a constant sampling frequency. This branch of methods is naturally suitable for using deep learning models, which require fixed-length inputs. FH2 is approximately true if we consider regular and cyclic movements that the user's speed can be estimated with the signal's frequency and amplitude. However, the correlation between the user's speed and the signal's frequency or amplitude can vary from one individual to another. The same hypothesis also implies that a segment contains sufficient information to yield a walking direction inference. According to a survey [27] on traditional methods for walking direction estimation with an unconstrained device, existing methods such as principal component analysis [28], [29], forward and lateral accelerations modeling [30], and frequency analysis of inertial signals [31] assume that the walking direction is observable with handheld inertial sensors during one step/stride. As a result, we expect noisy inferences from this approach.

III. SELECTED METHODS FOR THE EXPERIMENTAL ASSESSMENT: SELDA, RONIN, AND SMARTWALK
We selected the method pedestrian Stride-length Estimation [16] based on LSTM and Denoising Autoencoders (titled SELDA in the rest of the article) among the gait-driven AI methods. The latter is representative of the category with sufficient implementation and data processing details, along with a benchmarking dataset. RoNIN [21] is selected among the sampling-frequency-driven AI methods since the authors published their implementation, trained model weights, and a part of their dataset.
We would also like to evaluate a complete positioning non-AI gait-driven method for comparison; thus, we selected Smart-Walk. The method combines several techniques such as machine learning, statistical model, and an extended Kalman filter (EKF) to infer the user's trajectory; moreover, some parameters in the model are customized for each user.

A. SELDA: Pedestrian Stride-Length Estimation Based on LSTM and Denoising Autoencoders
The stride length estimation AI model proposed in [16] takes as input stride segments of inertial measurements: Three-axis acceleration and angular rate, collected by a handheld smartphone, along with higher-level features from empirical models (Weinberg [24], Ladetto [25], Scarlett [32], etc.) computed with the acceleration segment.
SELDA requires the user to carry the device in "texting" mode, in such a way that the z-axis of the device points to the sky. Stride events for signal segmentation are detected by a foot-mounted device. This setup is constrained and impractical for deployment. 1) SELDA Dataset: Publicly available [33], is collected by five volunteers of different gender and height, holding the smartphone horizontally in front of their chest.
The dataset covers both indoor and outdoor scenarios including staircases.
The inertial signal sampled at 100 Hz is segmented by stride instants provided by the foot-mounted reference tracker, which also provides stride length ground truth.
2) Adaptation for Experimental Assessment: We use only 4 higher-level features instead of 35 since only 4 definitions are available.
Among learning samples from the SELDA dataset, we only consider the stride length range between 0.3 m and 1.8 m. The original dataset is split into a 6319-sample training set and a 175-sample validation set.
Since SELDA only estimates stride length, for illustration's purpose, we pile up three modules, namely stride detection, SELDA, and heading estimation to build a positioning system. Stride instants and walking directions are provided by our footmounted reference tracker [34], [35].

B. RoNIN: Robust Neural Inertial Navigation
RoNIN expresses acceleration and angular rate measurements in the navigation frame, via the device attitude provided by Android's game rotation vector (GRV) and a spatial alignment procedure. An AI model (RoNIN ResNet) is trained to infer the user's horizontal velocity (V x , V y ), given a fix-length segment (200 frames) of acceleration and angular rate expressed in the navigation frame. Inferred velocities are integrated to obtain the user's trajectory. Both RoNIN datasets, model implementation, and trained model weights are publicly available [36].
1) RoNIN Dataset: It is collected mainly in indoor environments by 100 volunteers and 3 Android devices, covering usual scenarios such as a smartphone in a bag, in the pocket, handheld, walking, sitting, etc.
The ground truth trajectories are provided by visual-inertial SLAM performed by a tango phone attached to the volunteer's chest.
All measurements are synchronized and sampled at 200 Hz.
2) Importance of the GRV: GRV is a quaternion provided by Android API, indicating the device's orientation w.r.t some gravity-aligned reference frame [37]. To better understand it, we compare it to a transparent EKF device attitude tracking algorithm MAGYQ [38]. We attach rigidly our "home-made" navigation device (see Section IV) ubiquitous localization with inertial sensors and satellites (ULISS) and a smartphone on an aluminum plate to align their z axes (see Fig. 2). We recorded the following two tracks.
• Track 2: random rotations during walking. In Fig. 3(a) and (b), we plot the Euler angles of the smartphone given by GRV (top figure) and those of the ULISS given by MAGYQ (middle figure). The roll angle of ULISS evolves in the same way as the pitch angle of the Android device and the pitch angle of ULISS evolves in the same way as the Android device's roll angle. The ULISS yaw angle is opposite to the yaw angle of the Android device. In the bottom figure, we plot the variation of their angular offsets given by (1). The subscript "a" stands for Android and "u" stands for ULISS Δ yaw = yaw u − yaw u (t = 0) + (yaw a − yaw a (t = 0)) (3) The figures show that the GRV is good at estimating roll and pitch (related to gravity). There is no large difference between the GRV and the MAGYQ result. The offset between game rotation  yaw and ULISS yaw is almost constant during several minutes of recordings.
We observe punctual peaks in the bottom plots, which are due to are due to slight synchronization lags or the difference in response time of the two devices.
We can conclude that the offset between the GRV and the device's orientation, w.r.t to North-East-down frame, is approximately constant for several minutes. For this reason, we replace the GRV with MAGYQ in our experiments to make the RoNIN more transparent.

3) Adaptation for Experimental Assessment:
The article proposes three variants based on different deep learning models: 1) ResNet [39], 2) LSTM [17], and 3) temporal convolutional network [40] to estimate the user's position. We only assess RoNIN ResNet, since it yields the best results. We use the published model implementation and trained model weights for experimental assessment. We replace the GRV and the spatial alignment procedure with MAGYQ.

C. SmartWalk
It appeared interesting to make a comparison with an older solution based on classical signal processing techniques instead of recent AI tools: SmartWalk [40]. It is a pedestrian positioning algorithm fusing data from a triaxis accelerometer, a triaxis gyroscope, a triaxis magnetometer, a barometer, and a GNSS receiver. In this article, GNSS raw data are not included in the positioning algorithm to ensure a fair comparison with the other AI-based solutions. SmartWalk contains several modules.

1) Step Detection, Step Length Estimation, and Carrying
Mode Classification: This module processes the wearable raw acceleration and angular rate readings at 200 Hz. The original step detection module was replaced with SmartStep [2], [42].
Step length s is estimated by a linear model [43] where h is the user's height, f is the frequency of the acceleration magnitude, and a, b and c are universal parameters learned on a group of users. A personalized calibration is as well possible [44]. The device's carrying mode is classified into either texting or swinging mode. Irregular and static phases are also detected [43].
2) GMM for Walking Direction Inference: This module adopts a Gaussian mixture model (GMM) to model the misalignment between the device's pointing direction and the user's walking direction [45]. First, the device's accelerations are transformed into the local North-East-Down frame using the EKF-based device attitude tracking algorithm: MAGYQ [38]. Then, the distribution of the horizontal accelerations, when the user is walking toward the North (0 • heading), is modeled by a weighted sum of 2-D Gaussian distributions where τ k is the weight of Gaussian component k, parameterized by mean m k and variance P k . A GMM model is learned for each individual for a device handling mode, by expectation maximization. Finally, the walking direction of a step is inferred by rotating the learned GMM by angle θ to maximize the log-likelihood of the horizontal accelerations cloud over one step. θ is the inferred user heading for this step.
3) Corrections: An EKF completes SmartWalk with the following corrections.
1) Identify the stairs with a barometer and use a fixed step length (30 cm) when the user is on stairways. 2) Use a fixed step length (50 cm) in the propagation model and the estimated step lengths as observations. 3) Fuse the GMMs inference with the device's pointing direction given by MAGYQ according to hypothesis GH5. High confidence is given to the GMM at the beginning of the trajectory to match the initial heading with the GMM prediction. Table I summarizes the inputs and outputs of the three selected methods.

IV. EXPERIMENTAL ASSESSMENT
A. Experimental Setup 1) Hardware: The device ULISS [46], shown in Fig. 4(a), was developed by the GEOLOC laboratory at University Gustave Eiffel and is used for the experiments. It is a state-ofthe-art Inertial Navigation System containing an Xsens Mit-7 IMU-Mag sensor, a barometer, and a GNSS receiver, providing acceleration, angular rate, magnetic field, and atmospheric pressure readings at 200 Hz, GNSS reading at 5 Hz, using GPS timestamps. It is used for the experimental assessment instead of a smartphone because the signal acquisition is controlled and the sensors are calibrated. One ULISS is mounted on the foot and is the reference solution, i.e., ground truth, with a 0.3% positioning error over the traveled distance [47]. It is the winning solution of the three-year French national competition (MALIN) for the noncollaborative positioning of soldiers in challenging indoor environments [35].
2) Scenarios: As shown in Fig. 4(b), the test person holds one ULISS horizontally (the z-axis points to the sky) and walks naturally, as requested by SELDA. All scenarios started outdoors for the initialization of MAGYQ and the foot-mounted reference solution. Both algorithms need a magnetometer calibration without strong artificial magnetic fields for the initialization. For the  TABLE II  CHARACTERISTICS OF THE SIX TESTS IN OPEN ACCESS [47]   TABLE III  PERFORMANCE EVALUATION OF SELDA- sake of fairness, the ground truth initial walking direction is given to all three implemented positioning methods.
Four different surroundings, representing the diversity of common pedestrian navigation contexts, were chosen for the experiments. Fig. 5 shows the diversity of these environments. Tests 1-3 were recorded on the campus of the university by a healthy man (volunteer 1, 1.66 m height), and tests 4-6 in various environments (forest, city, and parking lot) by another healthy man (volunteer 2, 1.80 m height). The six recorded tracks are provided in open source [48]. They include raw inertial signals, calibration parameters, and ground truth trajectories. Table II summarizes the main characteristics of these six evaluation tracks.

B. Performance Evaluation
Three metrics are selected to evaluate the horizontal trajectories estimated by each selected method: the scale factor (SF), the EndPoint error Rate (EPR), and the root-mean-square error (RMSE). 1) Scale Factor: It is the ratio of the total length of the estimated trajectory l es to the total length of the ground truth  trajectory l gt . The ratio is expected to be close to 1 2) Endpoint Error Rate: It is the ratio of endpoint error (EPE) to the ground truth trajectory's total length l gt 3) Root-Mean-Square Error: It measures the standard deviation on the horizontal positioning accuracy  where n is the number of points in the trajectory, (x i , y i ) is the user's ground truth position at time step i, and (x i ,ŷ i ) is the predicted one.
The experimental results are reported in Table III. Estimated and ground truth trajectories are shown in Fig. 6 for the on-campus datasets (volunteer 1) and in Fig. 7 for the forest/urban/parking dataset (volunteer 2).

C. Analysis of the Inertial Pedestrian Positioning Estimates
1) Walking Distance: The SF evaluates the quality of the estimated walking distances. Table III shows that RoNIN always underestimates the walking distance. But the standard variation of RoNINs SF (0.067 for volunteer 1 and 0.019 for volunteer 2) is smaller than the one of SELDA (0.079 for volunteer 1 and 0.065 for volunteer 2), which over or underestimates the walking distance. Globally, RoNIN is able to better follow the pedestrian's dynamics changes as compared to SELDA. But important drifts are observed in the RoNIN trajectories for Tests 2 and 3. These observations are further detailed in Figs. 8-11. Figs. 8 and 9 show the stride lengths predicted by SELDA (blue) against the ground truth (orange) for both volunteers. To complete the analysis, stride lengths estimated by the state-ofthe-art Weinberg model (green) are plotted over.
The "nearly flat" line of the SELDAs predictions shows that it fails to capture the variations in the user's movement, especially when the user is taking stairs (smaller strides). Weinberg estimates, sharing one higher-level feature with SELDA, are much better at tracking variations in walking dynamics. The RoNINs velocity plots (see Figs. 10 and 11) show better performance in the tracking of walking changes: start, stop, and taking stairs. But, as foreseen by the theoretical analyses (FH2), the velocity estimates are noisy. The integration of the inferred velocities smooths this noise.
Globally, the two categories of methods show completely opposite behaviors. The failure of SELDA and the robustness of RoNIN can be explained by their training datasets. All stride length labels from SELDAs training set and their distribution are shown in Fig. 12. The mean and standard deviation of SELDA stride length labels are 1.36 m and 0.078 m. Most of the labels are close to the mean value, which is the best guess that the model can achieve. On the other hand, RoNIN regresses the two components of the velocity, whose variations are more important due to the infinity possibility of the walking direction.
Thanks to the sampling-frequency-driven data segmentation, RoNIN is more data-intensive than SELDA for the same amount of measurements. For example, if a normal walking   RoNIN, i.e., 200 measurement points in one segment, with a stride of 5 frames.
For volunteer 1, SmartWalk tracks the best the user's dynamics (with a 0.038 standard variation), whereas it is RoNIN (with a 0.019 standard variation) for volunteer 2. To understand the observation, stride lengths estimated by the SmartWalk's step model without (orange) and with (green) EKF correction are illustrated against the ground truth (blue) in Figs. 13 and 14. Step instants detected by processing the handheld inertial sensors data are projected on the ground truth to observe the derived stride (2 steps) length. Strides larger than 2 m can be observed in the blue dots, illustrating the underdetection of gait events in the SmartWalk approach. Similar to SELDA, SmartWalk's estimations form a relatively flat (less than SELDA) line when there is no staircase. When the staircase detection function is operational with the barometer readings, SmartWalk outperforms both RoNIN and SELDA in scenarios with staircases (Tests 1 and 3). Without a stairway, RoNIN gives the best user dynamics tracking. On the other hand, SmartWalk can be sensitive to barometer noises and overdetect staircases. In Test 4, several tiny strides of 60 cm are inferred but no stairway is included in the track.
Despite SELDAs failure of tracking the user's dynamics by giving almost constant stride length inferences, its SFs are closer to 1 than the other two methods, when stairs are not included in the track (Tests 2, 4, 5, and 6). Indeed, the constant stride length estimated by SELDA corresponds to an average learned from its training set. Because human walking is regular and constrained, the deviation of a stride length around the average is bounded.
2) Walking Direction: Only RoNIN and SmartWalk are compared for the walking direction estimation task since SELDA does not predict the latter. Noisy velocity inferences result in noisy walking directions estimated by RoNIN. SmartWalk is better than RoNIN since the mean RMSE of SmartWalk is smaller in both cases of volunteers 1 and 2. The EKF correction module is efficient here.
The benefit of SmartWalk's EKF correction is illustrated in Fig. 15 for Volunteer 2. Similar to RoNIN, GMM standalone (orange) yields noisy walking direction estimations. On the other hand, the trajectories estimated with MAGYQ's yaw angle (blue) have almost the same shape as the ground truth. Under the hypothesis that when the carrying mode remains steady, the change in the user's walking direction over two consecutive steps shall be the same as the change in the device's pointing direction (Hypothesis GH5). The device's yaw angle can be utilized to correct GMM results since MAGYQ results are more accurate and smooth. The fusion improves significantly the walking direction estimation, especially for turnings (green).

V. CONCLUSION
This article analyzes the features of existing AI-based pedestrian inertial positioning techniques with wearable sensors both at the theoretical and experimental performance levels. A twocategory classification, based on the inertial segmentation strategy, is presented: either using the human gait analysis or the inertial signals sampling frequency.
A theoretical analysis of the fundamental assumptions that allow the two categories of AI-based methods to function properly is carried out. A state-of-the-art algorithm from each category (SELDA and RoNIN) and a classical signal-processing-based algorithm (SmartWalk) are detailed and implemented for the 2.17-km experimental assessment in six scenarios and environments covering the diversity of pedestrians' mobility (open-sky, indoors, forest, urban, parking lot). The dataset is open-access.
SELDA uses the flat-foot instants of a foot-mounted tracker to segment inertial signals. It was found to be inefficient for labeling wearables' training datasets [42] and unrealistic in reallife situations. To better capture the walking changes with gaitdriven AI methods, different walking dynamics (various stride lengths) should be added to the training dataset, and adopting a user-centric approach could be beneficial.
Compared to SELDA, the signal segmentation and labeling strategies of RoNIN show better learning. It estimates the walking distance up to an SF, which is found to be stable for the same individual and different from one individual to another. In addition, noisy predicted velocities and lack of precision in estimating turnings result in important positioning drifts.
SmartWalk shows that the gait analysis can provide efficient corrections to trajectory estimation: better displacement estimates on stairs, smooth walking direction on straight lines, and correct turning angle estimation. It is worth noticing that significant changes in the device's yaw angle can indicate turnings. However, hypothesis GH5 still needs to be tested under diverse scenarios other than the "texting" case.
Globally, SELDA, RoNIN, and SmartWalk achieve 8-m, 22-m, and 17-m average positioning errors (RMSE), respectively, on six testing tracks recorded with two volunteers in various environments, over a 2.17-km walking distance.
Both categories are facing challenges. Gait-driven AI methods need to improve their robustness to deal with different device poses and user dynamics. Sampling-frequency-driven AI methods need to reduce noises in their predictions. A direction for improvement is to fuse the two approaches to capture the user's dynamics with a fixed sampling frequency-based processing and correct the trajectory using estimated gait parameters. Finally, customizing the model parameters for each user is another promising strategy.