Drift-Free Visual SLAM for Mobile Robot Localization by Integrating UWB Technology

Visual simultaneous localization and mapping (vSLAM or visual SLAM) is an important technique for mobile robot localization in the global navigation satellite system denied (GNSS-denied) environments. However, the positioning accuracy could become unbearable due to the lack of image feature points when the robot navigates in a spacious indoor space. The drifting errors accumulated over time are generally inevitable and need to be mitigated by more sophisticated loop-closure algorithms. In this paper, we propose a drift-free visual SLAM technique for mobile robot localization by integrating the ultra-wideband (UWB) positioning technology. The basic concept is to utilize the global constraint of the UWB positioning to reduce the locally accumulated errors of visual SLAM localization based on the extended Kalman filtering (EKF) framework. In our experimental results, various SLAM approaches are performed in the indoor scenes, and the evaluation and comparison have demonstrated the feasibility of the proposed localization technique. By the integration of UWB positioning, the overall drift error of the robot navigation is reduced for more than 50%.

with the visual servoing capability. Thus, sensing techniques 23 and data processing algorithms are the core technologies for 24 automated factories in the future. Among the various sub-25 systems used for mobile robots, the self-localization module 26 is with specific importance in robot mobility. When produc-27 tion lines are transformed to automated industrial systems to 28 increase the throughput with high reliability, it is necessary to 29 provide precise positioning and sufficient accuracy for robot 30 localization. 31 The associate editor coordinating the review of this manuscript and approving it for publication was Guilin Yang . Over the past few decades, technologies for simultane-32 ous localization and mapping (SLAM) have been widely 33 investigated [1], [2]. The objective is to obtain the location 34 information of the mobile robot and construct the map of 35 the environment during the robot navigation at the same 36 time. Some commonly used sensors include infrared, sonar, 37 2D and 3D light detection and ranging (LiDAR), monocular 38 camera, stereo camera system, depth camera, and inertial 39 measurement unit (IMU). In addition to the development of 40 SLAM techniques with single sensors, the fusion of different 41 SLAM approaches has been shown to provide better local-42 ization accuracy [3], [4]. Nevertheless, there still exist many 43 unsolved problems even the sensor fusion methods are incor-44 porated. Some typical examples include the slipperiness of 45 wheel contacts and the influence of unknown external forces. 46 These might introduce some errors which are difficult to 47 mitigate by the SLAM systems using on-board sensors alone, 48 and therefore more sophisticated loop closure detection needs 49 to be adopted [5]. 50 The most important problem for a SLAM system is to deal 51 with the measurement errors in the position and orientation. 52 (EKF) is then adopted to fuse the localization results based 108 on the confidence weighting [12]. In our proposed method, 109 the localization failure and drifting errors can be monitored 110 continuously, and the UWB positioning is adopted for the 111 optimization of the SLAM output whenever the uncertainty is 112 larger than a threshold. Several experiments are performed in 113 the real-world environments, and the performance evaluation 114 has demonstrated the feasibility of our approach for precise 115 indoor localization. 117 In the robotics research community, many SLAM approaches 118 have been proposed over the last decades [13]. Most classic 119 approaches in the existing literature utilize laser rangefind-120 ers (lidars) or cameras for environmental data acquisi-121 tion. Among various SLAM techniques, the lidar based 122 approaches have been extensively investigated [14]. These 123 techniques are frequently regarded as the main localization 124 methods, and perform relatively well in robot navigation 125 tasks. Generally speaking, 2D lidars are adopted for domestic 126 applications such as cleaning robots or region exploration, 127 while 3D lidars are utilized in the applications of self-driving 128 vehicles or aerial robots. Those obstacles in the detectable 129 region are perceived as 3D point clouds with the depth infor-130 mation. In the lidar-based techniques, the 3D data obtained 131 from different locations are registered to a common coordi-132 nate frame for comparison, and then used to calculate their 133 relative orientation and position [15]. The frame-by-frame 134 transformation is then used for mobile robot localization and 135 environment map construction.   For the EKF algorithm used in this work, it is based on the 219 implementation in [3] and [32]. From the modern control the-220 ory and statistical data processing, the noise corrupted state 221 vectors of a system can be estimated by the iteration of data 222 measurements. Thus, the objective is to derive the optimal 223 value of the current system state by the one estimated at 224 the previous time stamp and the present measurement. When 225 adopted for the mobile robot localization, this is to compute 226 the 3-D state vector representing the position. By defining the 227 noise with Gaussian distribution, the current state vector and 228 the prediction error covariance matrix are obtained according 229 to the kinematic model derived from Newtonian mechanics. 230

231
The system flowchart of the proposed drift-free visual SLAM 232 technique is illustrated in Figure 1. In this work, we adopt 233 RTAB-Map (Real-Time Appearance-Based Mapping) as the 234 basic SLAM framework for the development [33], and 235 incorporate the pre-established UWB positioning system for 236 global localization. As shown in the figure, the SLAM sys-237 tem on-board the robot for self-localization operates simul-238 taneously with the UWB global positioning. By fusing the 239 multimodal sources for robot localization, the high accurate 240 results with drifting corrections can be obtained based on 241 the weightings of confidence levels. In addition, the data 242 fusion of SLAM and UWB is carried out via EKF (extended 243 Kalman filter). If the measurement difference between two 244 systems is greater than a threshold, a relocalization procedure 245 is performed based on the UWB global information.

246
The RTAB-Map module adopted in this work is based on a 247 general RGB-D SLAM framework. It is also capable of using 248 memory management to perform the closed-loop detection. 249 The essential task is to make it possible to accomplish a 250 long range navigation with the online loop closure detection. 251 In the general SLAM framework, the computations of visual 252 odometry for frame-to-frame and frame-to-map are usually 253 carried out separately. The input images are acquired from 254 stereo or RGB-D cameras, and the frame-to-map feature 255 extraction is performed without directly matching for the 256 nearest neighbors. Under the situation of feature loss, the ratio 257 of the first and second nearest neighbors is compared and 258 frame approach, the optical flow is adopted for feature match- key frames [34], [35]. It is then followed by a local bundle 276 adjustment for the key frames and using the geometric con-277 sistency for camera pose updates.

278
To provide a global constraint for localization, we incor-  The time of flight associated with the pulse signals between 295 these modules is used to compute the distance by where C is the speed of light [36]. If the positioning is carried 298 out with multiple modules, it is required to calculate the dis-299 tances from the label to those modules. In this configuration, 300 the 3D coordinates will be derived using a trilateral method.   hyperbolas. To have this method operate successfully, one 313 crucial prerequisite is the synchronization in time among the 314 different modules. In general, TDOA is more complicated for 315 deployment and setup because of the time synchronization 316 requirement between different modules. However, it only 317 requires to transmit the signals instead of waiting for the 318 responses from the tags. Consequently, more spare time for 319 the localization of other tags can be achieved by using this 320 approach.

321
Since the UWB positioning system measures the distance 322 and performs the localization using TDOA, this approach can 323 be considered as GNSS positioning in the indoor environ-324 ment. Thus, the drifting error due to the long range mobile 325 robot navigation can be avoided. The measurement error will 326 be fixed within a certain range with no accumulation, and 327 the localization estimates will approximate the ground-truth 328 globally. As illustrated in Figure 3, the relative poses among 329 multiple frames are generally estimated by SLAM based 330 localization techniques. To apply UWB measurements to the 331 VOLUME 10, 2022 FIGURE 5. The schematic diagram for the UWB error correction. If the difference between the newly observed point and the previous position estimated by UWB is larger than a preset value, new coordinates will be used for the localization update.
The idea of the overall error correction based on the mea-351 surements of SLAM and UWB is illustrated in Figure 5. 352 If the newly created position from SLAM and the current 353 position estimate obtained by UWB have a difference larger 354 than a value τ , then we use the new position coordinates 355 for an update. The new localization position (x 3 , y 3 ) has the 356 same direction as (x 2 , y 2 ), but provided an offset given by the 357 threshold as follows 358 where x and y are the displacements in the x and y 361 directions respectively, and xy is the 2-D displacement in 362 the Euclidean distance. While the error correction is carried 363 out in the x and y directions, the drift error in the z axis is 364 filtered out by EKF with the 2D mode setting.

365
The localization sensor fusion of SLAM and UWB posi-366 tioning is further carried out using the extended Kalman filter 367 (EKF). We assume that the probabilities of localization follow 368 the Gaussian distribution initially for iterative updates, the 369 truncation operations in Eqs. (3) and (4) are then used to 370 derive the final result. The Kalman filter was initially used to 371 cope with a linear-quadratic equation by finding the solutions 372 recursively [37]. It is a problem for estimating the current 373 state (or the state of process) of a linear dynamic system under 374 the perturbation of white noise. To properly estimate or pre-375 dict the current state, the Kalman filter proceeds recursively 376 by utilizing the current measurements and the previous states. 377 Since the Kalman filter performs in a recursive way till the 378 optimal estimation state is achieved, it is commonly regarded 379 as a powerful technique to minimize the error of state esti-380 mates. The EKF is an extended version of the Kalman filter 381 to deal with non-linear models. In EKF, it includes an extra 382 linearization model in the prediction step and the calculation 383 of partial derivatives of the state variables.

384
Given an n × 1 process state vector x k , an m × 1 mea-385 surement vector z k , and the control input u k , where k denotes 386 the time stamp, then a general non-linear system and the 387 measurement model can be described by where the random variables w k and v k represent the Gaussian 391 white noise and the measurement noise, respectively. Let 392 P k , Q k and R k be the covariance matrices for x k , w k and 393 v k , respectively, the EKF algorithm is carried out with 394 two steps: prediction (time update) and correction (mea-395 surement update). In the prediction update step, the state 396 projections and error covariance estimates are computed 397 from In the measurement update step, the measurement z k 405 becomes available and EKF calculates the Kalman gain 406 matrix. It is then incorporated with the measurement inno-407 vation to derive the estimated state x k , followed by the state 408 error covariance matrix update. The general scheme for the 409 measurement update is given by where H is the Jacobean matrix of the measurement and com- and its movement with the control input u k , and is given where F k = ∂f /∂X and G u = ∂f /∂u.   The state error between the actual and predicted mea-445 • Matching: In the matching procedure, an assignment is 449 processed from measurements to the landmarks and then 450 store in the map.

451
• Estimation: The estimation of the state is given by where K k is the Kalman gain and P k is the new state 456 covariance matrix.

457
The flowchart of the proposed UWB-based SLAM drifting 458 error correction approach is illustrated in Figure 6. When both 459 the SLAM and UWB systems start the positioning process, 460 the localization status is continuously monitored. If a local-461 ization failure of the visual SLAM is detected, the coefficients 462 of the covariance matrix are changed to a large value to rep-463 resent its low confidence. Otherwise, the difference between 464 the SLAM and UWB positioning results is calculated, and 465 a threshold is used to determine whether the drifting error 466 correction will be performed. The noise covariance matrix of 467 the EKF is then updated in the diagonal elements. Since the 468 estimates from SLAM and UWB are our main concerns, the 469 weights for the position and orientation (6 DoFs) are set as 470 larger values compared to the velocities and angular velocities 471 in 3 directions.

473
The proposed drift-free visual SLAM with UWB technique is 474 implemented on a mobile robot system, and tested in the real 475 VOLUME 10, 2022 world environment. Figure 7 shows the mobile robot and the 476 indoor space for our experiments. We construct an aluminum 477 extrusion rack and install it on a Pioneer P3-DX robot to 478 place the sensing and computing devices (See Figure 7(a)).  in the open space for localization assistance [38]. Figure 8 Figure 9 shows the trajectories obtained 519 using the SLAM and UWB positioning systems (presented 520 in red and blue curves), respectively. Figure 10 illustrates the  experiment. It is followed by setting the SLAM covariance 532 matrix parameter to 1 to perform the EKF for localization 533 fusion. In Figure 12(a), the blue, red and green curves repre-534 sent the localization trajectories obtained using UWB, SLAM 535 and the EKF fusion. The figure illustrates that, when the 536 visual SLAM system does not localize successfully, the EKF 537 will still utilize to the positioning result from the previous 538 covariance matrix computation. As a result, large errors will 539 present in the final fusion trajectory. In the implementation, 540 we change the covariance matrix parameter to 999 (or a huge 541 number) if the SLAM has a tracking failure detected. As the 542 fusion trajectories indicated in Figure 12(b), the SLAM fail-543 ure does not affect the result of EKF fusion when the proposed 544 adjustment is used.

545
For the evaluation of SLAM algorithms with ground-truth 546 data, most RGB-D datasets publicly available are collected 547 FIGURE 11. Some of the images captured and used for the visual SLAM computation.  and (x gt (i), y gt (i), z gt (i)), respectively. We conduct the experi-564 ments with several navigation trajectories in an indoor scene, 565 and evaluate with UWB, SLAM and EKF fusion localization. 566 The experimental result of a mobile robot navigating in a 567 rectangular path is shown in Figure 13, with the localization 568 trajectories derived from UWB, SLAM and the EKF fusion 569 marked in blue, red and green, respectively. Table 1 tabulates 570 the translation errors from different localization methods, 571 where SLAM' indicates the SLAM result with drifting cor-572 rection. The table shows that the mean errors along the x and 573 y axes are reduced from 0.647 meters to 0.182 meters and 574 from 1.399 meters to 0.387 meters after the drift correction, 575 respectively. The results illustrate that our proposed technique 576 is able to provide great improvements on drifting errors. 577 In the experiment, the EKF fusion can generally suppress 578 the translation error. Table 1 also indicates that the UWB 579 positioning has slight better performance in the y direction, 580 despite the noisy trajectory as illustrated in Figure 13. The 581 main reason is the evaluation carried out only on discrete 582 sample locations, which results in the UWB localization 583 approximate the ground-truth even if the complete path is 584 rather noisy. Figure 14 shows the CDF (cumulative distribu-585 tion function) of the positioning error per meter. The mean 586 and standard deviation of the EKF fusion results are 0.616 and 587 0.045, respectively. Since the proposed technique is able to 588 perform in real-time, we consider the comparison in terms of 589 computation time is not significant.

590
The second experiment is performed on a more compli-591 cated navigation path to evaluate the stability of the proposed 592 technique in a spacious indoor space. The mobile robot moves 593 in straight lines, curves and irregular paths rather than travels 594 VOLUME 10, 2022     Table 2. As indi-602 cated in the table, although the drifting is more severe in the 603 z-axis due to the lack of detected features for localization, 604 we are able to suppress the errors in all directions as shown 605 in SLAM'. In addition, the EKF fusion result not only 606 successfully reduces the error, but also avoids the UWB 607 positioning noise. This has demonstrated the feasibility of 608 our approach for many challenging situations in occurred 609 practical applications.

611
One major issue of current SLAM systems is the accumu-612 lation of drift errors for the long range navigation. In this 613 work, we present an approach for indoor localization by inte-614 grating different positioning approaches. The basic idea is to 615 mitigate the localization errors introduced by proprioceptive 616 sensors through the integration with UWB technology. The 617 localization trajectory in the global scale is split to a frame-618 by-frame basis to estimate the relative position of the robot 619 movement. Since the error from UWB does not accumulate 620 over time, the localization failure can be monitored continu-621 ously and the drifting errors of the local trajectory are reduced 622 by the UWB positioning. In the experiments, various SLAM 623 approaches are conducted in the real-world scenes for perfor-624 mance comparison. The evaluation has illustrated the robust-625 ness of the proposed drift-free visual SLAM by integrating 626 the UWB technology for indoor localization. In the devel-627 opment, we consider the application for automated factory 628 in a scope of about 20 squared meters. The limitation of the 629 current system is mainly the assumption that the environment 630 is relatively spacious without dynamic objects. In the future 631 work, we will evaluate more complex scenes, and consider the 632 applications on high precision manufacturing using mobile 633 robots.