Introduction
Simultaneous localization and mapping (SLAM) is a crucial technology for unmanned autonomous operations, enabling effective environment perception and real-time localization and mapping of mobile robots. SLAM methods can be categorized as LiDAR SLAM and visual SLAM based on the sensors used [1]. Visual SLAM, relying on cameras, faces uncertainties in outdoor environments due to light and environmental changes, impacting its accuracy and robustness [2]. In contrast, LiDAR SLAM offers significant advantages in measurement accuracy, range, and resistance to environmental interference [3]. Consequently, LiDAR SLAM excels in mapping and localization, finding widespread applications indoors and outdoors.
With 360
On the one hand, loop closure detection presents a challenging issue in LiDAR SLAM as it prevents significant deviations in state estimation over time [6], which crucial for accurate and effective back-end mapping. LiDAR-based loop closure detection methods are typically classified into local and global descriptors [3], [7]. Global descriptors, proposed to address the instability of matching local key points, calculate similarity between the current frame and others to detect loop closures [8]. Dube et al. [9] proposed SegMatch, a segment-based scene recognition method, as a loop closure detection module for global optimization of vehicle pose [10], [11]. However, due to additional point cloud processing, it increases the computational cost of loop closure detection.
On the other hand, incorporating back-end nonlinear global optimization, such as GTSAM [12] and Levenberg–Marquardt (LM), trajectory is optimized to produce high-precision map on a larger scale. The main methods for back-end global optimization include filter-base and graph-based approaches. Due to computational complexity and limited robustness of former, graph-based optimization dominates the field and with the following four categories:
least squares;
random gradient descent;
relaxation iteration;
manifold iteration, which can conveniently construct and solve large-scale factor graph problems.
Aiming at inaccuracy of positioning in existing LiDAR SLAM [3], [10], [13], especially in large-scale environments, the data information of multiple sensors is huge and complex, requiring a large amount of computing resources and effective data processing schemes, and for long-term system operation, it may lead to sensor drift, cumulative errors and other problems, in this article, by introducing loop closure detection and back-end optimization, FAST-LIO2 [14] is improved to reduce the cumulative errors and improve the accuracy of localization and mapping. The main contributions are as follows.
A tightly coupled localization method is proposed based on LiDAR key features and IMU. Curvature-based feature extraction scheme is proposed to enhance the spatial constraints of planar feature points. The initial pose estimation of LiDAR-Inertial Odometry is then obtained by an IEKF to improve the accuracy of the localization.
A loop closure detection method based on height-intensity scan context (HISC) global descriptor is proposed. By integrating geometric and intensity information, HISC global descriptor is designed and an adaptive distance threshold is introduced to correct the accumulated errors over time in LiDAR-IMU system, improve the trajectory and mapping accuracy of large-scale environment.
A feature matching is proposed for calculating the pose transformation between pairs of loop-point clouds, to reduce the computational cost of pose estimation. In addition, we introduce a trajectory global optimization method based on GTSAM, which constructs a factor graph using iSAM2 and incorporates GPS factors to impose absolute position constraints, thereby reducing mapping trajectory drift.
The rest of this article is organized as follows. In Section II, we discuss current research works of LiDAR SLAM methods. Section III describes the details of the proposed method, including LiDAR inertial odometry, loop closure detection, global optimization and mapping. The experiments are presented in Section IV. And the experimental results are discussed in Section V. Finally, Section VI concludes this article.
Related Work
At present, the research of LiDAR SLAM has been very extensive. In this section, we focus on analyzing the research status of LiDAR SLAM, loop closure detection, and tightly coupled of LiDAR and IMU sensors.
A. LiDAR SLAM
The development of SLAM can be categorized into three stages: early (1986–2010), middle (2010–2014), and modern (2014–present). In the early stage, methods based on Kalman filter (KF) held prominence, followed by the emergence of methods based on extended Kalman filter (EKF) and particle filter, among others. In 2010, the introduction of Karto SLAM [15] marked the arrival of optimization-based SLAM, which demonstrated better performance compared to filter-based methods. In 2014, Zhang Ji [5] proposed Lidar Odometry and Mapping in Real-time (LOAM), which marked the basic maturity of LiDAR SLAM, and separated the localization and mapping into two algorithms. One performs high-frequency odometry with low accuracy (localization), while the other operates at a lower frequency to perform point cloud matching and registration (mapping and odometry correction). By combining these, a high-precision and real-time LiDAR odometry system is achieved. However, it also has certain limitations, such as reduced optimization efficiency in feature-rich environments due to the lack of a closed-loop detection function [16]. As a result, enhancing the robustness of SLAM has become a new research focus, leading to continuous improvements in LOAM's performance.
Based on Ceres Solver [17], Wang et al. [1] proposed Fast LiDAR Odometry and Mapping (F-LOAM), which improves the frame matching accuracy of LOAM by removing scan-to-scan pose estimation, retaining only scan-to-map pose optimization, and adopting a noniterative two-level distortion compensation method to reduce computational complexity and cost. According to the official evaluation criteria of KITTI dataset [18], although F-LOAM achieved the best accuracy in pose estimation, it lacks loop closure detection, leading to accumulated errors and reducing effectiveness in large-scale scenarios [3]. Shan [19] proposed Light-weight and Ground-Optimized LiDAR Odometry and Mapping (LeGO-LOAM), which incorporated loop closure detection and lightweight and ground optimization on feature extraction, by combining ICP and Euclidean distance to identify loop closure points. Compared to other methods, it achieved similar or better accuracy. However, the loop closure detection in LeGO-LOAM is unstable and may occasionally result in detection errors or missed identifications [13].
B. Optimization Methods for LiDAR SLAM
Many researchers have attempted to enhance the performance of SLAM systems by incorporating additional modules. Among them, loop closure detection is a critical module in SLAM. Over time, LiDAR odometry accumulates errors, leading to drift problems in long-term navigation and mapping. To address this issue in large-scale scenarios, integrating a loop closure detection module in the back-end can correct odometry drift [1]. Kim et al. [8] proposed a method using scan context global descriptor to reduce the dimensionality of point cloud data frames, which are stored in a 2-D matrix, the rows represent the distance of the divided ground regions (i.e., bins) from the LiDAR center, and the columns represent the angle of these regions relative to the x-direction. By using the global descriptor to calculate the similarity between current frame and others, loop closure detection results can be obtained (e.g., [13] and [20]).
However, existing 3-D loop closure detection methods often use local or global geometric descriptors while neglecting the intensity information in point clouds. To address this limitation, a global descriptor called intensity scan context (ISC) was proposed based on F-LOAM, which incorporates both geometry and intensity information [21]. Furthermore, to enhance the efficiency of loop closure detection, a two-stage hierarchical rerecognition method was introduced. Optimized-sc-f-loam [3] is also a loop closure detection method based on F-LOAM. Unlike [21], it employs a feature-point matching method instead of using the original LiDAR point cloud to calculate the pose transformation between the current frame and the closed-loop frame, resulting in reduced computation time. In the back-end, global optimization is based on GTSAM, and an adaptive distance threshold is utilized for more precise loop closure detection.
Integrating an IMU is another approach to enhance the performance of LiDAR SLAM system. The incorporation of IMU can significantly improve the accuracy and robustness of the LiDAR odometry and compensate motion distortions in LiDAR scans. With the rapid development of robotics technology, localization and mapping techniques are being applied to increasingly complex and dynamic scenarios [22]. The combination of LiDAR and IMU offers advantages such as high accuracy, fast speed, and immunity to environmental lighting conditions, which makes it widely employed [23].
At present, LiDAR and IMU fusion can be categorized into the loosely-coupled and the tightly-coupled. The former deal with two sensor information separately to infer their motion constraints which are fused later (e.g., [5], [19], [24], and [25]). The authors in [26] proposed combining IMU measurements with attitude estimates, which obtained from a LiDAR-based Gaussian particle filter and a prebuilt map. Generally speaking, the loosely-coupled fusion method is computationally effective and has good real-time performance. However, the motion constraints between LiDAR and IMU may lead to information loss [4], making it difficult to ensure accuracy in high-speed motion or degraded scenes.
Another approach is the tightly-coupled method that directly fuses LiDAR information and IMU data by joint optimization, which can be further divided into optimization-based [27], [28] and extended KF-based approaches [4], [29]. The authors in [30] proposed a graph optimization based tightly-coupled approach, which combined prior information from the LiDAR-IMU odometry and optimization method based on rotational constraints to further refine the odometry pose, it can obtain a globally consistent and robust mapping trajectory. However, constraints and batch optimization in constructing local map windows are time consuming and have poor real-time performance. Shan et al. proposed LIO-SAM [28], which is based on incremental smoothing [31] that introduced a global factor graph consisting of LiDAR odometry factors, IMU preintegration factors, GPS factors, and loop closure factors, and it achieved high-precision global consistency in mapping and motion estimation for the robot. The authors in [29] proposed a LiDAR-IMU fusion odometry framework with high computational efficiency and good robustness, which adopted the iterative extended Kalman Filter similar to [4], and used forward propagation to predict the state as well as backward propagation to correct motion distortion in LiDAR scanning, and proposed a new Kalman gain calculation formula to reduce the computational complexity. But the system lacked back-end optimization and only worked in small environment. The authors in [14] inherited FAST-LIO and introduced a new data structure called the incremental k-d tree (ikd-tree) [32], which supports incremental update (e.g., point insertion and deletion) as well as dynamic rebalancing, significantly reducing the amount of computation. This approach improves the accuracy and robustness of odometry and map generation. However, similar to [29], this method is also limited to small-scale environments.
In summary, 3-D LiDAR faces challenges in long-range environments with sparse structural features, leading to localization and mapping failures and inadequate construction of surrounding environment maps. The tightly-coupled of LiDAR and IMU incurs high computational costs, significant time consumption, and large memory usage, which making it difficult to ensure real-time and accurate operation of the system. Therefore, improving the precise localization and mapping of 3D point cloud maps in large-scale environments, both indoors and outdoors, is of great significance for applications such as mobile robotics and autonomous driving.
Methodology
The proposed LiDAR SLAM method consists of three parts: 1) LiDAR-IMU odometry; 2) loop closure detection; and 3) global optimization and mapping, as in Fig. 1, to enable real-time six-DOF state estimation of the robot and establish a globally consistent map. To address the asynchronization issue between the LiDAR and IMU, the LiDAR point cloud data is first preprocessed. Considering challenges as the large quantity of raw LiDAR points, slow data processing speed, and computational complexity, local neighbor that based feature extraction is given to compute curvature and extracted feature points, including planar points and edge points. Next, the motion model provided by the IMU is employed to predict the state of feature points, which is fused and optimized using iterative Kalman Filter to estimate the robot's pose and position during motion. Simultaneously, the IMU backward propagation is employed to mitigate the motion distortion of the LiDAR, to obtain more accurate and robust localization results. Subsequently, a loop closure detection method based on the HISC is proposed to reduce the localization drift caused by cumulative errors in the LiDAR-IMU odometry. Then, an adaptive distance threshold is introduced to reduce the probability of wrong loop closure detection and the similarity between the current frame and historical frames is computed based on proposed global descriptor, which is constructed by the distortion-corrected feature points. Finally, by adding odometry factors, loop closure factors, and GPS factors to the iSAM2 factor graph, a global optimization-based GTSAM is presented to correct the accumulated drift errors and generate globally consistent trajectories and maps.
A. LiDAR-Inertial Odometry
1) Feature Extraction
Due to the large number of points contained in a single scan of LiDAR, traditional point cloud matching methods, such as ICP and NDT, have low computational efficiency. Therefore, it is necessary to process the raw point cloud data
\begin{align*}
\sigma _{k}^{(m,n)}=\frac{1}{|S_{k}^{(m,n)}|}\sum_{\text{p}_{k}^{(m,j)}\in S_{k}^{(m,n)}} {||\text{p}_{k}^{(m,j)}-\text{p}_{k}^{(m,n)}||} \tag{1}
\end{align*}
Extract feature points (yellow indicates edge points, green indicates planar points). (a) KITTI dataset. (b) Self-collected dataset.
2) Tightly-Coupled LiDAR-IMU With IEKF
To address the issue of localization and mapping failures in scenes with insufficient structural features, this article inherits the FAST-LIO2 [14] by integrating IMU and LiDAR with IEKF. This approach aims to handle degraded and weak-textured scenes and improve system stability. It mainly includes forward propagation based on IMU measurements and iterative updating based on LiDAR scanning.
1) Forward Propagation Based on IMU
State definitions as follows:
\begin{align*}
x={{[{{p}^{T}},{{v}^{T}},{{R}^{T}},b_{\omega }^{T},b_{a}^{T},{{g}^{T}}]}^{T}} \tag{2}
\end{align*}
Assume the optimal state estimate after fusing the last (i.e.,
\begin{align*}
{{\hat{x}}_{i+1}}&={{\hat{x}}_{i}}\boxplus \left(\Delta t\text{f}\left({{{\hat{x}}}_{i}},{{\mu }_{i}},0 \right) \right);{{\hat{x}}_{0}}={{\bar{x}}_{k-1}} \tag{3}
\\
{{{\hat{\text{P}}}}_{i+1}}&={{{\text{F}}}_{{{{\tilde{x}}}_{i}}}}{{{\hat{\text{P}}}}_{i}}{{\text{F}}_{{{{\tilde{x}}}_{i}}}}^{T}+{{\text{F}}_{{{w}_{i}}}}{{Q}_{i}}{{\text{F}}_{{{w}_{i}}}}^{T};{{{\hat{\text{P}}}}_{0}}={{{\bar{\text{P}}}}_{k-1}} \tag{4}
\end{align*}
\begin{align*}
{{\text{F}}_{{{{\tilde{x}}}_{i}}}}&=\frac{\partial \left({{x}_{i+1}}\boxminus {{{\hat{x}}}_{i+1}} \right)}{\partial {{{\tilde{x}}}_{i}}}|{{\tilde{x}}_{i}}=0,{{w}_{i}}=0 \tag{5}\\
{{\text{F}}_{{{{\tilde{x}}}_{i}}}}&=\left[ \begin{matrix}\text{I} & \text{I}\Delta t & 0 & 0 & 0 & 0 \\
0 & \text{I} & -\hat{R}{{ \lfloor {{a}_{m}}-{{b}_{a}} \rfloor }_{\wedge }} & 0 & -\hat{R}\Delta t & \text{I}\Delta t \\
0 & 0 & \text{Exp}(-({{\omega }_{m}}-{{b}_{\omega }})\Delta t) & \text{-I}\Delta t & 0 & 0 \\
0 & 0 & 0 & \text{I} & 0 & 0 \\
0 & 0 & 0 & 0 & \text{I} & 0 \\
0 & 0 & 0 & 0 & 0 & \text{I} \\
\end{matrix} \right] \tag{6}\\
{{\text{F}}_{{{w}_{i}}}}&=\frac{\partial ({{x}_{i+1}}\boxminus {{{\hat{x}}}_{i+1}})}{\partial {{w}_{i}}}|{{\tilde{x}}_{i}}=0,{{w}_{i}}=0 \tag{7}\\
{{\text{F}}_{{{w}_{i}}}}&=\left[ \begin{matrix}0 & 0 & 0 & 0 \\
-\text{I}\Delta t & 0 & 0 & 0 \\
0 & -\text{I}\Delta t & 0 & 0 \\
0 & 0 & -\text{I}\Delta t & 0 \\
0 & 0 & 0 & -\text{I}\Delta t \\
0 & 0 & 0 & 0 \\
\end{matrix} \right]. \tag{8}
\end{align*}
The forward propagation continues until reaching the end time of the current
2) Residual Computation
Assume the estimate of state
\begin{align*}
{\hat{\text{p}}}{{_{j}^{\text{k}}}^{W}}={\hat{\text{T}}}{{_{{{I}_{k}}}^{\text{k}}}^{W}}{\hat{\text{T}}}{{_{{{L}_{k}}}^{\text{k}}}^{I}}{{\text{p}}_{j}}^{L} \tag{9}
\end{align*}
For each LiDAR feature point, we assume that the nearest plane or edge defined by its neighboring feature points in the map represents its true location. The residual is defined as the distance between the estimated global coordinates
\begin{align*}
{{\text{z}}_{j}}^{\text{k}}&={{\mu }_{j}}^{T}({\hat{\text{p}}}{{_{j}^{\text{k}}}^{W}}-{{\text{q}}_{j}}^{W}) \tag{10}
\\
{{\text{z}}_{j}}^{\text{k}}&={{ \lfloor {{\mu }_{j}} \rfloor }_{\wedge }}({\hat{\text{p}}}{{_{j}^{\text{k}}}^{W}}-{{\text{q}}_{j}}^{W}) \tag{11}
\end{align*}
Substituting (9) into (10) and (11) yields the measurement model
\begin{align*}
0&=\mathrm{h}_{j}\left(x_{k}, n_{j}{ }^{L}\right) \\
& =\mathrm{h}_{j}\left(\hat{x}_{k}{ }^{\mathrm{k}} \boxplus \tilde{x}_{k}{ }^{\mathrm{k}}, n_{j}{ }^{L}\right) \\
& \simeq \mathrm{h}_{j}\left(\hat{x}_{k}{ }^{\mathrm{k}}, 0\right)+\mathrm{H}_{j}{ }^{\mathrm{k}} \tilde{x}_{k}{ }^{\mathrm{k}}+\mathrm{v}_{j} \\
& ={\mathrm{z}_{j}}^{\mathrm{k}}+\mathrm{H}_{j}{ }^{\mathrm{k}} \tilde{x}_{k}{ }^{\mathrm{k}}+\mathrm{v}_{j} \tag{12}
\end{align*}
3) State Iterative Update
The propagated state
\begin{align*}
x_{k} \boxminus \hat{x}_{k}&=\left(\hat{x}_{k}{ }^{\mathrm{k}} \boxplus \tilde{x}_{k}{ }^{\mathrm{k}}\right) \boxminus \hat{x}_{k}\\
&=\hat{x}_{k}{ }^{\mathrm{k}} \boxminus \hat{x}_{k}+\mathrm{J}^{\mathrm{k}} \tilde{x}_{k}{ }^{\mathrm{k}} \sim \mathcal {N}\left(0, \widehat{\mathrm{P}}_{k}\right) \tag{13}
\end{align*}
Besides the prior distribution, the state distribution of (14) is computed based on the measurement model derived from (12).
\begin{align*}
-\mathrm{v}_{j}=\mathrm{z}_{j}{ }^{\mathrm{k}}+\mathrm{H}_{j}{ }^{\mathrm{k}} \tilde{x}_{k}{ }^{\mathrm{k}} \sim \mathcal {N}\left(0, \mathrm{R}_{j}\right). \tag{14}
\end{align*}
Combining the prior distribution in (13) and the measurement model from (14) yields the posteriori distribution of the state
\begin{align*}
\min _{\tilde{x}_{k}^{\mathrm{k}}}\left(\left\Vert x_{k} \boxminus \hat{x}_{k}\right\Vert _{\hat{\mathrm{P}}_{k}}^{2}+\sum _{j=1}^{m}\left\Vert \mathrm{z}_{j}{ }^{\mathrm{k}}+\mathrm{H}_{j}{ }^{\mathrm{k}} \tilde{x}_{k}{ }^{\mathrm{k}}\right\Vert _{\mathrm{R}_{j}}^{2}\right). \tag{15}
\end{align*}
This MAP problem can be solved by IEKF method as follows:
\begin{align*}
\text{K}&={{({{\text{H}}^{T}}{{\text{R}}^{-1}}\text{H+}{{\text{P}}^{-1}})}^{-1}}{{\text{H}}^{T}}{{\text{R}}^{-1}} \tag{16}
\\
{{\hat{x}}_{k}}{ }^{\text{k+1}}&={{\hat{x}}_{k}}{ }^{\text{k}}\boxplus (-\text{K}{{\text{z}}_{k}}^{\text{k}}-(\text{I}-\text{KH}){{({{\text{J}}^{\text{k}}})}^{-1}}({{\hat{x}}_{k}}{ }^{\text{k}}\boxminus {{\hat{x}}_{k}})) \tag{17}
\end{align*}
The above process repeats until convergence, then, the final optimal state and covariance estimates after convergence are given as follows:
\begin{align*}
{{\bar{x}}_{k}}&={{\hat{x}}_{k}}{ }^{\text{k+1}} \tag{18}
\\
{{{\bar{\text{P}}}}_{k}}&=(\text{I-KH})\text{P}. \tag{19}
\end{align*}
Subsequently, the optimal state estimate
B. Loop Closure Detection
To address the issue of accumulated errors in the LiDAR-IMU system during long-term operation, this article introduced a loop closure detection method to reduce positioning drift. The stability is improved by correcting the trajectory and optimizing the map. The Scan Context [8] and Intensity Scan Context [21] algorithms reduced computational complexity by dimensionality reduction, mapping the 3-D point cloud data to a 2-D image. The global descriptor, as shown in Fig. 3(b), is calculated to measure the similarity between point cloud frames and determine if a loop closure is formed. However, it utilized the original point cloud from the LiDAR for constructing the global descriptor, resulting in high computational cost and a high rate of wrong loop closures. By using feature points after front-end IMU distortion corrected, we introduced a global descriptor to improve accuracy in measuring the similarity between point clouds. To better represent the characteristics of the point cloud, a combined consideration of intensity and height information is proposed, known as the HISC global descriptor. To address the limitation of fixed distance threshold, which may result in missing loop closures or false positives, an adaptive distance threshold is proposed to replace it to reduce the probability of false loop closure detection.
Using the top view of a point cloud from a 3-D scan (a), we partition ground areas into bins, that are divided according to azimuth (from 0 to
1) Global Descriptor of HISC
This article introduced a novel global descriptor, which constructed by height [8] and intensity [21] information. The height information provides an effective summary of the vertical structure of the surrounding buildings, eliminating the need for complex calculations to analyze the point cloud characteristics. The maximum height indicates the visible portion of the surrounding structure. This self-centered visibility allows for the analysis of place characteristics [8], making it useful for loop closure detection and verifying if the system passes through the same location. Moreover, objects exhibit different intensity values, and intensity information serves as a representation of the reflectance of surfaces in the environment. For instance, highly reflective materials like metal plates have higher intensity values, while concrete surfaces have lower values. Therefore, intensity information can effectively serve as a feature to assist laser odometry in location identification.
Assuming the LiDAR scan contains
\begin{align*}
\Omega (i, j)=\max _{\mathrm{p}_{k} \in S_{i, j}}\left(\eta _{k}+z_{k}\right). \tag{20}
\end{align*}
According to the above process, HISC global descriptor is finally expressed as the matrix
\begin{align*}
I=\left(a_{i j}\right) \in \mathbb {R}^{N_{r} \times N_{s}}, a_{i j}=\Omega (i, j). \tag{21}
\end{align*}
The matrix contains geometric information and intensity information of the environment. The Similarity Score between Scan Contexts and Two-phase Search Algorithm method [8] is employed to calculate the similarity between point cloud pairs according to
2) Adaptive Distance Threshold
The original Scan Context method uses a fixed distance threshold to detect and exclude point cloud pairs that have high similarity but are far apart. Setting a small fixed distance threshold may result in undetected loops, while setting a large threshold may detect more false loops. To address this problem, an adaptive distance threshold [3] is introduced to replace it, which effectively detects closed loops and reduces the occurrence of false loop closures.
When a loop closure was detected, the matrix in (21) is computed to obtain a pair of similarity loop-closing points, denoted as
\begin{align*}
\tilde{T}_{k}^{\text{loop}}={{T}_{\text{loop}}}^{{{W}^{-1}}}{{T}_{k}}^{W}. \tag{22}
\end{align*}
The distance
\begin{align*}
d=\sqrt{\tilde{T}_{k}^{\text{loop}}.{{x}^{2}}+\tilde{T}_{k}^{\text{loop}}.{{y}^{2}}+\tilde{T}_{k}^{\text{loop}}.{{z}^{2}}} \tag{23}
\end{align*}
Then, a linear function is constructed using the number of keyframes
\begin{align*}
{{d}_{\text{thre}}}=60+k/100. \tag{24}
\end{align*}
If
C. Global Optimization and Mapping
In order to address the problem of system error accumulation over time and loop closure detection, we proposed back-end optimization to correct system errors. While local optimization only considered the relationship between consecutive frames, it is prone to noise and motion deviations. Therefore, global optimization is introduced, which utilizes global observation data and pose to correct errors. Scan-to-map constraints are constructed for all observed key frames using the feature point matching method [5], forming a large-scale nonlinear least squares problem. The LM algorithm is employed to minimize the error functions, eliminate errors, and correct trajectory drift. In addition, the GTSAM library based on factor graph is employed for global optimization. The iSAM2 algorithm [31] is utilized to incorporate odometry factors, loop closure detection factors, and GPS factors. By using joint optimization, errors are effectively eliminated, enabling the determination of an optimal trajectory and pose for each key frame. This leads to more accurate map and positioning results, enhancing the precision of map construction.
1) Pose Estimation Based on Feature Matching
The feature points
\begin{align*}
\mathrm{p}_{k}{ }^{W}=R_{\text{loop }}^{W} *\left(R_{L}^{\text{loop }} * \mathrm{p}_{k}{ }^{L}+t_{L}^{\text{loop }}\right)+t_{\text{loop }}^{W} \tag{25}
\end{align*}
Then, a feature point matching-based method is used to match the planar points and edge points in the current point cloud frame with the submap. Point-to-line and point-to-plane positional constraints are then constructed as (26) and (27), respectively.
\begin{align*}
d_{\varepsilon _{k}}=\frac{\left|\left(\mathrm{p}_{k}{ }^{W}-\mathrm{p}_{l}{ }^{W}\right) \times \left(\mathrm{p}_{k}{ }^{W}-\mathrm{p}_{m}{ }^{W}\right)\right|}{\left|\mathrm{p}_{l}{ }^{W}-\mathrm{p}_{m}{ }^{W}\right|} \tag{26}
\end{align*}
\begin{align*}
d_{S_{k}}=\frac{\left|\begin{array}{c}\left(\mathrm{p}_{k}{ }^{W}-\mathrm{p}_{l}{ }^{W}\right) \\
\left(\mathrm{p}_{l}{ }^{W}-\mathrm{p}_{m}{ }^{W}\right) \times \left(\mathrm{p}_{l}{ }^{W}-\mathrm{p}_{j}{ }^{W}\right) \end{array}\right|}{\left|\left(\mathrm{p}_{l}{ }^{W}-\mathrm{p}_{m}{ }^{W}\right) \times \left(\mathrm{p}_{l}{ }^{W}-\mathrm{p}_{j}{ }^{W}\right)\right|} \tag{27}
\end{align*}
Then, using (26) and (27), nonlinear equations are constructed, and the L-M nonlinear optimization method is used to iteratively solve for the pose transformation matrix
\begin{align*}
\min \left\lbrace \sum _{i=1}^{\mathrm{m}} d_{\varepsilon }\left(\mathrm{p}_{i}(X)\right)+\sum _{j=1}^{\mathrm{n}} d_{\mathcal {S}}\left(\mathrm{p}_{j}(X)\right)\right\rbrace \tag{28}
\end{align*}
Finally, the pose transformation matrix
2) Pose Graph Construction
This module takes the transformation matrix
The constructed factor graph is then subjected to global optimization, resulting in optimized poses
Experiment
To evaluate the performance of the proposed algorithm, we conducted tests on both public dataset and self-collected dataset. The KITTI dataset sequences 05, 06, 07, and 09 are used to validate the localization accuracy of the proposed algorithm, and compared with A-LOAM [5] and FAST-LIO2 [14]. In order to verify the time efficiency of the proposed loop closure detection algorithm, that compared with SC-A-LOAM on KITTI sequences. In order to demonstrate the performance of the proposed global descriptor, the proposed method, SC [8] and ISC [21] methods were tested by comparing the RMSE of the generated trajectory on KITTI dataset. For further evaluation, self-collected dataset by VLP-16 LiDAR and N100 IMU are named Park dataset and Street dataset. A comparison of trajectory results was performed against SC-A-LOAM [20], Optimized-sc-f-loam [3], and FAST-LIO2 [14]. Then, comparisons were made with SC-A-LOAM, Optimized-sc-f-loam, LIO-SAM,FAST-LIO2 on the KITTI sequence 07, the outdoor Park dataset, and indoor dataset to validate the effectiveness of the proposed algorithm in mitigating mapping drift. Finally, the CPU usage and memory usage of five different algorithms running three different data sets are analyzed. The experimental data in this article is the best result of multiple measurements.
A. Hardware Platform
The experimental hardware platform in this article consisted of a laptop with an Intel i5-12500H, 16 GB RAM, and an NVIDIA GeForce RTX 3050 graphics card. The operating system used was Ubuntu 20.04. The software framework employed was Robot Operating System, which version is Noetic, and the primary programming language used was C++. The sensors utilized in this study included the VLP-16 mechanical LiDAR and the WHEELTEC N100 IMU.
For the experimental evaluation, the publicly available KITTI dataset, which contains data from outdoor autonomous driving scenarios, was utilized. The dataset includes 22 different outdoor environment sequences, and for 11 of these sequences (00-10), ground-truth trajectories obtained from GPS are provided. Each data sequence consists of grayscale images, color images, and LiDAR point cloud data. In addition, a campus environment data acquisition platform was used. Fig. 4(a) illustrates the configuration of this data collection platform, and Fig. 4(b) shows the environment.
Self-data collection in the campus environment. (a) Shows the self-data collection platform consisting of LiDAR and IMU. (b) Shows the self-data collection environment and the experimental operating environment.
B. Localization Accuracy
1) KITTI Dataset
In this article, we use the EVO evaluation tool to assess the localization accuracy of the proposed algorithm. Specifically, we compared the performance of the proposed algorithm with the ground-truth on KITTI dataset sequences 05, 06, 07, and 09. These sequences contain loop closures, making them suitable for evaluating the effectiveness of the improved loop closure detection algorithm proposed in this article. The evaluation was based on trajectory visualization, Absolute Trajectory Error (ATE), and Relative Pose Error (RPE).
Fig. 5 illustrates the visual comparison results of the proposed algorithm, A-LOAM, and FAST-LIO2 running on KITTI dataset sequences 05, 06, 07, and 09, along with the ground truth trajectories.
Visualization of trajectory comparison between A-LOAM, FAST-LIO, and our method on KITTI dataset sequences. (a), (b), (c), and (d) correspond to sequences 05, 06, 07, 09, respectively. (a) 05 sequence. (b) 06 sequence. (c) 07 sequence. (d) 09 sequence.
Table I presents the ATE statistics comparing the estimated trajectories obtained by the three algorithms on the sequences of the KITTI dataset with the ground truth trajectories. From the comparison of the statistical values in the table, it can be observed that the proposed algorithm has smaller error values in sequences 05 and 07. The proposed algorithm also achieves the lowest average error (mean) and root mean square error (RMSE) across all four sequences. Compared to the A-LOAM algorithm, the RMSE of the proposed algorithm are reduced by 65.1%, 73.6%, 78.3%, and 61.3% in the four sequences, respectively. Similarly, compared to the FAST-LIO2 algorithm, the RMSE is reduced by 59.4%, 1.16%, 36.4%, and 12.0%. These results demonstrate that the proposed algorithm achieves smaller trajectory estimation errors compared to the ground truth trajectories in the KITTI sequences, indicating higher accuracy of the algorithm.
The RPE is used to compare pose increments and describe the accuracy of the pose difference between two frames relative to the real pose under a fixed time difference. The error values of the three algorithms on the four sequences are shown in Table II. It can be observed that the algorithm proposed in this article does not achieve the minimum error in each case. However, for sequences 05, 06, and 09, the RMSE statistics of the proposed algorithm are the lowest among the three algorithms. In addition, the proposed algorithm also exhibits the majority of the lowest errors in other statistical values. Overall, the algorithm proposed in this article demonstrates superior performance.
2) Self-Collected Dataset
The experimental platform shown in Fig. 4 was used to collect dataset within the campus of Northwest A&F University, including the Park dataset and Street dataset, which contain LiDAR and IMU information. The satellite map is shown in Fig. 6, and it was used for evaluating the algorithm. The algorithm proposed was compared with three other algorithms: SC-A-LOAM [20] and Optimized-sc-f-loam [3], these only use LiDAR data, and FAST-LIO2 [14] which utilize both LiDAR and IMU data. The trajectories obtained by running different algorithms are shown in Fig. 7, which include long-distance loop closures. Fig. 8 displays the trajectories in the x, y, and z directions. Due to different numbers of keyframes extracted, the length of the displayed trajectories are different.
Satellite map of campus dataset. (a) The dataset consists of a long-distance navigation around the campus garden. (b) The dataset depicts walking around the teaching building. (a) Park dataset. (b) Street dataset.
Visual comparison of the trajectory projections on the x-y plane obtained by running four different algorithms on the campus environment data set. (a) Park dataset. (b) Street dataset.
Trajectories output in the x-y-z directions by running four different algorithms on the campus environment dataset. (a) Park dataset. (b) Street dataset.
As shown in Fig. 7, it can be observed that the algorithm proposed in [3] exhibited significant drift in the trajectories that obtained from both environments, indicating poor robustness of the algorithm. Although the method proposed in [20] obtained a complete trajectory in the Street dataset, suffered from severe drift and incomplete trajectory in Park dataset. The algorithm in [14] produced a complete trajectory but fails to form valid loop closures. As seen in Figs. 7 and 8, our method can generate complete trajectories and effectively detect loop closures, and the output trajectory in the z direction is smoother, indicating lower elevation errors.
C. Time Consumption of Loop Closure Detection
This module is compared with the A-LOAM algorithm with Scan Context (SC-A-LOAM) to evaluate the computational efficiency of the loop closure detection process using the publicly available KITTI dataset (05, 06, 07, 09), and the number of keyframes extracted by the algorithm is also recorded, as shown in Table III. According to the data, our method achieved the best computational time for four sequences. For each sequence, compared to the SC-A-LOAM algorithm, the time consumption of loop closure detection of the proposed algorithm is reduced by 64.4%, 59.4%, 78.3%, and 74.3%, respectively, achieving improved algorithm efficiency while ensuring algorithm accuracy.
D. Performance Analysis of Global Descriptor
A new method for computing global descriptors was proposed. We used four sequences (05, 06, 07, 09) from the KITTI dataset to validate its effectiveness. The RMSE of the ATE and RPE between the trajectories obtained using three different global descriptor computation methods and the ground truth was compared using the EVO evaluation tool, and the results are shown in Table IV. It can be observed that for the 05 sequence, the ATE error of our is slightly higher than ISC. However, for the sequences 06, 07, and 09, our method achieved the lowest error values. Compared to the SC and ISC methods, our method shows an average improvement of approximately 16.18% and 8.97% for ATE, 0.23% and 0.21% for RPE, respectively.
E. Analysis of Mapping
In the experiment of outdoor 3-D point cloud map reconstruction, we used the KITTI sequence 07 and self-collected dataset for analysis. Four LiDAR SLAM methods (SC-A-LOAM, Optimized sc-f-loam, LIO-SAM, FAST-LIO2) are compared to validate the mapping effectiveness of the proposed method in this article. The results are shown in Fig. 9–12.
Visualization of the global map reconstruction for KITTI sequence 07. (a) for A-LOAM algorithm with Scan Context loop closure detection, (b) for F-LOAM algorithm with loop closure detection, (c) shows the LIO-SAM algorithm, (d) shows the mapping results of the FAST-LIO2 algorithm, (e) for the proposed method in this article. (a) SC-A-LOAM. (b) Optimized-sc-f-loam. (c) LIO-SAM. (d) FAST-LIO2. (e) Our.
Visualization of the local map reconstruction for KITTI sequence 07 (showing the map reconstruction on straight roads and at corners). (a) SC-A-LOAM. (b) Optimized-sc-f-loam. (c) LIO-SAM. (d) FAST-LIO2. (e) Our.
Global Map of Park dataset. (a) SC-A-LOAM. (b) Optimized-sc-f-loam. (c) LIO-SAM. (d) FAST-LIO2. (e) Our.
Loop closures detection of Park dataset. (a) and (b) show severe trajectory drift and the absence of proper loop closures. (c) The closed loop can be detected. (d) Cannot form a closed loop. (e) The proposed method performs loop closure correction. (a) SC-A-LOAM. (b) Optimized-sc-f-loam. (c) LIO-SAM. (d) FAST-LIO2. (e) Our.
According to the global map in Fig. 9, which can be observed that the lane markings and directional contours are clearly visible in the mapping results of all three algorithms. However, in Fig. 9(a), the point cloud is dense and the generated map exhibits cumulative drift; In Fig. 9(b), the mapping result is blurry and obstacles on the lane cannot be clearly displayed. The environmental map established in Fig. 9(c) has blurred boundaries and cannot clear obstacles on the displayed road. The point cloud in Fig. 9(d) is dense, and the global map is blurry, but the local map Fig. 10(d) is clear and has clear boundaries. However, the map established in Fig. 9(e) of the algorithm in this article eliminates most of the “ghosts” generated by cumulative drift. Fig. 10 shows that the mapping results of our algorithm was neat on straight roads and corners, with no significant drift. It clearly displays obstacles such as cars on the road, indicating improved robustness of the algorithm.
Meanwhile, in order to verify the accuracy of the proposed algorithm on the outdoor self-built dataset, the global mapping effect of the above five algorithms running the Park dataset is shown in Fig. 11. It can be seen from the figure that the SC-A-LOAM algorithm and the Optimized sc-f-loam algorithm are fuzzy, with large cumulative errors and serious drift. FAST-LIO2 algorithm has clear map boundaries and obvious color contrast, but the point cloud is dense and cannot effectively detect the closed loop. Then, the map established by the LIO-SAM algorithm and this algorithm is clear and can effectively detect the closed loop (Fig. 12) and make trajectory correction. However, the previous map established by LIO-SAM algorithm will disappear when the running time is too long. The map established by the algorithm in this article eliminates most of the accumulated drift and has clear boundaries, which will have high robustness and positioning accuracy in actual scenes.
For the experiment of 3-D point cloud map reconstruction in indoor scenes, this article collects the datasets by VLP-16 and N100 inertial navigation in the indoor environment of the library. By comparing the four kinds of LiDAR SLAM method (SC-A-LOAM, Optimized-sc-f-loam, LIO-SAM, FAST-LIO2) to verify the effect of the proposed method.
Fig. 13 shows the indoor environment for data collection. Fig. 14 shows the mapping results obtained by five different algorithms. It can be seen from the figure that (a) the mapping is fuzzy and the outline of the building can be generally seen but not clearly. (b) The mapping effect of the algorithm cannot see obstacles and clear routes in the environment. (c) The contour of the algorithm is clear, and the indoor environment structure can be clearly seen. (d) The algorithm creates large shadows, dense point clouds, and cannot clearly distinguish indoor objects. (e) The map built by the algorithm in this article is relatively clear, there is no double shadow, and the environment structure can be seen, but the point cloud is relatively sparse, and the map is not dense enough. To sum up, LIO-SAM and the algorithm in this article have the best effect of indoor environment mapping.
Indoor library global map. (a) SC-A-LOAM. (b) Optimized-sc-f-loam. (c) LIO-SAM. (d) FAST-LIO2. (e) Our.
F. CPU and Memory Usage Analysis
This section mainly analyzes the CPU and memory usage of different algorithms running different datasets by comparing five different algorithms running three different datasets (KITTI 07 sequence, outdoor park, and indoor library). The results are shown in Fig. 15, which show the maximum CPU and memory usage when running the dataset.
CPU and memory usage when running three dataset using five different algorithms. (a) KITTI 07 sequence, (b) self-collected outdoor park datasets, (c) self-collected indoor library datasets.
As can be seen from the data in the figure, for the KITTI dataset 07 sequence, the CPU usage was 82%, 48.0%, 71.7%, 100%, 59.8%, and the memory usage was 42.8%, 36.8%, 37.9%, 99.2%, 37.5%, respectively. Optimized-sc-f-loam is seemed low with poor accuracy, followed by the algorithm in this article. For the self-collected outdoor dataset, the CPU usage is 100%, 99.0%, 60.6%, 96.0%, 46.5%, and the memory usage is 45.0%, 54.7%, 40.9%, 88.9%, 63.3%, respectively. The lowest CPU usage is the algorithm in this article, with a little memory usage increasing. For the self-collected indoor dataset, the CPU usage was 98.0%, 49.0%, 48.0%, 53.5%, 39.8%, and the memory usage was 28.9%, 28.4%, 29.9%, 46.7%, 30.9%, respectively. The CPU usage was the lowest in this algorithm with a little memory usage increasing. According to the above analysis, it can be seen that the CPU usage of the proposed algorithm is small in different datasets, but for some outdoor datasets, our algorithm will produce a little memory usage increasing due to accurate mapping in complex environment.
Discussion
In this article, we proposed a LiDAR-IMU tightly-coupled SLAM method with IEKF and loop closure detection to address the problems of decreased localization and mapping accuracy caused by the accumulation of errors in a wide range of indoor and outdoor scenes. A loop closure detection method based on HISC global descriptor and a trajectory global optimization method based on GTSAM are proposed. The trajectory accuracy was measured by comparing with the ground-truth value of the public KITTI dataset, which show that in most cases, the proposed algorithm has lower trajectory errors and is close to the ground truth, as well as achieve more accurate pose estimation and improve localization accuracy. Then, the Global Descriptor experimental results show that the characteristics of points can be displayed more comprehensively by combining the intensity information and the height information, which reducing the probability of error loop closure detection and improving the accuracy. At the same time, the backend global optimization reduces the drift of the mapping trajectory. When constructing the global descriptor, we set a threshold for dividing the point cloud into grids. From the experimental results of both public and self-built datasets, it can be seen that the consistency of intensity information in the same subspace is basically ensured. However, if in different lighting conditions, reflective surfaces, or occlusions, a smaller threshold may need to be set to ensure strength consistency, which may lead to a slight increasing in time consumption. Time comparison with SC-A-LOAM on KITTI dataset indicates that the positional constraints constructed on extracted feature points can address high point cloud volume, reduce estimation cost, and shorten system runtime. Through the introduction of loop closure detection and global optimization, the problems of motion trajectory drift and map overlap are solved.
Furthermore, to verify the generalization ability of the algorithm, we conducted experiments on the Velodyne dataset of FAST-LIO2. As ground truth was not provided, we compared the trajectory and mapping result of the benchmark framework and the improved network on this dataset, as shown in Figs. 16–17. The trajectory results show that our algorithm coincides completely with the FAST-LIO2 trajectory. According to the mapping results, although the point cloud density of our drawing is not as dense as that of FAST-LIO2, the result of mapping is clear with less memory consumption and time consumption. Overall, the proposed algorithm can quickly, accurately and clearly reconstruct 3-D scenes in real time in the lowest CPU usage during runtime, with a little memory usage increasing.
Trajectory results of FAST-LIO2 and Our algorithms on the publicly available Velodyne dataset are shown in the figure. They are 2-D visualization results, x-y-z three-axis visualization results, and RPY visualization results.
Figure shows FAST-LIO2 and our algorithm for global and local mapping. (a) FAST-LIO2. (b) Our.
Based on the above experimental results, the SC-A-LOAM algorithm and the Optimized sc-f-loam algorithm have poor trajectory accuracy and mapping effect on the outdoor park dataset, with severe trajectory drift and blurred mapping; the FAST-LIO2 algorithm cannot detect and form a closed loop when running on a dataset with loopback, and occupies a large amount of CPU and memory; The proposed algorithm and LIO-SAM have high mapping accuracy in indoor and outdoor environments, good real-time performance, can accurately identify and detect loops, with low CPU usage during runtime. The experiments have verified that our proposed algorithm has high accuracy of positioning map construction, which can be applied to urban roads, forest parks, and spacious indoor environments, and can build environmental maps in real time with low CPU usage and a little memory increasing.
Conclusion
In order to solve the problem of accuracy degradation caused by cumulative errors in a wide range of environments and the failure of localization map construction caused by LiDAR SLAM degradation in scenes with insufficient structural features, this article-based FAST-LIO2 proposed a LiDAR-IMU tightly-coupled SLAM method with IEKF and loop closure detection. The proposed method has been thoroughly evaluated in a variety of environments on both public datasets and data collected on self-built data acquisition platforms. The results show that compared with A-LOAM, FAST-LIO2, and LIO-SAM, the proposed method can achieve similar or better accuracy, and the CPU usage is low, and good real-time performance of the algorithm. Future work will continue to optimize back-end mapping and reduce memory usage.