IN THE RECENT years, it has become evident that the simultaneous localization and mapping (SLAM) problem can be efficiently solved by exploiting the sparseness of the information matrix or techniques from sparse graph and sparse linear algebra (see, e.g., [1], [2], [3], [4], [5]). However, most of the methods based on sparse representations have focused on building a single large-scale map, resulting in the need to update a large map whenever a new observation is made.

Alternatively, local submap joining [6], [7] provides an efficient way to build large-scale maps. In local submap joining, a sequence of small-sized local submaps are built (e.g. , using conventional extended Kalman filter (EKF) SLAM [8]), and then combined into a large-scale global map. During map joining [6], the state vector of the local submap is first transferred into the global coordinate frame. Common features present in both the local and global maps are identified, and an EKF is used to enforce identity constraints to obtain the global map. The resulting map covariance matrix is fully correlated, and thus, the map fusion process is computationally demanding. Overall computational savings are achieved due to the fact that the frequency of global map updates is reduced.

This paper demonstrates that local submap joining can be achieved through the use of a sparse information filter. The proposed map joining filter, sparse local submap joining filter (SLSJF), combines the advantages of the local submap joining algorithms and the sparse representation of SLAM to substantially reduce the computational cost of the global map construction.

The paper is organized as follows. Section II presents the overall structure of the SLSJF and demonstrates that the associated information matrix is exactly sparse. The SLSJF algorithm is described in detail in Section III. Section IV provides simulation and experiment results. Section V discusses some properties of the SLSJF and some related work. Section VI concludes the paper.

SECTION II

## Overall Structure of SLSJF

This section presents the overall structure of the SLSJF and explains why it results in a sparse representation.

### A. Input and Output of SLSJF

The input to the SLSJF is a sequence of local submaps constructed by some SLAM algorithm. Local maps^{1} are denoted by
TeX Source
$$(\hat{X}^L, P^L)\eqno{\hbox{(1)}}$$where (the superscript “*L*” stands for the local map) is an estimate of the state vector
TeX Source
$$\eqalignno{X^L &= (X_r^L, X_1^L,\ldots, X_n^L)\cr &=(x_r^L, y_r^L,\phi _r^L, x_1^L, y_1^L, \ldots, x_n^L, y_n^L)&\hbox{(2)}}$$and *P*^{L} is the associated covariance matrix. The state vector *X*^{L} contains the robot final pose *X*_{r}^{L} (the subscript “*r*” stands for the robot) and all the local feature positions *X*_{1}^{L},…,*X*_{n}^{L}, as typically generated by conventional EKF SLAM. The coordinate system of a local map is defined by the robot pose when the building of the local map is started, i.e., the robot starts at the coordinate origin of the local map.

It is assumed that the robot starts to build local map *k*+1 as soon as it finishes local map *k.* Therefore, the robot end pose of local map *k* (defined as the global position of the last robot pose when building local map *k*) is the same as the robot start pose of local map *k*+1 (see Fig. 1).

The output of SLSJF is a global map. The global map state vector contains all the feature positions and all the robot end poses of the local maps (see Fig. 1).

### B. Why can Local Map Joining Have Sparse Representation?

The reason why SLSJF can be developed is that the information contained in each local map is the relative position information about some “nearby objects”—the features and the robot start/end poses involved in the local map.

By including all the objects (all the features and all the robot start/end poses) in the global map state vector, the local map joining problem becomes a large-scale estimation problem with only “local” information (similar to smooth and mapping (SAM) [2] and full SLAM [5]). When extended information filter (EIF) is used to solve the estimation problem, a nonzero off-diagonal element of the information matrix (a “link” between the two related objects) occurs only when the two objects are within the same local map.^{2} Since the size of each local map is limited, any object will only have links with its “nearby objects,” no matter how many (overlapping) local maps are fused (see Fig. 1). This results in an exactly sparse information matrix.

Since all the objects involved in the local maps are included in the global state vector, no marginalization is required in the map joining process, and thus, the information matrix will stay exactly sparse all the time. Because most of the robot poses are marginalized out during the local map building process, the dimension of the global state vector is much less than that of SAM [2] and full SLAM [5].

### C. Overall Structure of SLSJF

SLSJF fuses the local maps sequentially to build a global map, in a manner similar to [6] and [7], using the structure presented in Algorithm 1.

**Algorithm 1** Overall structure of SLSJF

**REQUIRE:** A sequence of *localmaps*: each local map contains

a state vector estimate and a covariance matrix

1: Set local map 1 as the *globalmap*

2: For *k* = 2:*p* (*p* is the total number of local maps), fuse

*localmap* *k* into the *globalmap*

3: End

SECTION III

## SLSJF Algorithm

This section describes the various steps of SLSJF algorithm, including global map initialization and update, reordering of the global state vector, state vector and covariance submatrix recovery, and data association.

### A. State Vector of the Global Map

The state vector of the global map contains the global positions of all features and all the robot end poses of local maps. For convenience, the origin of the global map is chosen to be the same as the origin of local map 1 (see Fig. 1).

After local maps 1 to *k* are fused into the global map, the global state vector is denoted as *X*^{G}(*k*) (the superscript “*G*” stands for the global map) and is given by
TeX Source
$$\eqalignno{& X^G(k)\cr&= (X^G_1,\ldots, X^G_{n_1}, X^G_{1e}, X^G_{n_1+1},\ldots, X^G_{n_1+n_2}, X^G_{2e}, \cr&\quad \ldots \cr&\quad X^G_{n_1+\cdots +n_{k-1}+1},\ldots, X^G_{n_1+\cdots +n_{k-1}+n_k}, X^G_{ke})&\hbox{(3)}}$$where *X*^{G}_{1},…,*X*^{G}_{n1} are the global positions of the features in local map 1; *X*^{G}_{n1+1},…,*X*^{G}_{n1+n2} are the global positions of these features in local map 2 but not in local map 1; *X*^{G}_{n1+⋅⋅⋅ +nk−1+1},…,*X*^{G}_{n1+⋅⋅⋅ +nk−1+nk} are the global positions of the features in local map *k* but not in local maps 1 to *k*−1. *X*^{G}_{ie} = (*x*^{G}_{ie},*y*^{G}_{ie},φ^{G}_{ie}) (1≤ *i*≤ *k*) is the global position of the robot end pose of local map *i*, which is also the robot start pose of local map *i*+1 . Here, the subscript “*e*” stands for robot “end pose.”

In the information filter framework, an information vector *i*(*k*) and an information matrix *I*(*k*) are used for map fusion. The relationship between the state vector estimate , the corresponding covariance matrix *P*(*k*) and *i*(*k*), *I*(*k*) is ([5])
TeX Source
$$I(k)\hat{X}^G(k)=i(k), P(k)=I(k)^{-1}.\eqno{\hbox{(4)}}$$

As *I*(*k*) is an exactly sparse matrix, it can be stored and computed efficiently. The state vector estimate is recovered after fusing each local map by solving the sparse linear equation [the first equation in (4)]. The whole dense matrix *P*(*k*) is neither computed nor stored in SLSJF. Small parts of *P*(*k*) required for data association are computed by solving a set of sparse linear equations, as outlined in Section III-C3.

When fusing local map *k*+1 into the global map, the features that are present in local map *k*+1 but have not yet been included in the global map, *X*^{G}_{n1+⋅⋅⋅ +nk+1}, …,*X*^{G}_{n1+⋅⋅⋅ +nk+nk+1}, together with the robot end pose of local map *k*+1, *X*^{G}_{(k+1)e}, are added into the global state vector *X*^{G}(*k*) in (3) to form the new state vector *X*^{G}(*k*+1).

### C. Data Association

Data association in SLSJF refers to finding the features in local map *k*+1 that are already included in the global map and their corresponding indexes in the global state vector. This is an essential step in any practically deployable SLAM algorithm, yet is often neglected in many of the sparse information filter-based SLAM algorithms published in the literature.

Data association is a challenge problem in EIF SLAM if only the geometric relationships among features present in the global and local maps are available. How this can be efficiently achieved in SLSJF is described in the following.

**Algorithm 3:** Data association between local map *k*+1 and the global map

**REQUIRE:** *globalmap* and *localmap* *k*+1

1: Determine the set of potentially overlapping local maps

2: Find the set of potentially matched features

3: Recover the covariance submatrix associated with

and the potentially matched features

4: Use a statistical data association method to find the match

*Set of Potentially Overlapping Local Maps*

Local map *i* cannot overlap with local map *k*+1 if the distance between the origins of the two maps in the global coordinate frame is larger than the sum of the two local map radii plus the possible estimation error. The radius of a local map is defined as the maximal distance from the local map features to its origin. Fig. 2 illustrates the idea. Note that the location estimate of the origin of local map *i* is (for 2≤ *i* ≤ *k*), while that of local map 1 is (0 0 0).

*Set of Potentially Matched Features*

A feature from potentially overlapping local maps is added to a potentially matched feature list, if the distance from it to is smaller than the radius of local map *k*+1 plus the maximal possible estimation error. This list is further simplified by removing any member that is located further than a predetermined threshold distance from all features in local map *k*+1.

#### Covariance Submatrix Associated With *X*^{G}_{ke} and all Potentially Matched Features

The covariance submatrix can be obtained by first computing the corresponding columns of the covariance matrix *P*(*k*), and then extracting the desired rows.

Using (4), the *l* th column of the covariance matrix *P*(*k*), *P*_{l} can be obtained by solving the sparse linear equation [5]
TeX Source
$$I(k)P_l=e_l\eqno{\hbox{(5)}}$$where
TeX Source
$$e_l= [\overbrace{0,\ldots,0}^{l-1},1{,}{,}0\ldots,0]^T.\eqno{\hbox{(6)}}$$

Since the Cholesky factorization of *I*(*k*), *L*_{k} is a triangular matrix satisfying *L*_{k} *L*_{k}^{T} = *I*(*k*), the sparse linear equation (5) can be solved efficiently by first solving *L*_{k} *q* = *e*_{l}, and then solving *L*_{k}^{T}*P*_{l} = *q* [2]. Note that the Cholesky factorization *L*_{k} is already available from step 5 of Algorithm 2 when fusing local map *k* into the global map, as described in Section III-G.

*Feature Matching*

Since both the state estimates and the covariance matrices of the potentially matched features are available, any statistical data association algorithm (such as the simple nearest neighbor method [8] or the more robust joint compatibility test with branch and bound technique [12]) can be used to find the matching features.

### E. Update the Global Map

Suppose that local map *k*+1 is given by (1). Since the local map provides a consistent estimate of the relative positions from the robot start pose to the local features and the robot end pose, this map can be treated as an observation of the true relative positions with a zero-mean Gaussian observation noise whose covariance matrix is *P*^{L}.

To state it clearly, suppose that the data association result is *X*_{1}^{L} ↔ *X*_{i1}^{G}, …, *X*_{n}^{L} ↔ *X*_{in}^{G} (including both old and new features). Then, the local map state estimate can be regarded as an observation of the true relative positions from *X*_{ke}^{G} to *X*_{i1}^{G},…, *X*_{in}^{G}, *X*_{(k+1)e}^{G}, i.e.
TeX Source
$$z_{\rm map}=\hat{X}^L=H_{\rm map}(X^G)+w_{\rm map}\eqno{\hbox{(7)}}$$where *H*_{map}(*X*^{G}) is the vector of relative positions given by
TeX Source
$$\left(\matrix{(x_{(k+1)e}^G-x_{ke}^G)\cos \, \phi _{ke}^G + (y_{(k+1)e}^G-y_{ke}^G)\sin \, \phi _{ke}^G \cr (y_{(k+1)e}^G-y_{ke}^G)\cos \,\phi _{ke}^G - (x_{(k+1)e}^G-x_{ke}^G)\sin \, \phi _{ke}^G\cr \phi _{(k+1)e}^G - \phi _{ke}^G\cr (x_{i1}^G-x_{ke}^G)\cos \, \phi _{ke}^G + (y_{i1}^G-y_{ke}^G)\sin \, \phi _{ke}^G \cr (y_{i1}^G-y_{ke}^G)\cos \,\phi _{ke}^G - (x_{i1}^G-x_{ke}^G)\sin \, \phi _{ke}^G\cr \vdots \cr (x_{in}^G-x_{ke}^G)\cos \,\phi _{ke}^G + (y_{in}^G-y_{ke}^G)\sin \, \phi _{ke}^G \cr (y_{in}^G-y_{ke}^G)\cos \,\phi _{ke}^G - (x_{in}^G-x_{ke}^G)\sin \, \phi _{ke}^G}\right)$$and *w*_{map} is the zero-mean Gaussian “observation noise” whose covariance matrix is *P*^{L}.

The “observation” *z*_{map} can now be used to update the information vector and the information matrix as follows:
TeX Source
$$\eqalignno{I(k+1)&= I(k)+\nabla H_{\rm map}^T (P^L)^{-1}\nabla H_{\rm map} \cr i(k+1)&= i(k)+\nabla H_{\rm map}^T(P^L)^{-1} \cr&\quad\times[z_{\rm map}{-}H_{\rm map}(\hat{X}^G(k)){+}\nabla H_{\rm map}\hat{X}^G(k)]&\hbox{(8)}}$$where ∇ *H*_{map} is the Jacobian of the function *H*_{map} with respect to *X*^{G}(*k*) evaluated at .

Since only involves two robot poses *X*^{G}_{ke},\break *X*^{G}_{(k+1)e} and some local features (a small fraction of the total features in the global map), the matrix ∇ *H*_{map}^{T} (*P*^{L})^{−1}∇ *H*_{map} in (8) and the information matrix *I*(*k*+1) are both exactly sparse.

### F. Reorder the Global Map State Vector When Necessary

The purpose of reordering the global state vector is to make the computation of Cholesky factorization (Section III-G), the state vector recovery (Section III-H), and the covariance submatrix recovery (Section III-C.3) more efficient. Many different strategies for reordering are available. The strategy proposed here is a combination of the approximately minimal degree (AMD) reordering [2] and the reordering based on distances [4].

Whether to reorder the global map state vector or not depends on where the features in local map *k*+1 are located within the global state vector. If all of the features in local map *k*+1 are present within the *n*_{0} elements from the bottom of the global state vector,^{3} then the state vector is left unchanged. If this condition is violated, which happens only when closing a large loop, then the state vector is reordered.

The state vector is reordered using the following process. The robot pose *X*^{G}_{(k+1)e} and the features that are within distance *d*_{0}^{4} to are placed at the bottom part of the state vector. Their order is determined based on the distances from them to . The smaller the distance, the closer the position to the bottom. All the other robot poses and features are placed in the upper part of the state vector, and they are reordered based on AMD.

The major advantage of reordering by AMD is that the number of fill-ins in Cholesky factorization will be reduced. The major advantage of reordering the nearby features based on distances is that once the reordering is performed, another reordering will not be required for the fusion of next few local maps. This is because the robot cannot observe features that are not located in the bottom part of the state vector until it travels a certain distance.

Once the state vector is reordered, the corresponding information matrix *I*(*k*+1) and information vector *i*(*k*+1) are reordered accordingly. For notational simplicity, they are still denoted as *I*(*k*+1) and *i*(*k*+1). Note that the Cholesky factorization of the reordered *I*(*k*+1) cannot be easily obtained from *L*_{k}.

### G. Compute the Cholesky Factorization of *I*(*k*+1)

The method used to compute the Cholesky factorization of *I*(*k*+1) depends on whether the global state vector was reordered in Section III-F or not.

*Case (i).* If the global state vector was not reordered in Section III-F, then the Cholesky factorization of *I*(*k*) (available from step 5 of Algorithm 2 when fusing local map *k*) is used to construct the Cholesky factorization of *I*(*k*+1) as follows.

By (8), the relation between *I*(*k*+1) and *I*(*k*) is
TeX Source
$$I(k+1) = I(k)+ \left[\matrix{ 0 &0 \cr 0 &\Omega}\right]\eqno{\hbox{(9)}}$$where the upper left element in Ω is nonzero. Here, Ω is a symmetric matrix determined by the term ∇ *H*_{map}^{T} (*P*^{L})^{−1}∇ *H*_{map} in (8). Its dimension is less than *n*_{0} since otherwise the state vector would have been reordered.

Let *I*(*k*) and its Cholesky factorization *L*_{k} (a lower triangular matrix) be partitioned according to (9) as
TeX Source
$$I(k) = \left[\matrix{I_{11} &I_{21}^T \cr I_{21} &I_{22}}\right]\qquad L_k=\left[\matrix{L_{11} &0 \cr L_{21} &L_{22}}\right].\eqno{\hbox{(10)}}$$

According to (9) and (10), *I*(*k*+1) can be expressed by
TeX Source
$$I(k+1)=\left[\matrix{I_{11}&I_{21}^T\cr I_{21}&I_{22}^{k+1}}\right]=\left[\matrix{I_{11}&I_{21}^T\cr I_{21}&I_{22}+\Omega}\right].\eqno{\hbox{(11)}}$$

By Lemma 1 in the Appendix of [4], the Cholesky factorization of *I*(*k*+1) can be obtained by
TeX Source
$$L_{k+1} = \left[\matrix{ L_{11} &0 \cr L_{21} &L_{22}^{k+1}}\right]\eqno{\hbox{(12)}}$$where *L*_{22}^{k+1} is the Cholesky factorization of the low-dimensional matrix Ω +*L*_{22} *L*_{22}^{T} = *I*_{22}^{k+1}−*L*_{21} *L*_{21}^{T}.

Computing *L*_{k+1} by (12) is much more efficient than directly computing the Cholesky factorization of the high-dimensional matrix *I*(*k*+1).

*Case (ii).* If the global state vector has been reordered in Section III-F, then the Cholesky factorization of *I*(*k*) cannot be used to construct the Cholesky factorization of *I*(*k*+1). In this case, a direct Cholesky factorization of *I*(*k*+1) is performed to obtain *L*_{k+1}.

Since the reordering only happens occasionally, case (i) occurs most of the time.

SECTION IV

## Simulation and Experiment Results

In this section, simulation and experiment results are given to illustrate the accuracy and efficiency of SLSJF.

### A. Simulation Results

The 150 × 150 m^{2} simulation environment used contains 2500 features arranged in uniformly spaced rows and columns. The robot started from the left bottom corner of the square and followed a random trajectory, as shown in Fig. 3(a). A sensor with a field of view of 180^{∘} and a range of 6 m [the small semicircle seen near the bottom in Fig. 3(a)] was simulated to generate relative range and bearing measurements between the robot and the features. There were 27 924 robot poses in total and 170 846 measurements were made from the robot poses. The robot observed 2270 features in total and most of them were observed a number of times.

Six hundred small-sized local maps were built by conventional EKF SLAM using the odometry and measurement information. Each local map contains around ten features. Fig. 3(a) shows the global map generated by fusing all the 600 local maps using SLSJF. The data association in the SLSJF was performed using the nearest neighbor method [8]. The global map was superimposed with the global map generated by fusing the 600 local maps using EKF sequential map joining [6], [7] and the map generated by a single EKF SLAM. Close examination [e.g., Fig. 3(b)] shows that the feature position estimates computed by the three methods are all consistent. The feature position estimates of SLSJF and EKF sequential map joining are almost identical.

Fig. 3(c) shows the errors and 2σ bounds of the estimates of the 600 robot end poses obtained using the three methods. It is clear that the estimates are all consistent. It should be noted that in SLSJF, the robot end poses are included in the global state vector and are continuously updated. Therefore, the error and 2σ bounds of SLSJF are smaller than that of EKF sequential map joining and EKF SLAM, where the robot poses, except the most recent one, are not included in the state vector (hence are not updated).

Fig. 3(d) shows all the nonzero elements of the sparse information matrix obtained by SLSJF in black. Fig. 3(e) shows the CPU time^{5} required for the local map fusion using SLSJF and EKF sequential map joining. The total time for fusing all the 600 local maps is 145 s for SLSJF and 7306 s for EKF sequential map joining (building the 600 local maps takes 95 s, it takes conventional EKF SLAM more than 15 h to finish the map). Table I presents the detailed processing time for the two map joining algorithms. In SLSJF, the major computational cost is due to “data association” which includes the time for covariance submatrix recovery. The “others” including reordering of the state vector, Cholesky factorization, and state vector recovery also take significant time. On the other hand, “global map update” uses most of the computation time in EKF sequential map joining.

Fig. 3(f) compares the CPU time of SLSJF with the proposed reordering strategy and that of SLSJF with the AMD-only reordering [2] (for the proposed reordering, the parameters *n*_{0} = 400 and *d*_{0} = 15; for the AMD-only reordering, the reordering is performed after fusing every five local maps, the parameters are chosen such that both algorithms have their best performance). The performance of the two reordering algorithms are very similar, presumably due to the fact that the MATLAB implementation of AMD algorithm is very efficient.

### B. Experimental Results

SLSJF was also applied to the Victoria Park dataset that was first used in [14]. Neither ground truth nor noise parameters are available for this dataset. Published results for the vehicle trajectory and uncertainty estimates vary [3], [4], [13], [14], presumably due to different parameters used by various researchers. The results in this section, therefore, only demonstrate that SLSJF can be applied to this popular dataset.

Fig. 4(a) shows the map obtained by conventional EKF SLAM. The odometry and range-bearing observation data were used to build 200 local maps by EKF SLAM. Fig. 4(b) shows the global map obtained by joining the 200 local maps using SLSJF. Data association in SLSJF was performed using the nearest neighbor method [8]. Fig. 4(c) shows all the nonzero elements of the information matrix in black. The information matrix is not very sparse because the sensor range is relatively large (around 80 m) as compared with the size of the environment (300 m× 300 m). Fig. 4(d) shows the CPU time required to fuse each of the 200 local maps. The total processing time for joining all the 200 maps by SLSJF is 22 s (the time required for building the 200 local maps is 63 s).

SECTION V

## Related Work and Discussion

In this section, some of the properties of SLSJF and some related work are discussed.

### A. Different Ways to Achieve Sparse Representation

The sparse representations of SLAM recently proposed in the literature (e.g., [1], [2], [3], [4], [15]) make use of different state vectors and/or have different strategies for marginalizing out robot poses. In SAM [2], incremental SAM (iSAM) [13], tectonic SAM [11], and full SLAM [5], all the robot poses are included in the state vector and no marginalization is needed. However, the dimension of the state vector is very high, especially when the robot trajectory is long.

When all the previous robot poses are marginalized out as in conventional EIF SLAM, the information matrix becomes dense although it is approximately sparse [16]. The sparse extended information filter (SEIF) presented in [1] approximates the information matrix by a sparse one using sparsification, and this leads to inconsistent estimates [3].

The exactly sparse extended information filter (ESEIF) developed by [3] follows the conventional EIF SLAM algorithm, but marginalizes out the robot pose and relocates the robot from time to time. In this way, the information matrix is kept exactly sparse by sacrificing the robot location information once in a while.

In decoupled SLAM (D-SLAM) algorithm [4], the robot pose is not incorporated in the state vector for mapping. The observations made from one robot pose are first transferred into the relative position information among the observed features (the robot pose is marginalized out from the observations), and then the relative position information is used to update the map. This process also results in some information loss.

The D-SLAM map joining algorithm [15] first builds local maps and then marginalizes out the robot start and end poses from the local map, the obtained relative position information among features are fused into the global map in a way similar to the D-SLAM algorithm. The odometry information is maintained in the local maps, but there is still some information loss due to the marginalization of robot start/end poses.

In SLSJF, the robot start and end poses of the local maps are never marginalized but kept in the global state vector. Thus, all the information from local maps is preserved.

If each local map is treated as one integrated observation, then SLSJF has some similarity to iSAM [13]. The role of local maps in SLSJF is also similar to the “star nodes” in the graphical SLAM [17]. However, in the graphical SLAM, the poses are first added in the graph, and then, “star nodes” are made. While in SLSJF, most of the robot poses are marginalized out during the local map building steps. These robot poses are never present in the global state vector.

### B. Computational Complexity

The map joining problem considered in this paper is similar to that studied in [6] and [7]. The computational complexity of the local map building is *O*(1) since the size of a local map is small. The computational complexity of the global map update is *O*(*n*^{2}) for the sequential map joining approach in [6] and the constrained local submap filter in [7].

In SLSJF, the robot start/end poses of the local maps are included in the global state vector, and the EIF implementation results in an exactly sparse information matrix. This makes SLSJF much more efficient than the EKF sequential map joining [6], [7].

Although simulation results show that SLSJF is computationally very efficient for large-scale SLAM, the computational complexity of several steps in SLSJF may not be *O*(*n*) for worst case scenarios. For example, the number of fill-ins introduced in the Cholesky factorization depends on the environment and the robot trajectory. This influences the computational cost of the full Cholesky factorization step and the step of solving the sparse linear equations. Also, the computational cost of the proposed reordering is not well understood yet. In theory, SLSJF suffers the general *O*(*n*^{1.5}) cost for worst case scenario of planar grids, as all sparse-factorization-based methods do [18]. This is similar to the treemap algorithm [19] and the SAM using nested dissection algorithm [20].

Very recently, it was shown in [21] that the total computational cost of local map building and map joining can be reduced to *O*(*n*^{2}) by an EKF-based “divide and conquer SLAM” (D&C SLAM). Although D&C SLAM was shown to be much more efficient than the conventional EKF SLAM, it was not compared with the more efficient EKF sequential map joining [21].

The SLSJF has some similarity to the tectonic SAM algorithm [11]. Tectonic SAM is also an efficient submap-based approach, and the state vector reordering and Cholesky factorization are used in solving the least-square problem. The submap fusion in tectonic SAM uses a D&C approach, which is more efficient than the sequential map joining in SLSJF when data association is assumed. The major difference between the tectonic SAM and SLSJF is that in tectonic SAM, all the robot poses involved in building the local maps are kept, and the dimension of the global state vector is much higher than that of SLSJF.

### C. Requirements on SLSJF

In SLSJF, it is assumed that the local maps are consistent and accurate enough. If the local maps are inconsistent, SLSJF may produce wrong results due to the wrong information provided by the local maps. When the local maps are inaccurate, SLSJF may become inconsistent due to linearization errors.

Another assumption made in SLSJF is that the local map only involves “nearby objects.” This guarantees that the information matrix is exactly sparse, no matter how many local maps are fused. When this assumption does not hold such as the case with vision sensors, SLSJF can still be applied since a significant number of feature pairs will not be present concurrently in the same local map. However, the processes of selecting potentially matched features and reordering the state vector may need modifications to make the algorithm more efficient.

Similar to [6] and [7], there is no requirement on the structure of the environment for SLSJF to be applicable. This is different from the efficient treemap SLAM algorithm [19], where the environment has to be “topological suitable. ” Another difference between the SLSJF and the treemap SLAM algorithm is that the covariance submatrix recovery and data association have been ignored in the treemap SLAM implementations available to date [19], [22], [23].

### D. Exact Covariance Submatrix Recovery

The covariance submatrix recovery in SLSJF is exact. This is different from the approximate covariance submatrix recovery methods (e.g., [1], [10]), where only an approximate or upper bound of covariance submatrix is computed. As pointed out in [10], the upper bound can only be used in nearest neighbor data association [8] but cannot be used in the more robust joint compatibility test [12].

An algorithm for exact recovery of covariance submatrix was proposed in iSAM [13]. It has “*O*(*n*) time complexity for band-diagonal matrices and matrices with only a constant number of entries far from the diagonal, but can be more expensive for general sparse matrices” [13]. The covariance submatrix recovery in SLSJF is similar. The major advantages of SLSJF over iSAM is that the dimension of the state vector in SLSJF is much lower than that of iSAM. Thus, SLSJF may be more suitable for the situations where the robot trajectory is very long and/or the observation frequency is high, which is true for many common sensors such as laser range finders.

### E. Incremental Cholesky Factorization for Recovery

The idea of incrementally computing the Cholesky factorization is motivated by [4]. The main difference between the recovery method in SLSJF and that in [4] is that complete Cholesky factorization and direct method for linear equation solving are used in SLSJF, while approximate Cholesky factorization and preconditioned conjugate gradient method were used in [4].

The incremental Cholesky factorization also has some similarity with the QR factorization update in [13]. The QR factorization update in [13] is based on “Givens rotations,” while the incremental Cholesky factorization process in SLSJF is based on the “block-partitioned form of Cholesky factorization.” The performance of these two approaches is expected to be similar.

### F. Reordering of the Global State Vector

In SLSJF, the reordering of state vector aims to combine the advantages of AMD reordering (where the number of fill-ins is reduced [2], [13]), and the reordering by distance (where the efficient incremental Cholesky factorization procedure can be applied in most cases [4]).

The idea behind the “reordering by distance” is to make sure that the robot observes only the features that are in the bottom part of the state vector for as long as possible, no matter in which direction the robot is moving. However, this is not the best way of reordering for indoor environments where features in different rooms might actually be very close but cannot be seen simultaneously. For indoor environments, the knowledge on the structure of the environment (and the knowledge on the possible robot trajectory) can be exploited to place “the features that are likely to be re-observed” near the bottom of the state vector.

### G. Consistency

The SLSJF algorithm does not contain any approximations (such as sparsification [1]) that can lead to estimator inconsistency. However, as the case with all EKF/EIF-based estimation algorithms, it is possible that inconsistencies occur in SLSJF due to the errors introduced by the linearization process.

It has been suggested that local map-based strategies can improve the consistency of SLAM by keeping the robot orientation error small [24], [25]. We had conducted many simulations and found that this is true for some scenarios especially when the process noise, the feature density, and the sensor range are all small, or sequential update is used in EKF when multiple features can be observed from one robot pose. In many practical scenarios, for example, in the simulation results presented in Section IV-A, we found that both EKF SLAM (with batch update) and map joining results are consistent, mainly due to the small observation and odometry noises and the high feature density. When noise values were gradually increased, both strategies became inconsistent, almost always at the same level of noise. This is likely due to the fact that in any submap joining algorithm, inconsistency in even one of the submaps, leads to an inconsistent global map.

In SLSJF, all the robot start/end poses are in the global state vector, and there is no prediction step within the EIF. Thus, the SLSJF can be treated as a linearized least square solution with only one iteration in each map fusion step. In fact, at any map fusion step, the linearization error can be reduced further by recomputing the information matrix *I* and the information vector *i* as a sum of all the contributions in (8), using the new estimate as a linearization point for the Jacobians. This process is able to improve the consistency significantly, but with more computational cost.

### H. Treating the Local Map as a Virtual Observation

Many submap-based SLAM algorithms (either explicitly or implicitly) treat the local map as a virtual observation, but most of them treat a local map as “an observation made from the robot start pose to all the features in the local map.” In SLSJF, the local map is treated as “an observation made from the robot start pose to all the features in the local map and a virtual robot located at the robot end pose.” This motivates the inclusion of all the robot start/end poses in the global state vector to achieve an exactly sparse information matrix.

#### Comparison With Two-Level Mapping Algorithms

The output of SLSJF is one single global stochastic map. This approach is different from the two-level mapping algorithms (e.g. , hierarchical SLAM [26], Atlas [27], network-coupled feature maps [28]), where a set of local maps are maintained and the relationship among these maps is described at a higher level. Though promising due to their reduced computational cost, the two-level mapping approaches require more work to completely resolve the question of how to treat the overlapping regions among local maps. As pointed out in [26], all the two-level mapping systems result in suboptimal solutions because the effect of the upper level update cannot be propagated back to the local level.

By adding robot start/end poses of the local maps into the global state vector, an exactly sparse extended information filter for local submap joining, SLSJF, is developed. There is no approximation involved in SLSJF apart from linearization processes. SLSJF contains not only the filter steps but also two important steps that are essential for real-world application of the EIF-based algorithms—a covariance submatrix recovery step and a data association step. The sparse information matrix together with the novel state vector and the covariance submatrix recovery procedure make the SLSJF algorithm computationally very efficient.

SLSJF achieves an exactly sparse information matrix with no information loss. The dimension of its state vector is significantly less than that of the full SLAM algorithm [5], where all the robot poses are included in the state vector. As it does not matter how the local maps are built, SLSJF can also be applied to large-scale range-only or bearing-only SLAM problem—first use range-only or bearing-only SLAM algorithms to build local maps, and then, fuse the local maps together using SLSJF.

For the successful application of SLSJF for local map joining, it is important that all the local maps are consistent. Thus, it is essential to use reliable SLAM algorithms to build the local maps.

More work is required to determine the best reordering strategy for SLSJF, improve the robustness of SLSJF to linearization errors, and extend SLSJF to 3-D local map joining. Research along these directions is underway.