• Abstract

# Efficient View-Based SLAM Using Visual Loop Closures

This paper presents a simultaneous localization and mapping algorithm suitable for large-scale visual navigation. The estimation process is based on the viewpoint augmented navigation (VAN) framework using an extended information filter. Cholesky factorization modifications are used to maintain a factor of the VAN information matrix, enabling efficient recovery of state estimates and covariances. The algorithm is demonstrated using data acquired by an autonomous underwater vehicle performing a visual survey of sponge beds. Loop-closure observations produced by a stereo vision system are used to correct the estimated vehicle trajectory produced by dead reckoning sensors.

SECTION I

## Introduction

SIMULTANEOUS localisation and mapping (SLAM) has been widely used to estimate the position of a robot in an initially unknown environment. In the original formulation [1], [2], [3], the state of the robot and the position of a set of features extracted from observations of the environment are jointly estimated using an extended Kalman filter (EKF). The complexity of updating the filter after acquiring an observation is quadratic in the number of estimated features, resulting in a large research effort to produce more scalable SLAM methods. Examples include partitioned updates [4] and submapping techniques [5], [6], [7].

Recently, there has been increasing interest in SLAM algorithms using an extended information filter (EIF), in which an observation update can be performed in constant time. Sparsification approximations can be used to ignore many near-zero elements of the information matrix in feature-based SLAM approaches [8], [9], while the information matrix is exactly sparse when past vehicle poses are maintained by the filter, such as in the viewpoint augmented navigation (VAN) framework [9], [10], [11]. Exploiting the sparsity of the information matrix can reduce both the computational complexity and memory requirements of the filter.

A related approach is the smoothing and mapping (SAM) framework [12], [13], which estimates the states of a set of features and a history of robot poses. Unlike EKF or EIF approaches in which linearization errors are permanently incorporated into the filter, the SAM algorithm can perform an iterative least-squares optimization process to converge to an optimal state estimate. The information matrix in the normal equations solved during each iteration possesses a similar sparsity structure to that of the VAN framework.

The main difficulty with information-form SLAM algorithms is the recovery of state estimates and covariances. State estimates are required in the EIF prediction, observation, and update operations, while state covariances are required for data association or loop-closure hypothesis generation. Efficient state estimate and covariance recovery is the main focus of this paper.

In a previous VAN implementation [9], [10], [11], state estimates and covariances were recovered using a Cholesky factor of the information matrix that was recalculated each time an image was acquired. Using Cholesky factorization modifications to keep a factor up-to-date in a SAM application was previously proposed, but not implemented due to the complexity of the algorithms when applied to sparse matrices [12], [13]. In this paper, the use of Cholesky factorisation modifications in the VAN framework is investigated, utilizing a recently developed implementation [14].

In parallel to the paper presented here, an incremental SAM approach has been developed [15], [16], in which a QR factorization of the SAM measurement Jacobian is updated using Givens rotations. The two approaches are closely related, since the upper triangular matrix R in a QR factorization of the SAM measurement Jacobian is a Cholesky factor of the information matrix [13].

This paper is organized as follows. Section II provides a justification for using the VAN framework for visual navigation applications. Section III summarizes the information-form VAN filtering process. Section IV describes the Cholesky factorization process, and the modifications used to maintain a factor of the VAN information matrix. Section V describes state estimate recovery methods. Section VI describes state covariance recovery methods. Section VII outlines the process to generate loop-closure hypotheses. Section VIII presents the results of the efficient VAN algorithm applied to data acquired by an autonomous underwater vehicle (AUV).Finally, Section IX provides concluding remarks.

SECTION II

## SLAM Frameworks and Visual Navigation

Two main SLAM frameworks have been proposed: feature-based and view-based algorithms. In feature-based SLAM [1], [2], [3], [4], [5], [6], [7], [8], the positions of features are estimated, and a loop closure is performed by observing a previously initialized feature. In view-based SLAM [9], [10], [11], [17], a set of vehicle poses at locations where sensor data was acquired is estimated. A loop closure is performed by registering two sets of sensor data to produce an observation of the relative pose between the vehicle locations where the data was acquired.

A disadvantage of the view-based method is the need to find pairs of previously unused sensor data to construct independent loop-closure observations. Two relative pose measurements created using common feature observations will be correlated, and ignoring these correlations will cause the filter to become inconsistent. Applying multiple relative pose observations to the filter simultaneously while considering the correlations is possible; however, it is impractical since loop-closure events involving a single pose may occur at multiple different times. In comparison, the feature-based approach has no such problem, since the filter maintains all correlations and observations can be applied individually.

The feature-based approach has the disadvantages of requiring the filter to estimate the feature states, and the need to select which features will be used at the time they are first observed. In comparison, the view-based approach has the advantage that the selection of a subset of features used in a loop-closure observation can be delayed until the feature association process is performed.

As a result of these properties, feature-based approaches are more suitable for applications in which a small set of features can reliably be extracted and associated, while pose-based methods are more appropriate for applications in which large numbers of features can be extracted, particularly when it is uncertain which features can be associated in the future.

When evaluating the suitability of each framework for large-scale visual navigation, the properties of visual feature extraction and association algorithms need be considered. A range of wide-baseline approaches suitable for loop-closure situations have been developed [18], [19], [20], [21]. Association of such features can typically be performed at high precision, but at low recall rates (incorrect feature associations are uncommon; however, the number of associations produced is small) [22], [23].

When used within a feature-based SLAM algorithm, the properties of wide-baseline visual feature extraction and association algorithms result in a difficult feature selection problem. Thousands of features can be extracted from an image; however, few will be matched in a loop closure situation. Estimating the positions of all features becomes infeasible; however, if only a few are selected, a loop-closure observation becomes unlikely. The ability to use all the sensor data, rather than a sparse set of previously selected features at a loop-closure event is a critical advantage for view-based SLAM algorithms in vision applications.

An additional benefit of the view-based approach for visual navigation applications is its ability to handle delayed observations. Visual feature extraction and association are time-consuming processes, so a delay is likely to occur between the time an image is acquired and a loop-closure observation is produced. In the view-based framework, a relative pose constraint can be applied between two previously augmented poses whenever the image analysis operations are complete.

Due to avoidance of the feature selection problem, the inherent ability to handle delayed observations, and the efficiency when using the information form, the view-based VAN framework will be utilized in this paper.

SECTION III

## VAN

### A. Estimated State Vector

In the VAN framework, the current vehicle state is estimated along with a selection of past vehicle poses, leading to a state estimate vector of the form TeX Source $${\hat{{\bf x}}^{+}\left(t_{k}\right) } =\left[\matrix{{\hat{{\bf x}}_{p_{1}}^{+}\left(t_{k}\right)} \cr\vdots \cr{\hat{{\bf x}}_{p_{n}}^{+}\left(t_{k}\right)} \cr{\hat{{\bf x}}_{{v}}^{+}\left(t_{k}\right)} \cr}\right] =\left[\matrix{{\hat{{\bf x}}_{t}^{+}\left(t_{k}\right)} \cr{\hat{{\bf x}}_{{v}}^{+}\left(t_{k}\right)} \cr}\right]\eqno{\hbox{(1)}}$$where contains the current vehicle states, and is a vector of trajectory states consisting of n past vehicle vectors.

The covariance matrix has the form TeX Source $${{\bf P}^{+}\left(t_{k}\right)} =\left[\matrix{{{\bf P}_{tt}^{+}\left(t_{k}\right)} & {{{\bf P}_{tv}^{+}}\left(t_{k}\right)} \cr{{\bf P}_{tv}^{+{\ssr T}}\left(t_{k}\right)} & {{\bf P}_{vv}^{+}\left(t_{k}\right)}}\right] .\eqno{\hbox{(2)}}$$In the information form, the filter maintains the information matrix Y+(tk), which is the inverse of the covariance matrix TeX Source $${{\bf Y}^{+}\left(t_{k}\right) } = \left[{{\bf P}^{+}\left(t_{k}\right)} \right]^{-1}\eqno{\hbox{(3)}}$$and the information vector , which is related to the state estimate by TeX Source $${{{\hat{{\bf y}}}}^{+}\left(t_{k}\right) } = {{\bf Y}^{+}\left(t_{k}\right) } {\hat{{\bf x}}^{+}\left(t_{k}\right) } .\eqno{\hbox{(4)}}$$The VAN information vector has the form TeX Source $${{{\hat{{\bf y}}}}^{+}\left(t_{k}\right) } =\left[\matrix{{{{\hat{{\bf y}}}}_{t}^{+}\left(t_{k}\right) } \cr{{{\hat{{\bf y}}}}_{v}^{+}\left(t_{k}\right) } \cr}\right]\eqno{\hbox{(5)}}$$and the information matrix is TeX Source $${{\bf Y}^{+}\left(t_{k}\right) } =\left[\matrix{{{\bf Y}_{tt}^{+}\left(t_{k}\right) } & {{\bf Y}_{tv}^{+}\left(t_{k}\right)} \cr{\bf Y}_{tv}^{+{\ssr T}}\left(t_{k}\right) & {{\bf Y}_{vv}^{+}\left(t_{k}\right) }}\right] .\eqno{\hbox{(6)}}$$

### B. Estimation Process

The VAN estimation process uses the standard EIF three-step prediction, observation, and update cycle. The vehicle states are assumed to evolve according to a process model of the form TeX Source $${{\bf x}_{{v}}}\left(t_{k}\right) = {{\bf f}_{{v}}} \big[{{\bf x}_{{v}}}\left(t_{k-1}\right), {\bf u}\left(t_{k}\right) \big] + {\bf {w}}\left(t_{k}\right)\eqno{\hbox{(7)}}$$in which u(tk) is a vector of control inputs, and w(tk) is an error vector from a zero-mean Gaussian distribution with covariance Q(tk).

When propagating the vehicle states to a new timestep with a prediction operation, a decision on whether or not the current vehicle pose should be kept in the state vector is required. The current vehicle pose should be kept if it marks the location where data that may be used in future loop-closure observations were acquired. TeX Source \eqalignno{{{{\hat{{\bf y}}}}^{-}\left(t_{k}\right)} & =\left[\matrix{{{{\hat{{\bf y}}}}_{t}^{+}\left(t_{k-1}\right)} \cr{{{\hat{{\bf y}}}}_{v}^{+}\left(t_{k-1}\right)} - {{\bm\nabla}_{x}^{{\ssr T}} {{\bf f}_{{v}}}}\left(t_{k}\right) {{\bf Q}^{-1}}\left(t_{k}\right) \Big({{\bf f}_{{v}}}\left[{\hat{{\bf x}}_{{v}}^{+}\left(t_{k-1}\right)},{\bf u}\left(t_{k}\right) \right]-{{\bm \nabla}_{x} {{\bf f}_{{v}}}}\left(t_{k}\right) {\hat{{\bf x}}_{{v}}^{+}\left(t_{k-1}\right)} \Big) \cr{{\bf Q}^{-1}}\left(t_{k}\right) \Big({{\bf f}_{{v}}}\left[{\hat{{\bf x}}_{{v}}^{+}\left(t_{k-1}\right)},{\bf u}\left(t_{k}\right) \right] -{{\bm \nabla}_{x} {{\bf f}_{{v}}}}\left(t_{k}\right) {\hat{{\bf x}}_{{v}}^{+}\left(t_{k-1}\right)} \Big)}\right]&\hbox{(8)}\cr{{\bf Y}^{-}\left(t_{k}\right)} &=\left[\matrix{{{\bf Y}_{tt}^{+}\left(t_{k-1}\right)} & {{\bf Y}_{tv}^{+}\left(t_{k-1}\right)} & {\bm 0} \cr{{\bf Y}_{tv}^{+{\ssr T}}\left(t_{k-1}\right)} & {{\bf Y}_{vv}^{+}\left(t_{k-1}\right)} +{{\bm \nabla}_{x}^{{\ssr T}}{{\bf f}_{{v}}}}\left(t_{k}\right) {{\bf Q}^{-1}}\left(t_{k}\right){{\bm \nabla}_{x} {{\bf f}_{{v}}}}\left(t_{k}\right) & -{{\bm\nabla}_{x}^{{\ssr T}} {{\bf f}_{{v}}}}\left(t_{k}\right) {{\bf Q}^{-1}}\left(t_{k}\right) \cr{\bm 0} & - {{\bf Q}^{-1}}\left(t_{k}\right) {{\bm \nabla}_{x} {{\bf f}_{{v}}}}\left(t_{k}\right) & {{\bf Q}^{-1}}\left(t_{k}\right)}\right]&\hbox{(9)}} TeX Source \eqalignno{{{{\hat{{\bf y}}}}^{-}\left(t_{k}\right)} &=\left[\matrix{{{{\hat{{\bf y}}}}_{t}^{+}\left(t_{k-1}\right)} - {{\bf Y}_{tv}^{+}\left(t_{k-1}\right)}{{\bm \Omega}}^{-1}\left(t_{k}\right)\Big({{{\hat{{\bf y}}}}_{v}^{+}\left(t_{k-1}\right)} - {{\bm\nabla}_{x}^{{\ssr T}} {{\bf f}_{{v}}}}\left(t_{k}\right){{\bf Q}^{-1}}\left(t_{k}\right) {\bm \delta}\left(t_{k}\right) \Big)\cr{{\bf Q}^{-1}}\left(t_{k}\right) {{\bm \nabla}_{x} {{\bf f}_{{v}}}}\left(t_{k}\right) {{\bm \Omega}}^{-1}\left(t_{k}\right){{{\hat{{\bf y}}}}_{v}^{+}\left(t_{k-1}\right)} + {\bm\Psi}\left(t_{k}\right) {\bm \delta}\left(t_{k}\right)\cr}\right]&\hbox{(10)}\cr{{\bf Y}^{-}\left(t_{k}\right)} &=\left[\matrix{{{\bf Y}_{tt}^{+}\left(t_{k-1}\right)} - {{\bf Y}_{tv}^{+}\left(t_{k-1}\right)} {{\bm\Omega}}^{-1}\left(t_{k}\right) {{\bf Y}_{tv}^{+{\ssr T}}\left(t_{k-1}\right)} & {{\bf Y}_{tv}^{+}\left(t_{k-1}\right)} {{\bm\Omega}}^{-1}\left(t_{k}\right) {{\bm \nabla}_{x}^{{\ssr T}} {{\bf f}_{{v}}}}\left(t_{k}\right) {{\bf Q}^{-1}}\left(t_{k}\right) \cr{{\bf Q}^{-1}}\left(t_{k}\right) {{\bm \nabla}_{x} {{\bf f}_{{v}}}}\left(t_{k}\right) {{\bm \Omega}}^{-1}\left(t_{k}\right){{\bf Y}_{tv}^{+{\ssr T}}\left(t_{k-1}\right)} & {\bm\Psi}\left(t_{k}\right)}\right]&\hbox{(11)}}

For example, in the AUV application detailed in Section VIII in which loop-closure observations are produced from stereovision, prediction with augmentation (keeping the current pose) is performed after each stereo image pair is acquired. When propagating the filter forward from the time of a vehicle depth, attitude, or velocity observation, prediction without augmentation is performed.

Prediction with augmentation is performed as in [9], using (8) and (9), shown at the bottom of the page, in which ∇xfv(tk) is the Jacobian of the vehicle model with respect to the vehicle states.

The equations for prediction without augmentation can be obtained by marginalizing the previous pose from the augmented system of (8) and (9). The result is (10) and (11), shown at the bottom of this page, in which three subterms are defined in (12)(14). TeX Source \eqalignno{{\bm \delta}\left(t_{k}\right) & = {{\bf f}_{{v}}}\left[{\hat{{\bf x}}_{{v}}^{+}\left(t_{k-1}\right)},{\bf u}\left(t_{k}\right) \right]- {{\bm \nabla}_{x} {{\bf f}_{{v}}}}\left(t_{k}\right){\hat{{\bf x}}_{{v}}^{+}\left(t_{k-1}\right)}&\hbox{(12)} \cr{{\bm \Omega}}\left(t_{k}\right) & = {{\bf Y}_{vv}^{+}\left(t_{k-1}\right)} + {{\bm \nabla}_{x}^{{\ssr T}}{{\bf f}_{{v}}}}\left(t_{k}\right) {{\bf Q}^{-1}}\left(t_{k}\right){{\bm \nabla}_{x} {{\bf f}_{{v}}}}\left(t_{k}\right)&\hbox{(13)} \cr{\bm \Psi}\left(t_{k}\right) & = \big({\bf Q}\left(t_{k}\right) +{{\bm \nabla}_{x} {{\bf f}_{{v}}}}\left(t_{k}\right) \left[{{\bf Y}_{vv}^{+}\left(t_{k-1}\right)}\right]^{-1} {{\bm\nabla}_{x}^{{\ssr T}}{{\bf f}_{{v}}}}\left(t_{k}\right)\big)^{-1}.\cr&&\hbox{(14)}}Observations are assumed to be made according to a model of the form TeX Source $${{\bf z}}\left(t_{k}\right) = {\bf h} \big[{\bf x}\left(t_{k}\right) \big] + {\bf v}\left(t_{k}\right)\eqno{\hbox{(15)}}$$in which z(tk) is an observation vector, and v(tk) is a vector of observation errors with covariance R(tk).The difference between the actual and predicted observations is the innovation TeX Source $${\bm \nu}\left(t_{k}\right) = {{\bf z}}\left(t_{k}\right) - {\bf h}\left[{\hat{{\bf x}}^{-}\left(t_{k}\right)} \right] .\eqno{\hbox{(16)}}$$The innovation is used to update the information vector and matrix TeX Source \eqalignno{{{{\hat{{\bf y}}}}^{+}\left(t_{k}\right) } & = {{{\hat{{\bf y}}}}^{-}\left(t_{k}\right)} + {\bf i}\left(t_{k}\right)&\hbox{(17)}\cr{{\bf Y}^{+}\left(t_{k}\right) } & = {{\bf Y}^{-}\left(t_{k}\right)} + {\bf I}\left(t_{k}\right)&\hbox{(18)}}in which TeX Source \eqalignno{ {\bf i}\left(t_{k}\right) & = {{\bm \nabla}_{x} {\bf h}}\left(t_{k}\right) {{\bf R}^{-1}}\left(t_{k}\right) \left({\bm \nu}\left(t_{k}\right) +{{\bm \nabla}_{x} {\bf h}}\left(t_{k}\right){\hat{{\bf x}}^{-}\left(t_{k}\right)} \right)\qquad\,\,& \hbox{(19)} \cr{\bf I}\left(t_{k}\right) & = {{\bm \nabla}_{x} {\bf h}}\left(t_{k}\right) {{\bf R}^{-1}}\left(t_{k}\right) {{\bm \nabla}_{x}^{{\ssr T}} {\bf h}}\left(t_{k}\right)&\hbox{(20)}}where ∇xh(tk) is the Jacobian of the observation function with respect to the vehicle states.

The observation and update process is efficient in the information form, since only the elements of the information vector and matrix corresponding to the observed states are modified.

The prediction operation equations (8) and (12) require the prior vehicle pose state estimate, while in the update operation, (16) and (19) require the prior estimates of the observed states. In addition, the prediction operations require the vehicle process model to be linearized at the prior vehicle state estimate, and the update operation requires the observation model to be linearized at the estimate of the observed states. Once the necessary state estimates have been recovered, prediction (with or without augmentation) and observations are constant-time operations independent of the number of estimated poses.

### C. Structure of the Information Matrix

Elements of the VAN information matrix off the block diagonal are nonzero only if an observation relating the two corresponding poses has been applied to the filter. Fig. 1(a) shows an example of an information matrix sparsity pattern and Markov graph that results from dead reckoning (DR). Since each pose is related to the previous and next pose through odometry constraints, DR results in a block tridiagonal matrix. The Markov graph provides a visual representation of the relationship between the estimated variables, with an edge in the graph corresponding to a nonzero block in the information matrix.

When loop-closure observations are applied to the filter, additional nonzero elements in the information matrix are created at the locations corresponding to the two observed poses. Fig. 1(c) displays the information matrix resulting from adding loop closure observations between the first and last two poses.

The sparsity of the information matrix is important for the computational efficiency and storage requirements of the filter. In large-scale applications with many augmented poses, EKF-based approaches are infeasible due to the memory requirements of dense covariance matrices.

Fig. 1. Sparsity pattern of the VAN information matrix. Nonzero blocks in the information matrix correspond to edges in the Markov graph. (a) DR information matrix, in which odometry constraints produce a block tridiagonal structure. (b) DR Markov graph. (c) SLAM information matrix, in which two loop-closure observations constrain the final two poses relative to the initial pose. (d) SLAM Markov graph.
SECTION IV

## Cholesky Factorization and Modifications

The Cholesky factorization is commonly used to solve linear systems of the form TeX Source $${\bf A}{\bf X} = {\bf B}\eqno{\hbox{(21)}}$$where A is a positive definite symmetric matrix and X is a matrix of unknowns.

In this SLAM application, a Cholesky factor of the information matrix will be used to recover state estimates and covariances. Relationships for the state estimate vector and covariance matrix in the form of (21) can be produced by rearranging (3) and (4) to obtain TeX Source \eqalignno{{{\bf Y}^{+}\left(t_{k}\right) }{\hat{{\bf x}}^{+}\left(t_{k}\right) } &={{{\hat{{\bf y}}}}^{+}\left(t_{k}\right) }&\hbox{(22)} \cr{{\bf Y}^{+}\left(t_{k}\right) }{{\bf P}^{+}\left(t_{k}\right)} &={\bf I}.&\hbox{(23)}}The LDLT} form of the Cholesky decomposition of the matrix A is defined by TeX Source $${\bf A} = {\bf L} {\bf D} {{\bf L}^{{\ssr T}}}\eqno{\hbox{(24)}}$$where the L is a lower triangular matrix with all elements on the diagonal equal to one, and D is a diagonal matrix.

The solution to a system of equations in the form of (21) is calculated from the Cholesky factorization using a two-step forward and backward solve process. First, a forward solve step is performed on the lower triangular system TeX Source $${\bf L} {\bf Z} = {\bf B}\eqno{\hbox{(25)}}$$to recover the rows of the forward-solve result Z in order from first to last. The solution X can then be recovered using a backward solve operation on the upper triangular system TeX Source $${\bf D} {{\bf L}^{{\ssr T}}} {\bf X} = {\bf Z}\eqno{\hbox{(26)}}$$in which the rows of X are recovered in order from last to first.

The structure of the Cholesky factor L of a sparse matrix is related to the sparsity pattern of the original matrix A. Nonzero elements in the Cholesky factor are present at the locations of all nonzero elements in the original matrix; however, additional nonzeros known as “fill-in” are introduced. Fill-in is undesirable, since additional nonzero elements increase the computational complexity of the factorization and equation-solving processes.

Many algorithms to calculate the Cholesky decomposition exist [24]. The experiments presented in Section VIII use an efficient sparse uplooking algorithm [14]. In Fig. 2, the right-looking factorization algorithm is used to demonstrate the process of fill-in. During each iteration of the algorithm, the j th column of the factor is produced by dividing the j th column of the active submatrix by its element on the diagonal. The j th variable is then eliminated by marginalizing it from the remaining active submatrix. The fill-in produced in the Cholesky factor is equivalent to the additional edges produced by marginalizing a variable from the Markov graph, in which the neighbors of an eliminated node form a clique.

### A. Reducing Fill-in With Variable Reordering

Fig. 2. Right-looking Cholesky factorization algorithm demonstrating fill-in. During each iteration of the algorithm, the j th column of the factor is produced by dividing the j th column of the active submatrix by its element on the diagonal. The j th variable is then eliminated (marginalized) from the active submatrix. Fill-in elements (nonzeros at the locations of zeros in the original matrix) are shown as dark grey matrix blocks and thick graph edges. (a) Factorisation using the natural variable ordering, producing three blocks of fill-in. (b) Factorisation using the AMD variable ordering, resulting in only one block of fill-in.

Fill-in can be reduced by reordering the variables to change the sequence in which they are eliminated during the factorization process. Since finding the optimal permutation that produces minimal fill-in is NP-hard, heuristic-based approaches, such as the approximate minimum degree (AMD) algorithm, are typically used [24], [25].

In each iteration of a right-looking Cholesky factorization, the AMD algorithm employs the greedy strategy of selecting for elimination of the variable corresponding to the graph node with the smallest degree (the number of neighbors), or equivalently, the sparsest row of the remaining active submatrix to be factorized.

Fig. 2(b) illustrates the factorization process for the matrix previously decomposed in Fig. 2(a), using the variable ordering produced by AMD. The selected order in which the poses are eliminated is 5, 4, 1, 2, 3.The benefit of variable ordering can be observed by comparing the three blocks of fill-in produced when using the natural ordering to the one block of fill-in with the AMD ordering.

### B. Scalability of the Factorization Process

The computational complexity of the Cholesky factorization for a dense n× n matrix is O(n3). For sparse matrices, however, the complexity is dependent on the number of nonzeros in the Cholesky factor, which is influenced by the structure of the matrix being factorized and the variable ordering.

If the number of nonzeros in the Cholesky factor grows linearly with the number of estimated poses, as is the case for a DR VAN information matrix with a tridiagonal structure or a VAN system with a constant number of loop closures, the complexity of the Cholesky decomposition process is O(n). However, in general, where the Cholesky factor contains O(n2) nonzeros, as can be expected in SLAM applications where the number of loop-closure observations grows linearly with the number of poses, the complexity of the factorization process is O(n3).

### C. Modifying a Factor

If a previously factorized system of equations is changed, it is often possible to efficiently modify an existing factor instead of repeating the computationally expensive factorization process.

The complex equations and algorithms used to compute modified components of a sparse Cholesky factorization will not be presented here. Instead, the focus will be on illustrating which components of the factorization change, and the resulting complexity of the operation. Further details on the Cholesky modification algorithms can be found in [26] and [27], and the implementation used in the experiments presented in the paper is described in [14].

Four Cholesky factor modification operations are used: row additions, row deletions, updates, and downdates. The row addition and deletion operations allow the introduction of a new variable or removal of an existing variable from the system of linear equations. A two-step process of row deletion and addition can be used to perform an arbitrary change to a row of the factorized matrix.

Update and downdate operations allow a special case modification to the factorized system of equations. A modification of the form TeX Source $${\bar{{\bf A}}} = {\bf A} + {\bf W}{\bf W}^{{\ssr T}}, \quad {\bar{{\bf B}}} = {\bf B} + {{\bm \Delta}_{\bf B}}\eqno{\hbox{(27)}}$$where W is an n × k matrix is known as a rank-k update, while a modification of the form TeX Source $${\bar{\bf A}}= {\bf A} - {\bf W}{\bf W}^{{\ssr T}}, \quad {\bar{\bf B}} = {\bf B} + {\bm \Delta}_{\bf B}\eqno{\hbox{(28)}}$$is a rank-k downdate.

Equations (27) and (28) include a change to the right-hand-side matrix B of the system of linear equations to enable the forward solve result Z to be modified in addition to the Cholesky factor.

Fig. 3 illustrates an update modification performed on the system of equations previously factorized in Fig. 2(b). For each of the modification operations, if an element in row j of the factorized matrix is modified (added, removed, updated, or downdated), the elements of the factor that are changed are limited to the columns j to n. Considering the Cholesky factorization process in Fig. 2, this is a logical result, since these columns of the Cholesky factor were previously produced after the modified variable was marginalized from the active submatrix.

Fig. 3. Cholesky factor update modification example. A system of equations of the form AX = B, with an existing factorisation L and forward solve result Z, are modified using an update matrix W and right-hand side change matrix ΔB. In the resulting modified factor and forward solve result , altered blocks are shown in black. (a) Original A matrix. (b) Original B matrix. (c) Original factor. (d) Original forward solve result. (e) Update matrix. (f) Right-hand side change matrix. (g) Modified factor. (h) Modified forward solve result.

### D. Maintaining a Factor of the VAN Information Matrix

The information-form VAN filter operations of Section III-B can all be described using the row addition, row deletion, update, and downdate modifications.

The prediction with augmentation equations (8) and (9) can be implemented with row additions for the new pose variables, and an update on the previous pose states. Similarly, the prediction without augmentation equations (10) and (11) can be implemented with row removal and row addition operations to perform the changes to the current vehicle pose states, and a downdate on the previous pose states. The observation update equations (17) and (18) can simply be implemented with a single update modification.

Modifications are used to maintain an up-to-date factor after prediction and vehicle state observation operations. However, when a loop-closure observation is applied between past poses, the structure of the information matrix is significantly changed, causing the previous variable ordering to be ineffective in minimizing fill-in. Therefore, when a loop-closure observation is applied to the filter, a new variable ordering is found, and a new factor of the information matrix is calculated.

### E. Variable Ordering for Efficient VAN Operations

Prediction operations and observations of the current vehicle states are the most frequent procedures in a SLAM algorithm, with the number of loop-closure observations being relatively small. After considering the pattern of modified factor elements in Fig. 3, it is clear that ordering the vehicle states last will minimize the complexity of maintaining a factor of the VAN information matrix.

If the current vehicle states are ordered last, the number of elements in the factor that need to be recalculated is independent of the number of augmented poses, allowing the Cholesky factorization modifications for the prediction and vehicle state observation operations to be performed in constant time.

While ordering the vehicle states last may not result in the the minimal amount of fill-in, the benefit of constant-time prediction and observation operations outweigh the additional computational complexity due to the additional fill-in caused by this constraint.

SECTION V

## State Estimate Recovery

### A. Complete State Recovery

The complete state estimate vector can be recovered by solving the relationship TeX Source $${{\bf Y}^{+}\left(t_{k}\right) }{\hat{{\bf x}}^{+}\left(t_{k}\right) }={{{\hat{{\bf y}}}}^{+}\left(t_{k}\right) }\eqno{\hbox{(29)}}$$using the Cholesky factor of the information matrix and the process described in Section IV.

The efficiency of the forward and backward solve process used to solve (29) is dependent on the sparsity of the Cholesky factor. If the factor contains O(n) nonzero elements, as is the case for VAN systems with only odometry constraints or a constant number of loop-closure observations, the complete state estimate vector can be recovered in O(n) time. However, in general, where the Cholesky factor contains O(n2) nonzeros, as can be expected in SLAM applications where the number of loop-closure observations grows linearly with the number of poses, the computational complexity of recovering the complete vector is O(n2).

### B. Approximate Vehicle State Recovery

In a previous VAN implementation [9], [10], [11], approximate estimates of the current vehicle states were produced by partitioning the state vector into a “local” portion consisting of the states to be recovered, and the remaining “benign” states for which an approximate estimate is available. Using the subscript l for the local subvector and b for the benign states, the partitioned version of (29) is TeX Source $$\left[\matrix{{{\bf Y}^{+}_{{bb}}\left(t_{k}\right) } & {{\bf Y}^{+}_{{bl}}\left(t_{k}\right)} \cr{{\bf Y}^{{+{\ssr T}}}_{{bl}}\left(t_{k}\right) } & {{\bf Y}^{+}_{{ll}}\left(t_{k}\right)}}\right]\left[\matrix{{\hat{{\bf x}}_{{b}}^{+}\left(t_{k}\right) } \cr{\hat{{\bf x}}_{{l}}^{+}\left(t_{k}\right) }}\right]=\left[\matrix{{{{\hat{{\bf y}}}}^{+}_{{b}}\left(t_{k}\right) } \cr{{{\hat{{\bf y}}}}^{+}_{{l}}\left(t_{k}\right) }}\right].\eqno{\hbox{(30)}}$$If the benign states have not changed significantly since they were last recovered, providing a good approximation (a tilde is used to denote approximate estimates), an approximate estimate of the local states can be calculated with TeX Source $${\tilde{{\bf x}}}_{{l}}\left(t_{k}\right) = \left[{{\bf Y}^{+}_{{ll}}\left(t_{k}\right) }\right]^{-1} \left({{{\hat{{\bf y}}}}^{+}_{{l}}\left(t_{k}\right) } - {{\bf Y}^{{+{\ssr T}}}_{{bl}}\left(t_{k}\right) } {\tilde{{\bf x}}}_{{b}}\left(t_{k}\right) \right).\eqno{\hbox{(31)}}$$Only one block of corresponding to the previous-to-current pose cross-information submatrix contains nonzero elements, allowing the approximate vehicle state estimate to be calculated in constant time.

The assumption underlying this approximation is that the past vehicle poses have not been significantly updated by observations applied to the filter since the estimates of the benign states were last recovered.

If an observation such as a loop closure or global positioning system (GPS) fix that provides a large correction to states with drifting estimates is applied to the filter, a significant correction will be propagated to the previous pose states. As a result, the accuracy of the approximation will be poor and the complete state vector including new estimates of the benign states would need to be recovered using the method of Section V-A.

### C. Exact Vehicle State Recovery

In Section IV-C, it was shown that a Cholesky factor and the forward solve result can be efficiently modified to reflect changes to the original system of linear equations. The only remaining operation required to solve the modified system of linear equations is the backward solve of the upper-triangular system of (26), which has the form TeX Source $$\left[\matrix{{{\bf D}_{1}} & & \cr& \ddots & \cr& & {{\bf D}_{n}}}\right]\left[\matrix{{{\bf L}_{11}^{{\ssr T}}} & \ldots & {{\bf L}_{n1}^{{\ssr T}}} \cr& \ddots & \vdots \cr& & {{\bf L}_{nn}^{{\ssr T}}} \cr}\right]\left[\matrix{{{\bf X}_{1}} \cr\vdots \cr{{\bf X}_{n}} \cr}\right]=\left[\matrix{{\bf Z}_{1} \cr\vdots \cr{\bf Z}_{n} \cr}\right].\eqno{\hbox{(32)}}$$The backward solve operation recovers the variables in reverse order from last to first row. The last block of the solution X can, therefore, be calculated by solving TeX Source $${{\bf D}_{{n}}}{{\bf L}_{nn}^{{\ssr T}}}{{\bf X}_{n}} = {\bf Z}_{n}.\eqno{\hbox{(33)}}$$If the current vehicle pose variables are ordered last, and the forward substitution result is updated along with the Cholesky factor each time it is modified during a prediction or observation operation, this approach allows the current vehicle state estimates to be recovered in constant time. This is an important improvement over the method of Section V-B, since it allows prediction and observation operations to be performed without corrupting the filter with approximate estimates. As a result, the EIF will have the same optimality properties as an EKF solution.

SECTION VI

## Covariance Recovery

### A. Complete Inverse Recovery

Using the Cholesky decomposition of the information matrix, the complete covariance matrix can be recovered by solving the equation TeX Source $${{\bf Y}^{+}\left(t_{k}\right) }{{\bf P}^{+}\left(t_{k}\right)}={\bf I}.\eqno{\hbox{(34)}}$$While an information matrix may be sparse, the corresponding covariance matrix is dense. Recovering the complete covariance matrix is only feasible for problems with small state vectors.

### B. Recovery of Columns of the Inverse

The j th column of the covariance matrix can be recovered by solving the equation TeX Source $${{\bf Y}^{+}\left(t_{k}\right) } {{\bf P}^{+}_{*j}\left(t_{k}\right)} = {\bf I}_{*j}\eqno{\hbox{(35)}}$$where P+*j(tk) is the j th column of the covariance matrix, and I*j is the j th column of an identity matrix with the same dimensions as the information matrix.

If the Cholesky factor contains O(n) nonzero elements, which occur in VAN systems containing only odometry constraints or a constant number of loop closures, the computational complexity of recovering a column of the covariance matrix is O(n). However, in general, where the Cholesky factor contains O(n2) nonzeros, the complexity of recovering a column is O(n2).

### C. Recovery of the Sparse Inverse

Recovering the joint pose distributions used for loop-closure hypothesis generation requires the coariance of the augmented poses, which are located on the block diagonal of the covariance matrix. The covariance recovery method of Section VI-B is inefficient for this task, since many irrelevant elements of the inverse are calculated.

An alternative recovery method [28], [29], [30], [13] can be derived from the Takahashi relationship TeX Source $${\bf A}^{-1} = ({{\bf L}^{{\ssr T}}})^{-1} {\bf D}^{-1} - {\bf A}^{-1}({\bf L}-{\bf I}) .\eqno{\hbox{(36)}}$$If (36) is used to calculate the lower triangle of the inverse, the upper triangular component (LT})−1, which contains ones on its diagonal can be ignored. Individual elements of the lower triangle of the inverse can, therefore, be calculated using the recursive relationship TeX Source $$[{\bf A}^{-1}]_{ij} = [{\bf D}^{-1}]_{ij} - \sum_{k=j+1}^{n} [{\bf A}^{-1}]_{ik} {\bf L}_{kj}, \quad \hbox{for}\; i \ge j .\eqno{\hbox{(37)}}$$In (37), an element of the inverse in column j is described in terms of other elements of the inverse in columns j to n, along with the Cholesky factorization components L and D. If the matrices A and L are sparse, not all elements of the inverse need to be recovered.

The set of elements of the inverse at the locations of nonzeros in the Cholesky factor is known as the “sparse inverse,” which is illustrated in Fig. 4. All elements of the sparse inverse can be calculated using only other members of the sparse inverse and the factorization components [29].When applied to the factorization of a VAN information matrix, the sparse inverse includes the block diagonal, providing a method to recover the augmented pose covariances.

If the Cholesky factor contains O(n) nonzero elements, which occur in VAN system containing only odometry constraints or a constant number of loop closures, the sparse inverse can be recovered in O(n) time. However, in general, where the factor contains O(n2) nonzeros, the complexity of recovering the sparse inverse is O(n3).

Fig. 4. Structure of the sparse inverse matrix. In general, the inverse of a sparse matrix is dense. The sparse inverse matrix contains the elements of the inverse at the locations of non-zeros in the Cholesky factor. The sparse inverse may contain more nonzero elements than the original matrix due to fill-in in the Cholesky factor. (a) Original matrix. (b) Factor. (c) Inverse. (d) Sparse inverse.
SECTION VII

## Generating Loop-Closure Hypotheses

Since visual feature extraction and association is computationally expensive, generating a small set of loop-closure hypotheses on which image analysis will be performed is critical for the efficiency of the VAN algorithm. Deciding if a pair of poses is accepted as a loop-closure hypothesis is performed by evaluating their joint distributions to estimate the likelihood that images acquired at each pose overlap.

Due to the computational complexity of recovering covariances from an information filter, a previous VAN implementation [9], [10], [11] used covariances recovered at previous timesteps to generate loop-closure hypotheses. Since the uncertainty of augmented past poses can only decrease, the use of old covariances is a conservative strategy. The filter is not corrupted, since no approximate values are used in any prediction or observation operation. However, the use of conservative covariances may increase the number of loop-closure hypotheses generated.

The conservative pose covariances can be used to create an approximation of the predicted joint distribution covariance of the form TeX Source $$\tilde{{\bf P}}_{(i, v)}\left(t_{k}\right) =\left[\matrix{\tilde{{\bf P}}_{ii}\left(t_{k}\right) & {{\bf P}^{-}_{iv}\left(t_{k}\right)} \cr{{\bf P}^{-{\ssr T}}_{iv}\left(t_{k}\right)} & {{\bf P}^{-}_{vv}\left(t_{k}\right)}}\right]\eqno{\hbox{(38)}}$$where is the conservative covariance of pose i, and Piv(tk) and Pvv(tk) are the optimal past-to-current cross covariance and current pose covariances, which can be recovered from the vehicle columns of the covariance matrix using the method of Section VI-B.

To maintain the set of conservative past pose covariances, the current vehicle pose covariance is appended to the set each time a new pose is augmented to the state vector. When a loop-closure observation that significantly changes the past pose distributions is applied to the filter, the approximate covariances are updated.

In previous VAN applications [9], [10], [11], each time a loop-closure observation is applied to the filter, an EKF update is performed on the approximate joint distribution covariance to yield an updated covariance for the past pose. Since all of the estimated poses are correlated, a loop-closure observation will reduce the uncertainty of all trajectory states. This approach, however, only reduces the uncertainty in one of the maintained pose covariances, leaving the others highly conservative.

The sparse inverse recovery method of Section VI-C provides an alternative method to efficiently update all the augmented pose covariances. While this operation is more computationally complex than the single-pose EKF update, the reduction in the conservative pose uncertainties will cause fewer loop-closure hypotheses to be analyzed, and is likely to result in an overall improvement in efficiency.

SECTION VIII

## Results

The SLAM algorithm described in this paper has been applied to data acquired by the AUV Sirius, a modified version of the SeaBED AUV [31] developed at the Woods Hole Oceanographic Institution. DR is performed using a Doppler velocity log (DVL) that provides the velocity of the vehicle in three axes relative to the seafloor, a compass and a tilt sensor that observe the vehicle's orientation, and a pressure sensor to measure depth. A stereovision rig is used to provide loop-closure observations. Due to the accuracy of the DVL over short distances, the vision system is not used to provide odometry information.

Fig. 5. Simplified image overlap model for loop-closure hypotheses. A planar terrain structure is assumed, along with zero vehicle roll and pitch, and a constant radial field of view of α. The altitudes of the stereo rig are ai and aj, resulting in circular image footprints with radii ri and rj. Under these assumptions, overlapping images occur if .

In this application, loop-closure hypotheses are created using the simplified visibility model illustrated in Fig. 5, which is designed to be conservative and computationally efficient. In this model, the terrain is assumed to be planar, a conservative circular bound is used to approximate the stereo rig's field of view, and the vehicle is assumed to have zero roll and pitch (a reasonable approximation for the stable Sirius vehicle). Under these assumptions, image overlap occurs if the magnitude of the 2-D stereo rig displacement [xij,yij] between poses i and j is less than the sum of the circular image footprint radii ri and rj.A distribution for the 2-D displacement is created from the conservative joint pose covariance in (38).The likelihood of image overlap is calculated by integrating the 2-D displacement distribution over the circular region defined by . In this experiment, an approximate integration is performed by sampling the 2-D displacement distribution on a 20 × 20 cell grid as demonstrated in Fig. 6, and pose pairs with an overlap likelihood greater than 0.005 are accepted as loop-closure hypotheses.

Fig. 6. Calculating the likelihood of overlapping images for loop-closure hypotheses. In this example, the mean and covariance of the 2-D stereo pose displacement distribution are and , and the maximum distance for image overlap (ri+rj) is 1.5 m. The mean of the relative pose distribution is marked by a cross, and the one and two standard deviation ellipses are drawn in black. The gray circle shows the bounds of 2-D displacement vectors that support overlapping images. The displacement distribution is sampled on a grid, and cells within the overlap bounds are integrated to estimate the likelihood of overlapping images. A 20 × 20 grid has been used, requiring the calculation of 400 samples. The grayscale intensity of each grid cell displays the evaluated relative pose likelihood. In this example, the likelihood of overlapping images was calculated to be approximately 0.08.
Fig. 7. Stereovision loop-closure observation example. The left and right stereo images acquired at a first pose are shown on top of the stereo images acquired at a second pose. Features associated between all images are marked by lines joining their locations in both left and right frames.

Loop-closure observations are created using a six degree-of-freedom stereovision relative pose estimation algorithm [32]. The SURF algorithm [20] is used to extract and associate visual features, and epipolar geometry [33] is used to reject inconsistent feature observations within each stereo image pair. Triangulation [33] is performed to calculate initial estimates of the feature positions relative to the stereo rig, and a redescending M-estimator [34], [35] is used to calculate a relative pose hypothesis that minimizes a robustified registration error cost function. Any remaining outliers with observations inconsistent with the motion hypothesis are then rejected. Finally, the maximum likelihood relative vehicle pose estimate and covariance are then calculated from the remaining inlier features. An example set of stereo image pairs and the visual features used to produce a loop-closure observation are presented in Fig. 7.

In a deployment to survey sea sponges in the Ningaloo Marine Park near Exmouth in Western Australia, the AUV traversed a grid pattern within a square region of 150 m × 150 m, collecting 2156 pairs of stereo images. The ocean depth at the survey site is approximately 40 m, and the AUV maintained an altitude of 2 m above the seafloor. The vehicle trajectory is approximately 2.2 km in length, and required approximately 75 min to complete.

A comparison of the estimated trajectories produced by DR and SLAM is shown in Fig. 8. A total of 111 loop-closure observations were applied to the SLAM filter, shown by the red lines joining observed poses. Applying the loop-closure observations results in a trajectory estimate that suggests the vehicle drifted approximately 30 m southwest of the desired survey area.

While no ground truth for the survey is available, arguments for the superiority of the SLAM solution can be created by considering the consistency of the final vehicle position estimates with GPS observations acquired after the vehicle surfaced at the end of the mission, and the self-consistency of each estimated trajectory.

Estimates of the final vehicle position at the end of the mission produced by DR, SLAM, and GPS are listed in Table I.The difference between the SLAM estimate and GPS is approximately half that of the DR solution. It is likely that a large portion of the error in the SLAM solution was accumulated in the descent to the seafloor and ascent to the surface, since during these times, no visual observations are available to correct drifting estimates.

The superior self-consistency of the SLAM solution can be observed in mosaics of images acquired at trajectory crossover points. Fig. 9 presents mosaics for the crossover points marked “A” and “B” within the DR and SLAM trajectory estimates in Fig. 8.The mosaic of the DR crossover point in Fig. 9(a) is inconsistent, since images hypothesized to overlap contain no common features. In contrast, the mosaic of Fig. 9(b), produced using vehicle pose estimates from SLAM displays, accurately registered overlapping images, demonstrating the correction of DR drift.

Fig. 8. Comparison of DR and SLAM vehicle trajectory estimates. The SLAM estimates suggest the vehicle has drifted approximately 30 m southwest of the desired survey area. Mosaics of images acquired at the trajectory crossover points marked “A” and “B” are shown in Fig. 9. A video showing the evolution of the DR and SLAM trajectories is available at http://ieeexplore.ieee.org.
TABLE I FINAL VEHICLE POSITION ESTIMATES
Fig. 9. Mosaic reconstructions of crossover points in the estimated vehicle trajectories. (a) Mosaic of the region marked “A” in Fig. 8, demonstrating the inconsistency of the DR trajectory. Images predicted to contain overlap do not match due to significant localization errors.(b) Mosaic of the region marked “B” in Fig. 8, demonstrating the superior self-consistency of the SLAM trajectory.
TABLE II ESTIMATOR PROCESSING TIMES°
Fig. 10. Evaluation of conservative covariance updating strategies. The trace of the conservative pose covariance submatrices has been used as a measure of their uncertainty. In an exploration style mission with few loop closures, updating a single pose covariance after each loop closure with the EKF update method produces little benefit. Updating all pose covariances using the sparse inverse recovery method each time a loop-closure observation is applied to the filter maintains conservative covariances that are close to optimal.

Table II lists the processing times for the SLAM algorithm with and without the use of Cholesky modifications, and using the approximate and exact vehicle state recovery methods. The exact vehicle state recovery process is slightly more efficient due to the complexity of the matrix inverse operation required by the approximate method. Using Cholesky factor modifications provides a significant advantage, since many computationally expensive factorization operations (worst case O(n3)) are replaced by constant-time modifications. When applied to larger datasets, the difference between modifying and recalculating the factor will be greater.

The processing times listed in Table II do not include the time required to produce the loop-closure observations. In the current implementation, all vision processing is performed on the same CPU as the SLAM filter. For the Ningaloo dataset, the vision processing required an additional 4 min and 1 s of processing time. In the future, the computationally expensive image analysis operations, such as feature extraction, may be performed on a separate device such as a graphics processing unit.

TABLE III LOOP-CLOSURE STATISTICS FOR CONSERVATIVE POSE UPDATING STRATEGIES

In this application, exact vehicle state recovery provides little benefit in accuracy over the approximate method. The DVL, orientation, and depth sensors provide high frequency and accurate observations, resulting in only small corrections to the past vehicle pose states. If DR is performed using each vehicle state recovery method, the maximum difference in the vehicle position estimates for the Ningaloo survey is 9 cm. The benefits of the exact vehicle state recovery method may be greater in other applications, where observations provide larger corrections to the past pose estimates.

The superiority of the sparse inverse method to update the past pose conservative covariances is demonstrated in Fig. 10, where the trace of the covariances is used as a measurement of their uncertainty. For comparison, optimal (nonconservative) values were produced by recovering the true pose covariances at each timestep. For a survey pattern with few crossover points, applying a single-pose EKF update after each loop closure provides little benefit. The strategy of updating the conservative poses using the sparse inverse method after each loop closure produces near-optimal results. The numbers of loop-closure hypotheses and observations produced when using each conservative pose update strategy are listed in Table III. As expected, the sparse inverse method results in a significant reduction in the number of generated loop-closure hypotheses.

Fig. 11. Growth of the number of nonzero elements in the Cholesky factor. The number of nonzeros grows linearly with the number of augmented poses between loop-closure observations, the first of which occurs when there are 1039 augmented past poses. The irregularities in the growth pattern result from the greedy nature of the AMD algorithm. Some loop closures cause a decrease in the number of nonzeros when a better variable ordering is found. The worst case number of nonzeros is O(n2) in the number poses; however, the sparse set of loop closures in the Ningaloo experiment result in a growth that is not much worse than linear.

The final state vector for the Ningaloo Marine Park experiment contains 25 884 variables from 2157 poses. Each vehicle pose contains 12 states: three for position, three for orientation, three for velocity, and three for angular velocity.

The final information matrix is 99.86% sparse, and its lower triangle contains 482 706 nonzero elements. Most of the nonzero elements result from odometry constraints; however, each of the 111 loop-closure observations result in a block of nonzeros below the block tridiagonal.

If the natural variable ordering is used, the Cholesky factor of the final information matrix contains 5 165 838 nonzero elements. The AMD variable ordering produces a factor with 804 222 nonzeroes (approximately one-sixth of the number produced by the natural ordering), resulting in significant computational efficiency advantages when performing state estimate and covariance recovery.

The growth in the number of nonzeros in the Cholesky factor for the Ningaloo experiment is shown in Fig. 11.In general, the number of nonzeros for SLAM is O(n2) in the number of poses;however, due to the sparse set of crossover points in the Ningaloo experiment, the number of nonzeros caused by new poses and odometry constraints (which grow linearly) outnumbers those from loop-closure observations. As a result, in this case, the growth in the number of nonzeros is not much worse than linear.

Fig. 12. Processing times for loop-closure observations. The displayed processing times were acquired on a 2.0 GHz Intel Pentium M Processor. Applying a loop-closure observation update to the information matrix is a O(n) operation in this implementation due to the use of a compressed row storage format. In general, the computational complexity of the Cholesky factorization and sparse inverse recovery operations is O(n3), and the forward solve operation is O(n2). In this experiment, the growth of the processing times is not much worse than linear due to the near-linear growth in the number of nonzeros in the Cholesky factor.
Fig. 13. 3-D reconstruction of the Ningaloo survey site. (a) Overview of the reconstruction. A gap is present in the data near the northeast corner, where the stereo rig failed to log images. (b) Detail view of a trajectory crossover point, where loop-closure observations have been applied to the filter. (c) 3-D detail view, showing the structure of the terrain, including a few of the sponges that were the target of the survey. A video showing the seafloor reconstruction in detail is available at http://ieeexplore.ieee.org.

The most computationally expensive operations in the SLAM algorithm are loop-closure observations. The processing times for each component of the loop-closure observations in the Ningaloo experiment are shown in Fig. 12. Updating the information matrix is a O(n) operation in this implementation due to the use of a compressed row storage format requiring O(n) values to be shifted when a new nonzero element is inserted. Recalculating the Cholesky factor and recovering the sparse inverse to update the conservative pose covariances are the most time-consuming components. While the computational complexity of these operations, in general, is O(n3), the growth of their processing times in this experiment is not much worse than linear due to the near-linear growth in the number of nonzeros in the Cholesky factor.

A 3-D reconstruction of the survey site has been produced by triangulating features in the stereo images, and registering the point clouds in a common reference frame using the SLAM-estimated vehicle trajectory. The source images have then been projected onto the resulting mesh, which can be observed in Fig. 13.

SECTION IX

## Conclusion

A SLAM algorithm using the VAN framework was presented and demonstrated using data acquired by the Sirius AUV at Ningaloo Marine Park in Western Australia.

The use of Cholesky factorization modifications to update a decomposition of the information matrix prevents the need to repeatedly perform the computationally expensive factorization process each time state estimates and covariances are recovered.

Through the selection of an appropriate variable ordering, recovery of the vehicle state estimates can be performed in constant time, allowing prediction and vehicle state observation operations to be performed without corrupting the filter with approximate vehicle state estimates.

Updating the conservative covariances of all past poses using the sparse inverse recovery method results in the generation of significantly fewer loop-closure hypotheses than the previously used single-pose update method.

Currently, all processing is performed offline on logged data. While the wost case computational complexity of some filter operations is O(n3) in the number of augmented poses, the result for a typical underwater survey with a sparse set of loop-closure events suggests an online implementation is feasible for our application.

### Acknowledgment

The authors thank the Australian Institute of Marine Science (AIMS) for providing ship time aboard the R/V Cape Ferguson. In particular, they wish to thank A. Heyward, M. Rees, J. Collquhoun, and the crew of the Cape Ferguson for providing this opportunity and lending a hand whenever necessary. They also acknowledge the help of all those working behind the scenes to keep our AUV operational, including P. Rigby, J. Randle, the late A.Trinder, and B. Crundwell.

## Footnotes

Manuscript received December 15, 2007; revised July 04, 2008. First published October 14, 2008; current version published nulldate. This paper was recommended for publication by Associate Editor A. Davison and Editor L. Parker upon evaluation of the reviewers' comments. This work was supported in part by the Australian Research Council (ARC) Centre of Excellence Programme and in part by the New South Wales State Government.

The authors are with the ARC Centre of Excellence for Autonomous Systems, Australian Centre for Field Robotics, The University of Sydney, Sydney, NSW 2006, Australia (e-mail: i.mahon@cas.edu.au; s.williams@cas.edu.au; o.pizarro@cas.edu.au; m.roberson@cas.edu.au).

This paper has supplementary downloadable material available at http://ieeexplore.ieee.org provided by the authors. This material includes two movies showing the evolution of the estimated vehicle trajectory and a 3-D seafloor reconstruction. The size of the material is 12.5 MB.

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

## References

1. Estimating uncertain spatial relationships in robotics

R. Smith, M. Self, P. Cheeseman

Auton. Robot Veh., Vol. 8 pp. 167–193 1990

2. Dynamic map building for an autonomous mobile robot

J. J. Leonard, H. F. Durrant-Whyte, I. J. Cox

Int. J. Robot. Res., Vol. 11, issue (4) pp. 286–298 1992

3. A solution to the simultaneous localization and map building (SLAM) problem

M. W. M. G. Dissanayake, P. Newman, S. Clark, H. F. Durrant-Whyte, M. Csobra

IEEE Trans. Robot. Autom., Vol. 17, issue (3) pp. 229–241 2001-06

4. Optimization of the simultaneous localization and map-building algorithm for real-time implementation

J. Guivant, E. M. Nebot

IEEE Trans. Robot. Autom., Vol. 17, issue (3) pp. 242–257 2001-06

5. Efficient solutions to autonomous mapping and navigation problem,

S. B. Williams

Efficient solutions to autonomous mapping and navigation problem,, Ph.D. dissertation, Aust. Centre Field Robot., Univ. Sydney, Sydney, Australia, 2001

6. Robust mapping and localization in indoor environments using sonar data

J. D. Tardós, J. Neira, P. M. Newman, J. J. Leonard

Int. J. Robot. Res., Vol. 21, issue (4) pp. 311–330 2002

7. An atlas framework for scalable mapping

M. Bosse, P. Newman, J. Leonard, M. Soika, W. Feiten, S. Teller

Proc. IEEE Int. Conf. Robot. Autom. 2003, 2 pp. 1899–1906

8. Simultaneous localization and mapping with sparse extended information filters

S. Thrun, Y. Liu, D. Koller, A. Y. Ng, H. Durrant-Whyte

Int. J. Robot. Res., Vol. 23, issue (7–8) pp. 693–716 2004

9. Large-area visually augmented navigation for autonomous underwater vehicles,

R. M. Eustice

Large-area visually augmented navigation for autonomous underwater vehicles,, Ph.D. dissertation, Massachusetts Inst. Technol. Woods Hole Oceanogr. Inst., Wood's Hole, MA, 2005

10. Exactly sparse delayed-state filters for view-based SLAM

R. M. Eustice, H. Singh, J. J. Leonard

IEEE Trans. Robot., Vol. 22, issue (6) pp. 1100–1114 2006-12

11. Visually mapping the RMS Titanic: Conservative covariance estimates for SLAM information filters

R. M. Eustice, H. Singh, J. J. Leonard, M. R. Walter

Int. J. Robot. Res., Vol. 25, issue (12) pp. 1223–1242 2006

12. Square root SAM

F. Dellaert

Proc. Robot.: Sci. Syst. Cambridge, MA, 2005-06, pp. 177–184

13. Square root SAM: Simultaneous localization and mapping via square root information smoothing

F. Dellaert, M. Kaess

Int. J. Robot. Res., Vol. 25, issue (12) pp. 1181–1203 2006

14. Algorithm 8xx: Cholmod, supernodal sparse cholesky factorization and update/downdate

Y. Chen, T. A. Davis, W. W. Hager, S. Rajamanickam

Dept. Comput. Inf. Sci. Eng., Univ. Florida, Gainesville, FL, Tech. Rep. TR-2006-005, 2006

15. iSAM: Fast incremental smoothing and mapping with efficient data association

M. Kaess, A. Ranganathan, F. Dellaert

Proc. IEEE Int. Conf. Robot. Autom. 2007, pp. 1670–1677

16. Fast incremental square root information smoothing

M. Kaess, A. Ranganathan, F. Dellaert

Proc. Int. Joint Conf. Artif. Intell. 2007, pp. 2129–2134

17. Visually augmented navigation in an unstructured environment using a delayed state history

R. Eustice, O. Pizarro, H. Singh

Proc. IEEE Int. Conf. Robot. Autom. 2004, 1 pp. 25–32

18. Distinctive image features from scale-invariant keypoints

D. G. Lowe

Int. J. Comput. Vis., Vol. 60, issue (2) pp. 91–110 2004

19. Robust wide baseline stereo from maximally stable extremal regions

J. Matas, O. Chum, M. Urban, T. Pajdla

Cardiff, U.K.
Proc. Br. Mach. Vis. Conf., British Machine Vision Assoc., 2002, Vol. 1, pp. 384–392

20. SURF: Speeded up robust features

H. Bay, T. Tuytelaars, L. V. Gool

Graz, Austria
Proc. 9th Eur. Conf. Comput. Vis., Springer 2006-05, Vol. 13, pp. 404–417

21. An affine invariant salient region detector

Proc. Eur. Conf. Comput. Vis. 2004, pp. 404–416

22. A performance evaluation of local descriptors

K. Mikolajczk, C. Schmid

Proc. IEEE Conf. Comput. Vis. Pattern Recognit. 2003, 2 pp. 257–263

23. A comparison of affine region detectors

K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, L. V. Gool

Int. J. Comput. Vis., Vol. 65, issue (1–2) pp. 43–72 2005

24. Direct Methods for Sparse Linear Systems

T. A. Davis

SIAM, 2006

25. Minimum degree reordering algorithms: A tutorial

S. Ingram., (2006)., 2006, [Online]. Available: http://www.cs.ubc.ca/~sfingram/cs517_final.pdf

26. Modifying a sparse Cholesky factorization

T. A. Davis, W. W. Hager

SIAM J. Matrix Anal. Appl., Vol. 20, issue (3) pp. 606–627 1999

27. Row modifications of a sparse Cholesky factorization

T. A. Davis W. W. Hager

SIAM J. Matrix Anal. Appl., Vol. 26, issue (3) pp. 621–639 2005

28. On computing the inverse of a sparse matrix

H. Niessner, K. Reichert

Int. J. Num. Methods Eng., Vol. 19, issue (10) pp. 1513–1526 1983

29. On computing certain elements of the inverse of a sparse matrix

A. M. Erisman, W. F. Tinney

Commun. ACM, Vol. 18, issue (3) pp. 177–179 1975

B. Triggs, P. McLauchlan, R. Hartley, A. Fitzgibbon

Lecture Notes in Computer Science. Berlin, Germany
Vision Algorithms: Theory and Practice Lecture Notes in Computer Science., Springer-Verlag, 2000, pp. 298–375

31. Seabed AUV offers new platform for high-resolution imaging

H. Singh, A. Can, R. Eustice, S. Lerner, N. McPhee, O. Pizarro, C. Roman

EOS Trans. Amer. Geophys. Union, 2004-11, 85, 31, pp. 289–294–295

32. Vision-based navigation for autonomous underwater vehicles,

I. Mahon

Vision-based navigation for autonomous underwater vehicles,, Ph.D. dissertation, Aust. Centre Field Robot., Univ. Sydney, Sydney, Australia, 2008

33. Multiple View Geometry in Computer Vision

R. I. Hartley, A. Zisserman

Cambridge, U.K.
Cambridge Univ. Press, 2000

34. Robust Statistics

P. J. Huber

New York
Robust Statistics, Wiley, 1981

35. Robust Statistics

R. A. Maronna, R. D. Martin, V. J. Yohai

Berlin, Germany
Springer-Verlag, 2006

## Cited By

No Citations Available

## Keywords

### IEEE Keywords

No Keywords Available

### More Keywords

No Keywords Available

No Corrections

## Media

Video

9,363 KB
Video

### ningaloo_slam

3,418 KB
This paper appears in:
IEEE Transactions on Robotics
Issue Date:
OCTOBER 2008
On page(s):
1002 - 1014
ISBN:
1552-3098
Print ISBN:
N/A
INSPEC Accession Number:
10301462
Digital Object Identifier:
10.1109/TRO.2008.2004888
Date of Current Version:
31 Oct, 2008
Date of Original Publication:
14 Oct, 2008

Noreils, F.R.