Conditional Weighted Linear Fitting for 2D-LiDAR-Mapping of Indoor SLAM

The ability to map an unknown environment is a fundamental milestone for autonomous robotic vehicles. Solutions in this field must combine efficiency, accuracy, and precision. We propose a novel methodology for map feature extraction in indoor environments. The mathematical model and its implementation are designed to operate with 2-D light detection and ranging (LiDAR) measurements. Map parameters and associated uncertainty levels are determined through bivariate linear regression. The final step is experimental validation, using a low-cost commercial LiDAR sensor. The main contributions of the proposed methodology lie in the domains of computational efficiency and uncertainty. In addition, the results prove the ability of our methodology to handle large volumes of data while maintaining restrained growth in computational time. This outcome suggests considerable potential for real-time applications with limited hardware resources. A second methodology, extracted from the current state of the art, is used in parallel for benchmarking purposes.


I. INTRODUCTION
A UTONOMOUS robotic systems in industrial, civil, and military applications are experiencing a fast-paced evolution in their information acquisition capabilities.They are increasingly required to perform repetitive or dangerous tasks with a substantial degree of autonomy.Such paradigm requires many process subsystems, including autonomous mapping, positioning, and navigation in unknown and, sometimes, hostile environments.
In this context, simultaneous localization and mapping (SLAM) [1] plays an important role.Robotic elements are usually equipped with a large number of sensors.These must provide the highest possible level of environment awareness.Sensor information allows the robot to map the environment while positioning itself and navigating in it.The objective is to attain the least possible erratic driving, thus evolving toward greater autonomy of the robotic elements.
Light detection and ranging (LiDAR) sensors [2] transmit coherent light signals in the near-infrared spectrum [3].They perform triangulation or time-of-flight (TOF) measurements of the reflected signal [4].The wavelength in the visible spectrum surroundings and the use of coherent signals allow higher resolutions than any other ecometric, radioelectric, or acoustic systems [5], [6].This technology can replace or coexist with other systems, such as global positioning system-a system of limited use in indoor environments-LiDAR odometry [7], [8], inertial systems [9], or conventional cameras and radar [2], [3].
These sensors can be used in the mapping stage.Mapping an environment requires the identification of real references, which can be rigorously included in the representation.Accuracy in a representation implies scaling and dimensioning.This is achieved through the reduction of uncertainties, which are heavily linked to the quality with which the real measurements have been obtained and processed.Measurements are obtained by instruments of known accuracy.The uncertainty of the measuring instrument and the processing approach is transferred to the map in a process of uncertainty propagation.The resulting map uncertainty levels can be reduced by obtaining redundant measurements from different positions.However, this leads to an increase in the complexity of the process in terms of time, memory and computational cost.In a robotic vehicle, all of these are scarce.Real-time processing, weight, size, and power consumption, among others, impose a balance between mapping resources and the quality of the result.
Real-time mapping seeks to reduce the cartographic representation to a plane with geometric primitives that simplify reality.The amount of data involved in the mapping stage can be extremely high.Real-time applications require a reduction of its volume.One of the proposed solutions is feature-based mapping, which is very useful in indoor environments where features can be observed from different positions over large areas and are repeated with a certain frequency.These keypoints are usually symbolized by straight sections, curved sections, and their intersections [10].Typically, keypoints in the environment are detected and then defined by their characteristic parameters and uncertainties [11].This idealization of the real environment reduces raw data to a few characteristic points of the environment that define the mapped area as closely as possible.This stage is crucial for the subsequent positioning of the robotic element.
This article describes a novel methodology, conditional weighted linear fitting (CWLF), for feature extraction in indoor environments.We propose a rectification of the profile by fitting it to linear segments using conditional regression.A bivariate distribution-based model is introduced to characterize the uncertainty of raw data.The uncertainty of each sensor point is propagated to the intersections of the obtained segments (corners).Eventually, these intersections are characterized by their position and their uncertainty matrix, and constitute the characteristic points of the mapped environment.
The model is validated using a commercial 2D-LiDAR sensor.The experimental results demonstrate increased computational efficiency compared to traditional SLAM methods, coupled with higher levels of accuracy.Consequently, the main contribution of this approach is to achieve higher accuracy at reduced computational cost.
The rest of this article is organized as follows.In Section II, a revision of related work is presented.Section III introduces the measurement error mathematical model, from which measurement weighting factors are derived.In Section IV, CWLF methodology is developed in analogy with another classical feature extraction method, for comparison purposes.Section V contains the results of the 2D-LiDAR-mapping as well as the comparative analysis of both methodologies.Finally, Section VI concludes this article.

II. RELATED WORK
Characterization of an indoor profile requires the unambiguous identification of recognisable reference points.This function is efficiently fulfilled by corners.These keypoints can be defined as intersections of line segments.The natural procedure for their characterization starts with the identification of the points corresponding to the segments of a polygonal contour, followed by the determination of their intersections.
Nyugen et al. [5] analyzed the different segmentation methods available for the first step of the process.Siadat et al. [12] proceed with a similar revision.From the suggested alternatives, iterative end point fit (IEPF) is our choice for segmentation purposes.
Several authors discuss on the extraction of line parameters from segmented data.Arras and Siegwart [13] define a line using its slope and distance to the origin.These parameters are directly calculated in polar coordinates, given the natural translation of LiDAR measurements from a fixed station to this environment.Vandorpe et al. [14] suggested the characterization of the line using its slope and intersection, obtained through linear regression in Cartesian coordinates.The process is performed in both axes, so as to avoid tangent numerical overflows in the vicinity of π/2.Some authors, such as Zhi-yu et al. [15] or Premebida and Nunes [16] follow this methodology.Siadat et al. [12] identified the line using a normalized linear expression.Cartesian coordinates of the line points are defined by a set of coefficients, which in turn define the entire line.Parameter uncertainties can be obtained through least squares fitting.
Some authors, such as Arras and Siegwart [13] or Diosi and Kleeman [17] have suggested a hypothesis consisting on full error accumulation in the radial direction.This implies that angular error is discarded.Essentially, this approach is equal to regression on the radial axis, neglecting angular uncertainty.Unfortunately, each scan of the sensor only provides one distance (radial) value per point, and subsequent scans do not necessarily reproduce the same set of points.
All the aforementioned approaches involve management of large amounts of data, which are ultimately reduced to simple environmental features.High-resolution technology provides this volume of information with large robustness and precision levels.However, this comes at the expense of slower processing and more demanding storage requirements.In order to mitigate these hindrances, we propose a novel methodology, CWLF, which combines a higher computational efficiency in the mapping phase of SLAM with a decrease in the uncertainty of the resulting maps.
In forthcoming sections, the efficiency of CWLF will be tested against a reference.The methodology of choice is developed in [13] by Arras and Siegwart (from now on, cited as Arras-Siegwart).For the purpose of homogenization in the benchmarking process, we will complement certain aspects of the original formulation following existing literature or our own developments.

III. PRELIMINARY FUNDAMENTALS
The proposed instrumentation combines a 2D-LiDAR sensor with a rotatory base featuring an encoder.Measurements are, thus, characterized by two parameters: distance and angle.Both parameters stem from an independent sensor, resulting in no correlation between them.
Measurements have an associated uncertainty.This uncertainty is zero-mean and defined by the standard deviations in angle and distance, σ θ and σ ρ , respectively, [17], [18], [19].In this article, we will assume that the overall error, in terms of radial and transverse distances, corresponds to a bivariate probability distribution derived from two independent normal distributions [20].It can be expressed as The constant probability density contour is an ellipse defined by the exponent of the previous expression where k is a constant.
In the methodology proposed in this work, CWLF, we perform a rotation of the cross-radial system, considering the uncertainty of each observed point after translation to the Cartesian plane.In this new environment, the uncertainties are correlated.We perform linear regression in both axes, considering the accumulated uncertainty in a single axis, conditioned by its orthogonal counterpart.The weighting factors for each measurement are defined by the inverse of the conditioned uncertainty distribution.Similarly, the Arras-Siegwart methodology will be complemented with weighting factors equal to the inverse of the bivariate distribution.We will now expand on the formulation of these factors.

A. Bivariate Distribution for Arras-Siegwart
The Arras-Siegwart methodology description in [13] includes suggestions regarding weighting factor choice.However, none of them are suitable for a low-cost 2D-LiDAR sensor implementation.To overcome this handicap, we make use of the inverse of the covariance matrix determinant at each profile point as weighting factor ( This choice is justified expressing the bivariate distribution [21] as where S = πρσ θ σ ρ represents the surface of the constant probability density contour for k = 1 and |Σ| represents the determinant of covariance matrix Σ.In short, points with higher dispersion (represented by the surface of the ellipse) have less prevalence.

B. Bivariate Distribution for CWLF
CWLF defines lines by means of linear regression.This is an error minimization mechanism in the ordinate axis, provided the abscissa value can be considered true.As suggested by Vandorpe et al. [14], regression is performed on the abscissa axis in the vicinity of π/2.Inverting the role of the axes prevents numerical overflows.
Assuming a point set (x i , y i ), variance V x and covariance C xy are zero if x i data is assumed as true.Unfortunately, this is not the case, since, in our implementation, Cartesian data is obtained from measurements expressed in the polar plane.This coordinate change introduces correlation between both dimensions and their uncertainties.As mentioned before, we assume zero-mean Gaussian distributions in the radial and transverse dimensions of the polar representation.The joint uncertainty distribution corresponds to a bell-shaped bivariate distribution.The horizontal section of this function yields an elliptical constant probability density contour.We can express this distribution in the Cartesian system [20] as where and p is the correlation factor, related to covariance by In the Cartesian system, the uncertainty variable is correlated because elliptical contours are not in their primary axes (ρ, θ).In Fig. 1, the ellipse lies on the polar system (ρ, θ) of the 2D-LiDAR sensor.In order to be able to operate with each of the error ellipses of the environment, a base change is performed to the Cartesian system.Thus, all point uncertainties are referenced on the same axis system.The Cartesian system forms an angle α with the polar system.The error density function is defined by where and the covariance matrix is given by Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
where C ρθ is the covariance of distance ρ and angle θ.The sensitivity matrix A is now the change-of-basis matrix from the radial-transversal system to the Cartesian system.Fig. 1 shows an ellipse whose axes are on the radial-transverse system, forming an angle α = π/2 − θ with the Cartesian system.For the ith point, we may write Variance V x is the quadratic addition of the deviations of both ellipse axes on the horizontal axis.In analogy, variance V y is the quadratic addition of the deviations of both ellipse axes on the vertical axis.Geometrical analysis indicates that the length of the sides of the rectangle in which the ellipse is inscribed is equal to twice the deviations (2σ ε x , 2σ ε y ) in the respective parallel axes, as shown in Fig. 1. where The assumption of no error in the abscissa values yields a conditioned probability [20] for the error in the ordinate axis.For ε x = 0, we can express it as ) where f (ε x ) is the marginal distribution for ε x .The result is a Gaussian distribution with a variance defined by where subindex c indicates its nature as conditioned variance.In parallel, for the purpose of regression on the abscissa axis, we consider conditioned probability f (ε x |ε y ), given by where f (ε y ) is the marginal distribution for ε y .The result is, again, a Gaussian distribution with a variance defined by The inverses of V c ε y and V c ε x will be used as weighting factors in their respective regression algorithms.
Constant probability contour representation allows for the assessment of the balance between radial and transversal uncertainties.

IV. METHODOLOGY
In this section, we discuss the parameter extraction from geometric primitives in indoor environments.We also present the procedure for propagating uncertainties from laser telemeter data to the final characteristics using the two considered methods.The weighting factor used for both methodologies is the inverse of the dispersions corresponding to a bivariate distribution.

A. Straight Line Characterization
1) Arras-Siegwart Method for Line Fitting: The parameters defining the line in the reference method [13] are (r, α).These values correspond to the polar parameters of the intersection point of the straight line with its perpendicular passing through the origin The distance and angle parameters considered in the method described in [13] are not uncorrelated.Their covariance matrix is given by (18) where the covariance C ρθ between the radial and angular deviations is zero and J rα is the Jacobian matrix of the partial derivatives of line parameters (r, α) with respect to data sensor (ρ i , θ i ).The obtained covariance matrix is defined by 2) CWLF for Line Fitting: CWLF performs a linear regression alternatively on the ordinate or abscissa axes, following the criterion suggested by Vandorpe et al. [14].The rationale behind this axis choice is based on the density of the point set along N 1 and N 2 .This solves the problems associated with linear regression on a single axis 20) Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply. where the line has a horizontal trend.The point cloud given by the y i values accumulates around y. Therefore, the optimal regression is y as function of x [14].
Once the uncertainty of the y i values is known, weighted regression can be applied using expression Each vertical error term is weighted using the inverse of its uncertainty ω i The quadratic value of the Euclidean norm 2 of the weighted error vector can be expressed as From the previous expression, the slope m and the intersection c are estimated The uncertainties of m and are not uncorrelated.To decouple them, we perform a displacement of the axis to the centroid where x is the weighted average of x i points.
Adding the displacement to (22) results in Now the expressions of the slope and the intersection estimators result in The variances of estimators m and b are now independent [20] and can be obtained by propagating the uncertainties In the case of vertical lines, N 2 is greater than N 1 .In this case, the point set defined by the values of x i accumulates in the vicinity of x.The value of ω i ξ 2 i tends to zero and the computational result is undefined.As advocated in [14], the regression over the line can be done The weighting factor used here is ω i Following the same approach as previously, the ordinate axis will be displaced to the centroid.This prevents the dependence between s and t and provides two independent parameters s and h The variances V s and V h follow their counterparts V m and V b , respectively, replacing references to x by references to y.

B. Corner Extraction
The methods presented in this section focus on the extraction of corners as the intersection of two consecutive straight lines.The first one is based on the polar parameters of the lines, while the second one uses the parameters extracted from the linear regression.
1) Arras-Siegwart for Corner Extraction: The method proposed in [13] considers the corner coordinates (x c , y c ) from the intersection of two straight sections, (r j , α j ) and (r j+1 , α j+1 ) The uncertainty propagation of the straight line parameters to the corner [19] is reflected by the matrix where J c is the Jacobian matrix of the partial derivatives of corner parameters (x c , y c ) with respect to (r j , α j ) and (r j+1 , α j+1 )).The covariance matrix associated to the parameters of both lines is given by C r j α j r j+1 α j+1 ∂x c ∂α j+1 ∂y c ∂r j ∂y c ∂α j ∂y c ∂r j+1 2) CWLF for Corner Extraction: Once the parameters of the straight lines and their uncertainties have been obtained, the corners are calculated as the intersection point of two consecutive lines.Depending on the type of regression used to define each of the lines, different estimators are used.The three possible alternatives are presented hereafter.
The corner covariance matrix C ca , function of (m j , b j ) and (m j+1 , b j+1 ) parameters, is given by where J ra is the Jacobian matrix of the partial derivatives of corner parameters (x c , y c ) with respect to (m j , b j ) and (m j+1 , b j+1 ) and C m j b j m j+1 b j+1 is the diagonal matrix with the variances associated to line parameters.Case b) Both lines are derived from x = s(y − y) + h regression The covariance matrix C cb for (x c , y c ) point, function of (s j , h j ) and (s j+1 , h j+1 ), is given by where J rb is the Jacobian matrix of corner parameters (x c , y c ) and C s j h j s j+1 h j+1 is the diagonal matrix with the variances associated to line parameters.Case c) A straight line comes from x regression and another from y regression The covariance matrix C cc for (x c , y c ), function of (m, b) and (s, h), is given by where J rc is the Jacobian matrix of corner parameters (x c , y c ) and C mbsh is the diagonal matrix with the variances associated to line parameters.

V. EXPERIMENTAL RESULTS
The algorithms proposed in Section IV, Arras-Siegwart and CWLF, have been executed on a laptop computer equipped with Intel Core i7-8565 U processor CPU and 8 GB RAM.SLAMTEC RPLIDAR S1 laser rangefinder has been used for extraction of raw data from the environments.This low-cost device allows 360-degree laser scanning within a 40-m range and 5 cm of range uncertainty.Based on TOF principle, the sensor captures distance and angle data approximately every 0.391 • .The graphical representation of this data yields a polar plot of the mapped profile, centered on the sensor (see Fig. 2).The scanning frequency of the sensor is variable.The measurements used in our experimentation phase were acquired at a rate of 10 Hz.
The LiDAR profile is segmented into straight sections in order to extract the features that define the environment.As we already mentioned, the IEPF method has been adopted for this purpose [12].It is a particular application of the classic Split and Merge.The algorithm defines a line passing through the first and last points of the obtained point cloud.The distance of each point to the line is calculated.Should it exceed a certain threshold value, the most distant point is defined as a breaking point.This divides the data into two sets, in which the previous steps are performed recursively.The process ends when no more breaking points can be generated [12].We have fixed the IEPF threshold Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE I CORNER UNCERTAINTIES OF ENVIRONMENT A IN MM: CWLF AND ARRAS-SIEGWART
distance at 50 mm for our experiments.The fitting is done according to Arras-Siegwart and CWLF methods, described in Section IV.Fig. 2 shows the indoor environments mapped with the RPLi-DAR S1 laser rangefinder.These environments are characterized by a large number of straight sections, obstacles, and hidden areas that prevent the profiles from being fully visible.We have tested the behavior of both methods in environments of different dimensions by analyzing two profiles.The largest one, which we have deemed environment A, is approximately 9 m × 8 m in size.Environment B, which is comparatively smaller, is 3.5 m × 6 m.In environments of this size, one laser sweep yields approximately 1000 measurements.This amount of data must be reduced in order to process and store it efficiently.First, we apply IEPF segmentation method to split raw data into straight sections.Afterwards, we estimate the parameters and the uncertainties of each straight section.To this end, both feature extraction methods (Arras-Siegwart and CWLF) are employed.The final result is environment A being characterized by 23 real corners and 36 straight sections, while environment B features 15 real corners and 18 straight sections, as shown in Fig. 2.
Both methodologies present qualitatively equivalent results in terms of corners and straight sections.However, uncertainty levels and computational times differ significantly.Evidence of the former can be found in Tables I and II and Fig. 3. Save one outlier in each environment, corner uncertainties obtained from CWLF are more accurate than their Arras-Siegwart counterparts.Regarding the latter, comparative computational efficiency between Arras-Siegwart and CWLF methods is analyzed hereafter as a function of the number of measurements involved in the calculations.
The corners selected for assessment are 20 (environment A) and 4 (environment B).The first one is a nonorthogonal  corner, defined by a 130 • angle, while 4 is an orthogonal corner, very common in indoor environments.Fig. 3 further expands on the computational efficiency difference for corner 20 of environment A. The behavior of the remaining corners, in both environments, is analogous.The blue bars represent the processing time required using Arras-Siegwart methodology; the black bars represent the same magnitude using CWLF.The red line shows the ratio between the two.Computational times for CWLF are systematically lower, remaining in the order of μs.In the case of Arras-Siegwart method, the order of ms is reached when 90 or more measurements are involved.Figs. 4 and 5 show a global comparison of methods, in terms of computational times and uncertainties.For each method, the two aforementioned corners have been studied at measurement dataset sizes equal to 50 and 100.In Arras-Siegwart, the uncertainty in both axes (σ x c , σ y c ) increases with the distance of the observed points to the intersection of the perpendicular from the origin.This relation is not present in CWLF, resulting in a reduction of uncertainty levels.This decrease corresponds to  approximately 30% on the x-axis and 15% on the y-axis for environment A, whereas in environment B the decrease is about 78% on the x-axis and 68% on the y-axis.Furthermore, the inverse relation between number of measurements and uncertainty is easily verifiable.
The computational times associated to corner extraction include four computational steps: extraction of the straight lines that define the corner, estimation of their uncertainties, extraction of the corner parameters, and estimation of their uncertainties.The distribution for each method is shown in Figs. 4  and 5.The uncertainties represented correspond to σ x c and σ y c , as introduced in Section IV, and the computational times, separating the four steps necessary to obtain the corners and their uncertainties.The main difference between both corners lies in time allocation.The nonorthogonal corner 20 requires more time than the orthogonal 4 for the extraction of corner parameters.On the other hand, the Arras-Siegwart algorithm consumes 70% of its computational time in calculating the line uncertainty, whereas the CWLF algorithm spends 60% on the extraction of line parameters.Overall, CWLF significantly reduces the feature extraction times of a profile and the uncertainties associated with its keypoints.

VI. CONCLUSION
In this article, we have presented a new LiDAR-based 2-D mapping methodology.The outcome of this article revolves around two major aspects: uncertainty and computational time.
The introduction of measurement weighting factors based on a bivariate distribution has successfully led to uncertainty levels in the order of those of the instrument.Considering the same starting dataset, uncertainty is consistently reduced with reference to similar SLAM methodologies.
The ability to perform regression on both coordinate axes precludes the risk of numerical instabilities associated to extreme geometries.Similarly, assuming correlation between axial uncertainties adds to the overall uncertainty realism of the methodology.
The main contribution of CWLF resides in its capability to manage large volumes of raw data without significantly increasing computational times.This indirectly conditions the quality of the resulting map: uncertainty levels become increasingly determined by sensor error, rather than by available resources.Indeed, for a given computational timestep, CWLF can process a larger number of measurements, thus featuring lower uncertainties.
Future lines of work have the potential to further increase the efficiency of this methodology by reducing repetitive steps or exploring specific computational implementations.For instance, while the method successfully avoids numerical overflow in the vicinity of π/2 by choosing the regression axis, it requires a prior analysis of the point cloud density parameters.This limitation will be addressed in future research by exploring an alternative solution that does not require preanalysis.

Fig. 1 .
Fig. 1.This error ellipse represents the constant probability density contour of a single point of the environment, associated to laser rangefinder uncertainties σ ρ and σ θ .

Fig. 2 .
Fig. 2. Indoor environments have been acquired with RPLiDAR S1 sensor.The red star corresponds to the laser rangefinder position.The black data shows the mapped environment while corners are represented by blue circles.Relevant corners have been numbered according to the clockwise direction of the data acquisition.

Fig. 3 .
Fig. 3. CWLF and Arras-Siegwart computational times are plotted as a function of the number of measurements employed for corner 20 (environment A) extraction.The red line reflects the computational cost of the CWLF method with respect to the weighted Arras-Siegwart method.Please note the logarithmic scale for the time axis is measured in μs.

Fig. 4 .
Fig. 4. Comparative graph of uncertainties and computational times of Arras-Siegwart and CWLF for the extraction of corner 20 of environment A. Both methods are benchmarked at 50 and 100 available measurements.Please note the scale is logarithmic.

Fig. 5 .
Fig. 5. Comparative graph of uncertainties and computational times of Arras-Siegwart and CWLF for the extraction of corner 4 of environment B. Both methods are benchmarked at 50 and 100 available measurements.Please note the scale is logarithmic.