A Local Feature Descriptor Based on Rotational Volume for Pairwise Registration of Point Clouds

Aiming to problems in the pairwise registration of point clouds, such as keypoints are difficult to describe accurately, corresponding points are difficult to match accurately and convergence speed is slow due to uncertainty of initial transformation matrix, we propose a novel feature descriptor based on ratio of rotational volume to describe effectively keypoints, and on the basis of the feature descriptor, we proposed an improved coarse-to-fine registration pipeline of point clouds, in which we use coarse registration to obtain a good initial transformation matrix and then use fine registration based on a modified ICP algorithm to obtain an accurate transformation matrix. Experimental results show that our proposed feature descriptor has a good robustness to rotation, noise, scale and varying mesh resolution, less storage space and faster running speed than PFH, FPFH, SHOT and RoPS descriptors, and our improved pairwise registration pipeline is very effective to solve the problems in the pairwise registration of point clouds.


I. INTRODUCTION AND RELATED WORK
With the development of laser scanning technology, the capacity to capture 3D spatial data has been enhanced greatly. However, limit to the field of view of the scanning device, each scanning can only capture partial point cloud of 3D object. In order to obtain complete 3D object, it is necessary to use registration technology to align partial point clouds into a global coordinate framework. The core work of registering partial point clouds is to find the corresponding position and orientation of a pairwise point clouds in a global coordinate framework, which is also called pairwise registration [1]. At present, the most popular pairwise registration is the Iterative Closest Point (ICP) algorithm [2], [3]. The ICP searches the nearest point pairs from a pair of point clouds, namely a source point cloud and a target point cloud, and estimate the rigid body transformation and then apply it into the source point cloud. The process of search and transformation is performed iteratively until the convergence is obtained.
The associate editor coordinating the review of this manuscript and approving it for publication was Yongming Li .
Because input pairwise point clouds might be only partially overlapping, complete point-to-point correspondences are always hard to be found. Many efforts have been made into the field of feature selection [4] that could improve the correspondence problem. Robust feature descriptors such as integral volume descriptors [5], moment invariants [6] and spherical harmonic invariants [7] have been proposed and used for registration of pairwise point clouds. The above feature descriptors are invariant to translation and rotation, but are still sensitive to noise [8], so it is hard to find correct correspondences by using them. In recent years, people have also put forward some local feature descriptors such as point feature histogram (PFH) [9], rotational projection statistics (RoPS) [10]- [13], signature of histogram of orientations (SHOT) [14], multi-attribute statistics histograms (MaSH) [15], local feature statistic histogram (LFSH) [16], binary shape context descriptor [17], [18], voxel-based buffer-weighted binary descriptor [19], 3D descriptor with global structural frames and local signatures of histograms [20], signature of geometric centroids descriptor [21], etc. These local feature descriptors have a certain improvement on finding correct point-to-point correspondences in noisy point clouds, and outperform global feature descriptors in pairwise registration of point clouds. Moreover, there are also methods generating descriptors based on machine learning, such as [22]- [25]. However, there also exist some problems for them, such as time efficiency or space storage. Therefore, finding a feature descriptor robust to nuisance and highly efficient is still a great challenge for researchers who have been studying pairwise registration of point clouds.
The contributions of this paper are summarized as follows: • A feature descriptor based on rotational volume is presented, which simultaneously achieves high descriptiveness, robustness and time efficiency for the purpose of local surface representation and surface matching.
• A boundary point detection algorithm is presented, which can obtain effectively points on the boundary; and based on the algorithm, we can remove effectively keypoints near to boundary, which can enhance the descriptiveness of keypoints.
• An improved rigid transformation estimation via SAC is presented, which can get a transformation estimation according to a pair of keypoint and their local reference frames, and can get an optimal transformation estimation according to ratio of inliers.
• An improved coarse-to-fine pairwise registration pipeline for point clouds is proposed, which includes keypoint detection with boundary removed, feature description and matching, transformation estimation, fine registration.
The remainder of this paper is organized as follows. In Section 2, we describe the process of our proposed feature descriptor based on rotational volume in detail. In Section 3, we describe our improved coarse-to-fine pairwise registration of point clouds, especially including a boundary point detection algorithm and a robust rigid transformation estimation via SAC. Section 4 discusses the experimental results about our proposed feature descriptor, keypoint detection and pairwise registration before concluding in Section 5.

II. A LOCAL FEATURE DESCRIPTOR BASED ON ROTATIONAL VOLUME
Local feature-based methods for pairwise registration aligns point clouds by using consistent point-to-point correspondences which are usually generated by matching local feature descriptors between pairwise point clouds. Therefore, the local feature descriptor should be robust and distinctive to various nuisances such as noise, scale and varying point cloud resolution, in order to obtain sufficient correct pointto-point correspondences. Moreover, local feature descriptor should be invariant to rigid transformation including translation and rotation. In this section, a fast and robust local feature descriptor based on rotational volume will be first introduced and then similarity measure of the feature descriptor will be given.
In this section, we first generate a right-angled trapezoid by sequentially connecting two points within a neighborhood of a point cloud and their projected points in a special plane, which is a tangent plane of a neighborhood of a query point in a point cloud. Then, we compute the volume of a shape, which is generated by rotating the above trapezoid around a special axis which is the normal of the above special plane and is parallel with a right-angled side of the above trapezoid. We name the volume as rotational volume since it is obtained by rotation. Given a query point, its feature descriptor based on rotational volume describes accumulation of rotational volume of the query point's neighbors distributed on different regions. For each point on a point cloud, its local surface formed by its neighboring points has subtle fluctuations different from other local surfaces, which causes its feature descriptor based on rotational volume to be different from other point's feature descriptors. At the same time, for each point on a point cloud, its local surface will not change with rotation and translation of the point cloud, which also makes its feature descriptor based on rotational volume be invariant to rotation and translation. Moreover, for each point on a point cloud, geometric structure of its local surface will only have a little change when there are various nuisances such as noise, scale and varying point cloud resolution, which can make its feature descriptor based on rotational volume be robust to nuisance. The rest of this section will introduce how feature descriptor of a query point is generated and the similarity between our proposed descriptors.

1) SELECTION OF NEIGHBORHOOD AND GENERATION OF LOCAL REFERENCE FRAME
Assume that P is input point cloud, which contains N points {p 1 , p 2 , . . . , p N }. Given a query point p i (i ∈ [0, N − 1]), its neighborhood is a sphere of radius r centered at p i , and all the points excluding p i in this neighborhood are named as neighboring points or neighbors of the query point. The sphere of p i is named as S i , and the set of the neighbors are named as Nbhd(p i ). Moreover, KD-Tree is used to search the neighbors of a query point in this paper, which can accelerate the searching speed.
For a query point p i , we denote its neighbors by Nbhd(p i ) = {p j i |j = 0, 1, . . . , k − 1}, and with those neighbors, we can define a local reference frame LRF i centered at the query point p i by using eigen analysis of covariance matrix as follows.
(1) Compute a weight for p i inversely related to its neighbors as follows This weight is used to compensate for uneven sampling of the neighboring points, so that the neighboring points at sparsely sampled regions contribute more than the neighboring points at densely sampled regions. VOLUME 8, 2020 (2) Compute a weighted covariance matrix cov(p i ) for p i using its neighbors. The 3×3 covariance matrix is given by the order of  decreasing magnitude and their corresponding eigenvectors   {e 1 i ,e 2 i ,e 3 i }, which define repeatable, orthogonal directions in presence of noise and clutter. It is worth noting that, compared to [26] and [27], the third eigenvector e 3 i no longer represents the normal direction of the total least squares estimation and sometimes it is obviously different from it. But this does not affect performance, since in the case of local feature descriptors what matters is a highly repeatable and robust triplet of orthogonal directions, and not its geometrical or topological meaning.
(4) Use p i as the origin, and use e 1 i ,e 3 i and their cross product e 1 i × e 3 i as u, w, and v axes respectively to define a local reference frame LRF i at p i . Since an eigenvector of the covariance matrix computes a direction in the 3D space based on the amount of point position variations, its orientation has a 180 • ambiguity. Therefore, we have two choices for the orientation of u and w axis. We determine the sign on the u and w axis according to the principle which the sign of local axis should be coherent with the majority of the vectors formed by p i and its neighbors. In the following, we refer to the three eigenvectors in decreasing eigenvalue order as the u + , v + and w + axis respectively. With u − , v − and w − denote the opposite vectors of u + , v + and w + respectively. The criterion that a vector formed by p i and an its neighbor is coherent with u + axis is given by in which, the vector is coherent with u + axis when Sum p j i is equal 1, and the vector is coherent with u − axis when Sum p j i is equal -1. Therefore, u axis can be given by The same procedure is used to disambiguate the w axis. Finally, v axis is obtained as w × u. As far, the local reference frame LRF i (p i ,u, v, w) has been constructed.

2) SPACE PARTITIONING OF NEIGHBORHOOD
Taking the intersection p north i between w's positive direction and S i as north pole, the intersection p south i between w's negative direction and S i as south pole, and w as middle axis, we divide S i into m uniform regions, along the longitude of S i . We denote a region as Region k i (k ∈ [0, m − 1]) and the set of Region k i as Region i . In order to determine the neighboring points included in each Region k i , we project Region i into a plane which is parallel to the uv plane and passes the point p south i . We denote the projection of S i as C S i which is a circle, and denote the projection of Region k i on C S i as Sector k i (k ∈ [0, m − 1]) which is a sector, and we denote the set of Sector k i as Sector i in which a Sector k i is corresponded with a Region k i . The space partitioning for the query point's neighborhood is shown as Fig. 1. Then, for each neighboring point, we need to determine its corresponding Region k i . The problem is hard to be solved in 3D space, so we project each neighboring point into 2D space and determine its projection point's corresponding Sector k i . Thus, corresponding Region k i of each neighboring point can be determined according to the correspondence between Sector k i and Region k i . The detail process is shown as follows: (1) For a neighbor p j i in Nbhd(p i ), projecting it into the uv plane according to the following formula in which q j i is projection point of p j i on the uv plane, and * denotes a multiplication between a scalar and a vector.
(2) Connect p i with q j i and compute angle α j i between the vector − → p i q j i and u axis according to the following formula and the schematic diagram is shown as Fig. 2.
(3) Determine the sector that q j i locates in according to the value of α j i . We divide S i into m uniform regions, so C S i , the projection plane of S i , is also divided into m uniform regions, and the value of central angle of each Sector k i is 360/m that we denote as β, and then the number of sector, which q j i locates in, can be defined as follow in which n j i is number of sectors in Sector k i for a neighbor p j i in Nbhd(p i ).

3) COMPUTATION OF ROTATIONAL VOLUME AND GENERATION OF FEATURE DESCRIPTOR
By space partitioning of neighborhood, we can determine the neighboring points included in each Region k i , then we can calculate sum of rotational volume for each Region k i and m sums of rotational volume form a vector, and finally we are able to obtain the feature descriptor of the query point p i by normalize the above vector. Therefore, the core, that generates a feature descriptor for a query point, is to calculate sum of rotational volume for each Region k i and its process is detailed as follows.
(1) Suppose a Region k i (k ∈ [0, m−1]) including the neighboring points Nbhd p k i = {p k1 i ,p k2 i , . . . ,p kt i }, and assign zero to V k i which denotes sum of rotational volume in Region k i . (2) Take out two adjacent points in order from Nbhd p k i and form t-1 sequences such as < p k1  i denotes the distance between p k2 i and q k2 i , r k1 i denotes the radius of a circle formed by p south and q k1 i and centered at p south , r k2 i denotes the radius of a circle formed by p south and q k2 i and centered at p south . The right-angled trapezoid T k1k2 i rotates 360 degree around w axis and it will generate a new shape that consists of a solid frustum, a solid cylinder and a hollow cylinder which are shown as Fig. 4. So, the volume V tmp i of this shape can be calculated as follows.
(4) Assign the result of V k i plus V tmp i to V k i and then take next sequence from Nbhd p k i to execute step (3) until all the sequences have been handled.
(5) Calculate V k i of each Region k i according to above steps and form a vector v_ds with m dimension.
(6) Normalize v_ds by making each element of v_ds be divided by 1-norm of v_ds. The new vector is the feature descriptor of the query point p i .

4) SIMILARITY MEASURE OF FEATURE DESCRIPTOR
What rotational volume describes is the ratio of volumes where each volume is generated by rotating a right-angled trapezoid which is formed by two points and its projected points. A neighborhood of a query point will be divided into m regions, and neighboring points in each region will generate a rotational volume. When geometric structure is different between a query point and a target point, their rotational volume in each region will be different with each other, which causes feature descriptors of these two points to be different with each other. When geometric structure is VOLUME 8, 2020 same between a query point and a target point, their rotational volume will be same in each region, which causes feature descriptors of these two points to be same with each other. When geometric structure is similar between a query point and a target point, their rotational volume will be different in some regions and will be similar in some other regions, which causes feature descriptors of these two points to be similar. Geometric structure of two points determine whether their feature descriptors based on rotational volume are similar or not, which is the principle of exclusivity for two feature descriptors generated by rotational volume.
The similarity between two feature descriptors based on rotational volume can be described by the distance between two feature descriptors, and the closer the distance is, the more similar they are. Supposing two feature descriptors . . ,V m 2 }, their similarity can be defined as follow

III. COARSE-TO-FINE PAIRWISE REGISTRATION OF POINT CLOUDS
Based on our proposed feature descriptor, an improved coarse-to-fine method for pairwise registration of point clouds is described in this section. It consists of five major modules: keypoint detection with boundary removed, feature descriptor generation, feature matching, robust transformation estimation and fine registration. Fig. 5 shows pairwise registration pipeline of point clouds.

A. KEYPOINT DETECTION WITH BOUNDARY REMOVED
Based on ISS detector, keypoint detection with boundary removed will remove keypoints on the boundary from keypoint set. Fig. 6 shows the pipeline of keypoint detection.

1) ISS DETECTOR
ISS detector is based on the Eigenvalue decomposition of the scatter matrix (p) of neighboring points centered at p, i.e.
in which p j is a neighboring point centered at p. Given (p), its eigenvalues in the order of decreasing magnitude are denoted λ 1 ,λ 2 ,λ 3 , and points whose ratio between two successive eigenvalues is below a threshold are retained, i.e.
The rationale is to avoid detecting keypoints at points exhibiting a similar spread along the principal directions where a repeatable canonical reference frame cannot be established. Among remaining points, the keypoint is determined by the magnitude of the smallest eigenvalue is local minima or maxima, i.e.
in which λ 3 p is the smallest eigenvalue of any neighboring point centered at p. in which p j is a neighboring point centered at p.

2) BOUNDARY POINT DETECTION ALGORITHM
Mian et al. [28] proposed a boundary detection based on that the number of points in the neighborhood of boundary point is much lower than the number of points in the neighborhood of non-boundary point. The method accords with people's recognition to boundary points, but in practice, its effect is not so good, and then the average number of neighboring points and the number of neighboring points of each point need to be calculated in advance, which is very time-consuming and is not robust to noise. The boundary point detection in this paper is since most neighboring points of a boundary point are distributed on one side of the boundary point, which is shown in Fig. 7. According to this fact, if a point is a boundary point, most of its neighboring points are distributed on one side which causes a notch on the other side of the boundary point. Therefore, when these neighboring points are projected on tangent plane of the boundary point, there will also be a notch in the line between the projection points and the boundary point. Detecting angle of the notch can determine whether the point is a boundary point or not. The process of determining whether a point p in a point cloud P is a boundary point is shown as follows: (1) Determine p s neighboring points centered at p with radius r and name them as N(p), i.e.
(2) Project N(p) to a tangent plane which is centered at p and whose normal is n with following formula and name these projection points as N(q).
(3) Find the nearest point to p and name it as q u , and then construct a local reference frame (p, u, v, w ) whose w axis is p s normal n, u axis is − → pq u / − → pq u , v axis is u × w, origin point is p.
in which L i describes the included angle between two adjacent vectors on the tangent plane which is shown as Fig. 8. (6) Find maximum angle L max and determine p is a boundary point if L max > L th . Here, L th is a threshold whose value can be adjusted.

3) REMOVING KEYPOINTS ON THE BOUNDARY
The process of removing keypoints on the boundary is show as follows: (1) Traverse all points in a point cloud P and detect the entire boundary points into a set named as Boundaries according to the above boundary point detection algorithm.
(2) Traverse all points in a point cloud P and detect the entire keypoints into a set named as Keypoints according to the above ISS detector.
(3) For each keypoint kp in Keypoints, find its nearest distance d min to a boundary point in Boundaries and remove kp from Keypoints if d min < 5mr.

B. FEATURE DESCRIPTOR GENERATION AND FEATURE MATCHING
Let P s and P t respectively represent source and target point cloud. Before feature descriptor generation, we first detect respectively keypoint for P s and P t , and keypoints are denoted by KPS s By matched feature pair < d s i ,d t j >, we can find matched keypoint pair < kp s i , kp t j > which is a point-to-point correspondence. KD tree is here used to accelerated feature matching process. It is noted that L2 norm is always employed to measure the similarity between two feature descriptors, as in [14], [29]. Therefore, n correspondences, denoted by a set C, can be obtained between DS s and DS t . However, only a part of them is determined to be correct since P s and P t are usually partially overlapped.

C. ROBUST TRANSFORMATION ESTIMATION VIA SAC
As far, a correspondence set C is established between P s and P t . The aim of robust transformation estimation is to estimate an optimal rigid transformation T from C in order to transform P s to P t . Popular methods include the RANSAC algorithm [30], [31] and its variants [32], [33]. However, they always suffer from weak robustness to high time consumption. Here, we propose a sample consensus (SAC) to estimate T with a quicker speed and a higher accuracy.
Our proposed SAC method also estimates T from C within a fixed iteration. However, the difference is that our proposed method uses a correspondence to estimate a rigid transformation instead of three correspondences. Theoretically, a correspondence is insufficient to estimate a rigid transformation between 3D point clouds. However, if a correspondence is orientated with LRF, it is sufficient to estimate a rigid transformation [34]. The formulas of estimating a rigid transformation are shown as follows in which < kp t j ,kp s i > is a keypoint pair, LRF s i and LRF t j are local reference frame of kp s i and kp t j respectively, R and t denote a rotation matrix and a translation vector of a rigid transformation. It is noted that LRF of each keypoint does not be calculated here, and it has been calculated in the process of feature descriptor generation, because our proposed feature descriptor based on rotational volume need to construct LRF for each keypoint in advance. In contrast to classical RANSAC, the advantage of a correspondence is that the computational complex is reduced from O(n 3

) to O(n)
where n is the number of correspondences.
By above method, we can achieve n rigid transformations by n correspondences, and the optimal rigid transformation is satisfied with maximized sample consensus. We use ratio of inliers to measure the level of sample consensus and so the maximized sample consensus is corresponding to the biggest ratio of inliers. In order to calculate a ratio of inliers for a correspondence, we first calculate a rigid transformation by formula 21-22 and name it as ; then, we transform P s via and name the transformed point cloud as P s ; and then, we search its nearest point in P t for each point in P s , and we regard a point in P s as a inlier if the distance between and its nearest point in P t is less than a threshold, i.e.
and finally the ratio of inliers can be defined as follow rt inliers = n inlines n s , in which n inlines is denoted as the number of inliers, and n s is denoted as the number of points in P s . Therefore, we can achieve the optimal rigid transformation T * and then T * is applied to transform P s and generate a transformed point cloud P s * . Here, coarse registration of pairwise point clouds has been complete.

D. FINE REGISTRATION
After coarse registration, pairwise point clouds almost have been aligned well and further refining pairwise point clouds uses a modified ICP algorithm [35] proposed recently, which can exhibit accurate performance even when a good initialization is not reliably available. Via the fine registration, we can merge the input pairwise pint clouds into a seamless point cloud.

IV. EXPERIMENTS
In this section, experiments include feature descriptor based on rotational volume (section 4.1), keypoint detection with boundary removed (section 4.2) and coarse pairwise registration (section 4.3). In experiments of keypoint detection, we will show detection process of keypoints, including the keypoints detected by ISS detector, and boundary points detected by our proposed boundary point detection algorithm, and ultimate keypoints after removing keypoints near to boundary. In experiments of pairwise registration, we will show results of pairwise matching by keypoint detection, feature descriptor generation and feature matching, and will give registration error of our proposed pairwise registration, and will compare our proposed feature descriptor with stateof-the-art feature descriptors based on our improved pairwise registration pipeline.

A. EXPERMENTS OF FEATURE DESCRIPTOR BASED ON ROTATIONAL VOLUME
The task of this section is to compare our proposed feature descriptor with PFH, FPFH, SHOT, RoPS in terms of run time, robustness to rotation, noise, scale and varying mesh resolution. PFH, FPFH, SHOT and RoPS are implemented in PCL [36](Point Cloud Library). In these experiments, the parameter m in our proposed descriptor is set to 24, and a normal vector of a point in the point cloud is computed on the nearest 10 neighboring points.

1) DATASET
In these experiments, we use the Stanford 3D Scanning Repository [37] to test these feature descriptors. There are a total of nine 3D models in the Stanford 3D Scanning Repository: Bunny, Drill bit, Happy Buddha, Dragon, Armadillo, Lucy, Asian Dragon, Vellum manuscript and Thai Statue. These models were scanned using a Cyberware 3030 MS scanner, Stanford Large Statue Scanner or XYZ RGB auto-synchronized camera with a resolution of 100 microns. However, only three models, namely Armadillo, Bunny and Happy Buddha, are used in our experiments because these three models were all scanned by a Cyberware 3030 MS scanner and have same mesh resolution (mr), which is helpful for analyzing and comparing the performance of these feature descriptors. These three models are shown in Fig. 9. It is worth noting that we convert the model's data format from original ply (Polygon File Format) format to pcd (point cloud data) format using CloudCompare Software so that those models can be used in our algorithm. 2) RUN TIME Table 1 shows the number of points in each model and the number of keypoints in each model, and these keypoints are  obtained by ISS [38](Intrinsic Shape Signatures) which has been implemented in PCL. Table 2 shows the information of dimension of five feature descriptors and run time that generates descriptors for each model's keypoints. It can be seen from Table 2 that the feature descriptor proposed in this paper is low-dimensional and high-speed. It means that the required storage space is smaller, and computation efficiency is higher for our proposed feature descriptor. The deep reason is that our proposed featture descriptor only needs to calculate the local coordinate system and the ratio of rational volume around each keypoint's neighborhood. It is noted that in this experiment, PFH is faster than FPFH. The reason is that the search radius of FPFH is twice than that of PFH, which causes FPFH's time cost will increase more quickly than PFH's time cost while the number of points used to calculate descriptors is small. Finally, it should be also noted that RoPS descriptor could not be generated direct from point cloud, and we need first turn point cloud into triangle mesh using the method of greedy projection triangulation before generating RoPS descriptor. Therefore, it is very time-consuming to generate RoPS descriptor and we can also see clearly from Table 2 that run time of generating RoPS descriptor is the longest.

3) ROBUSTNESS TO ROTATION
In order to evaluate the robustness of these feature descriptors to rotation, we rotate models in an increasing angle from 0 to 90. For the feature descriptors with robust rotation, their values do not change or have a small change. The process of experiment is shown as follows: (1) Suppose a model M which comes from models used in this experiment and a ground-truth rotation matrix R gt .
(2) Rotate M according to rotation matrix R gt and generate a new model M .
(3) Calculate keypoints for M and denote the keypoints as KPS.
(4) From M , find corresponding keypoint for each keypoint in KPS according to R gt and denote the keypoints as KPS . In this step, we first rotate a keypoint kp in KPS according to R gt and generate a keypoint kp tmp , and then we find the nearest point of kp tmp from M and denote it as kp . The point kp is a corresponding keypoint of kp. According to this operation, other keypoints in KPS can find their corresponding keypoints in KPS .
(5) Calculate its descriptor for each keypoint in KPS and denote the descriptors as DS, and calculate its descriptor for each keypoint in KPS' and denote the descriptors as DS'. (6) For each descriptor d in DS, find the nearest descriptor d from DS , and denote d and d as an ordinal pair <d, d >.
(7) According to <d, d' >, find corresponding keypoint's ordinal pair <kp, kp > in which kp comes from M and kp comes from M . (8) Judge the correspondence between kp and kp' according to the following formula which determines whether a descriptor does not change or has a small change with rotation. (9) Calculate accuracy of matched descriptors according to the following formula accuracy = n acc /n, (26) in which n acc denotes the number of matched descriptors, and n denotes the total of descriptors. The accuracy can reflect the robustness of feature descriptor to rotation. The higher the accuracy is, the more robust the feature descriptor is to rotation. The average accuracy under different rotation angle for three models is shown as Fig. 10, in which it is very clear that VOLUME 8, 2020 our proposed descriptor achieves the best accuracy in most feature descriptors and it also shows our proposed descriptor has the best robustness to rotation. There are at least two reasons for this. First, our proposed feature descriptor is based on a robust local reference frame which is invariant to rotation. Second, our proposed descriptor describes ratio of rotational volume of points in different partitions and the ratio of rotational volume is also invariant to rotation.

4) ROBUSTNESS TO NOISE
In order to evaluate the robustness of these feature descriptors to noise, we added Gaussian noise with increasing standard deviation of 0.1,0.2 and 0.3 mr to models and denotes them as scenes which are shown as Fig. 11. For the feature descriptors with robust noise, their values do not change or have a small change. The process of experiment is shown as follows: (1) Suppose a model M which comes from models used in this experiment and a ground-truth matrix Rt gt which includes a rotation matrix R gt and a translate vector t gt .
(2) Select a corresponding scene S σ which comes from scenes used in this experiment and is generated by adding Gaussian noise with standard deviation σ to M .
(3) Rotate and translate S σ according to the matrix Rt gt and the translate vector t gt , and then generate a new scene S σ .
(4) Calculate keypoints for M and denote the keypoints as KPS.
(5) From S σ , find corresponding keypoint for each keypoint in KPS according to Rt gt and t gt , and denote the keypoints as KPS'. In this step, we first rotate and translate a keypoint kp in KPS according to Rt gt and t gt , and generate a keypoint kp tmp , and then we find the nearest point of kp tmp from S σ and denote it as kp'. The point kp' is a corresponding keypoint of kp. According to this operation, other keypoints in KPS can find their corresponding keypoints in KPS'. (6) Calculate its descriptor for each keypoint in KPS and denote the descriptors as DS, and calculate its descriptor for each keypoint in KPS' and denote the descriptors as DS'. (7) For each descriptor d in DS, find the nearest descriptor d' from DS' , and denote d and d' as an ordinal pair <d, d'>. (8) According to <d, d' >, find corresponding keypoint' s ordinal pair <kp, kp'> in which kp comes from M and kp' comes from M'. (9) Judge the correspondence between kp and kp' according to the following formula in which σ denotes standard deviation of Gaussian noise between M and S σ . Formula (27) determines whether a descriptor does not change or has a small change with noise. (10) Calculate accuracy of matched descriptors according to formulas (26). Here, the accuracy can reflect the robustness of feature descriptor to noise. The higher the accuracy is, the more robust the feature descriptor is to noise. The average accuracy under different Gaussian noise for three models is shown as Fig. 12, in which it is very clear that our proposed descriptor achieves the best accuracy in most feature descriptors, and is followed by RoPS. By Fig. 12, it shows our proposed feature descriptor has the best robustness to noise. The deep-seated reason is that our proposed feature descriptor describes ratio of rotational volume, which denotes space structure of keypoint and its neighboring points around the keypoint will keep invariant as long as noises do not change the geometric structure of neighborhood of keypoint.

5) ROBUSTNESS TO VARYING MESH RESOLUTION
In real production and life, different hardware and configuration of sampling equipment will lead to different mesh resolutions. Therefore, it is very important to test the performance of a feature descriptor based on varying mesh resolution. The process of experiment on robustness to varying mesh resolution is shown as follows: (1) Suppose a model M which comes from models used in this experiment and a ground-truth rotation matrix R gt .
(2) Resample M to δ of its original mesh resolution and name the new model as M δ .
(3) Calculate accuracy between M and M δ based on the above experiment of robustness to rotation. The average accuracy under varying mesh resolution for three models is shown as Fig. 13. It can be clearly seen from Fig. 13 that our proposed feature descriptor has the best accuracy when mesh is resampled down to 0.75 of their original mesh resolution, and the average accuracy of our proposed feature descriptor is close to other feature descriptors when mesh is resampled down to 0.5 of their original mesh resolution, and the its average accuracy is worse than other descriptors when mesh is resampled down to 0.25 of their original mesh resolution whose reason is that our proposed feature descriptor has great changes for internal geometry structure and volume when mesh resolution varies greatly, and then large error is made on the process of calculating ratio of volume.
Though our proposed feature descriptor is not so good when mesh is resampled to 0.25, it is very rare for this kind of varying mesh resolution in real production and life. Meanwhile, our proposed feature descriptor is good when mesh is resampled to 0.75 and 0.5. Therefore, our proposed feature descriptor is robust to varying mesh resolution.

6) ROBUSTNESS TO SCALE
In order to evaluate the robustness of these feature descriptors to scale, we need make a point cloud multiply by a scale factor. The process of experiment is shown as follows: (1) Suppose a model M which comes from models used in this experiment and a ground-truth scale factor ω.
(2) Multiply M to scale factor ω and generate a new model M'.
(3) Calculate keypoints for M and denote the keypoints as KPS.
(4) From M', find corresponding keypoint for each keypoint in KPS according to ω and denote the keypoints as KPS'. In this step, we first multiply a keypoint kp in KPS by ω and generate a keypoint kp tmp , and then we find the nearest point of kp tmp from M' and denote it as kp'. The point kp' is a corresponding keypoint of kp. According to this operation, other keypoints in KPS can find their corresponding keypoints in KPS' .
(5) Calculate its descriptor for each keypoint in KPS and denote the descriptors as DS, and calculate its descriptor for each keypoint in KPS' and denote the descriptors as DS'. (6) For each descriptor d in DS, find the nearest descriptor d' from DS', and denote d and d' as an ordinal pair <d, d' >.
(7) According to <d, d' >, find corresponding keypoint' s ordinal pair <kp, kp' > in which kp comes from M and kp' comes from M'. (8) Judge the correspondence between kp and kp' according to the following formula which determines whether a descriptor does not change or has a small change under different scales. (9) Calculate average accuracy between M and M' based on the above experiment of robustness to rotation. The average accuracy under different scale factors for three models is shown as Fig. 14. It is very clear that our proposed feature descriptor achieves the best accuracy in most feature descriptors and is followed by PFH and FPFH. The reason that our proposed feature descriptor is robust to scale factor is that we normalize our proposed feature descriptor and make it almost unaffected for scale change. The reason that PFH and FPFH are almost unaffected by scale change is they both describe relation of angle between different sides, which is also robust to scale change.

B. EXPERIMENTS OF KEYPOINT DETECTION WITH BOUNDARY REMOVED
The task of this section is to validate whether our keypoint detection with boundary removed can detect boundary points VOLUME 8, 2020 in a point cloud and can obtain keypoints far away from boundary.

1) DATASET
In these experiments, we first use partial point clouds of Armadillo, Bunny and Buddha which come from the Stanford 3D Scanning Repository to perform our experiments. For Armadillo, we select its 11 partial point clouds into our dataset and they are named from ArmadilloBack_0 to ArmadilloBack_330 with 60-degree intervals between them. For Bunny, we select 6 partial point clouds into our dataset and they are named as Bun000, Bun045, Bun090, Bun180, Bun270 and Bun315 respectively. For Happy buddha, we select 15 partial point clouds into our dataset and they are named from HappyStandRight_0 to HappyStan-dRight_336 with 24-degree intervals between them. In addition, we also use partial point clouds of Bremen which comes from Robotic 3D Repository to perform our experiments. For Bremen, we select its 13 partial point clouds into our dataset and they are named from scan000 to scan012. Fig. 15 shows part of point clouds in our dataset.

2) RESULTS OF KEYPOINT DETECTION WITH BOUNDARY REMOVED
Before obtaining keypoints, a key step is to obtain boundary points. In the process of determining whether a point is a boundary point, we choose its neighboring points within radius 4 mr and set different L max . Fig. 16 shows boundary points of point clouds listed in Fig. 15 under different L max and the red points are boundary points and green point is original points of point clouds in Fig. 15. From Fig. 16, it is very clear that the number of boundary points is smallest when L max is set to π and the boundary points are discrete. By contrast, there are more boundary points when L max is between π/2 and π/4. Especially, when L max is set to π/3 or π/4, boundary points overlap with each other and form several clusters. In later experiments, we set L max to π/2, which is tradeoff that too many boundary points will decrease time efficiency of detecting keypoints and too little boundary points will decrease effect of detecting keypoints.
After obtaining boundary points for a point cloud, we can obtain original keypoints by ISS detector and remove keypoints on the boundary from the original keypoints. In the process of determining whether a point is a keypoint, we choose its neighboring points within radius 4 mr in order to calculate scatter matrix (p), and we set th 12 and th 23 to 0.975 (which is default value in ISS detector). In Fig. 17, column (a) and column (c) show keypoints by ISS detector for different point clouds listed in Fig. 15, and red points are keypoints and green points are original points of point clouds. After obtaining original keypoints, we will remove each keypoint whose distance with the nearest boundary point is less than 5 mr. In Fig. 17, column (b) and column (d) show current keypoints after removing those keypoints on the boundary from original keypoints. Compare to keypoints on column (a) and column (b), column (c) and column (d) in Fig. 17, it is clearly observed that our algorithm of keypoint detection with boundary removed can obtain keypoints on non-boundary, which is very import to generate feature descriptor.

C. EXPERIMENTS OF PAIRWISE REGISTRATION
After obtaining keypoints from point cloud, feature descriptor generation, feature matching, robust transformation estimation using SAC and fine ICP are then performed, and the result of pairwise registration can be obtained. Here, we use the same dataset used in section 4.2. In those steps, more important step is pairwise matching of point clouds, including keypoint detection, feature descriptor generation and feature matching. By pairwise matching of point clouds, lots of correspondences between point clouds are able to be obtained, which is helpful for a good coarse pairwise registration, and meanwhile, a good coarse pairwise registration will be helpful for a fine registration. It is specially noted value of radius r need be set in many cases, such as detecting keypoints without boundary removed, and constructing LRF, and generating our proposed descriptor, and so on. So, we set uniformly radius r to 4 mr in order to decrease the number of parameters.   Fig. 18, the blue point cloud represents source point cloud, the green point cloud represents target point cloud, and a line represents a pair of keypoints between source point cloud and target point cloud. It is noted specially that the original lines are not parallel, but rather crisscross because rotation transformation exists between source point cloud and target point cloud. However, to easily evaluate the accuracy of matching pairs, we rotate source point cloud and its keypoints according to the ground-truth rotation between source point cloud and target point cloud. Based on this rotation, if the matching lines are more parallel and the number of matching lines is larger, our proposed method is more effective. It is visually obvious that there are many parallel matching lines in Fig. 18(a), 18(b) and 18 (c) between source point cloud and target point cloud. At the same time, we verify further the accuracy for a matching line according to the following steps: firstly, for a pair of keypoint < kp t j ,kp s i > represented by a line, we transform kp s i according to the ground-truth transformation between source point cloud and target point cloud, and name the transformed keypoint as kp s i ; secondly, we calculate the distance between kp s i and kp t j ; finally, if the distance is less than 0.5 mr, we regard kp t j and kp s i as a true matching keypoint pair, and the matching line will be shown as a red line, and on the contrary false matching line will be shown as a blue line. From Fig. 18 (a) to 18 (c), it is very clear that most matching lines are red and it shows further our proposed pairwise matching of point clouds is more effective. Four results of pairwise registration are represented in Fig. 19, in which blue points represent source point cloud and green points represent target point cloud. It is visually obvious that source point cloud can be registered seamlessly to target point cloud and there is a complete fuse between source point cloud and target point cloud.
To further verify our pairwise registration, two criteria are employed for quantitative assessment of registration's results, i.e., the rotation error θ r and the translation error θ t which are used in [15], [34]. θ r represents error between the ground truth rotation matrix R GT and the actual matrix R A . θ t represents error between the ground truth vector t GT and the actual translation vector t A . They are defined as follows where d res is 1 mr, R GT and t GT are known in advance. It should be noted that how we can obtain R A and t A .
In the process of we can use SAC to estimate a coarse transformation matrix Rt C between source point cloud and target point cloud, and then we can use fine registration to obtain a fine transformation matrix Rt F , so final transformation matrix Rt is the result that Rt F multiplies Rt C . We denote Rt F and Rt C as follow where R c and t C represent rotation matrix and translation vector of Rt C respectively, R F and t F represent rotation matrix and translation vector of Rt F respectively. Therefore, Rt can be obtained as follow where R F R C represents R A and R F t C +t F represents t A . Theoretically, when θ r and θ t are smaller, the accuracy of pairwise registration is higher. In our dataset used in section 4.2, there are total 45 pieces of point clouds and 41 pairs of point clouds are formed. The registration errors of whole 41 pairs of point clouds are shown in Table 3. One can see that, most pairs of point clouds have been accurately registered (i.e., with small rotation error and translation error). To judge the correctness of a pairwise registration, thresholds are required as principles. For example, we regard a pairwise registration as correctness only when its rotation error and translation error are both smaller than 5 degree and 5 mr respectively. Based on current thresholds, our pairwise registration achieves a registration accuracy of 90.2% on our dataset. The deep reason is that our proposed feature descriptor in pairwise registration pipeline is robust to rotation and noise. Therefore, based on our improved pairwise registration pipeline, we compare our feature descriptor with state-of-the-art feature descriptors. Table 4 shows average registration errors of whole 41 pairs of point clouds in our dataset. From Table 4, it is clearly shown that average registration error for pairwise registration based on our proposed feature descriptor is the smallest, followed by based on based on RoPS and SHOT. The deep-seated reason is that our proposed feature descriptor is more invariant to rotation and more robust to noise than other feature descriptors.

V. CONCLUSION
In this paper, we propose a novel local feature descriptor based on rotational volume, which describes a ratio of volume of a geometrical model which is generated by rotating a point and its neighboring points around their normal. Experimental results show that our proposed feature descriptor has a good robustness to rotation and noise, less storage space and faster running speed than PFH, FPFH, SHOT and RoPS descriptors. We also propose a boundary point detection algorithm in order to remove keypoints near to boundary, and a robust transformation estimation via SAC. Based on above works, we implement pairwise registration of point clouds, and experimental results show that our pairwise registration pipeline is very effective. Of course, there are also some limitations about our proposed method as follows: (1) Too many parameters need to be set and we intend to set these parameter by adaptive algorithm in the future. (2) Our proposed method only apply to rigid registration and it's invalid to non-rigid registration.