A Maximum Entropy-Based Optimal Neighbor Selection for Multispectral Airborne LiDAR Point Cloud Classification

Multispectral light detection and ranging (LiDAR) technology was recently invented to improve the capability of thematic mapping through incorporating visible/infrared spectral information. Similar to image processing, point cloud classification usually considers contextual features derived from surrounding points to improve the model accuracy. Some of the existing methods construct contextual features of point clouds by querying a fixed scale/number of neighbor points or selecting a variable size neighborhood based on some optimality criterion. Although these methods are able to collect neighbor points to derive contextual features, they may also in turn introduce heterogeneity from the local neighborhood or select insufficient neighbor points, hindering the performance of classification. Therefore, we propose an optimal neighbor selection method based on the maximum entropy (MaxEnt) principle. More specifically, the proposed method determines the homogeneity of local neighborhood of each point and constructs geometric and radiometric features based on the use of MaxEnt to determine optimal points nearby. The constructed contextual features are then served as input into various machine learning classifiers for point cloud classification. Extensive experiments are conducted to compare the performance of MaxEnt against six other neighbor selection methods. The experimental results demonstrate that MaxEnt is able to achieve better classification results on multispectral airborne LiDAR data collected by Optech Titan in terms of overall accuracy (OA) improvement by 7.3%–19.1%. Moreover, MaxEnt is proven to be more suitable for land cover scenarios with imbalanced classes caused by detailed and tiny objects, e.g., perimeter fencings and power lines, than other existing neighbor selection methods.

Abstract-Multispectral light detection and ranging (LiDAR) technology was recently invented to improve the capability of thematic mapping through incorporating visible/infrared spectral information.Similar to image processing, point cloud classification usually considers contextual features derived from surrounding points to improve the model accuracy.Some of the existing methods construct contextual features of point clouds by querying a fixed scale/number of neighbor points or selecting a variable size neighborhood based on some optimality criterion.Although these methods are able to collect neighbor points to derive contextual features, they may also in turn introduce heterogeneity from the local neighborhood or select insufficient neighbor points, hindering the performance of classification.Therefore, we propose an optimal neighbor selection method based on the maximum entropy (MaxEnt) principle.More specifically, the proposed method determines the homogeneity of local neighborhood of each point and constructs geometric and radiometric features based on the use of MaxEnt to determine optimal points nearby.The constructed contextual features are then served as input into various machine learning classifiers for point cloud classification.Extensive experiments are conducted to compare the performance of MaxEnt against six other neighbor selection methods.The experimental results demonstrate that MaxEnt is able to achieve better classification results on multispectral airborne LiDAR data collected by Optech Titan in terms of overall accuracy (OA) improvement by 7.3%-19.1%.Moreover, MaxEnt is proven to be more suitable for land cover scenarios with imbalanced classes caused by detailed and tiny objects, e.g., perimeter fencings and power lines, than other existing neighbor selection methods.
Index Terms-Airborne laser scanning, contextual features, land cover, maximum entropy (MaxEnt), multispectral light detection and ranging (LiDAR), optimal neighbor selection, point cloud classification.

I. INTRODUCTION
A IRBORNE light detection and ranging (LiDAR) systems by emitting and receiving laser beams to and from the Earth surface [1].Such an active remote sensing technique facilitates various applications, including city planning, engineering surveying, land cover/use mapping, coastal engineering, and forestry studies [2], [3], [4].To support these applications, effective and accurate 3-D point cloud classification and segmentation become essential and critical.Unless a RGB camera is equipped on-board, traditional monochromatic airborne LiDAR systems only collect point cloud data using a single wavelength laser that results in a limited radiometric information [5], [6].Subsequently, a multispectral airborne LiDAR system, named Optech Titan, was developed by Teledyne Optech in 2014 with three laser channels, namely, channel 1 (1550 nm), channel 2 (1064 nm), and channel 3 (532 nm), respectively [7], [8], [9].Multispectral LiDAR system can effectively improve the radiometric limitation of monochromatic LiDAR system and fundamentally overcomes the drawbacks of RGB-based point cloud classification and segmentation with single channel information [10].
Despite the outstanding aspects of providing additional spectral information, laser beams generated by multispectral LiDAR system with three respective laser channels result in three non-co-aligned laser datasets [6], [11].To simultaneously exploit three sets of point clouds for classification, a core channel can be first determined and then the point clouds from the rest of the two channels can be combined with the point cloud of the selected core channel [11].As all the data from three channels share no common position, one can regard the points from the core channel as core points and search nearest neighbor points from the other channels based on the nearest Euclidean distance.Thus, apart from the intensity in each of the core points, intensity values obtained from the nearest neighbor points of other laser channels can be regarded as additional radiometric information.
Individual point possesses unique information, including 3-D coordinates, backscattered intensity, number of returns, return number, etc.Nevertheless, information embedded in the data point itself may be inadequate to generate representative features for accurate point cloud classification.As a result, neighbor points can be queried to participate in inferring additional features of each data point.Neighbor selection methods based on a fixed number of neighbor points, such as cylindrical neighbor selection method [12], [13], spherical neighbor selection method [12], [14], and k-nearest neighbor (kNN) selection method [12], [15], are commonly found in point cloud classification and semantic segmentation.All these methods have their own strategies to construct contextual information by selecting neighbor points located within a fixed radius of cylinder or sphere or selecting a fixed number of nearest neighbor points.However, these methods are indeed empirical and limited by the particularity of the point cloud data in specific areas.Besides, relying on a fixed scale of neighbor points inevitably suffers from information redundancy, which may also introduce an interference of information representing contextual features [16], [17].As a result, several attempts are found to select only those critical neighbor points to overcome the drawbacks.An optimal neighbor selection method looks for an adaptive scale through a heuristic search for a specific number of neighbor points by sequentially increasing the neighbor size [18].An alternative method presented in [19] was motivated by a principle, which selects a specific size of neighborhood based on the consistent curvature value.Furthermore, Weinmann et al. [20] and [21] proposed an optimal neighbor selection strategy through looking for a minimal eigenentropy of neighbor points.These methods shed light on the flexibility of selecting neighbor points, since they are capable of estimating an optimal scale of neighborhood.However, the majority of these studies only consider geometric features as an indicator to select an optimal neighborhood and the ultimate neighbor size always tends to be small, which results in information insufficiency and a lack of diversity [12].
To fill the research void, we propose an optimal neighbor selection method to select sufficient homogeneous neighbor points to enhance information diversity, reduce information redundancy, and improve classification accuracy.The proposed method is built based on the maximum entropy (MaxEnt) principle to adaptively select homogeneous neighbor points and ignore heterogeneous neighbor points from a fixed scale of neighborhood.MaxEnt principle [22], [23], [24] states that the largest entropy's summation of probability distribution within homogeneous and heterogeneous points provide a maximum information discrepancy, which implies such kind of probability distribution best distinguishes homogeneous and heterogeneous neighbor points.Subsequently, this principle facilitates the selection of sufficient homogeneous neighbor points, meanwhile drops the interfering heterogeneous neighbor points from the neighborhood.
After selecting optimal neighbor points in the core channel in a multispectral airborne LiDAR dataset, contextual feature vectors can then be derived to represent the characteristics of each individual point.Geometric feature vectors include height information, i.e., elevation of point as well as eigenvaluebased features, i.e., omnivariance, anisotropy, eigenentropy, summation, local curvature, linearity, planarity, and sphericity [12], [25].Radiometric feature vectors can be determined from three laser channels, i.e., channel 1 (1550 nm), channel 2 (1064 nm), and channel 3 (532 nm) [26].For the purpose of point cloud classification, the extracted feature vectors serve as an input for nine commonly used machine learning classifiers, including support vector machine (SVM), decision tree (DT), random forests (RFs), kNNs, Gaussian Naïve Bayes (GNB), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), adaptive boost (AB), and multilayer perceptron (MLP).Most of them have been favorably used in LiDAR point cloud classification and segmentation [11], [12], [20].All these models classify point clouds into seven self-defined land cover classes, i.e., "Road," "House," "Grass," "Tree," "Fence," "Water," and "Powerline."We exploit various evaluation metrics and conduct comparisons between our MaxEnt with respect to other existing neighbor selection algorithms [12], [18], [19], [20] to evaluate the classification results and effectiveness of our proposed method.The major contributions of our work are summarized as follows.
1) To propose an optimal neighbor selection method based on the MaxEnt principle to enhance information diversity, reduce information redundancy, and improve classification accuracy.2) To maximize the benefits of using multispectral airborne LiDAR intensity data to infer optimal neighbor points.3) To compare the proposed MaxEnt method with other existing neighbor selection methods in terms of point cloud classification.4) To assess the impact toward detailed and tiny objects/land cover categories, such as "Fence" and "Powerline" that cause imbalanced classes.The rest of the article is organized as follows.We first present the proposed algorithm in detail in Section II.Section III describes the dataset and the entire experimental setup.We then present the performance of the proposed method in Section IV and discuss the derived experimental results in Section V by comparing with other existing methods.Section VI concludes our findings in this study.

II. METHODOLOGY A. Overall Workflow
The workflow of the proposed MaxEnt is presented in Fig. 1.For a given multispectral LiDAR dataset L, a core channel, L c , is first selected among all available laser channels, e.g., L 1 , L 2 , and L 3 .For instance, L c = L 2 if channel 2 is selected as the core channel.Then, we take an arbitrarily selected point, say x c = {x, y, z, I c }, with geometric and radiometric information from the selected core channel as an example.The workflow begins by embedding the intensity values of the two channels to the core channel, resulting in the x c = {x, y, z, I 1 , I 2 , I 3 }.Then, each x c queries a fixed number of neighbor points x n in accordance with the kNN selection method as shown in Fig. 1.The next step involves searching optimal neighbor points of x ′ based on MaxEnt principle within x n .The selected optimal neighbor points may vary when exploiting different information, i.e., elevation (z), intensity of channel 1 (I 1 ), channel 2 (I 2 ), and channel 3 (I 3 ) or the combination of all the above features (see Fig. 1).Feature vectors can then be derived from those selected optimal neighbor points based on their geometric and radiometric properties.Finally, the extracted feature vectors serve as an input into machine learning classifiers and predict the land cover class of the core data point x c .

B. MaxEnt Principle
The concept of thermodynamic entropy was first proposed to describe the distribution of thermal energy in 1850s by Clausius [27].Boltzmann [28] later on proposed the thermodynamic entropy to give the probabilistic interpretation in 1901, which described the extent about disorder and uncertainty of information in the field of thermodynamics.Shannon [29] further developed the thermodynamic entropy proposed by Boltzmann [28] and extended it as a concept to measure the statistical uncertainty of the information of a given system.The entropy value of a variable represents the amount of information contained in this variable.For a system with n discrete states, the entropy summation value of the system can be calculated by the probability distribution of these states and where p i refers to the probability of state i.The larger value of the entropy's summation (S) of the variable refers to a greater uncertainty and greater amount of information retained in the system.
Based on the advantage of representing the amount of information in a system, the concept of entropy is also exploited to other fields and extended to other principles, such as the MaxEnt principle proposed by Jaynes [22].MaxEnt principle proves that the uncertainty reaches to a maximum when the probability of all the states are equally likely.The uniform distribution maximizes the entropy and retains the largest amount of uncertainty and information.The conclusion is justified by Laplace's principle of insufficient reason [23], which implies that the best strategy is to consider all the states equally distributed to discriminate between two or more events.

C. MaxEnt for Optimal Neighbor Selection
Similar to image classification, contextual features of point clouds can significantly influence the classification performance.The more relevant representation of contextual features being derived, the higher classification accuracy can be achieved.As contextual features are extracted at each data point with respect to the local neighborhood, it is vital to select representative neighbor points prior to point cloud classification.To address the limitations of existing fixed scale neighbor selection methods and optimal neighbor selection methods, here we propose to adaptively select optimal neighbor points and extract representative contextual features for each point in the L c of multispectral airborne LiDAR data.Based on the MaxEnt principle, the proposed algorithm further divides a fixed scale of local neighborhood into homogeneous and heterogeneous points.The homogeneous neighbor points are then regarded as selected optimal neighbor points, which can well represent the characteristics and improve the information diversity of derived features, meanwhile avoid introducing heterogeneous points and redundant feature information.
Let x be the input of the MaxEnt principle, which can be elevation and/or intensity.Then the nearest k neighbors are determined for each data point in the core channel.This is defined as the original neighborhood.The information of each data point in the core channel is represented as x c , while x n refers to the information of the corresponding neighbor points.δ is defined as the absolute information difference |x c − x n | between the core point and the original neighbor points, i.e., Then, k neighbor points are divided into l levels in accordance with the range of absolute information difference δ. n i is the number of neighbor points belonging to the ith level, i.e., The probability of the ith level can be computed as Then, the information entropy of state x can be calculated using (1).The idea of MaxEnt principle is to look for a specific ith level, which can divide the fixed number of original neighbor points into homogeneous and heterogeneous neighbor points, when the entropy's summation of these two sets of points is maximized.

D. Implementation of MaxEnt on Multispectral LiDAR Point Cloud Data
For multispectral airborne LiDAR point clouds with three laser channels, the available features include elevation z, intensity of channel 1 (I 1 ), channel 2 (I 2 ), and channel 3 (I 3 ), respectively.To exploit intensity information from three channels simultaneously, we first define, say for instance channel 2, as the core channel and search for the nearest neighbor point based on Euclidean distance from channel 1 and channel 3, respectively.Within three times of mean point spacing of channel 2, the nearest neighbor point of channel 1 and channel 3 are assigned to the core data point of channel 2. Regarding the elevation, MaxEnt only searches the elevation of neighbor points from the core channel.Accordingly, the information of a data point x c can be represented in an array of {x, y, z, I 1 , I 2 , I 3 }.The pseudo code for the implementation of MaxEnt for multispectral point cloud data is as follows.
Step 1 (Calculate Information Difference δ): Search a fixed number of k original nearest neighbor points, say 1000, for each point in L c and then calculate and construct the vector of absolute information difference between the core data point and its original neighbor points in the three channels.This step can provide absolute information difference of elevation and intensity of channel 1, channel 2, and channel 3, which are referred as δ z , δ I 1 , δ I 2 , and δ I 3 , respectively.
Step 2 (Divide Absolute Information Difference δ Into l Levels): Divide the four absolute information differences, i.e., δ z , δ I 1 , δ I 2 , and δ I 3 into l levels.Calculate p i ; ∀i ∈ [1, l] for the δ according to (5).The way how the l being chosen can refer to the elevation difference.For instance, if the elevation within the 1000 nearest neighbor points ranges from 0 to 100 m, we can divide the dataset into 20 levels with a 5-m range, i.e., l = 20.
Step 3 (Set an Initial Threshold Level T ): Assign the first level of l as an initial threshold level, which means threshold level t is equal to 1.Then, the corresponding information threshold can be computed as T = x c + (δ/l) × t to divide k nearest initial neighbor points x n into initial homogeneous neighbor points x and initial heterogeneous neighbor points x ϒ .
Step 4 (Separate Into x and x ϒ ): If the absolute information difference δ between x c and x n is smaller than the information threshold x n is regarded as the initial homogeneous neighbor point x , otherwise it can be determined as initial heterogeneous neighbor point x ϒ .
Step 5 (Calculate the Entropy's Summation S): Calculate Shannon entropy of the initial homogeneous neighbor points S t and heterogeneous neighbor points S ϒ t with the specific threshold level t, respectively.The entropy's summation is calculated by // Final homogeneous neighbor points Step 6 (Raise the Threshold Level t by 1): If t is no larger than l, then repeat steps 4-6 until t is equal to l.After computing the S +ϒ t ; ∀t ∈ l, then the iteration terminates.
Step 7 (Identify the Threshold Level t That Maximizes the Summation of Entropy): The threshold level which can maximize the entropy's summation is regarded as the optimal threshold level t ′ = arg max t∈l (S +ϒ t ) to divide original neighbor points into homogeneous points x and heterogeneous points x ϒ .The optimal information threshold is calculated by Step 8 (Determine x and x ϒ ): Determine x and x ϒ by comparing the absolute information difference between x c and x n with the value of T ′ .If the absolute information difference is no larger than T ′ , then this neighbor point is selected as x , otherwise this neighbor point is regarded as Step 9 (Refine Homogeneous Neighbor Points With Concave Hull): As some of the points within the region of x may be Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.wrongly depicted as x ϒ , concave hull H is then constructed based on the extracted x in Step 8.The x ϒ determined in Step 8, which are located within the H, are re-assigned as x .Thus, this process further refines the formation of a final set of x .
As there are four types of information being embedded in L, i.e., z, I 1 , I 2 , and I 3 , therefore, MaxEnt is able to determine four optimal neighbor selection results accordingly for each x c .However, each of these optimal neighbor selection results, i.e., x (z), x (I 1 ), x (I 2 ), and x (I 3 ), are derived with respect to characteristics of the corresponding feature.For example, if the x c belongs to "Road," MaxEnt likely exploits the elevation feature that can only discriminate x n if it belongs to either a "Tree" or "House" since they are located in a higher elevation than "Road" in general.On the other hand, if x n belongs to "Grass," the use of elevation is unable to maximize the entropy summation in order to distinguish x n , i.e., "Grass," from the x c , i.e., "Road."In this case, the use of intensity features, i.e., I 1 , I 2 , and I 3 , can thus aid in identifying x , which means those x n mostly belonging to "Road."Hence, the final x ′ should be the intersection of four independent optimal neighbor selection results with k ′ cardinality, which implies that the selected x are based on four types of information simultaneously.A graphical illustration of optimal neighbor selection results derived by MaxEnt is depicted in Fig. 2, which shows seven types of core data points including "Road," "House," "Fence," "Tree," "Grass," "Water," and "Powerline."

E. Feature Extraction
Feature vectors of point clouds are extracted and derived from the optimal neighbor points.The feature vectors, as shown in Table I, are basically divided into two main categories: 1) geometric features that are derived from the 3-D coordinates and 2) radiometric features derived from the intensity of three laser channels.
1) Geometric Features: Geometric features basically include elevation-based feature vectors, i.e., mean of elevation z and standard deviation of elevation σ z computed between the core data point x c and k ′ number of optimal neighbor points x ′ .Apart from elevation-based geometric features, 3-D spatial geometric features of neighbor points are also exploited to represent the feature of x c .These 3-D shape features are related to three eigenvalues λ i , i ∈ {1, 2, 3} derived from the covariance matrix constructed by the 3-D coordinates of the k ′ number of optimal neighbor points x ′ .Based on three eigenvalues λ 1 , λ 2 , and λ 3 , the geometric features can be represented by eight components, including linearity L λ , planarity P λ , sphericity S λ , omnivariance O λ , anisotropy A λ , eigenentropy E λ , sum of eigenvalues λ , and change of curvature C λ .Among these features, some of them require to first normalize the eigenvalues λ i as λ i = λ i / λ i and subsequently derive the features [25] as shown in Table I.
2) Radiometric Features: Similar to elevation-based geometric features, radiometric features can be constructed by calculating the mean value and standard deviation of k ′ number of optimal neighbor points x ′ .In Table I, Ī j refers to the mean intensity derived from x ′ with respect to the laser channel j ∈ {1, 2, 3}.σ I j refers to the standard deviation of intensity derived from x ′ from channel j ∈ {1, 2, 3}.As mentioned above, the use of radiometric features can further aid in delineating classes having similar elevation but with different spectral reflectance.

F. Machine Learning Classifiers
The purpose of point cloud classification is to estimate and assign a label of self-defined land cover class for each data point x c by extracting the above-mentioned features and training machine learning classifiers.In our experimental work, nine commonly used machine learning classifiers are exploited to classify the multispectral airborne LiDAR point clouds, and some of them have been adopted in previous studies [11], [12], [20].These include SVM, RFs, kNN, DT, GNB, LDA, QDA, AB, and MLP.Furthermore, we conduct experiments by exploiting a classic deep learning-based method, i.e., Pointnet++ [30] and make comparisons of classification results between our proposed method and Pointnet++.

III. EXPERIMENTS
The performance of MaxEnt for multispectral airborne LiDAR point cloud classification is evaluated by comparing Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE I SUMMARY OF GEOMETRIC AND RADIOMETRIC FEATURES FOR POINT CLOUD CLASSIFICATION
with two categories of neighbor selection methods.The first type includes the use of a fixed scale of cylindrical, spherical, and a fixed number of kNN query [12].The second category is indeed an optimal neighbor selection method that relies on assessing the change of curvature [18], consistency of curvature level [19], and minimum of eigenentropy [20].All these neighbor selection methods together with the proposed MaxEnt are examined using an identical multispectral airborne LiDAR dataset.

A. Dataset
The multispectral airborne LiDAR dataset covers a residential sub-urban area located in Scarborough, ON, Canada.Fig. 3(a) shows an aerial image and the spatial extent of study area, which is enclosed by a parallelogram.The detailed information of the dataset is listed in Table II.The dataset was collected by Optech Titan system that was flown at an altitude of 430 m.The LiDAR system also simultaneously transmitted three laser beams on September 3, 2014.As a result of the flight survey, the airborne LiDAR system generated three strips of 3-D point clouds in las data format with fields including x yz coordinates, backscattered intensity, number of returns, return number, GPS time, scan angle, etc.The numbers of generated points of channels 1-3 are 3 724 889, 4 391 470, and 5 030 194, respectively.The mean point spacing of points from three channels are calculated by considering all returns of the entire region.We extract part of the dataset from the study area with a size of 1046 × 208 m since the flight survey covered a large off-shore region at the east side that were trimmed from our experimental work.Finally, the number of points from the intercepted dataset of channels 1-3 are 3 146 762, 3 532 946,  3(c)], i.e., "Road," "House," "Tree," "Grass," "Fence," "Powerline," and "Water."Some of the data points that cannot be classified into any predefined categories are labeled as "Others," and they are mainly located nearby the shore region at the east side.The 3-D point clouds displayed in terms of combined intensity from three channels and manually labeled land cover classes are shown in Fig. 3(b) and (c).Fig. 4 illustrates the example of each land cover class in 3-D point clouds.
Since land cover categories, including "Fence" and "Powerline," mainly locate in the west side of the study area as shown in Fig. 5, we thus conduct experiments independently on this specific part of study area to validate the performance of the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

B. Data Preprocessing
As the study area is located on a moderate, undulating terrain with ground elevation difference ranging from 38.63 to 92.79 m; therefore the use of absolute elevation value as a feature may induce inaccurate interpretation and affect the model performance to distinguish different land cover classes.For example, in Fig. 6(a), the elevation of "Road" or "Grass" in the dotted region is larger than that of "House" and even "Tree" in the west side of the study area.As a result, the point cloud dataset undergoes a ground filtering process, which is capable of separating elevated objects (off-ground points) from the terrain (ground points) prior to classification.Our experiments adopt the use of cloth simulation to achieve the ground filtering [31], [32], since it is an efficient algorithm that works well on relatively flat terrain.The filtering process basically turn the point cloud dataset upside down and then map a soft cloth dropping from the top to the inverted surface.Intersection between point clouds and the covered cloth is  regarded as a base to distinguish between ground points and off-ground points.After ground filtering, height normalization is applied to the entire point cloud of the study area as shown in Fig. 6(b).One can clearly notice that, after ground filtering and height normalization, the elevation of "Road" and "Grass" is close to those located in the west side of the study area.

C. Evaluation Metrics
As a commonly used evaluation metric of airborne laser scanning point cloud classification, the overall accuracy (OA) is exploited to evaluate the classification results derived from MaxEnt and six existing neighbor selection methods.The OA is defined as OA = TP + TN TP + FP + TN + FN (7) where true positive (TP) refers to the points belong to land cover and are correctly classified into and true negative (TN) implies the classifier correctly predicts the negative class.False negative (FN) represents the points belong to land cover , but recognized as any other land cover.In terms of false positive (FP), it means that the points annotated as other land covers, but misclassified as land cover .In general, larger value of OA corresponds to higher classification accuracy and better performance of the method.
Apart from that, F1 score assesses the classification performance for each land cover category, which is more suitable than OA for the number of the points in each land cover varying greatly.F1 score is calculated based on precision and recall, which are defined as follows: Then we can calculate the F1 score of land cover by Fainlly, an average F1 score among all land cover classes can be computed as where N refers to the number of land cover categories.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
To provide sufficient critical insights, we further consider intersection-over-union (IoU) as the metric to quantitatively assess the classification performance of our proposed method and other existing methods.The IoU of each land cover category is calculated by The overall mean IoU (mIoU) is ultimately generated by calculating the average value of IoUs across all the land covers as where N refers to the number of land cover categories.The larger values of mF1 and mIoU imply that the corresponding method demonstrates a higher capability in terms of classification performance.

D. Experimental Settings
In the subsequent experiments, we select channel 2 as the core channel because laser wavelength in 1064 nm offers the best separability of different land covers [11].We then embed the intensity value from channel 1 and 3 in the core channel.Since querying neighbor points for the entire study area based on kNN selection method or k-dimensional tree method causes high computational burden, we then divide the study area into 20 portions to speed up the process.In this case, searching neighbor points from the corresponding section of study area can help improve the computational efficiency.With the manually labeled training data as shown in Fig. 5, we randomly select 1% of total number of data points as training data for the nine machine learning classifiers.
Regarding the number of neighbor points k, it is critical to apply the same number to compare the classification performance of the proposed MaxEnt with other neighbor selection methods.We thus set k = 1000 due to the prolific memory of the computational platform.To be consistent with the searching range of MaxEnt, the number of searched neighbor points for fixed scale of neighbor selection methods and the number of original given neighbor points for optimal neighbor selection methods are all close to 1000.The workflow of other existing neighbor selection methods is shown in Fig. 7, which includes the corresponding examples of neighbor selection results.The following paragraphs briefly introduce six neighbor selection methods to compare against the proposed MaxEnt.
1) Fixed Scale of Cylindrical Neighbor Selection Method (Cylinder): Thomas et al. [12] proposed to select neighbor points located within a fixed radius of cylinder.To determine the radius of cylinder, we refer to the benchmark scale of neighbor points in MaxEnt.As a result, we define the radius of cylinder as eight times of mean point spacing, which is closest to the benchmark scale of 1000 neighbor points in MaxEnt.
2) Fixed Scale of Spherical Neighbor Selection Method (Sphere): The mechanism is similar to the fixed scale of cylindrical neighbor selection method [12].The only difference is that Sphere method selects neighbor points located within a fixed radius of sphere rather than a cylinder.In this experiment, we define the radius of sphere as ten times of mean point spacing (1058 neighbor points), which is closest to the benchmark scale of 1000 neighbor points in MaxEnt.
3) Fixed Scale of kNN Selection Method (kNN): As a fixed scale of neighbor selection, it selects kNN points instead of relying on a certain geometry for neighbor selection.Therefore, we set the parameter k to 1000 accordingly in the subsequent experiment.
4) Change of Curvature Based Optimal Neighbor Selection Method (SurVar): SurVar is first introduced in [18] for multiscale feature selection on point cloud.Given a fixed number of neighbor points, i.e., 1000, the 3-D covariance matrix of the neighbor points can be constructed to represent the geometric feature.The surface variation of the neighbor points can be calculated by where λ 1 ≥ λ 2 ≥ λ 2 ≥ 0 are the eigenvalues of 3-D covariance matrix of neighbor points.The principle of this method is based on the change of curvature C λ .With the increase of the scale of neighbor points, the method is to search the location with significant increase of C λ which means to find the specific neighbor size and the corresponding value of k.This algorithm is inspired by the fact that the sudden jumps of surface variation refer to significant deviation in the normal direction of the surface.5) Consistent Curvature Level Based Optimal Neighbor Selection Method (ConCur): Belton and Lichti [19] proposed to select a critical scale of neighbor points that has a consistent curvature level from kNN points.The mechanism thus implies the variance of the curvature should be nominally zero.As a result, the variance of curvature can be computed as where κ is the surface variance.If data points are located close to an edge, their corresponding neighbor points should have a range of curvature values gradually increasing.

6) Minimum of Eigenentropy Based Optimal Neighbor Selection Method (MinEig):
As proposed in [20], the MinEng optimal neighbor selection strategy is built upon the principle via searching the minimum value of eigenentropy derived from a range of neighbor points (e.g., 10-1000) in accordance with the benchmark scale of neighbor points.The eigenentropy of the neighbor points can be calculated as follows: where λ i = λ i / λ, i ∈ 1, 2, 3 refers to the normalized eigenvalue with a sum equal to one.According to [20], the MinEng aims to select the critical number k of neighbor points that can minimize the value of eigenentropy E λ by sequentially increasing the value of k.

E. Determination of an Optimal l Value
As we set an initial scale of neighbor points to a specific number, i.e., 1000, such a range is capable of covering all land cover types according to the empiric knowledge on the study area.To implement the proposed MaxEnt, we first divide the range of the information into l parts, which should be manually set.Thus, we conduct an initial experiment to explore the impact of the value of parameter l on the classification performance based on MaxEnt.We thus set the value of l ranging from 10 to 150 and then validate the method's performance on DSA-Dataset by sequentially increasing the value of l by 10.As shown in Fig. 8, the classification accuracy gradually increases starting with l = 10 and reaches the peak when l is equal to 90.The classification accuracy slightly goes up and down afterward, i.e., l ranges from 90 to 150.Whereas, the time cost of computation appears to gradually increase with the rise of l and dramatically increases when l is greater than 130. Accordingly, we set l to 90, which provides the best classification performance and consumes relatively low cost of time to accomplish the task.Subsequently, in accordance with algorithm 1, we adopt the use of l to determine the information threshold T .The neighbor points with information difference δ smaller than the threshold T are then selected as initial homogeneous neighbor points x , while the rest of points are regarded as initial heterogeneous neighbor points x ϒ accordingly.

IV. QUANTITATIVE AND QUALITATIVE RESULTS
In this section, we present the quantitative and qualitative experimental results of DSA-Dataset in Section IV-A and ESA-Dataset in Section IV-B with respect to the classification performance of MaxEnt and those six existing neighbor selection methods.

A. DSA-Dataset
DSA-Dataset mainly covers the west side of the study area.Since those detailed, tiny classes, such as "Fence" and "Powerline," only appear in this region, this area is intentionally selected to examine the impact of MaxEnt on imbalanced classes.Table III shows the classification accuracy of different neighbor selection algorithms based on nine commonly used machine learning classifiers.Among all classifiers, the proposed MaxEnt generates the best classification results in most of the cases.Aside from GNB in which the use of MaxEnt on channel 3's intensity generates the best result, i.e., 0.815, MaxEnt applying to the elevation and intensity values yields the best classification results.The accuracy achieved by MaxEnt(All) ranges in between the lowest value by QDA, i.e., 0.641, and the highest by AB, i.e., 0.953.
In terms of classifiers, QDA achieves the worst performance.For those six neighbor selection methods, QDA results in an OA in between 0.405 and 0.443.MaxEnt-based methods boost the accuracy by 0.1-0.2,leading to an accuracy in between 0.525 and 0.641.The second worst performance can be found in kNN classifier.Those six neighbor selection methods produce classification results in between 0.532 and 0.679, while the accuracy of MaxEnt-based methods produce a slightly better result with accuracy found from 0.608 to 0.730.Classifiers, i.e., SVM, LDA, and DT, present a similar range of OA regardless of the neighbor selection methods.The OA is above 0.8 in most of the cases, except the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE III OA OF NINE CLASSIFIERS WITH RESPECT TO CORRESPONDING NEIGHBOR SELECTION METHODS ON THE DSA-DATASET
Cylinder-based method, producing an OA of 0.679, 0.733, and 0.740 with the three respective classifiers.The corresponding best performance can be found in MaxEnt(All) with an OA in between 0.877 and 0.921.On the other hand, the OA found in GNB is mostly found in between 0.7 and 0.8, with Cylinder and ConCur generating the worst results with accuracy of 0.619 and 0.642, respectively.
Although the best classification results of MLP occur when using MaxEnt-based methods, the other six neighbor selection methods, unlike the above-mentioned scenarios, occasionally outperform MaxEnt.Neighbor selection methods, such as SurVar and ConCur, produce an OA with 0.731 and 0.795, respectively.These results are significantly better than all the MaxEnt-based methods (0.598-0.693), except MaxEnt(All) with an OA of 0.814.RF and AB both achieve the highest classification accuracy among the nine classifiers.Both of them yield an OA higher than 0.95 when using MaxEnt(All).The use of other MaxEnt-based methods also comes with a satisfied classification performance with an OA over 0.9.The classification results of those six neighbor selection methods are also comparable.Except Cylinder-based method generating an accuracy slightly over 0.83, both Sphere and kNN come with an accuracy better than 0.88.The remaining three methods, i.e., SurVar, ConCur and MinEig, all generate an OA ranging from 0.893 to 0.923.
Specifically, the classification results derived by AB are extracted in Table IV due to its prolific performance among all classifiers.In Table IV, it shows the AB-derived classification results of each land cover class with respect to the neighbor selection methods.Similar to the OA, the derived mF1 of all MaxEnt-based methods produce the best results, ranging from 0.793 to 0.860, with MaxEnt(All)'s result yielding the highest.Those six existing neighbor selection methods generate a slightly smaller value of mF1 that is found in between 0.675 and 0.729.Fig. 9 presents the corresponding confusion matrix generated by the MaxEnt(All) with an AB classifier applying to the DSA-Dataset.The corresponding classification results of DSA-Dataset based on different neighbor selection methods are also shown in Fig. 10.
In terms of land cover class "Road," MaxEnt-based methods slightly outperform all other existing neighbor selection methods, except ConCur, i.e., F1 score = 0.892.When all radiometric and geometric features are adopted in MaxEnt, the F1 score reaches to the best performance, i.e., 0.910, and it is higher than the use of elevation or intensity from specific channel only by 1.2%-3.6%.Similarly, MaxEnt-based methods derive the highest F1 score in "Tree" and "Grass."Both of which the corresponding accuracy is found larger than 0.9 and incorporating all the elevation and intensity features drive the F1 score to the maximum, i.e., 0.986 in "Tree" and 0.960 in "Grass."Despite that, the performance of other six neighbor selection methods compare favorably with MaxEnt, where the worst performance can be found in Cylinder method with a F1 score equal to 0.883 in "Tree" and 0.794 in "Grass." Although MaxEnt also presents the best classification results for class "House" and "Fence," both of them are found with the best results when MaxEnt is applied, i.e., F1 score = 0.955 and 0.588, comparing to the use of MaxEnt on all features, i.e., 0.940 and 0.501.Another notable issue is the performance of imbalanced classes, i.e., "Fence" and "Powerline."These classes are indeed tiny objects appeared in the study scene and occupy less than 1.3% ("Fence": 549, 0.305%; "Powerline": 2189, 1.216%) of data points in the DSA-Dataset.Those six neighbor selection methods produce a poor performance with a F1 score between 0.044 (SurVar) and 0.266 (kNN) in "Fence" and 0.198 (SurVar) and 0.418 (Cylinder) in "Powerline."On the other hand, MaxEnt improves the performance on these detailed, tiny objects via incorporating optimal contextual features.This thus leads to  an improved F1 score ranging from 0.297 to 0.588 in "Fence" and 0.609 to 0.733 in "Powerline."

B. ESA-Dataset
Table V shows the classification results of different neighbor selection methods based on nine commonly used machine learning classifiers conducted on the ESA-Dataset.Among all classifiers, the proposed MaxEnt-based methods achieve the best classification results with OA in between 0.680 and 0.968.Except for yielding an accuracy of 0.849 by using Channel 2, all the above-mentioned best scenarios occur when MaxEnt exploiting all information including elevation and intensity of Channels 1-3, i.e., MaxEnt(All) is adopted.Unlike the results in DSA-Dataset, the best classification results are produced by RF, instead of AB.Despite that, the performance of both classifiers are similar.The OA of MaxEnt-based methods ranges from 0.92 to 0.97.Those six neighbor selection methods produce comparable results with accuracy mostly higher than 0.9, except the Cylinder-based method that produces an accuracy with slightly below 0.9.In contrast, QDA produces the worst classification results similar to DSA-Dataset.The MaxEnt-based methods come up with an accuracy in between 0.524 and 0.680.Those six neighbor selection methods generate significantly worst results.The OA of QDA with contextual features derived from Sphere and MinEig is 0.288 and 0.216, respectively.SurVar and kNN even result in an OA below 0.2.
The rest of classification results can be categorized into two scenarios.SVM, DT, and LDA all come with the best classification results in MaxEnt(All), having an OA ranging from 0.924 to 0.951.The use of respective intensity and elevation information with MaxEnt all achieves an accuracy in between 0.88 and 0.91.Regarding the six neighbor selection methods, except the results of DT that come with a slightly better classification accuracy, both SVM and LDA obviously generate a slightly worst classification accuracy than MaxEnt-based methods with an accuracy from 0.77 to 0.86.On the other hand, though the best results are derived by MaxEnt(All) with accuracy larger than 0.8, other MaxEnt-based methods can no longer maintain such a high level of accuracy when kNN and MLP are adopted.Also, those six neighbor selection methods generate an OA in between 0.671 and 0.833, which may occasionally outperform the MaxEnt-based methods.
Similar to DSA-Dataset, we intentionally analyze the results of each individual land cover class produced by the best classifier, i.e., RF, as shown in Table VI.Since the ESA-Dataset covers the entire region of study area, the east side of region covers the Lake Ontario, resulting in an additional class "Water" alongside with six other land cover classes.Again, the best performance of each class can be found when MaxEnt-based methods are used.Except "House" and "Fence" are found with the best result by using MaxEnt along with elevation, the other five classes all come with the highest accuracy when MaxEnt(All) is used.
"Tree" and "Water" share a common pattern in terms of the F1 score.Both of them are found with the highest F1 score, i.e., >0.99, produced by MaxEnt-based methods, other six neighbor selection methods highly comparable results that are close to the MaxEnt-based methods, i.e., better than 0.95.In terms of the class "Road," the performance of MaxEnt-based methods are equal or slightly better than the six neighbor selection methods.The best result can be found in MaxEnt(All) with a F1 score of 0.891, while the existing methods produce a F1 score ranging from 0.806 to 0.866.For "House" and "Grass," though the best results appear in Max-Ent(Elevation) and MaxEnt(All), respectively, with F1 score better than 0.92, existing neighbor selection methods may  occasionally surpass MaxEnt-based methods.For instance, kNN and MinEig produce a F1 score greater or equal to 0.91, which is higher than most of the MaxEnt-based methods, except MaxEnt(All), in the scenario of "House."Sphere, kNN, ConCur, and MinEig all generate better classification results in "Grass" comparing to all MaxEnt-based methods, except the MaxEnt(All).
"Powerline" and "Fence" perform the worst performance among seven land cover classes, similar to the DSA-Dataset.MaxEnt-based methods lift up the results of "Powerline" with F1 score found in between 0.436 and 0.635.Existing neighbor selection methods perform poorly; all of them, except Cylinder-based method, produce a F1 score lower than 0.1.In the scenario of "Fence," MaxEnt-based methods barely come up with a F1 score larger than 0.2, except MaxEnt being applied to the intensity of Channel 2 resulting in a F1 score of 0.149.Those six neighbor selection methods all generate a poor classification result on "Fence" with F1 score ranging from 0.029 to 0.170.For the proposed MaxEnt algorithm, the classification results of DSA-Dataset are slightly better than those of ESA-Dataset.The corresponding confusion matrix generated by the MaxEnt(All) using RFs classifier on the ESA-Dataset is presented in Fig. 13.

A. MaxEnt Versus Six Neighbor Selection Methods
MaxEnt consistently outperforms all six neighbor selection methods by looking for optimal neighbor points before feature extraction and point cloud classification.In terms of those neighbor selection methods based on a fixed number of neighbor points, i.e., Cylinder, Sphere, and kNN, MaxEnt achieves an improved accuracy by 13.0%-19.1%,10.6%-11.9%,and 11.3%-11.7%,respectively, by taking an average of the results derived from nine classifiers of both datasets.Such a notable improvement is also observed by comparing MaxEnt with the remaining three optimal neighbor selection methods, i.e., SurVar, ConCur, and MinEig.The accuracy improvement in average is found to be 8.8%-11.3%,7.3%-9.6%,and 10.4%-10.5%,respectively.Regardless of the classifiers, Max-Ent, particularly MaxEnt(All), produces the best classification results most of the time.Indeed, these can be explained by the rationale of each neighbor selection method, which may only be applicable under certain circumstance.
Although all the existing neighbor selection methods are not comparable with MaxEnt, Cylinder-based method eventually performs the worst among all methods.This can be explained by the mechanism of Cylinder, which selects neighbor points located within the circular cross section along the z-direction.This may cause heterogeneity of land covers such as understory grass cover being included as optimal neighbor of tree canopies.Sphere and kNN methods come with a fixed number of neighbor points that may also induce heterogeneity in selecting neighbor points.Regarding the rest of three optimal neighbor selection methods, SurVar basically concentrates on detecting and extracting linear-type features on point-sample surface and selects homogeneous neighbor points located within the curved boundaries generated by linetype features.ConCur and MinEig also aim to detect if a data point located on a linear feature, boundary, edge or within a surface.Since airborne LiDAR point clouds, if collected on a terrain with mixture of land covers, always have a lack of discriminated surface or regular geometry, except roads and buildings; therefore, these methods are inefficient to look for surface of optimal neighbor points.MaxEnt, on the other hand, literally looks for optimal neighbor points by grouping those having high similarity of intensity values and elevation via  [30] ON THE DSA-AND ESA-DATASET computing the sum of Shannon entropy.The maximum of the sum thus implies the largest amount of information retained therein.Therefore, MaxEnt is not limited by specific types of scene or land cover pattern.

B. Merit of Using Multispectral LiDAR Intensity in Inferring Optimal Neighbor Points
Existing studies have demonstrated the use of multispectral LiDAR intensity can improve estimation of forest metrics and land cover classification [5], [33].In our experiment, multispectral LiDAR intensity further improves inferring optimal neighbor points by intersecting the resulting homogeneous neighbor points that achieve the MaxEnt.By taking an average of all the classification results, except GNB, Max-Ent(All) improves the classification accuracy by 3.7%-5.7%,6.3%-8.2%,4.0%-6.1%,and 5.5%-7.7%,comparing to the results derived by MaxEnt individually applying to elevation and intensity of channels 1-3, respectively.
In terms of land cover classes, MaxEnt(All) produces an improved classification results than the rest of the MaxEnt methods in all the classes, except "House" and "Fence" in which MaxEnt(Elevation) achieve a better result.Specifically, a significant improvement can be found in the case of "Powerline."The use of MaxEnt(All) improves the F1 score by 3.8%-5.3%,12.4%-16.1%,10.2%-13.8% and 5.5%-19.9%,comparing to the results derived by MaxEnt being applied with elevation and the three respective channel.MaxEnt(All) also provides beneficial effect to the land cover class "Grass."MaxEnt(All) lifts up the F1 score by 5.2%-11.6%,4.4%-14.6%,4.8%-13.3%and 5.8%-13.2%with respect to the rest of the four MaxEnt combinations.The F1 score improvement provided by MaxEnt(All), though is not comparable to those of "House" and "Fence," ranges from 1% to 6%.There is no notable difference among the results of five MaxEnt-based methods in "Water" as the F1 score has already yielded better than 99%.In short, the experimental work further justifies the merit of having multispectral LiDAR intensity, which can aid in inferring optimal contextual features for point cloud classification.

C. Impact of MaxEnt on Imbalanced Classes
Since both "Powerline" and "Fence" occupy less than 1.3% and 0.1% of data points in DSA-and ESA-Dataset, respectively, these imbalanced classes certainly cause a negative impact toward the classification results.Nevertheless, the proposed MaxEnt can help look for optimal neighbor points of these classes and achieves significantly better results when comparing those derived by six other neighbor selection methods.For instance, ConCur achieves the worst classification results among the six neighbor selection methods with an accuracy of 0.014 in "Powerline" and 0.011 in "Fence."MaxEnt(All) thus lifts up the F1 score by 31.5%-62.1% in the case of "Powerline," while Max(Elevation) achieves an improved F1 score by 17.1%-54.4% in the scenario of "Fence."One can also find a notable difference among the classification results of these two classes in accordance with Fig. 10.Therefore, our proposed MaxEnt achieves both high OA and F1 score, comparing to other existing methods and handles well with the category-imbalanced problem.

D. Comparison to Deep Learning-Based Method
Apart from the classic machine learning classifiers, we also employ Pointnet++ [30] as a benchmark deep learning-based method to compare our proposed approach.We thus extract 48%-65% of data points as training data with all features to train the Pointnet++ and leave the remaining 35%-52% as testing data on the DSA-and ESA-Dataset.Table VII shows the corresponding F1 score of individual land cover class, OA, mF1, and mIoU of classification results generated by Pointnet++.The results show that Pointnet++ can produce an accurate classification results with OA yielding 90%-92.5%,mF1 score ranging from 0.737 to 0.802, and mIOU roughly in between 0.6 and 0.7 found on both datasets.However, our proposed MaxEnt(All) still outperforms the deep learning-based method with higher OA (by 4%-5%), F1 score (by 0.06-0.1),and mIoU (by 0.085-0.1)(refer to Tables IV and VI).Although deep learning-based method is effective to classify LiDAR point clouds, it also requires a large amount of training data.Compared with deep learning-based method, our proposed method only requires a small amount of training data, meanwhile generates better classification results.

VI. CONCLUSION
Multispectral airborne LiDAR data overcome the drawbacks of existing monochromatic airborne LiDAR system having a lack of fruitful spectral information.To maximize the benefits of using multispectral LiDAR intensity data, in this study, we propose an optimal neighbor selection method (MaxEnt) to select sufficient homogeneous neighbor points to aid in point cloud classification.The proposed MaxEnt is built upon the MaxEnt principle to adaptively select homogeneous neighbor points and remove heterogeneous neighbor points from a fixed scale of original neighbor points.As the MaxEnt principle states that the largest entropy's summation of the probability distribution of homogeneous and heterogeneous data provide a maximum information discrimination; therefore, the derived probability distribution can aid in avoiding interference information, reducing information redundancy, and enhancing information diversity from the neighbor points.As a result, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
MaxEnt can select sufficient homogeneous neighbor points, meanwhile delineate and remove the interfering heterogeneous neighbor points.After selecting optimal neighbor points, feature vectors can be constructed by using the geometric and radiometric information extracted from these homogeneous neighbor points.The extracted feature vectors thus serve as the input for nine commonly used machine learning classifiers for point cloud classification and individual point is assigned with a typical land cover category.The experimental results adequately indicate that MaxEnt outperforms other existing neighbor selection methods for point cloud classification with better classification performance and OA by 7.3%-19.1%.Moreover, multispectral LiDAR intensity together with elevation can aid in inferring optimal neighbor points.It produces an improved classification accuracy by 3.7%-8.2%when comparing to MaxEnt being applied to only the elevation or intensity.Also, MaxEnt is proven to be more appropriate to classify point clouds collected on study area with detailed and tiny objects, such as, "Powerline" and "Fence."The F1 score is significantly lifted up by 17.1%-62.1%,when comparing to the results derived by six existing neighbor selection methods.In short, MaxEnt achieves a higher accuracy and mF1 score than other methods, and thus shows a better capability.Future work will focus on incorporating the mechanism of MaxEnt in point cloud convolution to embrace the use of deep neural network for semantic segmentation.

Fig. 2 .
Fig. 2. Illustration of optimal neighbor selection results of seven land cover classes.(1a)-(7a) x n with a red star together with surrounding k number of x c (say for example k = 6000), while (1b)-(7b) corresponding k ′ number of optimal neighbor points (x ′ ) derived by MaxEnt based on z, I 1 , I 2 , and I 3 .

Fig. 3 .
Fig. 3. (a) Aerial image and the spatial extent of study area.(b) Multispectral LiDAR intensity data shown on the 3-D point clouds.(c) 2-D view showing the manually labeled point clouds with predefined land cover classes.

Fig. 6 .
Fig. 6.(a) Original LiDAR point cloud and (b) point cloud after height normalization using cloth simulation.

Fig. 7 .
Fig. 7. Illustration of results derived from six existing neighbor selection methods.(1a)-(6a) x n with a red star together with surrounding k number of x c (say for example k = 6000), while (1b)-(6b) corresponding k ′ number of optimal neighbor points derived.

Fig. 9 .
Fig. 9. Confusion matrix generated by MaxEnt(All) based on AB classifier on the DSA-Dataset.

Fig. 13 .
Fig. 13.Confusion matrix generated by based on RFs classifier on the ESA-Dataset.
Graduate Student Member, IEEE, Wai Yeung Yan , Senior Member, IEEE, and Derek D. Lichti

TABLE II SUMMARY
OF THE MULTISPECTRAL AIRBORNE LIDAR DATASET and 3 731 276, respectively.The study area includes various land cover classes, which are predefined and manually labeled into seven categories [see Fig.

TABLE IV F1
SCORE OF INDIVIDUAL LAND COVER CLASS, OA, MF1, AND MIOU USING AB ON THE DSA-DATASET

TABLE V OA
OF NINE CLASSIFIERS WITH RESPECT TO CORRESPONDING NEIGHBOR SELECTION METHODS ON THE ESA-DATASET

TABLE VI F1
SCORE OF INDIVIDUAL LAND COVER CLASS, OA, MF1, AND MIOU USING RFS ON THE ESA-DATASET

TABLE VII F1
SCORE OF INDIVIDUAL LAND COVER CLASS, OA, MF1, AND MIOU USING POINTNET++