Classification of Terrestrial Lidar Data Directly From Digitized Echo Waveforms

Information derived from full-waveform (FW) light detection and ranging (lidar) data has already been shown to be relevant for point cloud analysis tasks. Relevant waveform attributes to populate the corresponding point’s feature vector are typically provided through a post-processing FW analysis (FWA) technique based on fitting the echo waveform with a parametric function describing the shape and location of the echo pulse in the waveform. Samples of the digitized echo are the primary source for any waveform analysis using parametric functions. On the other hand, for some FW lidar scanning systems, describing the complex system response model using a simple parametric function seems challenging or impractical. Earlier studies have shown the potential of a waveform’s digital samples as relevant waveform attributes for point cloud classification. The main goal of this study is to extend earlier experiments on direct exploitation of returned waveform signals collected by a FW terrestrial laser scanning (TLS) system to multireturn waveform signals for point cloud classification in a built environment. Furthermore, the classification performance on feature vectors containing calibrated waveform attributes, derived from a waveform processing approach performed in real-time by the FW TLS system, is evaluated on multiple-echo waveforms and compared with the classification performance derived from the proposed FW data classification technique via deep learning. Classification performance derived through the proposed technique demonstrates high information content of raw digitized waveform samples. Results show that feature vectors containing samples of digitized echoes carry more information about the physical properties of the target than those containing calibrated waveform attributes.

Abstract-Information derived from full-waveform (FW) light detection and ranging (lidar) data has already been shown to be relevant for point cloud analysis tasks. Relevant waveform attributes to populate the corresponding point's feature vector are typically provided through a post-processing FW analysis (FWA) technique based on fitting the echo waveform with a parametric function describing the shape and location of the echo pulse in the waveform. Samples of the digitized echo are the primary source for any waveform analysis using parametric functions. On the other hand, for some FW lidar scanning systems, describing the complex system response model using a simple parametric function seems challenging or impractical. Earlier studies have shown the potential of a waveform's digital samples as relevant waveform attributes for point cloud classification. The main goal of this study is to extend earlier experiments on direct exploitation of returned waveform signals collected by a FW terrestrial laser scanning (TLS) system to multireturn waveform signals for point cloud classification in a built environment. Furthermore, the classification performance on feature vectors containing calibrated waveform attributes, derived from a waveform processing approach performed in real-time by the FW TLS system, is evaluated on multiple-echo waveforms and compared with the classification performance derived from the proposed FW data classification technique via deep learning. Classification performance derived through the proposed technique demonstrates high information content of raw digitized waveform samples. Results show that feature vectors containing samples of digitized echoes carry more information about the physical properties of the target than those containing calibrated waveform attributes.

I. INTRODUCTION
T ERRESTRIAL laser scanning (TLS) systems are increasingly being used for topographic and land cover mapping [1], [2], [3]. There are two distinct techniques used in both airborne and terrestrial lidar (light detection and ranging) systems based on how the echo pulse is recorded: discrete return and full-waveform (FW) lidar systems.
Traditional discrete return lidar systems use a hardwarebased echo detection technique. The number of detected echoes on the returned signal and their detection times depend on the detection method implemented by the hardware [4]. The travel time of each detected echo pulse is, then, converted into range information. Knowing the sensor position and the direction of the transmitted laser pulse, the 3-D coordinates of the reflecting target are calculated. In discrete return systems, the amplitude of the echo pulse is proportional to the target reflectance at the wavelength of the laser beam. The data collected by a laser scanning system are a discrete sample of the real world which is represented by a dense set of points called a 3-D point cloud. Points can later be rendered in false color according to the intensity of the echo pulse corresponding to each individual point.
FW lidar, on the other hand, is an advancement in data acquisition and recording which has been the subject of research for the past 20 years [5], [6]. In contrast to discrete return systems, for each transmitted laser pulse, FW lidar digitally samples the full temporal energy profile of the returned signal (waveform) with a very high sampling rate (typically 500 MHz-2 GHz) [4]. Each echo of the digitized waveform is stored as a series of digital numbers (DNs). Each DN (or digital sample) represents the intensity of the echo at a certain time within the echo waveform. FW data provide additional information about the scattering properties of the target, which may be relevant for lidar point cloud analyses [6].
Since the introduction of FW airborne lidar systems, different FW analysis (FWA) approaches have been explored to derive information about the scattering properties of the target to enhance the underlying lidar point cloud analyses. Currently, the most popular FWA technique is based on the decomposition and modeling of the digitized echo waveform using parametric functions such as the well-known generalized Gaussian function [4], [7].
Compared with discrete returns, applying FWA on FW lidar data usually results in a denser and more accurate representation of the study area, as well as the ability to estimate some geometric properties, such as roughness, slope, and spatial distribution in the laser beam cone of diffraction. Moreover, radiometric calibration of the waveform amplitude enables the extraction of valuable information about the reflectance properties of the target, such as its backscattering cross section and backscattering coefficient [8], [9]. The geometric and This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ radiometric information derived from analyzing waveform signals have already been shown to be valuable for enhancing some challenging lidar point cloud processing tasks such as classification, segmentation, and filtering [4], [8], [10], [11].
Although a large number of studies have focused on different FWA approaches and derived waveform attributes from the modeled echo waveform to characterize target properties, the information content of the raw waveform signal, without a priori decomposition into target attributes, has not been explored. This is more critical in FW TLS systems, where the traditional FWA approaches, similar to those applied on FW data acquired by airborne lidar systems, may not be applicable [35], [36], [37].
FW TLS systems represent lidar instruments of high dynamic range that may be equipped with an echo digitizer of lower digitization rate relative to common airborne FW lidar systems. The high dynamic range of the TLS targets (due to large range envelope) and central obscuration [28] lead to nonlinearity in the scale characteristics of the system response and highly non-Gaussian echo shape [36]. As a result, standard shape fitting techniques used on airborne FW lidar data to extract relevant waveform attributes for point cloud analyses may not be applicable for FW TLS data. On the other hand, in lidar systems with a known system response model with linear scale characteristics, selecting an appropriate parametric model for fitting to the waveform may have limitations. For example, the parametric model is required to be represented by a set of parameters each of which describes the association with a certain target's characteristic [4], [29]. In addition, the accuracy of fitting the waveform is dependent on the sampling interval of the digitized waveform, which is governed by the sampling rate of the digitizer.

A. FW Analysis
FW lidar systems usually provide the end-user access to the raw digitized waveform for further analyses. The main objective of waveform analysis is to extract information about the scattering properties of the target(s) in the laser cone of diffraction. By designing a signal fitting algorithm and assuming that the waveform is composed of a series of components (echoes), waveform decomposition allows an accurate localization of echoes in the measured waveform, from which the distance to each target is calculated. Waveform decomposition, usually, leads to maximum target determination and localization, beyond the range resolution of the lidar system [38], [39].
In FW airborne laser scanning (ALS) systems, the underlying assumption is that the scattering properties of a cluster of targets can be described by a mathematical signal, and therefore parametric function is normally fit to the waveform in offline (post-processing) mode [7], [40]. A range of parametric functions may be explored to improve decomposition and modeling [41], [42], [43]. The choice of parametric function(s) to fit to the waveform for target localization and extracting target properties correlated with the parameters describing the shape of the echo waveform is guided by the characteristics of the underlying sensor system response, which is determined by the convolution of the system impulse response with the transmitted laser pulse [36]. The more closely the chosen parametric function approximates the system response model, the more accurately a waveform signal can be decomposed into its constituent echoes. Moreover, uncertainties in localizing each detected echo in the waveform and in evaluating echo parameters, such as echo amplitude and width, are governed by the chosen parametric function [22], [24], [36], [44].
Commercial and experimental FW TLS systems are now available [45]. However, unlike their airborne counterpart with single-channel detection, TLS systems must contend with a much higher dynamic range of the laser power backscattered from both very near to very far target distances. Nonlinearity in the scale characteristics of the system response of the TLS system can increase the uncertainty in fitting the echo waveform [28], [36]. Moreover, regarding the lower digitization rate in FW TLS systems (usually 500 MHz) than that in FW ALS systems (usually 1 GHz), accurate waveform decomposition and modeling of the echo signal using typical parametric functions tend to be more challenging [36].
To cope with those limitations, Riegl Laser Measurement Systems, GmbH, has introduced V-Line lidar systems with waveform digitization and real-time FWA [45]. Rather than using complex parametric functions to capture variable echo shapes, the system response spanning the entire dynamic range of the lidar system is measured and stored in the instrument [28], [36]. This calibrated system response model is subsequently used to accurately decompose the measured waveform signals. Echo digitization and online waveform processing in such lidar systems not only enables accurate and precise ranging but also assigns valuable waveform-derived attributes including calibrated amplitude, calibrated relative reflectance, and pulse shape deviation to each measurement. While the first two waveform attributes provide information about the radiometric properties of the target, pulse shape deviation describes the degree of distortion of the echo pulse, due to its interaction with multiple close-by targets, slanted or distributed targets.
Compared with offline (post-processing) FWA, online waveform processing may yield similar or even more precise and accurate echo modeling for single-echo waveforms, but it has limitations dealing with overlapping echoes resulting from multitarget settings [46]. Furthermore, due to limited computational power in commercially available FW TLS systems, applying more sophisticated waveform analyses to extract waveform attributes, such as echo width, is not feasible [34].

B. Study Purpose
This article investigates the utility of the raw digitized waveforms recorded by a FW TLS system for multiclass point cloud classification. The main goal is to extend an earlier experiment, carried out by Pashaei et al. [37], which directly exploited the waveform for point cloud classification. Earlier experiments demonstrated the usefulness of exploiting digital samples of recorded echoes for direct classification of lidar points corresponding to single-echo waveforms. This study accounts for multireturn waveform signals, where waveforms are collected by a FW TLS system over a built environment.
The lidar system used in this study, a Riegl VZ-2000i FW TLS, not only digitizes and records the raw echo waveform for post-acquisition analysis but also provides calibrated waveform-derived attributes through a real-time FWA approach called online waveform processing [45], [46]. The online waveform processing performs a proprietary FWA procedure (in real-time) according to the estimated lidar system response measured and stored during calibration in the instrument by the manufacturer [36].
We hypothesize that the information content of the raw waveform signal can be exploited without parametric fitting to provide target scattering information for challenging point cloud analysis tasks, such as filtering, classification, and segmentation, at the same or higher accuracy as performing classification using waveform attributes derived from online waveform processing or, more typically, from post-processing FWA techniques which fit the waveform using parametric functions. This hypothesis is formed based on the fact that samples of the digitized waveform are the primary source of data for retrieving the system response model and extracting waveform attributes. On the other hand, uncertainty in recovering the real lidar system response or in fitting an appropriate parametric function may introduce more uncertainty to the waveform attributes derived from each detected echo. Moreover, similar to FWA in post-processing mode using parametric fitting, the online waveform processing for Riegl V-line systems provides a limited number of waveform attributes that may not be sufficient and/or discriminative enough for effective classification. Studies also show that engineering relevant waveform attributes for point cloud analyses is not a trivial task [29], [34] In this work, the feature vector associated with each point in the TLS point cloud is constructed based on the digital samples recorded for the corresponding echo waveform. Due to high correlation between the intensity of the echo waveform and the distance to the target, and due to the importance of range in radiometric calibration of the waveform, the range to the target is used as the only geometric attribute in the feature vector of each point. To evaluate the information content of the raw digitized waveform, the constructed waveform feature vectors are used for a TLS point cloud multiclass classification within a built environment, where a deep convolutional neural network (DCNN) model is implemented and used for automatic feature learning and feature vector classification. The classification performance is compared with that based on the analysis of calibrated online waveform feature vectors, provided by the online waveform processing, within the same DCNN model.
The contributions of this study are twofold: 1) regarding the promising results achieved in the earlier study [37], the utility of the digitized waveform samples (DNs) collected by a FW TLS system is explored for direct echo (point) classification in a multiple-echo waveform scenario rather than limiting to only single-echo waveforms; and 2) calibrated waveform features for each echo, derived from online waveform processing, are examined for multiclass point (echo) classification, and classification performance is compared with the performance achieved through classification of the feature vectors constructed from the raw digitized waveform samples of the corresponding echo.

II. STUDY AREA AND DATASET
A part of the Texas A&M University-Corpus Christi, TX, USA, campus with an area of 94 600 m 2 was selected as the study area, which includes both natural and man-made structures. Two criteria were considered for target selection as distinct classes of targets. The selected class of targets should be located at different distances (ranges) from the scanner. In addition, the number of observations for each class needs to be large enough for optimal training of the DCNNbased classifier. In this study, seven target categories: asphalt roads, buildings, grass fields, tree canopy, tree trunk, light poles, and cars have been selected for multiclass point cloud classification.
FW TLS data were collected using a Riegl VZ-2000i FW TLS system on August 31, 2021, from six different TLS positions. Fig. 1 illustrates the study area with two different views of the co-registered point cloud. Fig. 1(a) shows the side view of the study area represented by the co-registered point cloud color-coded based on the recorded intensity for echoes (points). The top view of the study area represented by the co-registered point cloud color-coded based on the point's height is shown in Fig. 1(b). Circles in Fig. 1(b) represent scan positions over the study area. Having multiple scan positions within the study area is critical for evaluating the robustness of the proposed approach. Instances for training and testing the underlying classifier can be chosen from either separate or a combination of scan positions. Furthermore, collecting FW TLS data from multiple scan positions significantly decreases correlations between the shape of the return waveform and the geometric properties of the target.
The waveform data and the corresponding point cloud at each scan position were collected in panoramic mode  with a 360 • horizontal field-of-view (FOV) and 100 • (from −40 • to +60 • ) vertical FOV using the scanner's high-speed acquisition mode with FW recording turned on. The pulse repetition rate (PRR) was set to 300 kHz, corresponding to 122 000 measurements per second, and the minimum stepping angle was set to 0.0024 • equivalent to 4 mm point spacing at 100 m. Registration and fine alignment of individual scan positions into a cohesive point cloud was performed with Reigl RiSCAN PRO, version 2.12.1, software package, using the multistation adjustment (MSA) plugin. MSA results reported by the RiSCAN PRO software show the final horizontal and vertical precision of TLS scan co-registration was 0.006 and 0.004 m, respectively, with angular precision better than 0.004 • for all the angular parameters. The co-registered point cloud was then georeferenced using the Riegl VZ-2000i's integrated RTK GNSS receiver, which received corrections from the Texas Department of Transportation (TxDOT) realtime network (RTN) during data acquisition. This approach typically provides absolute positional accuracy down to a few centimeters. Spatial referencing was set to North American Datum of 1983 (NAD83), National Adjustment 2011, State Plane Coordinate System, Texas South Zone, for the horizontal point cloud coordinates. Vertical coordinates were referenced to the NAD83 ellipsoid.
Technical specifications of the FW TLS system are given in Table I. The TLS system digitizes and records the entire echo waveform at a sampling rate of 500 MHz (one sample per 2 ns). The lidar system also provides waveform-derived attributes including calibrated amplitude, calibrated relative reflectance, and pulse shape deviation using the calibrated system response model and online waveform processing. Fig. 2 illustrates a digitized waveform signal including two echoes However, for a point (echo) derived from a multiple-echo waveform, the detected echo may be merged with one or more nearby echoes located in the same waveform signal. Merged echoes are due to the presence of more than one target in the path of the transmitted laser beam located at distances shorter than the range resolution of the laser pulse. Merged echoes in waveform signals make the definition of the corresponding feature vectors more challenging. For example, merged echoes make it difficult to construct feature vectors of a certain size for each detected echo due to the variable number of digital samples for the merged echo.
As a remedy for the problem described above, it is possible to explore the optimum (minimum) number of digital samples required to represent a digitized echo for classification. The minimum number of digital samples required to optimally describe a digitized echo is used to construct waveform feature vectors of a certain size to be fed into the classification model.
The experiment is accomplished for both the single-and multiple-echo waveforms to determine the minimum number of digital samples required to represent each echo in a waveform. For waveforms including merged echoes, this technique helps cut down the contribution of digital samples related to the nearby echoes in the feature vector constructed for the underlying echo. Thus, the experiment to find the optimum (minimum) number of digital samples in a feature vector includes two classification scenarios, one for single-echo waveform data and the other for multiple-echo waveform data. In each experiment, feature vectors including different numbers of digital samples are constructed and classified. The optimum size of the waveform feature vector for classification is determined by exploring the classification performance for waveform feature vectors of different sizes.

A. DCNN Architecture for FW TLS Data Classification
Since their introduction, DCNN architectures have significantly outperformed almost all the traditional ML approaches for classification or segmentation tasks in an end-to-end manner [47]. While a large number of DCNN architectures have been developed for image and 3-D point cloud classification, the potential of a DCNN architecture has not been fully explored for FW classification [48], [49].
In this experiment, the same DCNN architecture proposed in the earlier work, accomplished by Pashaei et al. [37], is used for point cloud (echo waveform) classification. The DCNN architecture is shown in Fig. 3. Online waveform feature vectors including a set of calibrated waveform attributes or an offline waveform feature vector including a subset of the digital samples (DNs) representing the digitized echo may be input to the DCNN.
The input to the network is a data matrix of size N × M, where N is the number of input instances that are simultaneously fed to the network for classification and M is the number of elements in the input feature vector. Referring to Fig. 3, the first block of the proposed DCNN architecture takes 1-D input data and computes local features for each input vector using three 1-D convolutional kernels of size 1 × 1 with batch normalization. Each convolutional layer is then followed by a nonlinear activation function, such as ReLU where x is the input vector or the feature vector computed in the earlier convolutional layer, W is the learnable weight parameter, and b is the bias parameter. Local features derived in the first convolutional block are fed into a max pooling layer to extract global features from the input feature vectors. The second part of the network concatenates the input vector with both the local and global feature vectors and the resulting vector is fed to the second set of convolutional layers, where three 1-D kernels of size 1 × 1 with batch normalization and the ReLU activation function are applied on each individual input feature vector. To solve the classification of the input data, the feature vector resulting from the last convolutional layer is fed into the classifier defined on top of the DCNN architecture, where the class probability is calculated for each individual input vector by the softmax layer as where p i is the class probability of the class i with output value of y i , and C is the total number of classes. Due to the fact that FW TLS data may include imbalanced instances for different classes, the DCNN model uses a weighted categorical cross-entropy loss for training. The loss  TABLE II  TOTAL NUMBER OF GROUND-TRUTH INSTANCES GENERATED FROM THE COLLECTED TLS DATASET OVER THE STUDY AREA   TABLE III   TOTAL NUMBER OF INSTANCES FOR TRAINING AND TESTING THE PROPOSED DCNN CLASSIFIER UNDER EACH SCENARIO. TRAINING AND TESTING  INSTANCES ARE RANDOM SAMPLES OF THE GROUND-TRUTH DATA GIVEN IN TABLE II function is formulated as where L CE is the categorical cross-entropy loss, t n,c is the ground-truth value in one-hot vector representation, and Y n,c is the value showing the predicted probability of class c for the input vector n. W c is the weight for class c, which can be defined as where a is the number of the instances of the same target category, and b is the total number of instances in all the target categories.

B. Ground Truth for Point Cloud Classification
Ground-truth instances are required due to the use of supervised classification. By manual inspection of the co-registered point cloud, points representing the seven different target categories: asphalt road, building, grass, tree canopy, tree trunk, light pole, and car were extracted from the original point cloud. The total number of points belonging to each target category derived from single-echo and multiple-echo waveforms is given in Table II. Ground-truth instances for training and testing the classifier were generated by random sampling from the total number of ground-truth instances provided for each target category.
1) Training and Testing Datasets for Single-Echo Waveforms: To perform the first classification scenario, i.e., multiclass classification on points derived from single-echo waveforms to determine the minimum number of digital samples required to optimally represent the underlying echo with its feature vector, 200 000 instances were randomly sampled from the total available single-echo ground-truth instances for each target category given in Table II. From the sampled instances, 50% were used for training while the remaining 50% were used for testing. For a more accurate estimation of the classification performance and robust determination of the optimum number of waveform samples in the waveform feature vectors, ten separate sets of the training and testing datasets were generated as above and the classification performance was reported through averaging classification performance for all the testing datasets.
The optimum number of waveform samples in waveform feature vectors is examined by constructing feature vectors containing different numbers of digital waveform samples. 2) Training and Testing Datasets for Multiple-Echo Waveforms: To perform multiclass classification on multiple-echo waveforms, 200 000 instances related to multiple-echo waveforms for each target category, given in Table II, were randomly selected where 100 000 were used for training and the remaining 100 000 were used for testing. The same procedure given in the first classification scenario for identifying the optimum number of required samples in waveform feature vectors is also performed for the multiple-echo waveform dataset. Table III illustrates the total number of ground-truth instances and the number of instances in each target category for training and testing the classifier under each scenario. For the sake of completeness and to provide a baseline for evaluating classification performance, point cloud classification based on the raw digitized waveform samples is compared with the classification performance of the same TLS point cloud data with feature vectors containing calibrated online waveform attributes for each point in the point cloud. Thus, for each point in the training or testing datasets, an online waveform feature vector is constructed which includes the range of the point, calibrated amplitude (in dB), calibrated relative reflectance (in dB), and pulse shape deviation (unitless).

A. Classification Using Calibrated Waveform Feature Vectors
The number of feature vectors containing the range and calibrated waveform attributes, derived from online waveform processing for single-echo and multiple-echo waveforms, is given in Table III. Those feature vectors were separately used for training and testing the underlying classifier. Fig. 4 visualizes the distribution of the measured online waveform attributes for different target categories using instances given in Table II for points from both the singleand multiple-echo waveforms. Also, the multiclass classification performance using calibrated waveform attributes on the testing datasets given in Table III for single-echo and multipleecho waveforms is given in Table IV. According to the plot representing the range distribution in Fig. 4, it can be seen that the collected FW TLS data do not include a specific target category which is clustered in a certain range from the TLS instrument due to the large overlaps between the range distributions of different targets. The overlap between the distributions of the calibrated amplitude, relative reflectance, and pulse shape deviation for different target categories can also be seen in related plots given in Fig. 4. Referring to the amplitude distribution plot, large overlaps between the amplitude distributions of different target categories are clear. Although points related to the asphalt roads and buildings show a relatively large separation in their amplitude values, there is still a significant overlap between their distributions. Large overlaps are also visible between the amplitude distributions of asphalt and grass and between building and light pole. Finally, tree trunk and car show large overlap in their amplitude distributions with remaining classes.
Regarding the plot illustrating the pulse shape deviation in Fig. 4, it is clearly noticeable that unlike tree canopy and grass, other target categories have narrower distributions. The main reason for this observation is that a single transmitted laser pulse usually interacts with a cluster of targets located at distances much less than the range resolution of the laser pulse. Moreover, natural objects such as grass and tree canopy show a high degree of randomness in their orientation with respect to the direction of the laser pulse leading to larger range of pulse shape deviation.
However, unlike calibrated amplitude and pulse shape deviation, calibrated relative reflectance appears more relevant for discriminating different targets. According to the reflectance distribution plot, although there are significant overlaps between the reflectance distributions of some target categories, the separation in reflectance values between some target categories is more noticeable. Referring to the plot, there is a relatively large difference between the reflectance values of the buildings and asphalt road categories and between tree trunks and asphalt. On the other hand, similar to buildings and tree trunks, asphalt and tree canopy, grass and tree canopy, and light poles and cars have large overlaps in their reflectance distributions.
The overall accuracy given in Table IV for classification performance using feature vectors populated by the calibrated online waveform attributes derived from the single-echo waveforms (value above the horizontal line) and from the multiple-echo waveforms (value below the horizontal line) is 75% and 72%, respectively. The confusion matrices, also, show relatively similar values for correctly classified points and similar misclassification rates in almost all the target categories. The misclassfication rate in each target category is compatible with the observations derived from the distribution plots given in Fig. 4. According to Table IV, it can be seen that a large portion of the asphalt points have been predicted as tree canopy, grass, and car points which is predictable from the calibrated relative reflectance plot given in Fig. 4. Moreover, when examining Table IV and Fig. 4, building points are more likely to be labeled as tree trunks due to the relative closeness of their reflectance distribution. When comparing the reflectance plot and confusion matrices, points belonging to the light pole and car categories may be misclassified as belonging to other target categories with roughly similar probabilities.
The confusion matrices given in Table IV also confirm that the classification performance for calibrated online waveform attributes from multiple-echo waveforms is approximately similar to the performance achieved for classifying single-echo waveforms. Knowing that some multiple-echo waveforms may include echoes returned from similar or different targets, the similarity in classification performance verifies the discrimination power of calibrated online waveform attributes for predicting class labels for each individual echo in multipleecho waveforms.  n; n ∈ {1, 3, . . . , 23} waveform samples is based on the average overall accuracy calculated for ten different sets of training and testing feature vectors of size n + 1 (range plus n waveform samples for each echo pulse) randomly sampled from the original dataset given in Table II. The vertical bars in the plot show the standard deviation of the overall accuracy for each classification scenario. It is worth noting that the standard deviation of the overall accuracy illustrated in the plot, which ranges approximately from 1% to 4%, is accumulative. It is due to the contribution of several factors to the uncertainty of the final overall accuracy derived from the supervised training of the DCNN model. Main factors may include random initialization of DCNN weights for training the model, quality and quantity of the interclass similarity and intraclass variability of instances randomly selected for each training scenario, and generalization bias regarding the test set.

B. Classification Using Raw Waveform Feature Vectors
According to the plot given in Fig. 5, feature vectors containing just one sample of the digitized echo, i.e., the sample with the largest DN, lead to the lowest classification performance for both the single-and multiple-echo waveforms. The overall accuracy steadily increases up to feature vectors containing nine waveform samples with a standard deviation of 1% in calculated overall accuracy. After this point, the inclusion of additional digital echo samples to the feature vector does not lead to significant enhancement in echo (point) classification. This observation is confirmed by referring to Fig. 2, where the tails of the digitized echo waveform signals do not typically provide useful information about the illuminated target. The same pattern for overall classification performance is also observed in multiple-echo waveforms. In addition, 1% decrease observed in the overall accuracy of the single-echo scenario for some feature vectors including more than nine waveform samples may be attributed to the factors discussed earlier which contribute to the uncertainty of the evaluated overall accuracy. Fig. 5 shows that classification performance for multipleecho waveforms shows similar improvement with additional waveform samples. However, due to the presence of merged echoes in some waveform signals, the overall classification accuracy for multiple-echo waveforms is less than for the single-echo waveforms regardless of feature vector size. The reduction in performance of the echo classification beyond nine samples per echo for the experiment with multiple-echo waveforms is partially due to the partial laser energy received by each target and the contribution of digital samples related to nearby echo(es) in the feature vector of the underlying echo. Moreover, according to the plot in Fig. 5, the standard deviation of classification performance for multiple-echo feature vectors is larger than for single-echo feature vectors of the same size. The variance of the classification performance increases by adding more digital samples to the multiple-echo waveform feature vectors. Table V gives the classification performance for feature vectors for single-echo and multiple-echo waveforms where for both the scenarios, the feature vectors contain nine waveform samples. The overall classification accuracy for feature vectors containing nine samples of digitized echo signals is 82% and 80% for single-echo and multiple-echo waveforms, respectively. Comparing Tables IV and V, it is clear that point cloud classification using samples of digitized waveforms is more accurate than using calibrated waveform attributes, i.e., amplitude, relative reflectance, and pulse shape deviation. The confusion matrices in Table V show that the misclassified  points given in the table follow the same pattern already  observed in Table IV. Comparing the confusion matrices in Tables IV and V, significant improvements are noticeable for predicting instances belonging to asphalt, grass, tree canopy, and light pole classes using raw waveform samples rather than calibrated waveform attributes for both the single-and multiple-echo scenarios. A possible explanation for achieving higher classification performance for aforementioned classes may be attributed to the capability of the DCNN model to capture the shape of the echo pulses using samples of the digitized echoes. Although online waveform feature vectors include pulse shape deviation, referring to Fig. 4, this attribute does not seem discriminative enough for some target classes due to the large overlap between their distributions.
Finally, Fig. 6 gives a qualitative representation of the TLS point cloud classification using calibrated online waveform attributes and raw digitized waveform samples. The quality of classification is shown for points (or echoes) derived from both the single-and multiple-echo waveform signals acquired over the study site.

V. CONCLUSION
In this study, the potential of the raw waveform signals, collected and digitized by the Riegl VZ-2000i FW TLS system, for a multiclass classification over the campus environment of the Texas A&M University-Corpus Christi was explored. The proposed technique for using waveform attributes for TLS point cloud classification does not require using typical FWA approaches for fitting the waveform signals for decomposition and modeling and deriving waveform attributes such as the echo amplitude, echo width, and calibrated backscatter cross section or backscattering coefficient. Instead, samples of the corresponding digitized raw waveform signal were used as waveform attributes for predicting the label of the underlying point. The DCNN classifier developed in this study requires no a priori information of the expected return waveform shape since a hierarchy of waveform features are automatically learned within the deep learning architecture. Furthermore, comparing classification performance derived from feature vectors populated by raw waveform samples with that derived from equivalent feature vectors containing the calibrated waveform attributes for the same target point shows that a set of samples representing the shape of the return waveform carry more information about the scattering properties of the target.
The proposed classification technique in this study is useful for FW lidar systems with unknown or complex system response models, where typical FWA techniques may not be optimal. It may also be useful for FW lidar systems that rely on parametric fitting to extract common waveform attributes, such as echo amplitude and echo width because fitting the waveform by parametric functions can be prone to a high degree of uncertainty. Another advantage of the proposed classification method is that it eliminates the need for feature engineering to extract meaningful attributes from the fit waveform for echo classification. Furthermore, regarding the valuable information content of digitized waveforms derived in a DCNN model without the need for applying typical FWA techniques in a post-processing, as shown in this study, and the fast inference of waveform labels for input waveforms to the trained model due to the simple structure of the input data, which is a 1-D series of DNs, the proposed technique can be considered and explored for near real-time waveform or point cloud classification scenarios.
Some limitations related to the proposed point cloud classification approach should be kept in mind. To get a robust assessment of the proposed lidar point classification approach, it should be evaluated within more complex built and natural environments with more target categories. Substantial variations in the returned waveforms and consequently high interclass similarity and intraclass variability in feature vectors related to targets located at longer distances from the lidar system may require advanced analyses of the FW data. Also, similarity in feature vectors corresponding to different targets or variations in feature vectors related to targets of the same class category may require advanced analyses. This issue can be more critical for targets located at distances shorter than the range resolution of the laser pulse which results in complex, merged waveform signals.
Moreover, it is worth noting that the DCNN architecture proposed in this study has been designed and trained for processing FW data collected by the Riegl VZ-2000i TLS system equipped with a digitizer of a certain sampling rate. Although the input to the DCNN model can be easily modified to input feature vectors of a different size, fine-tuning the model is required where the trained model is used for classifying waveform data collected by a FW lidar system in a different environment (e.g., urban area) or under a different viewing angle (e.g., airborne FW lidar).
Future work may consider applying the same technique on FW data from different lidar sensors capable of waveform digitization because the proposed approach is easily adaptable to data collected from FW airborne lidar and other modalities. In addition, calibrated waveform features derived from accurate fitting to the waveform signals can be assessed against raw waveform features for point cloud classification. The results derived from such a study will help to assess the generalization of the proposed approach. Furthermore, developing more advanced DCNN architectures to effectively explore waveform feature space for simultaneous echo localization, through regression and classification may also be considered as future work. Different implementations of DCNN models for analyzing the waveform signals with different settings can be explored. For example, the implementation of the attention mechanism in the convolutional layers and the choice of the activation function among existing modalities may need to be explored for efficient local and global feature learning and representation. The proposed raw waveform classification technique could also be used for advanced target identification or filtering procedures in complex environments where the inclusion of geometric information to the feature vector of each individual measurement can boost the performance of the FW lidar data analysis.

ACKNOWLEDGMENT
The statements, findings, conclusions, and recommendations are those of the author(s) and do not necessarily reflect the views of the funding agencies.