Adaptive Douglas-Peucker Algorithm With Automatic Thresholding for AIS-Based Vessel Trajectory Compression

Automatic identification system (AIS) is an important part of perfecting terrestrial networks, radar systems and satellite constellations. It has been widely used in vessel traffic service system to improve navigational safety. Following the explosion in vessel AIS data, the issues of data storing, processing, and analysis arise as emerging research topics in recent years. Vessel trajectory compression is used to eliminate the redundant information, preserve the key features, and simplify information for further data mining, thus correspondingly improving data quality and guaranteeing accurate measurement for ensuring navigational safety. It is well known that trajectory compression quality significantly depends on the threshold selection. We propose an Adaptive Douglas-Peucker (ADP) algorithm with automatic thresholding for AIS-based vessel trajectory compression. In particular, the optimal threshold is adaptively calculated using a novel automatic threshold selection method for each trajectory, as an improvement and complement of original Douglas-Peucker (DP) algorithm. It is developed based on the channel and trajectory characteristics, segmentation framework, and mean distance. The proposed method is able to simplify vessel trajectory data and extract useful information effectively. The time series trajectory classification and clustering are discussed and analysed based on ADP algorithm in this paper. To verify the reasonability and effectiveness of the proposed method, experiments are conducted on two different trajectory data sets in inland waterway of Yangtze River for trajectory classification based on the nearest neighbor classifier, and for trajectory clustering based on the spectral clustering. Comprehensive results demonstrate that the proposed algorithm can reduce the computational cost while ensuring the clustering and classification accuracy.

and navigational safety [4]. Following the extensive installation and use of AIS equipment, the issues of vessel trajectory data storing, processing, and analysis arise as emerging research topics [5]. The accurate processing and extraction of massive trajectory data are vital for trajectory clustering, classification, and prediction [6]- [8].
The original AIS trajectory data contains massive noise and redundant information. The maritime navigational authorities need to manage and regulate vessels based on effective and real-time AIS data [9]. The visualization of vessel trajectories based on AIS data is conductive to detecting abnormal behaviors and aiding maritime surveillance [10]. The compression of AIS trajectory data is a valid data pre-processing way in practical applications, and also an effective method to visualize the massive trajectories. Moreover, the automatic and rational simplification threshold selection method is crucial in trajectory data compression. Therefore, effective data compressing algorithm and threshold selection method are proposed and improved to solve these problems while retaining the main features.
Trajectories are described as different types of curves with all sorts of linear features, and consist of many spatiotemporal points. A lot of classical algorithms (e.g. online and batched compression techniques [11]) are proposed and developed to compress the trajectories while preserving them with important geometrical properties. The online compression techniques include Reservoir Sampling (RS) algorithm [12], Sliding Window (SW) algorithm [13] and Normal Opening Window (NOW) algorithm [14]. The batched compression techniques mainly are associated with three algorithms, unformal sample [15], Douglas-Peucker (DP) [16] algorithm and Bellman algorithm [17]. The uniform sampling algorithm takes each i th point in trajectory coordinates. The DP algorithm [18] is a classical simplification algorithm to preserve location, orientation, and shape of different trajectories based on the recursive and refinement approach of retaining the furthest vertexes. The Bellman algorithm is able to preserve the geometry feature of a certain number of points after their simplification as the original ones. The distances between points in the compression process are measured by two ways, Perpendicular Euclidean Distance (PED) and Time Synchronized Euclidean Distance (TSED) [19]. PED is the Euclidean Distance from one point to the line, and doesn't consider the temporal factor. TSED is the Euclidean Distance based on the time synchronised information, which takes the time interval ratio of different points as the weight to calculate the new projection location point.
Trajectory compression algorithms are widely used in various areas, such as maritime trajectory visualization, trajectory clustering, road traffic, pedestrian movement information, cartographic and map generalisation [20]. The theory of line compression has been widely used in trajectories processing. It's evident that the DP algorithm is one of the most effective methods to simplify and compress line data [21], and receives frequent usage [22].
Many different DP enhancements are proposed to compress trajectories. Saalfeld [23] discloses that the resulting simplified polyline by the DP algorithm is consistent with itself and adjacent features in the topology. Bertolotto and Zhou [24] develop the Saalfeld's algorithm to reduce the processing time, and integrate the new algorithm with a web-mapping system. Gudmundsson et al. [25] propose an extended DP algorithm to retain the geometry of self-crossing lines. The appropriate threshold interval [26] is selected from the experiment comparison results of different DP thresholds based on the AIS trajectory visualization quality. The Spatial QUalIty Simplification Heuristic Method (SQUSHM) is proposed by Muckell et al. [27] to reduce the computation time based on the selection of the local critical points. Chen et al. [28] put forward a fast polygonal approximation algorithm to simplify the GPS trajectories based on an integral square synchronous distance error criterion. Zhang et al. [29] present a new threshold selection method based on the minimum ship domain evaluation to define the threshold. Etienne et al. [30] propose an AIS trajectories simplification method based on the DP algorithm to reduce the computation time. However, the issue as to how the simplified threshold can be automatically determined remains unclear. A line simplification method is introduced in map generalisation, and Pallero [31] put forwards a robust and easy-to-implement DP algorithm to guarantee the lines without self-intersections. Birnbaum et al. [32] present a new trajectory compressing algorithm by splitting the trajectories into sub-trajectories based on their similarities. A new trajectory simplification algorithm namely Trajic is proposed by Nibali and He [33] based on the delta compression approach to achieve a good compression ratio and small error margin. Zhao and Shi [34] conduct clustering analysis based on the DP compression and the improved Density-Based Spatial Clustering of Applications with Noise (DBSCAN). However, all the improved DP algorithms are only based on the trajectory shape without changing the algorithm or automatically selecting the threshold.
Trajectory classification and clustering [35] are fundamental for trajectory prediction, anomaly detection and collision avoidance [36]. Trajectory classification and clustering are the important research methods of data mining, which are conducive to extracting pattern information and detecting anomaly behaviors [37], [38]. The classification and clustering processes are known as supervised and unsupervised learning methods respectively. Data pre-processing is the first step of trajectory classification and clustering, which can receive more effective information. The similarity measurement method can help calculate the distances between trajectories, which is used to measure their similarity. The distances between trajectories are a vital factor for trajectory classification and clustering [39]. There are many distance measurement methods from previous studies, for instance, simple Euclidean Distance (ED) [40], Hausdorff distance [41], HMM (Hidden Markov Model) [42], DTW (Dynamic Time Warping) [43], LCSS (Longest Common Subsequence) [44] and so on. ED requires the equal length of all trajectories, and does not take into account the time information. Hausdorff distance is time-consuming. HMM distance sets a statistical model for each trajectory, however it has high time complexity. It has been proved that both Hausdorff and HMM have poor performance [45]. Compared with location similarity, LCSS involves more shape similarity and has high time cost. DTW can easily find the shape similarity of the trajectory, and warps the route from feature to feature [46]. Therefore, DTW is also adopted and developed in the process of similarity measurement.
The relevant literatures indicate that the DP algorithm has been widely studied and used in different fields. To the best of our knowledge, no research has been conducted on the development of automatic threshold selection and a single different threshold for each trajectory. The threshold in the original DP algorithm must be defined by its users to simplify the lines. Therefore, how to select the threshold automatically is one of the research challenges to be addressed in this work. Each time series trajectory is different from others. The other improvement is to automatically select an appropriate threshold for each trajectory. These two improvements can provide useful insights to guide and act as a solid foundation to develop future studies relating to time series trajectories. To address these two problems, we present an Adaptive DP (ADP) algorithm to select the threshold for each trajectory automatically according to the characteristics of different trajectories. Meanwhile, the classification and clustering experiments are carried out on different data sets to verify the effectiveness and robustness of the newly proposed ADP algorithm.
The remainder of the paper is organized as follows. The basic and improved algorithms are described in detail in Section II. Section III describes the proposed framework in this paper, which is used for classifying and clustering time series trajectories. The numerical experiments are carried out on different data sets to validate the effectiveness and reasonability of the ADP in the automatic threshold selection in Section IV. Finally, Section V concludes work together with future work.

II. BASIC ALGORITHMS AND IMPROVED ALGORITHMS A. THE BASIC DOUGLAS-PEUCKER ALGORITHM
The classical DP algorithm is proposed by Douglas and Peucker, and its essence is that the line segments are used to approximate the original trajectory. The final simplified trajectory is topologically consistent with the original one, especially for the neighborhood characteristics in trajectories. The characteristic points are extracted, and then reconstructed the original trajectory which can approximate the original trajectory. The advantage of the basic DP is that it has translation and rotation invariance, the sampling results will be certain when the curve and threshold are given. However, the threshold must be pre-defined by the users to simplify the line. It is evident that the DP algorithm is able to compress trajectories effectively while preserving the main geometrical structures.
Suppose T = (T 1 , T 2 , · · · , T i , · · · , T n ) is the original trajectory. When the number of points is large enough, the original trajectory can be replaced by line segments To decrease the amount of trajectory points, we reconstruct the trajectory with fewer but more important points which are selected from the original point set T , T = (T k1 , T k2 , · · · , T kj , · · · , T km ), T ⊆ T . If the characteristic points are extracted accurately, the new line segments Fig. 1 is the schematic diagram of the original DP algorithm. The original trajectory is constructed by the line segments that connect 6 points (T 1 , T 2 , · · · , T 6 ). To preserve the main geometrical structure of the original trajectory and reduce the redundant trajectory points, it is necessary to extract the characteristic points from the original trajectory. The pre-defined threshold (i.e., tolerance) as a benchmark is selected to simplify the trajectory. The line (T 1 T 6 ) connecting the first point (T 1 ) and last point (T 6 ) is taken as the datum line (or a base line). Then the vertical Euclidean distance of each point to the datum line is calculated in the original trajectory. It can be seen that some of the vertical Euclidean distances are larger than the threshold, (e.g., T 2 ), the point related to the maximum vertical Euclidean distance will be selected to divide the original trajectory into two subtrajectories (e.g. T 1 T 2 , T 2 T 6 ). This procedure will be performed iteratively until there is no characteristic point which has a larger Euclidean distance than the threshold.

B. THE ADAPTIVE DOUGLAS-PEUCKER ALGORITHM
The threshold in the original DP algorithm must be set in advance to simplify the line. Currently, there is scanty studies on the selection of the best threshold in the literature. Therefore, this research pioneers the automatic selection of the threshold. Each time series trajectory is different from others, hence it is beneficial to select the appropriate threshold for each trajectory automatically. The success of such improvements will lay a solid foundation for the subsequent trajectory classification and clustering.
The original DP has only one threshold for all trajectories, and it is difficult to be determined. The ADP algorithm has a different threshold for each trajectory, and can automatically VOLUME 7, 2019 select the appropriate thresholds for different trajectories. The essence of ADP is to calculate the thresholds automatically according to the distances and characteristics of all feature points. ADP can further extract and preserve key features based on the channel characteristics, trajectory characteristics, segmentation framework, and mean distance. The improved DBSCAN is an effective reprocessing method to remove the noise points. The innovation of the improved DBSCAN is that the circular neighborhood is changed into a square neighborhood. Then the square sliding window can handle all the points according to the coordinates, ε, and MinPts. All the points in a data set are reprocessed to extract more efficient points. The criterion is to determine whether the point coordinates are within the range of the square neighborhood. This improvement can avoid the data explosion and memory overflow. For instance, if there are 800,000 points, there will have 319,999,600,000 distance values between different points. The original DBSCAN algorithm will fail to solve the problem of this complexity.
The ADP algorithm is proposed based on the channel characteristics, trajectory characteristics, segmentation framework, and mean distance to select the threshold for each trajectory automatically.
The pseudo code of the ADP algorithm is listed as follows.

C. THE DTW ALGORITHM
From the statistical point of view, the spatio-temporal AIS trajectory is essentially a kind of time series. Suppose Q = {q 1 , q 2 , · · · , q m } and C = {c 1 , c 2 , · · · , c n } denote the two AIS trajectories (i.e., time series), q i represents the value of the i th point in series Q, c j represents the value of the j th point in series C, m and n indicate the length of the entire sequences of Q and C, respectively. d q i , c j denotes the distance between q i and c j . DTW is used to calculate the similarity between two time series. The process of DTW is described as follows. All points are sorted according to their time, then the users construct a matrix A m×n , and A set of adjacent matrix elements in A m×n is called a warping path, denoted by W = {w 1 , w 2 , · · · , w k , · · · , w K }, and max {m, n} < K ≤ m+n−1, the k th point in W is represented by w k = a ij k , the warping path must meet the following constraints: (1) Boundary condition: w 1 = a 11 , w k = a mn ; (2) Continuity and monotonicity: They together ensure that every coordinate in two trajectories can appear in W , and the dotted line between the trajectories does not intersect. Certainly, the time at each point is also monotonic in W .
DTW can find a path with a minimum of the cost of the optimal path based on dynamic programming [47]. The algorithm steps are described as follows: Step1. Starting from the start point of the two sequences i, j to calculate the DTW distance D(i, j) between the two Algorithm 1 ADP Algorithm is the square sliding window, MinPts is the number of points covered by the sliding window.
Step2. The distance D(i, j) of the end point in the two sequences is the DTW distance of the two sequences.
The time complexity of the Euclidean distance and DTW are O(n) and O(n 2 ) respectively. DTW does not require that the two sequences are equal.

III. THE PROPOSED METHOD FRAMEWORK
The proposed ADP algorithm can automatically select a threshold for each trajectory, and hence significantly compress the trajectories, and calculate the compression rate according to the characteristic of each trajectory. It can reduce the amount of data, save the follow-up calculation time and preserve the important structural properties well. The ADP and DTW algorithms can accelerate the data processing and similarity measurement between massive time series.  The ADP algorithm is proposed to compress the time series data sets, and the DTW algorithm with a warping window is introduced to calculate the distances between time series. Then the classification and clustering analysis are carried out in two different time series data sets to verify the validity and effectiveness of the proposed algorithms. The experiment flowchart is shown as follows.
The threshold is the main factor that determines the trajectory compression quality. When its value becomes too small, it will lead to a high calculation cost, while if it becomes too large, it will not capture the original feature of the trajectory. Manual selection of the best compression threshold is the shortcoming of the current research of trajectory compression. To solve this problem, a novel ADP algorithm is proposed to automatically select the thresholds while preserving the structural and geometric characteristics well. Moreover, DTW is chosen to calculate the distance between the time series accurately. This paper not only presents a new algorithm, but also analyses its validity and feasibility through different experiments in the ensuing sections.

IV. EXPERIMENT RESULTS AND EVALUATION OF TWO DATA SETS A. EXPERIMENTAL SETUP AND DATA SETS
Two experiments are performed using 64-bit Windows 10 on a 2.60 GHz Intel Core i7-5600U CPU equipped with 8 GB memory. We implemented the proposed ADP, classification, and clustering methods using MATLAB R2016a, and DTW with a warping window algorithm using MATLAB R2016a and C language.
To verify the accuracy and efficiency of the proposed ADP algorithm, numerical experiments are implemented based on real AIS trajectory data of an inland waterway for classification and the bridge area waterway for clustering. The inland waterway data set is collected from Yangtze River, and has 404 trajectories with 74,263 points. The AIS trajectory data set in the bridge area waterway is the spatial-temporal trajectories with time, longitude, latitude and speed, etc. The AIS trajectory data sets in the bridge area waterway are threedimensional time series. The experimental data are collected from the AIS base station in the Wuhan section of the Yangtze River. The bridge area waterway data set includes the AIS trajectory data of 377 vessels with 58,296 points. The visualization of data sets is shown in Fig. 3.

B. TRAJECTORY COMPRESSION RESULT OF ADP ON A CLASSIFICATION DATA SET
In this paper, the validity of the proposed ADP algorithm is demonstrated by the real vessel trajectory data set. In the first step, the longitude range is [121.6830, 121.7502] and latitude range is [31.267, 31.3435] in the selected trajectory data set. Then the parameter ε is set to 0.0003 and the parameter MinPts is set to be 5 in the improved DBSCAN based on the longitude and latitude range of the trajectory points.
The proposed ADP algorithm is used for compressing the trajectories. The visualization of original and compressed trajectories are shown in Fig. 4. Fig. 4(a) and Fig. 4(c) are the original vessel trajectories and the compressed trajectories respectively. Meanwhile, Fig. 4(b) and Fig. 4(d) show the point data before and after compression respectively. It can be seen that from Fig. 4(b) and Fig. 4(d), the data volume is significantly reduced. The number of points on all trajectories is 1,553 after the trajectory compression.
The number of points and the threshold based on the ADP algorithm are shown in Fig. 5. Fig. 5 (a) displays the number of points before and after compression, where the red line expresses the number of points in original trajectories and the blue one is the number of points in compressed one. The number of points after compression is shown in Fig. 5 (b), which further clearly shows the number of points. The threshold of different trajectories is shown in Fig. 5 (c), and the range is The number of points and the thresholds of up-bound trajectories are shown in Fig. 7. Fig. 7 (a) displays the number of points before and after compression, where the red line indicates the number of points in original trajectories and the blue one is the number of points of all trajectories after compression. The number of points after compression is shown in Fig. 7 (b), which further clearly shows the number of points. The thresholds of different trajectories are shown in Fig. 7 (c), and the range is [0, 4 × 10 −3 ]. The threshold is automatically selected based on the features of different trajectories.
The visualization of original and compressed trajectories of the down-bound vessels are shown in Fig. 8. Fig. 8 (a) and Fig. 8 (c) are the original vessel trajectories and the ones after compression, respectively. Meanwhile, Fig. 8 (b) and Fig. 8  (d) show the points before and after compression respectively. It can be seen from Fig. 8 (b) and Fig. 8 (d) that the data volume is significantly reduced. There are 28,385 points on 179 down-bound trajectories, and only 725 points after using the ADP compression algorithm.
The number of points and the thresholds of the downbound trajectories are shown in Fig. 9. Fig. 9 (a) displays the number of points before and after compression, where the red line represents the number of points in original trajectories and the blue one is that in compressed trajectories. The number of points after compression is shown in Fig. 9 (b), which further clearly shows the number of points. The thresholds of different trajectories are shown in Fig. 9 (c), and the range is [0, 3 × 10 −3 ]. The thresholds are automatically selected based on the features of different trajectories.

C. TRAJECTORY COMPRESSION RESULT OF ADP ON CLUSTERING DATA SET
The data cleansing method used in this section is the same with the above process. The original data set includes 377 trajectories, and there are 324 trajectories with 25,678 points are preserved after data cleansing.
In the first step, the longitude range is [114.2746, 114.2919] and latitude range is [30.545, 30.562] in the selected trajectory data set. Then the parameter ε is set to 0.0006 and the parameter MinPts is set to be 4 in the improved DBSCAN based on the longitude and latitude range of trajectory points.
The visualization of the trajectories before and after compression based on the ADP algorithm are shown in Fig. 10. Fig. 10 (a) and Fig. 10 (c) are the original vessel trajectories and the ones after compression respectively. Meanwhile, Fig. 10 (b) and Fig. 10 (d) show the point data before and after compression respectively. It can be seen that from Fig. 10(b) and Fig. 10 (d), the data volume after compression is significantly reduced.
The number of points and the thresholds are shown in Fig. 11. Fig. 11 (a) displays the number of points before and after compression, where the red line expresses the number of points in original trajectories and the blue one is the number of points in the trajectories after compression. The number of points after compression is clearly shown in Fig. 11 (b). The thresholds of different trajectories are shown in Fig. 11 (c), the range is [0, 1 × 10 −3 ] and the    thresholds are automatically selected based on trajectories characteristics.

D. TRAJECTORY SIMILARITY MEASUREMENT BASED ON DTW
There are 380 trajectories in the inland waterway data set, while 324 trajectories in the bridge area waterway data set after trajectory compression. The distances between the trajectories are calculated by DTW. The distance matrix visualization for different data sets is shown in Fig. 12. Fig. 12(a) and Fig. 12(b) are the 2D image visualization of the 380 × 380 distance matrix before and after trajectory compression, respectively. The 2D image visualization of the 324 × 324 distance matrix before and after trajectory compression are shown in Fig. 12 (c) and Fig. 12 (d), respectively.

1) VISUALIZATION OF CLASSIFICATION RESULTS IN INLAND WATERWAYS
The classification results of the original and compressed data sets are shown in Fig. 13. The original trajectories are shown in Fig. 13(a), where the red lines are the trajectories of the up-bound vessels and the blue ones represent the one of the down-bound vessels. The classification result of the original data set is shown in Fig. 13(b), where the red, blue, green, and black colors represent different classes respectively. The blue line in the black one is the misclassification trajectory. Fig. 13(c) is visualization of the compressed data set, where the red and blue lines have the same meaning in Fig. 13(a). The classification result of the compressed data set is shown in Fig. 13(d), and the trajectories are clearly divided into four categories. The classification accuracy of these four categories is 100%. The original data set have 59,888 points, while the compressed data set only have 1,553 ones. The calculation time and processing time are significantly reduced, which provide theoretical basis and technical support for realizing big data research and analysis in future.

2) VISUALIZATION OF TRAJECTORY CLASSIFICATION RESULTS BY COURSE
The classification results of the data set in the inland waterway based on different courses are compared and shown in Fig. 14. Fig. 14 (a) is visualization of the up-bound vessel trajectories, and Fig. 14 (b) shows the classification result of the up-bound vessel trajectories. The classification accuracy of the up-bound vessel trajectories is 100%. The visualization of the down-bound vessel trajectories is shown in Fig. 14 (c), and the classification result of the down-bound vessel trajectories is shown in Fig. 14 (d). The classification accuracy of the down-bound vessel trajectories is also 100%.

1) VISUALIZATION OF CLUSTERING RESULTS IN BRIDGE WATERWAYS
Spectral Clustering (SC) is based on the spectral graph partition theory, and its essence is to transform the clustering problem of a sample space into the optimal partition problem of graph. It can divide the graph into several subgraphs, which have no intersections between each other. The points have the highest similarity in the same subgraph and the lowest VOLUME 7, 2019  similarity between different subgraphs. SC can identify the sample space with an arbitrary shape and converge to the global optimal solution. The basic idea of SC is to classify the feature vectors received by the feature decomposition based on the similarity matrix of the sample data.
The clustering results of data set in the bridge area waterway are shown in Fig. 15. Fig. 15 (a) is the clustering results based on SC when the number of clustering centers is 2. Fig. 15 (b) is the clustering results based on spectral clustering when the number of clustering centers is 3. Table 1 is the accumulative contribution rate of the top ten eigenvalues based on ADP and DTW, and the top two eigenvalues and the top three are 95.23% and 98.73%, respectively. The number of clusters is set to 2, and the performance analysis of two or three clustering centers are shown and analysed in the previous experiments. It can be clearly seen from Fig.15, the performance of two clustering centers is better than the three ones. The verification of the number of clustering centers further proves the effectiveness of the proposed compression algorithm and the clustering algorithm.

G. COMPARATIVE ANALYSIS OF TIME COMPLEXITY
The time complexity of the used methods in this work are as follows: DTW is O(n 2 ), the nearest neighbor classification is O(n), and spectral clustering is O(n 2 ). In the above time complexity expressions, n represents the number of AIS trajectories. The comparison results before and after trajectory compression are listed in Table 2. In the inland waterway data set, there are 380 trajectories, consisting of 59,888 points and 1,553 points before and after compression. The running time of DTW before and after compression is 250.292 s and 183.762 s, respectively. The classification running time is 6.348 s and 3.256 s, respectively. Whether the course is considered or not, the classification accuracy after trajectory compression is always 100%.
The running time and the classification accuracy further verify the validity of the proposed trajectory compression algorithm.
The data set in the bridge area waterway includes 324 trajectories, consisting of 25,678 and 1,154 points before and after trajectory compression. The running time of DTW before and after compression is 174.246 s and 108.141 s, respectively. The clustering time is 5.322 s and 3.818 s, respectively. The clustering accuracy is 96.9% and 100%, respectively.
The accuracy of classification and clustering after trajectory compression is better than that before trajectory compression. The running time of different parts after trajectory compression is less than that before trajectory compression. The comparison results before and after trajectory compression have further prove the effectiveness and feasibility of our proposed trajectory compression algorithm.

V. CONCLUSION AND FUTURE WORK
In this paper, we propose a novel trajectory compression algorithm to extract valid trajectory features, accelerate the similarity measures between massive AIS trajectories, improve the accuracy of classification and clustering, and reduce the processing and running time. The quality of trajectory compression and the accuracy of similarity measurement are the key factors to determine trajectory classification and clustering. The traditional DP compression threshold needs to be set manually or selected by experimental comparison. The proposed method could significantly compress the AIS trajectories while maintaining the main geometrical structures, and also automatically calculate a different threshold for each trajectory. It is always important to guarantee the structural features and increase the compression quality in trajectory clustering and classification. Therefore, trajectory similarity measurement based on ADP, the classification accuracy, and the clustering accuracy could be significantly improved and accelerated in practical applications. It is of significance for realizing big data research in future. Numerous experiments of trajectory classification and clustering are implemented using different trajectory data sets to verify the effectiveness and feasibility of the new ADP.
To generalise the improved algorithm in future, we need to research the particular shape trajectories, then further realize big data analysis based on the proposed ADP algorithm. Thus, further studies should be conducted to investigate the threshold automatic selection method of special and chaotic trajectories. In addition, the automatic segmentation framework should also be further studied.