Subspace-Based Preprocessing Module for Fast Hyperspectral Endmember Selection

Endmember extraction algorithms (EEAs) play a crucial role in hyperspectral image (HSI) perception, and yet they normally suffer from three flaws: 1) High computational burden, 2) weak noise robustness, and 3) high outlier sensitivity. To solve these problems, this article proposes a fast subspace-based preprocessing module, called fast subspace-based preprocessing module (FSPM), to select a high-quality data subset for subsequent endmember extraction. Specifically, FSPM first transforms an HSI into a low-dimensional data subspace using singular value decomposition. For each component pair, FSPM then detects its convex hull vertices and proposes a local outlier score measurement to remove potential outliers. FSPM finally transforms determined data points into noise-reduced data space for the sake of identifying endmembers. The proposed FSPM sheds new light on the current preprocessing field, which can fast reduce noises and outliers as well as remove redundant data points. Based on various validation metrics, experiments conducted on both synthetic and real HSIs indicate that the proposed FSPM is superior to current state-of-the-art preprocessing techniques.


I. INTRODUCTION
I N HYPERSPECTRAL images (HSIs), mixed pixels are a mixture of several distinct substances known as endmembers, weighted by their corresponding fractional abundances, which is primarily due to the limited spatial resolution of a sensor or mixed homogeneous area of distinct materials [1], [2]. To exploit mixed pixels, the process of decomposing a mixed pixel into a collection of endmembers and a set of abundances is called hyperspectral unmixing (HU), involving endmember extraction and abundance estimation procedures [3], [4].
However, when the light involves multiple scattering phenomena, endmembers are mixed based on a nonlinear mixture model (NLMM), otherwise, follow a linear mixture model (LMM) which provides clear physical meanings. Based on the LMM assumption, HU algorithms nowadays are explored mainly from three perspectives: 1) Statistical-based unmixing strategy, 2) spectral library-based sparse unmixing, and 3) fully constrained least-square-based endmember extraction.
Statistical-based algorithms recently received plenty of attention partly because those algorithms can simultaneously estimate endmembers and abundances, but principally because of their Bayesian paradigm which can formulate HU issues using explicitly mathematical and physical meanings. Classic statistical frameworks concentrate on the nonnegative matrix factorization (NMF) [5] which blindly decomposes an HSI as the product of endmember and abundance matrices by alternatively optimizing two matrices. By imposing various constraints on the basic NMF, lots of NMF-based variants can capture desirable unmixing performances such as minimum simplex volume constraint [2], 1/2 sparsity regularizer [6], graph constraint [7], collaborative sparsity ( 1,2 ) regularizer [8], etc.
Comparing with statistical frameworks, sparse unmixing algorithms hinge on an existing spectral library instead of exploiting endmembers, meaning that they focus sorely on estimating an abundance matrix. Several intrinsic abundance properties are generally explored including total variation [9], collaborative sparsity [10], nonlocal information [11], spatial discontinuity information [12], etc. Nevertheless, if we apply a sparse unmixing algorithm to obtain abundances, a spectral library is demanded, which is not consistently assured.
Despite the EEAs that determine endmembers from the entire data in an unsupervised manner, recent attempts equally highlight the field of preprocessing algorithm (PPA) that normally searches a combination of spatial-spectral information to select a data subset for subsequent endmember extraction task. Representative modules include spatial preprocessing (SPP) [26], spatial-spectral preprocessing (SSPP) [27], spatial-spectral preprocessing module (SPMM) [28], regional clustering-based spatial preprocessing (RCSPP) [29], etc. More specifically, those PPAs follow similar paradigms that contain two strategies: 1) Clustering-or superpixel-based techniques, including k-means [28], iterative self-organizing data analysis (ISO-DATA) [27], fuzzy C-means [30], and simple linear iteration clustering (SLIC) [29], are used to exploit spatial information by specifying spatially local homogeneity of each target pixel owing to an underlying fact that the potential endmembers generally originate from spatially homogeneous areas, and 2) a set of projection bases normally generated by principal component analysis (PCA) are considered to identify pixels' spectral purity by identifying their projections onto the bases [27]- [29]. Those pixels who maintain extreme projections are more likely to possess high spectral purity than other counterparts.
These strategies play an important role in the existing PPAs which have captured beneficial performances to some extent. But in the meantime dilemmas bound, to the spatial-spectral combination, to spatial homogeneity exploitation, to computational cost, to outliers, and noises. First, they rely grossly on the process of exploiting spatial and spectral information from two independent steps, in other words, they separately extract valuable information from rich spectral bands. Then, they deeply link the determination of spatial information with homogeneous areas. Next, they suffer heavily from high execution time, indicating that the joint action of processing the image employing the SPP or the SSPP algorithms and then extracting the endmembers from the resulting image takes more time than directly applying the VCA algorithm to the original nonpreprocessed image [31]. To reduce the computational requirements of SPP and SSPP, two parallel computing-based extensions [32], [33] were recently proposed which to some extent show a potential research line. More importantly, some of them neglect the impact of outliers prior to endmember extraction. Normally, outliers are special patterns in data without a well-defined notion of normal behavior, and their signatures are spectrally distinct from their surroundings or background representation [25], [34], [35]. In this regard, we regard the generation process of outliers as two types: Geographically anomalous type and functionally anomalous type. The first is generally seen as the rare materials that maintain negligible proportions in the observed HSI. The second is more likely to be generated during detector failure, data transfer, and improper data correction [36]. It is worth noting that once there exist potential outliers, they are more likely to be selected as endmembers by most spectral-based EEAs because they deeply deviate from the background data, and they may force a vertex of the simplex to reside at a point beyond the nominal position of the endmember in order to enclose every point [1]. Compared to outliers, noisy pixels are normal spectral signatures corrupted by Gaussian noise or even sparse noise, which can be recovered from noises based on a specific denoising model. One more thing to mention, the existing PPAs emphasize providing endmember candidates but barely considering noises that extensively affect endmember extraction accuracy in poor noise scenarios. For SPP and SSPP, pixel reconstruction and multiscale Gaussian filtering processes are, respectively, adopted to alleviate noises, but the noise reduction is not sufficient.
Moreover, most PPAs that utilize spatial information depend heavily on two facts. The first is that HSIs intrinsically contain spatial and spectral attributes, where spatial contextual can characterize pixels' spatial correlations. Those pixels who are spatially close to each other have higher correlations than those distant pixels. Under this fact, SPP [26] holds a belief that endmembers are more likely to be found in homogeneous areas rather than in transition or heterogeneous areas, which guides most subsequent PPAs' algorithmic structures. The second is that spatial information provides a crucial tool for PPAs to detect and remove outliers that inevitably exist in the dataset, principally owing to that outliers generally deviate from normal data behaviors. By applying clustering methods, homogeneous areas are filled with consistent categories, while outliers' labels distinctly differ from those of their neighborhoods. Compared with the process of specifying spectral information, however, the process of spatial exploitation is time-consuming since it generally involves traversing entire pixels as well as their nearby pixels. Therefore, we investigate potential research lines involving how to elegantly extract spatial-spectral information and supply a noise and outlier removal module. The former one highly affects PPAs' computational cost, and the latter one can effectively promote the quality of endmember candidates.
We set out to test the hypothesis that traditional spatialspectral-based data preprocessing is reflected in subspace-based data exploitation processes that have less computational requirements and weak noise/outlier interference. The aim of the proposed fast subspace-based preprocessing module (FSPM) pays closer attention to the following two facts: 1) Homogeneous pixels equally locate in the simplex, but outliers deviate from the simplex. Therefore, the proposed FSPM avoids traversing pixels from the HSI but only needs to remove potential outliers beyond the simplex, 2) the methods that regard projections as pixels' spectral purity can be effectively replaced by finding vertices or boundary points from the simplex, which may require less computational burden and produce more beneficial pixels. In this regard, FSPM does not investigate spatial-spectral information of the HSI but rather reduces the HSI into a low-dimensional subspace by performing singular value decomposition (SVD). The exploitation of spatial and spectral information in FSPM can be, respectively, regarded as the process of masking outliers which goes beyond the convex hull and specifying convex hull vertices. By iteratively identifying desired data points from a subspace, FSPM then transforms determined data points into noise-reduced data space. Fig. 1 illustrates the preprocessing procedures of FSPM. Based on different validation metrics, experiments conducted on synthetic and real HSIs indicate that FSPM not only requires less computational requirements but also provides a more preferable endmember candidate set than other PPAs.
We make four chief contributions. 1) This article introduces a preprocessing module for fast selecting a data subset. Instead of using traditional strategies that narrowly search for a combination of spatial-spectral information, the proposed FSPM treats preprocessing issues as a problem of subspace-based data subset selection. 2) This article unifies two core tasks of data preprocessing, including fast spatial-spectral information exploitation and noise reduction. The former one highly depends on iteratively specifying convex hull vertices from affine space and removing outliers far away from the convex hull, and the latter one reduces noises using SVD. Under these conditions, the proposed FSPM can simultaneously obtain a desired endmember subset with negligible processing time. 3) To remove outliers, FSPM develops a new outlier removal method by considering the neighbor sparsity of convex hull vertices especially vertically and horizontally extreme points. 4) Besides the current metrics such as spectral angle distance (SAD) and root-mean-square error (RMSE), this article proposes a new metric to validate PPAs' performance called preprocessing performance measure (PPM), involving evaluating endmember accuracy and computational cost of the PPA which coupled with EEAs. This article is divided into five sections, which can be given as follows. Section II reviews several important issues regarding EEAs and PPAs. The FSPM is introduced in Section III in detail. Section IV displays experimental results with regard to the FSPM and other comparison algorithms. Section V concludes this article.

A. LMM
LMM considers that light follows a linear combination of different materials (or endmembers). Suppose Y = [y 1 , y 2 , . . ., y n ] L×N is an HSI with L bands and N total pixels. It can be formulated by considering endmembers, abundance, and noise matrices where M ∈ R L×p , A ∈ R p×N , and W ∈ R L×N are endmember, abundance, and additive noise matrices, respectively, and p is the number of endmembers, which can be estimated by virtual dimensionality (VD) [37]. Besides, abundance nonnegative and sum-to-one constraints are adopted, such that A 0 and 1 T p A = 1 T N , where 1 denotes the vector of all ones.
B. EEA 1) Simplex Projection-Based EEA: Simplex projectionbased EEAs such as PPI [14], OSP [15], and VCA [16] normally investigate the extreme projections on determined projection bases. It is worth mentioning that there are some differences among such EEAs. PPI demands to project entire pixel signatures onto loads of randomly generated vectors, a.k.a. skewers. OSP considers projecting hyperspectral data onto an orthogonal subspace spanned by already determined endmembers. VCA performs an affine transformation and finds potential endmembers by projecting entire subspace data onto skewers. Although the endmembers extracted by VCA are inconsistent if we perform multiple independent runs, they can avoid noise interference.
2) Simplex Volume-Based EEA: Simplex volume-based EEAs can be further divided into two categories. The first one inflates a maximum matrix determinant by successively replacing pixels so as to reach the largest inner simplex volume. Representative methods include NFINDR [17], SGA [19], AVMAX [19], and fast gram determinant-based algorithm (FGDA) [38]. The second one shrinks a minimum exterior simplex volume by iteratively updating the objection function. Classic methods include simplex identification via split augmented Lagrangian (SISAL) [39], MVES [20], and MVSA [21]. The biggest difference behind these two strategies is that the former one selects endmembers from the entire data space and it can obtain the desired endmember extraction performance especially for those HSIs who contain pure pixels, while the latter one has the ability on handling both pure and nonpure assumption-based scenarios.
3) NMF: NMF-based HU algorithms involve jointly estimating endmembers and abundances from the HSI. Based on the previous notations, NMF can be expressed as where M 0 and A 0. In order to measure the difference between the original signal Y and reconstructed signal MA, its cost function is considered as where · 2 F denotes the Frobenius norm, and the cost function also subjects to 1 T p A = 1 T N . By considering endmemberoriented constraints, NMF-based variants can capture better unmixing performance even though there exists noises or nonpure pixels, given as where φ(M) denotes an endmember regularizer. Representative regularizer include minimum distance [40], minimum simplex volume [2], [8], endmember dissimilarity [41], and minimum dispersion [42]. Fig. 2 provides visual descriptions regarding four endmember extraction strategies mentioned in this section.
C. PPA 1) Pixel Reconstruction-Based PPA: Pixel reconstructionbased PPAs highlight the process of assessing spatial weights between target pixels and their surroundings and reconstructing target pixels using assessed weights. The basic formula can be simply expressed as where w and N denote a weight vector and neighbor pixels, respectively. In SPP [26], it measures weights between each pixel and its adjacent pixels based on SAD calculations. In [43], it uses the local linear embedding (LLE) to generate spatially correlated weights for SPP. Additionally, Zhang et al. [44] introduces a local sparse representation model to exploit the sparsity of the weight vector. Nevertheless, the abovementioned PPAs demand to process entire pixels without discarding redundant pixels, which barely reduce the computational cost of subsequent EEAs as well as themselves. Another inevitable shortcoming is that NMF not only can produce endmembers but also has the ability on estimating abundances. Each arrow equally represents that the current EEA can be applied to HSI for endmember extraction purposes.
those PPAs can improve endmember accuracy, especially for poor noise scenarios, owing to pixel's reconstruction process. When there are relatively light noises such as 40 dB, however, they may, in turn, affect endmember accuracy primarily because those pixels who have high spectral purity are reconstructed by their 5 × 5 or even 7 × 7 neighborhoods, so as to degrade their spectral purity.
2) Pixel Removal-Based PPA: Compared to pixel reconstruction-based PPAs, pixel removal-based PPAs adopt a strategy that removes plenty of redundant pixels that normally situate in heterogeneous areas or maintain relatively low spectral purity. By setting a specific retaining ratio, such as 20% of total pixels, the rest of the pixels are removed, meaning that the subsequent endmember extraction process will be faster. Those PPAs who follow this strategy are the dominant way to preprocess hyperspectral data in the past decade, including SSPP [27], SSPM [28], RCSPP [29], spatial edges and spectral extremes preprocessing (SE 2 PP) [31], region-based SPP [45], geodesic and Euclidean distance-based preprocessing (GEPP) [46], and spatial energy and spectral purity-based preprocessing (SESPP) [47]. However, such PPAs are concerned with traditional spatial exploitation processes such as k-means and SLIC, which are time-consuming. They also simultaneously resort to projecting entire pixels onto a set of bases to specify pixels' spectral purity. By fusing spatial and spectral information, they design a specific joint type to select desired endmember candidates. More importantly, in order to produce desired preprocessing performances, they involve several cut-off parameters such as the size of the sliding kernel window [28], the number of projection bases [27], and purity threshold [29], which require a careful tuning process according to different hyperspectral data qualities. Fig. 3 details two main preprocessing strategies involving pixel reconstruction-based PPA and pixel removal-based PPA.

A. Algorithmic Motivation
The current mainstream PPAs separate the preprocessing process into two independent steps, i.e., spatial exploitation and spectral purity calculation. The first one normally considers clustering-based methods to explore spatial information that identify pixels' homogeneity and underlying outliers. High homogeneity indicates that pixels are prone to be potential endmembers, otherwise they are more likely to be transition pixels or even outliers. The second one refers to generating lots of projection bases and capturing projections of the pixels onto those bases. It is easier for PPAs to possess extreme projections than other mixed pixels. However, from the viewpoint of data simplex, if it holds a pure pixel-based assumption, endmembers are seen as its vertices otherwise the optimization outcome of its at least p−1 boundary pixels of each facet. Besides, those pixels who have high spatial homogeneity are liable to be close to each other within the simplex because pixels are an affine combination of endmembers; the outliers are quite on the contrary far away from the simplex because of their abnormal data behavior showing a neighborhood sparsity property. More specifically, according to the LMM, pixels in an HSI often fall into a simplex determined by endmembers. The intrinsic dimensionality of hyperspectral data is much lower than its ambient dimensionality defined by the number of bands, indicating that the subspace other than the simplex is occupied by spectral variations, noises, and potential outliers [43]. If the hyperspectral data were projected to its intrinsic signal space, normal data points would be still correlated to each other in signal space while abnormal data (outliers) would go beyond the simplex and noises and spectral variations would be mitigated. In this case, when a dataset that contains outliers is projected into a subspace, the data points in the subspace will show that the neighborhood of outliers is sparse because it is far from simplex. Fig. 4 verifies the fact that the neighborhood of an outlier is sparser than that of normal data.
FSPM highlights a convex hull that has the ability to determine vertices including the vertex and boundary pixels of the simplex. Moreover, outliers can be specified by detecting those points derivating from the convex hull. By applying this idea, FSPM at least shows three advantages: 1) It avoids searching spatial or spectral projection information but specifying vertices of the convex hull; 2) it simplifies the process of outliers removal by detecting data points far away from the convex hull instead of using clustering methods, and 3) it tackles hyperspectral data in the subspace which reduces the computational time and noises.
B. Proposed FSPM 1) Subspace Transformation: In order to avoid timeconsuming steps, FSPM uses SVD to reduce hyperspectral data into p-dimensional subspace. Specifically, it first captures a sample correlation matrix by calculating YY T /N and extracts the left eigenvector matrix U from that matrix. Then, it projects the original data Y into U to generate a low-dimensional data subspace X = U T Y, which measures intrinsically principal information of hyperspectral data. In other to avoid displacements between the original data and projected ones [21], we utilize the projective projection by introducing a scaling factor [3], [16], and [21] for more details). This step provides two obvious advantages: 1) It converts L-dimensional data into p-dimensional projected data which promotes computational efficiency, and 2) it masks a large percentage of noises so that the convex hull vertices are more likely to be accurately found.
It is worth mentioning that there are plenty of methods to generate the subspace such as randomly generated orthonormal bases [48], principal component analysis [16], or NMF. There are two chief reasons why we adopt SVD to generate the subspace instead of other alternatives. The first one is that SVD is computationally efficient, which can provide the projection that best represents data in the maximum-power sense, and the second one is that SVD can to some degree mask noises during the data transformation because it tends to reinforce the data matrix to be low-rank.
2) Detection of Convex Hull Vertices: By applying the abovementioned data transformation, FSPM emphasizes the process of finding convex hull vertices because endmembers are normally specified from vertices if there exist pure pixels or formed by the boundary pixels if the pure-pixel assumption cannot hold. With the increase of data dimensions, however, the computational cost of convex hull algorithms such as Graham scan [49] and Jarvis march [50] go up exponentially. Even though p-dimensional data has achieved a low-dimensional space with each dimension representing a primary component, it is still impossible to detect convex hull vertices using entire components. In [48], Heylen and Scheunders restrict to the values D ∈ {1, 2, 3}, where D denotes the data dimension, because it is acceptable for endmember extraction not only for its light time complexity but also due to less memory requirements.
In this article, we restrict the detection process of convex hull vertices to 2D and select 2D subspace combinations by F =  (p − 1). Besides, we adopt quick hull (Qhull) [51] to find vertices of each 2D subspace owing to its reasonable complexity O(N log(h)) in two and three dimensions, with h as the number of vertices spanning the convex hull. Fig. 5 visually displays the detection process of convex hull vertices. Qhull first starts by identifying the leftmost and rightmost points and then finds the convex hulls of the points on two sides. By using the recursive strategy, all vertices can be quickly obtained.
3) Outliers Removal: The above description shows that the subspace-based view is meaningful and adequate to quickly select a high-quality data subset for the sake of identifying endmembers, but not satisfactory for the general case when potential outliers exist. Traditionally, PPAs such as SPP and SSPM concern a spatial exploitation process that can remove outliers to mitigate error propagation of subsequent endmember extraction. These outliers govern abnormal spectral behavior which tends to be regarded as endmembers in the spectral domain but is liable to be excluded by considering their spatial neighbors in the spatial domain. However, the process of spatial information exploration is time-consuming. In this regard, FSPM discards this process but takes an outlier detector to remove outliers from the data simplex. The outlier detector relies on the fact that homogeneous pixels normally exist in the simplex, but endmembers and outliers potentially situate at its vertices. Fig. 4 visually describes the fact that homogeneous pixels are interior data located in the simplex and outliers distinctly differ from normal data representation.
The process of outlier detection hinges grossly on three points. 1) Outliers are seen as abnormal points beyond the simplex, indicating that they normally vertically or horizontally contain extreme values.
2) The neighborhood of outliers is extremely sparser than that of normal data.
3) The local density of normal data is significantly close to that of their neighborhood, while outliers are quite on the contrary. It is worth mentioning that, for point 2), the neighborhood sparsity relies on the fact that a normal neighborhood set contains spectrally similar signatures and spatially consistent patterns while an abnormal neighborhood set maintains some spectrally distinct signatures. In this regard, when a dataset that contains outliers is projected to a subspace, the data points in the subspace will show the fact that the neighborhood of outliers is sparse while the neighborhood of normal data points is still correlated to each other. In other words, outliers that existed in highdimensional data can be clearly observed in low-dimensional intrinsic signal space in which the neighborhood of outliers is sparse. Therefore, FSPM pays more attention to four vertically and horizontally extreme points, because endmembers and outliers are more likely to have extreme coordinate values in the convex hull. For the four extreme points, their k-nearest neighbors (KNN) are specified to calculate local density. To reduce the computational cost of neighborhood searching, we set a radius-fixed searching region to define points' KNN.
Definition 1: For any positive integer k , the k-nearest neighbors of a point x within a searching region D with radius fixed at δ, such that: 1) if there are s(s ≤ k ) data points within D, the number of nearest neighbors of x is s, otherwise; 2) the nearest neighbors of x are the k points that have the closest Euclidean distances to x.
For each extreme point, FSPM calculates its local density concerning Euclidean distances between its neighbors and itself. For each component, i.e., 1-D subspace X i,: , however, its data mean and standard deviation explicitly differs from other counterparts, implying that directly specifying neighborhoods by Euclidean distance on the original component pair tends to be affected by different data variance. In this regard, each component is standardized by fixing the data mean and standard deviation as 0 and 1, respectively, given by where μ and σ, respectively, represent the data mean and standard deviation. It is worth noting that when there are only s(s ≤ k ) neighbors in a fixed searching region D, we consider the relation between k and s and use it to weight target pixels' local density. Specifically, if k = s, the target point's local density is weighted by k /s = 1, otherwise its weight less than 1, which can weaken outlier's local density. Definition 2: Given a KNN set N = {z i } s i=1 of x, its local density (LoD) is defined as: We are also aware that the outlier's LoD is different from that of its neighbors, triggering that its neighbors' LoD is equally required to specify. However, when entire neighbors' LoD are demanded to calculate, it is a waste of execution cost because there are lots of shared neighbors. Considering this fact, FSPM highlights the farthest neighbor lying in D and calculates its LoD.
Definition 3: Given a KNN set N = {z i } s i=1 of x, its farthest neighborẑ ∈ N is defined aŝ By calculating the LoD of the target point and its farthest neighbor point, we can obtain the target point's local outlier scores (LOS). The most crucial property behind LOS is that LOS explicitly closes to 1 or even larger than 1 when it involves a normal data point, otherwise it is extremely lower than 1. In this regard, a cut-off threshold can be considered to distinguish normal points from outliers.
Definition 4: The local outlier score (LOS) of x is defined as We are also aware that there exist a lot of outlier detectors that have captured desired detection performances in the field of data mining including statistical/probabilistic-based, linear modelbased, density-based, and so on [52]. Statistical/probabilisticbased detectors such as the 3σ rule [53] and percentile-based box theory are the fast strategy to identify outliers, and yet they assume that the data points follow a Gaussian distribution, which is not always ensured. Linear model-based detectors such as principal component analysis assume that outliers can generate higher reconstruction errors during the data transformation process than those of normal data points. Density-based detectors such as local outlier factor (LOF) [54] and DBSCAN [55] identify outliers by considering the local density of entire data points. Our proposed outlier removal method, i.e., LOS, shows several clear advantages.
1) LOS only takes into account convex hull vertices especially vertically and horizontally extreme points rather than entire data points, which can quickly identify outliers. for j → 4 do 15: Specify D using fixed radius δ; 16: Specify LoD(E(:, j)) by considering its KNN in D;

17:
Specify LoD(ẑ) by considering its KNN in corresponding D; 18: if LOS (E(:, j)  In order to visually describe the intrinsic idea of LOS, Fig. 6 provides four key parts in detail. Noting that the process of removing potential outliers (i.e., extreme points in the vertical and horizontal axes) can be parallel to the extraction process of convex hull vertices, which is an underlying solution to further accelerate the preprocessing procedures of FSPM. Besides, Algorithm 1 provides detailed algorithmic structures to elaborate proposed FSPM.

C. Implementation Issues
This section provides several crucial implementation details regarding FSPM.
There are four parameters, including the ratio of pixel selection λ, the radius of searching region δ, the number of neighborhoods k, and the cut-off threshold η. λ determines how many pixels will be extracted from the original hyperspectral data. According to the descriptions of pseudo-code of FSPM [see Algorithm 1], line 11 is the most time-consuming step that requires at least 1 2 p(p − 1) runs with each complexity O (N log(h)). In this regard, if λ is too large, it obviously affects FSPM's time performance. Besides, since FSPM pays more attention to convex hull vertices that are seen as potential endmembers, λ can be fixed at a relatively small ratio, such as 1% of total pixels. δ specifies a neighborhood searching region that avoids finding KNN from the entire data. Normally, δ can determine a circle searching region, but it involves a relatively complex searching procedure. In this article, we fix δ according to vertically and horizontally extreme points because those points can determine a proximate data range, and they are also easy to be found according to points' coordinates. Assume Max v and Max h , respectively, denote vertical and horizontal maxima, and Min v and Min h , respectively, denote vertical and horizontal minima. Here, we define δ from two dimensions, i.e., δ v = (Max v − Min v )/10 and δ h = (Max h − Min h )/10. By applying δ v and δ h , each target point can quickly define its rectangle searching region. Besides, if we applied a circle searching region, δ would be fixed at δ 2 v + δ 2 h . For k, it determines the number of nearest neighbors and it is fixed at 5 in our subsequent experiments. In terms of η, it is a cut-off threshold used to separate outliers and normal points. We are aware that this threshold is difficult to tune partly because of various anomalous types and mainly because of the potential sparsity of different datasets.
In [34], Xu et al. created outliers by considering the idea that the spectral reflectance of several consecutive bands is significantly higher than that of other bands. In [3] and [25], the authors created outliers based on LMM, defined as In this formula, we allow that the abundance proportions of an endmember signature m p is γ ∈ [1, 1.2]. For the rest of endmember signatures , their corresponding abundance vector is ζ p−1 subjecting to an important ASC constraint γ + 1 T p−1 ζ p−1 = 1. In this outlier assumption, we use it to generate outliers for Experiment 2 and fix η at 0.1, which can accurately exclude outliers.
In Algorithm 1, we use a while-do loop to iteratively select convex hull vertices until we reach the desired number λ × N . It is worth mentioning that we remove already determined vertices from F which not only can avoid a repetitive selection process but also gradually shrink the data scale [see Line 25].

IV. RESULTS
This section details several experiments, which were conducted on synthetic and real HSIs. All the algorithms ran on a PC with an Intel Core i7-2600 K (at 3.4 GHz) and 16 GB RAM. Experimental results obtained from all the algorithms were averaged on 10 independent runs.  5 were integrated to verify experimental performances. A spatial-spectral-based EEA called SENMAV [25] is considered to supply experimental result comparisons. Besides, the parameters of four benchmark PPAs are carefully set according to their original publications [26]- [29].

A. Experimental Preparations
2) Evaluation Metrics: 1) SAD: SAD assesses the spectral similarity between two spectra, given by SAD (α, β) = arccos α T β α β (11) where α and β denote extracted endmember spectra and library, respectively. The higher the spectral similarity between α and β, the smaller the SAD. 2) RMSE: RMSE evaluates the reconstruction error between two images, and it is given by where Y andŶ represent original and estimated image, respectively. Both them have L bands and N total pixels. The lower the RMSE values, the better the reconstruction performance.
3) Speedup: Traditionally, researchers consider recording the computational time of PPAs and EEAs for the purpose of evaluating the performance of PPAs. Although most PPAs have relatively low computational requirements, the existing PPAs suffer from the fact that the joint action of processing the image employing PPA and then extracting the endmembers from the resulting image takes more than directly applying EEA. In other words, they barely produce desired acceleration performances for subsequent EEAs. In this context, we consider the speedup metric to verify the acceleration performance of PPA. Speedup measures computational cost ratio between EEAs without coupling PPAs and on itself with coupling, and it is defined as follows: where T eea , T ppa , and T ppa-eea stand for EEA's execution time on original hyperspectral data, PPA's preprocessing cost, and EEA's execution time on preprocessed data using PPA, respectively. If speedup results are greater than 1, it infers that PPA achieves the goal of accelerating endmember extraction of EEA. 4) PPM: In the previous publications with regard to PPAs, the authors separately record the endmember accuracy of PPA-EEAs and EEAs but fail to consider whether PPAs have the ability on improving the endmember accuracy.
In this regard, we consider evaluating the endmember accuracy between PPA-EEAs and EEAs and propose a new preprocessing performance measurement (PPM) to evaluate the effectiveness of PPAs from time and endmember accuracy perspectives. The main motivation behind this metric is that the existing metrics such as SAD and RMSE in the field of hyperspectral unmixing barely provide the difference measure of the endmember extraction accuracy obtained from the preprocessed image and original nonpreprocessed image. We use a log operation to measure the difference between SAD results obtained from EEA and SAD results obtained from the PPA-EEA combination. We also unify the speedup results and SAD differences, defined as The chief benefit of PPM is that it can directly provide performance comparisons among PPAs. Once the endmember extraction accuracy obtained from the PPA-EEA combination is higher than that of EEA, PPM will be positive and negative otherwise. An optimal or ideal PPA should have the ability to generate PPM values tending to positive infinity. It is worth mentioning that we can also use RMSE to measure PPM when SAD is not available [see Experiment 7].

3) Hyperspectral Datasets: a) Synthetic dataset (DC1): DC1 is a representative synthetic
image that can generate images according to different parameter settings [2]. Assume there is a desired image that contains n × n pixels. In order to generate mixing pixels, the image is divided into plenty of equal blocks with each of them filling with one of the p spectral signatures and performing a w × w spatial low-pass filter. For the purpose of validating preprocessing performances, the parameters were set and reported according to specific experiment requirements [see Fig. 7(a)]. Besides, endmembers are selected from the U.S. Geological Survey (USGS). 6 b) Indian Pines: Indian Pines was gathered by the Airborne Visible Infrared Imaging Spectrometer (AVIRIS) sensor over the Indian Pines test site in Northwest Indiana. This image contains 145 × 145 pixels, 224 bands, and 16 ground-truth classes [see Fig. 7(b)]. After removing the noise and water absorption bands (the excluded bands were 104-108, 150-163, 220), 200 bands were retained. c) Cuprite: Cuprite is a real hyperspectral dataset, with data captured by the AVIRIS in Las Vegas, NV, USA. After removing the noise and water absorption bands (the excluded bands were 1-6, 105-115, 150-170, and 221-224) and capturing a desired region of interest, we use a 250×190×182 hyperspectral cube [see Fig. 7(c)]. In [16], 14 types of endmembers are identified for endmember extraction, which can be estimated via VD [37]. In [6], the authors believe that there exist 10 endmembers. In [7],

1) Experiment 1 (Parameter Sensitivity Analysis):
To provide visual descriptions of impacts of different λ on experimental performances, this experiment was conducted on DC1 with 257 × 257 pixels, 224 bands, three endmembers, and 30-dB noises. The detailed parameter settings are tabulated in Table I. As can be seen from Fig. 8(a), when combined with FSPM, N-FINDR can provide robust and accurate endmember extraction performances but the endmember accuracy has weak relations to λ. When λ goes up, the time consumption of FSPM also reaches a growth, but it is still lower than that of N-FINDR [see Fig. 8(b)]. Besides, the speedup results have a declining trend when λ increases, mainly because small λ can provide a few pixels for subsequent endmember identification so that N-FINDR can assure a very fast unmixing task [see Fig. 8(c)]. Fig. 9 visually displays the preprocessed data with different λ from 0.1% to 2.5%. Compared to Fig. 9 (b), FSPM can provide desired high-quality preprocessed pixels that normally lie in the vertices and boundaries of the simplex.
2) Experiment 2 (Outlier Sensitivity Analysis): To assess the experimental performance of PPAs on outliers, this experiment considered two types of DC1. The first one contained three endmembers and three outliers while the second one did not maintain outliers. In order to generate outliers, we first randomly selected three normal pixels from DC1 and then applied (10) to recreate outliers according to the endmembers and true abundance fractions of three normal pixels. The three normal pixels were finally replaced by three outliers. In this context, we can generate a synthetic dataset containing outliers for this experiment. Fig. 10(a) and (b) provide a dataset used in this experiment and outlier spectra, respectively. Table II displays SAD results obtained from N-FINDR and other PPA-EEA combinations. It can be seen from this table that when the dataset meets outliers, N-FINDR and RCSPP-NFINDR show different SAD results on three endmembers compared to those without outliers. Four PPA-EEA combinations including SPP-NFINDR, SSPP-NFINR, SSPM-NFINDR, and FSPM-NFINDR, are robust to outliers, especially for FSPM which not only has the ability to remove outliers but rather provides the lowest and most consistent SAD results.

3) Experiment 3 (Visual Comparisons of Spectral Signatures
Obtained From EEA and PPA-EEA Combinations): This experiment visually verified the spectral signatures extracted from different algorithms, where the low-pass filter window was fixed at 10 and noise levels varied from 10 to 40 dB and four endmembers were considered. As can be found in Fig. 11, compared to the endmember signatures obtained from four PPA-EEA combinations and NFINDR, FSPM-NFINDR has the ability on extracting endmember signatures that can accurately match USGS spectra. Also, under entire scenarios, FSPM-NFINDR shows minimum SAD values compared to other algorithms.  a "Y" and "N" denote that the dataset contains outliers and does not contain outliers, respectively. b " × " represents that the algorithm is sensitive to outliers and " " stands for that the algorithm can avoid the interference of outliers. The best results are in bold.

4) Experiment 4 (Visualization of Preprocessed Data):
This experiment was conducted on DC1 that contains 237 × 237 pixels with three endmembers and 224 spectral bands. The noise level was fixed at 30 dB. To simulate real-world scenarios, we used two spatial low-pass filters to generate mixed pixels. The first one used 20 × 20 low-pass filter window, so that there are no pure pixels [see Fig. 12(a)] and the second used 10 × 10 low-pass filter window to retain pure pixels [see Fig. 13(a)].
As can be seen from Fig. 12(a)-(g), FSPM (green points) not only mitigates noises but also can provide a few high-quality data points that lie in the vertices and boundaries of the simplex. Compared to FSPM, SPP (orange points) and SSPP (pink points) to some extent can alleviate noises, but they barely produce desired high-quality data points. Likewise, in Fig. 13(a)-(g), FSPM equally finds a set of endmember candidates from noisy data. Although SPP removes a large percentage of noise owing to the process of pixel reconstruction, it also affects the vertices of a simplex which may have an impact on subsequent endmember accuracy. For SSPM, it can identify the position of pure pixels, but the preprocessed data are still affected by noise. In terms of RCSPP, it applies a superpixel algorithm to segment this dataset into a set of superpixels and retains high-quality pixels from   Fig. 14(a) displays mean SAD (mSAD) trends averaged on three endmembers. Compared to N-FINDR and other PPA-EEA combinations, when coupled with FSPM, N-FINDR produces higher endmember accuracy. Fig. 14(b) illustrates speedup results on different datasets. Because FSPM ensures a few endmember candidates and negligible time burden itself, it provides relatively larger acceleration performance than that of other PPAs on all the datasets when combined with N-FINDR. In terms of time-consuming, Fig. 14   the simplex boundary. When there exist worse noises, MVSA may find it difficult to determine the desired simplex. In this regard, SPP can reconstruct each pixel using its neighborhoods, which alleviates noises to recover data simplex to some degree. Besides, FSPM can improve MVSA's endmember accuracy according to PPM results, and it also provides good acceleration performance for MVSA. In terms of MVC-NMF, it normally requires a lot of computational time owing to its nonconvex regularizer that hinders its optimization speed. When combined with FSPM, MVC-NMF has a large acceleration performance over 130 times while it simultaneously has improvements on endmember accuracy. SENMAV provides a comparison from a spatial-spectral-based endmember extraction viewpoint. Compared with SENMAV, some PPAs can promote EEA's endmember extraction performance from a spatial-spectral perspective.

7) Experiment 7 (Comparison Between EEAs and PPA-EEA Combinations on Indian Pines Dataset):
This experiment was conducted on Indian Pines that contains 145 × 145 pixels, 200 pruned bands, and 16 classes. Since there are no available reference spectra for 16 classes, mSAD was not provided for this dataset, but 16 classes were regarded as the number of endmembers.    endmember extraction performances associated with five EEAs and five EEA-PPAs on the Cuprite dataset. It is worth noting that when the extracted endmembers are matched with USGS library by considering SAD, this matching procedure is suboptimal because it may strongly depend on the order in which endmembers  are matched. In this case, the endmember that is matched in the first place may indeed affect subsequent matching of the remaining endmembers [26]. In order to avoid this problem, we select six representative minerals to calculate mSAD because they can be accurately matched with most of the relevant minerals in the USGS library under different endmember extraction scenarios. Six representative minerals include alunite, buddingtonite, dumortierite, kaolinite, muscovite, and montmorillonite.  and the PPA-EEA combination (in seconds) of all the algorithms on Cuprite hyperspectral datasets. As can be seen from Table V, when combined with FSPM, N-FINDR, and VCA can generate the best experimental results regarding all evaluation metrics and also provide the lowest time consumption. For MVSA, SSPM provides the best acceleration performance but barely promotes its endmember accuracy. Compared to other PPAs, FSPM provides lower SAD values as well as PPM for MVSA. In terms of MVC-NMF, SPP can produce better endmember accuracy, but FSPM has the lowest reconstruction errors and the highest acceleration performance. For OSP, although FSPM provides the best speedup results, SSPM can provide the lowest SAD, RMSE, and PPM results.
The reconstruction error maps obtained from all algorithms are displayed in Fig. 16. We are aware that the number of estimated endmembers was fixed at 12 potentially resulting in higher RMSE results in some regions than those in 14 types of endmembers. However, algorithmic structures of different endmember extraction methods also to some extent lead to insufficient endmember extraction performance. For instance, compared to pure pixel assumption-based EEAs, NMF-based algorithms such as MVC-NMF can produce lower RMSE results under the same setting of endmember numbers.

V. CONCLUSION
This article investigates a new fast preprocessing algorithm based on subspace exploitation, called FSPM. Specifically, FSPM does not consider the traditional process of spatialspectral exploration of the HSI but rather first reduces the HSI into a low-dimensional subspace by performing SVD. FSPM then, respectively, treats spatial and spectral information exploitation as the process of removing outliers beyond the simplex and specifying convex hull vertices. After iteratively extracting desired data points from the subspace, FSPM finally transforms determined data points into noise-reduced data space. Compared to existing representative PPAs, FSPM shows its advantages on the quality of endmember candidates and time requirements. Based on different validation metrics, experiments conducted on synthetic and real hyperspectral images indicate that FSPM not only requires less computational requirements but also provides a more preferable data subset than other counterparts.
We are also aware that there are underlying flaws, which somewhat demands future research works. The first one is that the proposed FSPM requires several parameters to separate the outliers and normal data. Although we find that there are several parameters, including the ratio of pixel selection λ, the radius of searching region δ and the number of neighborhoods k, can be easily tuned, the most important parameter, i.e., the cut-off threshold η, requires a more careful tuning process according to specific dataset and outlier scenarios. The second one is that the computational burden of FSPM is affected by the number of endmembers.