Length Reduction of Singular Spectrum Analysis With Guarantee Exact Perfect Reconstruction via Block Sliding Approach

The conventional singular spectrum analysis is to divide a signal into segments where there is only one non-overlapping point between two consecutive segments. By putting these segments into the columns of a matrix and performing the singular value decomposition on the matrix, various one dimensional singular spectrum analysis vectors are obtained. Since different one dimensional singular spectrum analysis vectors represent different parts of the signal such as the trend part, the oscillation part and the noise part of the signal, the singular spectrum analysis plays a very important role in the nonlinear and adaptive signal analysis. However, as the length of each one dimensional singular spectrum analysis vector is the same as that of the original signal, there is a redundancy among these one dimensional singular spectrum analysis vectors. In order to reduce the required computational power and the required units for the memory storage for performing the singular spectrum analysis, this article proposes a method to reduce the total number of the elements of all the one dimensional singular spectrum analysis vectors. In particular, the length of the shift block for generating the trajectory matrix is increased from one to a positive integer greater than one under a certain criterion. In this case, the total number of the columns of the trajectory matrix is reduced. As a result, the total number of the off-diagonals of all the two dimensional singular spectrum analysis matrices is reduced. Hence, the total number of the elements of all the one dimensional singular spectrum analysis vectors is reduced. In order to guarantee exact perfect reconstruction, this article reformulates the de-Hankelization process. In particular, the first element of the off-diagonals of all the two dimensional singular spectrum analysis matrices is taken as the elements of the one dimensional singular spectrum analysis vectors. Exact perfect reconstruction condition is derived. Simulations show that exact perfect reconstruction can be achieved while the total number of the elements of all the one dimensional singular spectrum analysis vectors is significantly reduced.


I. INTRODUCTION
Singular spectrum analysis is a kind of nonlinear and adaptive time frequency analysis [1]. The signals are represented as the sum of the one dimensional singular spectrum analysis vectors [1]. Based on the rationale of judiciously selecting one The associate editor coordinating the review of this manuscript and approving it for publication was Jinming Wen . dimensional singular analysis vectors, successful applications [22] include signal denoising [2]- [5], underlying trend extraction [6], pattern recognition [7] and peak detection [12]. However, unlike the conventional linear and nonadaptive time frequency analysis such as the maximally decimated filter bank analysis [8], [9], the singular spectrum analysis is a kind of oversampled time frequency analysis [1]. In particular, the total number of the elements of all the one dimensional VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ singular spectrum analysis vectors is L times the length of the original signal. Here, L is the total number of the one dimensional singular spectrum analysis vectors [1]. For some applications such as data compression, the increase in the total number of the elements in the representation would significantly degrade the application performance [10]. In the mean time, a uniform decimation has been applied to the subband components of a uniform filter bank to reduce the total number of the elements of all the represented components [19]- [21]. If this technique is applied to the singular spectrum analysis, then the rows of the trajectory matrix are the polyphase components of the original signal [16]- [18]. However, the two dimensional singular spectrum analysis matrices are obtained via performing the eigen value decomposition on the trajectory matrix [1]. Hence, the one dimensional singular spectrum analysis vectors are in general not the polyphase components of the original signal. As a result, whether exact perfect reconstruction of the original signal can be achieved or not via representing the rows of the trajectory matrix as the polyphase components of the original signal is unknown. This article is to address this issue.
Besides, Hankelization is to map the one dimensional signal to a trajectory matrix and the de-Hankelization is to map the two dimensional singular spectrum analysis matrices to the one dimensional singular spectrum analysis vectors. The most common de-Hankelization method is via the diagonal averaging approach [13]- [15]. That is, the average value of each off-diagonal of each two dimensional singular spectrum analysis matrix is computed [13]- [15]. These average values of these off-diagonals of each two dimensional singular spectrum analysis matrix from the elements of each one dimensional singular spectrum analysis vector [13]- [15]. Likewise, the norm approach is proposed to determine the elements of the one dimensional singular spectrum analysis vectors [11]. However, the lengths of these one dimensional singular spectrum analysis vectors are shorter than that of the original signal if the rows of the trajectory matrix are the polyphase components of the original signal. Therefore, summing up of these one dimensional singular spectrum analysis vectors cannot reconstruct the original signal. This article is to address this issue.
Overall, the novelty and the contribution of this article is to propose a method to reduce the total number of the elements of all the one dimensional singular spectrum analysis vectors. Here, the length of the shift block is increased to reduce the total number of the columns of the trajectory matrix. On the other hand, exact perfect reconstruction is guaranteed. To achieve this objective, the de-Hankelization is reformulated. In particular, the first element of the off-diagonals of all the two dimensional singular spectrum analysis matrices is taken as the elements of the one dimensional singular spectrum analysis vectors. Up to the authors' understanding, this approach for performing the de-Hankelization as well as this approach for reducing the total number of the elements of all the one dimensional singular spectrum analysis vectors with the guarantee exact perfect reconstruction have not investigated before.
The outline of this article is as follows. The conventional singular spectrum analysis is reviewed in Section II. The proposed method for reducing the total number of the elements of all the one dimensional singular spectrum analysis vectors is presented in Section III. Numerical simulation results are presented in Section IV. Finally, a conclusion is drawn in Section V.

II. REVIEW ON THE CONVENTIONAL SINGULAR SPECTRUM ANALYSIS
This section summarizes the procedures for performing the conventional singular spectrum analysis. Let the length of a signal be N . Denote the signal as x(n). Let the vector T ∈ R N Here, the superscript ''T '' is denoted as the transposition operator and R a is denoted as the set of the a dimensional real valued vectors. Denote the trajectory matrix as Here, a×b is denoted as the set of the a×b real valued matrices.
Let the left unitary matrix and the right unitary matrix obtained by performing the singular value decomposition on x be U ∈ R L×L and V ∈ (N −L+1)×(N −L+1) , respectively. Also, let the diagonal matrix obtained by performing the singular value decomposition on X be ∈ L×L . That is, Here, 0 a×b is denoted as the a×b zero matrix and the superscript ''H '' is denoted as the conjugate transposition operator. As the total number of the columns in the zero matrix is N−2 L+1 which is required to be a non-negative integer, N ≥2 L− 1 is required to be satisfied. Let the columns of U and the columns of V be u i ∈ R L for i = 0, . . . , L−1 and v i ∈ R N −L+1 for i = 0, . . . , N−L, respectively. That is, Also, let the diagonal elements of be λ 1 for i = 0, . . . , L−1. Define  [1], the conventional diagonal averaging method is employed. That is, the elements of the one dimensional singular spectrum analysis vectors are defined as:

A. CONSTRUCTION OF THE TRAJECTORY MATRIX
Let the length of the shift block for generating the trajectory matrix be D ∈ Z + . Here, Z + is denoted as the set of the positive integers. For the conventional singular spectrum analysis [1], D = 1. In this article, we consider the case that D > 1. In this case, the trajectory matrix as Since the index of the first element in the last column of the trajectory matrix is N−L which is required to be an integer multiple of D, N −L D ∈ Z + is required to be satisfied. It is worth noting that the k th row of X is the k th polyphase component of x(n) for k = 0, . . . , L−1 [8]. Define the mapping from the set of x to the set of X be f . That is, f (x) = X. Obviously, f is invertible if and only if L ≥ D This is because there is no data loss when L ≥ D and vice versa.

B. CONSTRUCTION OF THE TWO DIMENSIONAL SINGULAR SPECTRUM ANALYSIS MATRICES
As the total number of the columns in the zero matrix is N −L D −L+1 which is required to be a non-negative integer, N ≥ (D+1)L−D is required to be satisfied. Now, for i = 0, . . . , L−1. It can be checked easily that X = Obviously, the length of µ i for i = 0, . . . , L−1 is smaller than the length of x. Hence, the total number of the elements of all the one dimensional singular spectrum analysis vectors is significantly reduced.
For performing the conventional diagonal averaging based de-Hankelization [1], we have: in the above diagonal averaging based de-Hankelization process. However, these equations consist of the sum of one or more than one quadratic term. As it is required to access many different storage units especially when the size of the matrix is large, solving these equations requires a high computational power and the memory storage units. Moreover, there are (L−1)! and N −L D ! vector equations governing the orthogonal conditions among the columns of U and V, respectively. Furthermore, U, D, V and X are also related together by the singular value decomposition. Hence, it is very difficult to find µ i using X i for i = 0, . . . , L−1.
To address this difficulty, this article proposes to reformulate the de-Hankelization process without solving those complicated equations. It is worth noting that X i for i = 0, . . . , L−1 is a rank one matrix which is completely characterized by the first element of its off-diagonals. Hence, µ i,j is chosen as the first element of y i,j for i = 0, . . . , L−1 and for i = 0, . . . , L−1

D. RECONSTRUCTION OF THE TWO DIMENSIONAL SINGULAR SPECTRUM ANALYSIS MATRICES AND THE TRAJECTORY MATRIX
Let the reconstructed two dimensional singular spectrum Then, the trajectory matrix X can be recovered fromX. That isX = X. This is becauseX i = X i for i = 0, . . . , L−1.

E. RECONSTRUCTION OF THE ORIGINAL SIGNAL
Let the vector of the reconstructed signal based onX bẽ Here, the polyphase components ofx are chosen as the rows ofX as shown in Figure 1. In particular, by picking up the first D rows ofX and upsampling each row with the appropriate delays,x is formed by adding these delayed upsampled rows together.

F. SUMMARY OF THE PROCEDURES FOR PERFORMING THE DECIMATED SINGULAR SPECTRUM ANALYSIS USING THE PSEUDO CODE
To summarize the procedures for performing the decimated singular spectrum analysis, the pseudo code is described below: Decomposition Phase: Step 1: Construct the trajectory matrix X where Step 2: Perform the singular value decomposition on X to obtain U, and V. That is, X = U 0 Step 3: Compute the two dimensional singular spectrum analysis matrices Step 5: Compute the elements in the one dimensional singular spectrum analysis vectors µ i,k where
Step 7: Compute the reconstructed trajectory matrixX whereX = L−1 i=0X i . Step 8: Compute the reconstructed signalx where the polyphase components ofx are chosen as the rows ofX. That is, pick up the first D rows ofX, upsample each picked up row and delay it, thenx is the sum of these delayed upsampled rows.

G. REVIEW ON THE SINGULAR SPECTRUM OF A SIGNAL
For the conventional singular spectrum analysis, the distribution of the energy of the eigentriple to the total energy of all the eigentriples of the trajectory matrix is called the singular spectrum of a signal. That is, the distribution of where l = N −L+1 L and z represents the integer part of a real number z.

H. EXACT PERFECT RECONSTRUCTION
Since the total number of the elements of all the one dimensional singular spectrum analysis vectors is reduced, in general it is not guaranteed to achieve exact perfect reconstruction. This article aims to address this issue.
If L ≥ D, then exact perfect reconstruction of x can be achieved. That is,x = x. This result is summarized in the following theorem: This implies that x can be perfected reconstructed using µ i and g i for i = 0, . . . , L−1. This proves the necessary part.
On the other hand, if L < D, then the information is lost. As a result, x cannot be perfectly reconstructed using µ i and g i for i = 0, . . . , L−1. This proves the sufficient part and the proof is completed.
From Theorem 1, it can be seen that exact perfect reconstruction only depends on the length of the shift block for generating the trajectory matrix, the number of the singular spectrum analysis components and the length of original signal. It is independent of the signal itself. Also, it is independent on whether the mode mixing phenomenon occurs or not.

I. ANALYSIS ON THE REQUIRED COMPUTATIONAL POWER AND MEMORY ANDMEMORY STORAGE
It is worth noting that our proposed method only requires L N −L D +1 memory units to generate X. On the other hand, the conventional singular spectrum analysis approach requires L(N−L+1) memory units to generate X. Table 1,  Table 2, Table 3 and Table 4 show the required computational powers for performing the singular value decomposition on X, computing the one dimensional singular spectrum analysis vectors, reconstructing the original signal and performing the overall operations, respectively. It can be seen from Table 1 to Table 4 that our proposed method requires a smaller number of multiplications and the additions as well as a smaller number of the memory storage compared to the conventional diagonal averaging based de-Hankelization method.

IV. COMPUTER NUMERICAL SIMULATION RESULTS
Without the loss of the generality, consider an identically and independently Gaussian distributed random signal with the VOLUME 8, 2020 mean being equal to zero and the variance being equal to one. Here, N = 16 is illustrated. This is because the signal with a very long length is difficult to be illustrated numerically. Moreover, L = 4 is illustrated which is equal to 25% of the signal length. This ratio is illustrated because it is usually used in the singular spectrum analysis. Furthermore, D = 3 is illustrated. This is because it refers to the nearly maximal decimation case. It can be checked easily that both N −L D ∈ Z + and N ≥ (D+1)L-D are satisfied. For a particular realization, it is found that It is worth noting that X i for i = 0, . . . , L−1 are not the Hankel matrices. Besides, we have Similarly, X i for i = 0, . . . , L−1 are not the Hankel matrices. Besides, we have: whereμ 3,0 = 0.0077 0.0207 −0.0537 andμ 3,1 = 0.0953 0.1004 0.0369 0.0931 , as well as g 0 = −0.0887, g 1 = 0.3108, g 2 = −0.0851 and and g for g 3 = 0.0953· Likewise, the elements of µ i and g i , for i = 0, . . . , L−1 are expressed as the elements of X i . Next, we computeX i for i = 0, . . . , L−1. We also havẽ X t = X i for i = 0, . . . , L−1. Here, the elements ofX i are expressed as the elements of µ i and g i for i = 0, . . . , L−1. Then, we can computex andx. We still haveX = X and That means, exact perfect reconstruction on x is achieved. On the other hand, since L ≥ D, Theorem 1 also gives the same conclusion. It is worth noting that the signal representation consists of two major parts. The first part is the signal decomposition and the second part is the signal synthesis. For the signal decomposition, if the original signal composites of several components, then the ideal case for the signal decomposition is to represent the original signal as the sum of these signal components. If this is not the case, then the mode mixing phenomenon occurs. For the empirical mode decomposition, it represents the original signal as the sum of the intrinsic mode functions and different intrinsic mode functions are localized in different frequency bands. Hence, if the components of the original signal are localized in different frequency bands, then the mode mixing phenomenon may not occur. However, as the objective of the singular spectrum analysis is to represent the original signal as the sum of the components with different significances, where the significances of the components are determined by the eigenvalues of the trajectory matrix, there is no control on the occurrence of the mode fixing phenomenon. Based on this, now consider a signal separation problem. In particular, the length of the signal is equal to 1000. That is, N = 1000. The signal is composed of 4 finite length sinusoidal components. Their maximum absolute values are equal to 1,0.2,0.1 and 0.05, respectively, as well as their frequencies are equal to 0.005,0.08665,0.16835 and 0.25, respectively. That is, x(n) = sin(0.01πn)+0.2 sin(0.1733π n) +0.1 sin(0.3367πn)+0.05 sin(0.5π n) (27) for n = 0, . . . , N−1. Here, the finite length sinusoidal signals are used for the illustration becape of the following two reasons. First, as the finite length sinusoidal signals can be modeled by the products of the rectangle window functions and the corresponding infinite length sinusoidal signals, these signals are asymptotically separable. Second, these signals are widely used in many engineering applications such as in the communication applications. since there are four finite length sinusoidal components, L = 4 is chosen. In order to achieve the maximally decimation, D = 4 is chosen. Figure 2 plots the original signal. Then, we compute X j for i = 0, . . . , L−1. Similarly, X i for i = 0, . . . , L− 1 are not the Hankel matrices. Next, we and g i for i = 0, . . . , L−1. Obviously, the compute µ i and g i elements of µ i and g i for i = 0, . . . , L−1 are expressed in terms of the elements of X i . Figure 3 plots these four one dimensional singular spectrum analysis components obtained based on both our proposed de-Hankelization method and the conventional diagonal averaging based deHankelization method [1]. Here, it can be seen from Figure 3 that the maximum absolute values of the one dimensional singular spectrum analysis components obtained by our proposed de-Hankelization method are 1.1881, 0.2950, 0.1095 and 0.0094, respectively. On the other hand, the maximum absolute values of the one dimensional singular spectrum analysis components obtained by conventional diagonal averaging based de-Hankelization method are 1.0382,0.0600,0.0523 and 0.0073, respectively. Compare with the maximum absolute values of the original signal components which are 1,0.2,0.1 and 0.05, respectively, it can be seen that the errors obtained by our proposed deHankelization method are lower than those obtained by the conventional diagonal averaging based de-Hankelization  method. Next, we computeX i for i = 0, . . . , L−1. Obviously, the elements ofX i are expressed as the elements of µ i and g i for i = 0, . . . , L−1 We also haveX i = X i for i = 0, . . . , L−1. Finally, we can computeX and x. We still haveX = X andx = x. Figure 4 plots the reconstruction signal. It can be seen that exact perfect reconstruction is achieved. On the other hand, as the length of the reconstructed signal based on the diagonal averaging based de-Hankelization method is shorter than that of the  original signal, exact perfect reconstruction is impossible for the diagonal averaging based de-Hankelization method. Now, consider an example of the signal with the length the same as that of the previous example. The signal is also composed of 4 finite length sinusoidal components. Their maximum absolute values are the same as those of the previous example, but their frequencies are changed to 0.005, 0.01, 0.015 and 0.02, respectively. That is,  Figure 5 plots the original signal. Then, we compute X i for i = 0, . . . , L−1. Similarly, X i for i = 0, . . . , L−1 are not the and g for Hankel matrices. Next, we compute µ i and for g i i = 0, . . . , L−1. Obviously, the elements of µ i and g i for i = 0, . . . , L−1 are expressed as the elements of X i . Figure 6 plots these four one dimensional singular spectrum analysis components obtained based on both our proposed deHankelization method and the conventional diagonal averaging based de-Hankelization method [1]. Compare with the maximum absolute values of the original signal components, it can be seen that the errors obtained by our proposed de-Hankelization method are lower than those obtained by the conventional diagonal averaging based deHankelization method. Besides, it is worth noting that the obtained one dimensional singular spectrum analysis components are not the corresponding signal components. It can be seen from Figure 6 that there are many harmonics in the singular spectrum analysis components. That means, the mode mixing phenomenon occurs. However, it can be seen from Figure 6 that the second one dimensional singular spectrum analysis component obtained by our proposed de-Hankelization method contains a smaller number of harmonics compared to that obtained by the conventional diagonal averaging based de-Hankelization method [1]. Next, we computeX i for i = 0, . . . , L−1 Obviously, the elements ofX i are expressed as the elements of µ i and g i for i = 0, . . . , L−1. We also haveX i = X i for i = 0, . . . , L−1. Finally, we can computeX andx· We still haveX = X andx = x· Figure 7 plots the reconstruction signal. It can be seen that exact perfect reconstruction is achieved. On the other hand, as the length of the reconstructed signal based on the diagonal averaging based de-Hankelization method is shorter than that of the original signal, exact perfect reconstruction is impossible for the diagonal averaging based de-Hankelization method. Finally, our proposed de-Hankelization method is applied to a hyperspectral image. This image is taken from the Indian Pines dataset where the AVIRIS imaging spectrometer is used for taking the images with the range of the imaging wavelength being between 0.4µm and 2.5µm and the bandwidth being below 10 nm. Therefore, the image is with 200 consecutive bands. A pixel in this image is randomly selected and these 200 consecutive radiation intensity values at different bands are obtained. The xcoordinate and the y-coordinate of Figure 8 refer to the index of these 200 bands and the radiation intensity corresponding to different bands, respectively. Obviously, this signal cannot be represented as the sum of a small number  of the finite length sinusoidal components. That means, this is not an asymptotically separable signal. In this example, L = 4 and D = 4 are illustrated. First, we compute X i for i = 0, . . . , L−1. Likewise, X i for i = 0, . . . , L−1 are not the Hankel matrices. Second, we compute µ i and g i for i = 0, . . . , L−1. Obviously, the elements of µ i and g i for i = 0, . . . , L−1 are expressed as the elements of X i . Third, we computeX i for i = 0, . . . , L−1. We also haveX i = X i for i = 0, . . . , L−1. Here, the elements ofX i are expressed as the elements of µ i and g i for i = 0, . . . , L−1. Fourth, we can computeX. We still haveX = X. Finally, we computex Figure 9 shows both the original grey scale values (blue line) and the reconstructed grey scale values using the four singular spectrum analysis components obtained based on our proposed de-Hankelization method (red line). It can be seen that the red line coincides with the blue line. That means, the grey scale values can be exactly reconstructed. That is ,x = x. From here, we can see that exact perfect reconstruction can be achieved if Theorem 1 is satisfied no matter the original signal is an asymptotically separable signal or not. On the other hand, as the length of the reconstructed signal obtained by the diagonal averaging based de-Hankelization method is shorter than that of the original signal, exact perfect reconstruction is impossible for the diagonal averaging based de-Hankelization method. To demonstrate the advantages of the reduction of the total number of the elements in all the one dimensional singular spectrum analysis vectors, the denoising is performed in these one dimensional singular spectrum analysis vectors via the thresholding scheme. That is, the elements in the one dimensional singular spectrum analysis vectors are set to zero if they are smaller than a threshold value. Here, the lengths of the one dimensional singular spectrum analysis components based on our proposed deHankelization method are 53. On the other hand, those based on the conventional singular spectrum analysis approach with D = 1 are 200. Obviously, our proposed deHankelization method can significantly reduce the required computational power and the memory storage for the processing. More precisely, the total time taken for this denoising operation is computed. It is found that our proposed method requires 1.2s for the overall operation. On the other hand, the conventional singular spectrum analysis approach with D = 1 requires 8.4 s for the overall operation. Obviously, performing the denoising via our proposed method is more efficient than via the conventional singular spectrum analysis approach with D = 1.

V. CONCLUSION
This article proposes to increase the length of the shift block for generating the trajectory matrix to reduce the total number of the elements of all the one dimensional singular spectrum analysis vectors. To perfectly reconstruct the original signal, the de-Hankelization process is redefined in such a way that the first element of the off-diagonals of all the two dimensional singular spectrum analysis matrices is used to construct the one dimensional singular spectrum analysis vectors. The computer numerical simulation results show that exact perfect reconstruction can be achieved while the total number of the elements of all the one dimensional singular spectrum analysis vectors is significantly reduced.