GPR Clutter Reduction by Robust Orthonormal Subspace Learning

The clutter severely decreases the target visibility, thus the detection rates in ground penetrating radar (GPR) systems. Recently proposed robust principal component analysis (RPCA) based clutter removal method decomposes the GPR image into its low rank and sparse parts corresponding to clutter and target components. Motivated by its encouraging results, many lower complexity low rank and sparse decomposition (LRSD) methods such as go decomposition (GoDec) or robust non-negative matrix factorization (RNMF) have been applied to GPR. This paper proposes a new clutter reduction method using robust orthonormal subspace learning (ROSL). The raw GPR image is decomposed into its clutter and target parts via ROSL. The proposed method is faster than the popular RPCA. Although it has similar complexity, and similar performance with GoDec and RNMF for fine tuned parameters of these methods, the proposed method does not require any presetting of the algorithm parameters. Its performance remains independent for a broad range parameter value. Results demonstrate that the proposed method achieves 14 – 48% higher performance in terms of PSNR values than the state-of-the-art LRSD methods for an arbitrary parameter choice.


I. INTRODUCTION
Acommon problem in ground penetrating radar (GPR) systems is the presence of clutter which highly affects the target imaging/detection capabilities. The clutter is caused by several reasons such as the reflected signal from the ground surface (ground-bounce), the coupling signal between its transmitting and receiving antennas (direct-wave arrival) and the reflections from the subsurface discontinuities and other candidate objects similar to the target [1]. Besides the conventional approaches such as the simplest mean value extraction or subspace based approaches [2], [3], multiresolution-multidirection [4]- [6] and low rank and sparse decomposition (LRSD) based methods [7]- [11] have been extensively used to solve this and many similar problems [12], [13].
Subspace based clutter removal methods project the raw data matrix onto clutter and target subspaces via singular value decomposition (SVD), principal component analysis (PCA) or independent component analysis (ICA) [2]. The most dominant component corresponds to the The associate editor coordinating the review of this manuscript and approving it for publication was Qilian Liang . clutter while the remaining ones are used to represent the clutter-free target components. Thus, subspace based clutter reduction methods also provide a low rank representation of the input GPR data matrix. Although the highest eigenvalue indicates the dominant component for SVD or PCA, this ordering is not respected in ICA. Moreover, the target component may be split into many components for GPR images containing several targets. The recently proposed non-negative matrix factorization (NMF) can also be cited in the low rank methods. It has a better performance compared to other low rank based methods but a similar complexity [3].
As another approach, the target image can be decomposed into subimages using multiresolution analysis (MRA), followed by directional filtering based decomposition [4]- [6]. Target component can be recovered by direct inversion of the subbands containing target information. Despite their satisfactory target detection performance, the multiresolution-multidirection based approaches tend to spread the target part, decreasing the visual quality of the resulting images. Moreover, their complexity is much higher compared to subspace or LRSD based methods. VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ LRSD based methods express an input matrix as the sum of a low rank and a sparse matrix. It is widely used in image and video processing topics [14] such as denoising [15], classification [16], restoration [17], scene change detection and foreground/background separation [18]. In GPR, the target part can be considered as sparse compared to the whole GPR image and the clutter part can be expressed by a low rank matrix as demonstrated in [7]- [11]. Besides GPR, the clutter problem is widely encountered in synthetic aperture radar (SAR) systems. Detection of small targets from sea clutter by maritime surveillance radars can be also cast into a LRSD problem where the small target is provided by the sparse component [12], [13].
Robust principal component analysis (RPCA) is used in [7], [9] to perform LRSD for clutter reduction in subsurface and through wall imaging applications. In RPCA [19], the non-convex rank minimization problem is relaxed by the convex nuclear norm minimization. However, the nuclear norm minimization requires extensive SVD computations, decreasing the efficiency of this method for field studies. Randomized low rank and sparse matrix decomposition or the go decomposition (GoDec) [20] proposes to replace SVD operations by bilateral random projections (BRP) as a less time consuming approach [21]. Another solution for rank minimization is non-convex matrix factorizations such as robust matrix factorization (RMF) or low rank matrix fitting (LMaFIT) both of which require a presetting of the rank parameter [22]. The recently proposed robust orthonormal subspace learning (ROSL) [22] can be thought as a non-convex relaxation of RPCA. It replaces the nuclear norm minimization with the group sparsity of the coefficients under orthonormal subspace for the modelling of the low rank part. Thus it does not require any a priori information such as the rank value which is bounded by the non-zero rows of the sparse coefficient matrix [22].
Another low rank modelling attempt may be cited as [3], where the low rank modelling of the GPR image is obtained by NMF. Since the clutter dominates the reflections from the target, a small rank value provides the clutter part, while the remaining corresponds to the target part. The idea has been developed in [11] where a robust version of NMF (RNMF) [23], which gives a sparse part besides the low rank one, is used for the decomposition of the GPR image. RNMF in [11] is reasonably fast and presents excellent results, but its performance is very sensitive to the value of the penalization parameter, requiring preprocessing for its choice.
Motivated by the success of LRSD methods for GPR clutter removal, this paper proposes a new LRSD based clutter reduction method which does not suffer from the drawbacks of the former ones. The decomposition of the GPR data matrix is carried on by ROSL [22]. ROSL is a non-convex relaxation of RPCA, it accelerates the LRSD method by using fast sparse coding methods, does not require the rank information, and has the same global minimum as RPCA since the group sparsity norm of the sparse coefficient is lower bounded by the nuclear norm [22]. Thus, it requires only the knowledge of a unique parameter which provides a constant performance in a broad interval for this single parameter.
The rest of the paper is organized as follows. Section II introduces a ROSL based GPR clutter reduction method. Results for simulated and experimental datasets as well as comparisons with state-of-the art LRSD based clutter reduction methods such as RPCA, GoDec and RNMF are presented in section III. This chapter also investigates the effect of the parameter choice in LRSD based methods and proves the superiority of ROSL among them. Concluding remarks are given in section IV.

II. ROSL BASED CLUTTER REDUCTION IN GPR
Let the GPR image be represented by a rectangular matrix X with dimensions M × N representing the depth (time) and the down-track (antenna position) index. Since the reflection from the target is dominated by the clutter and the target may be considered as sparse with respect to the whole GPR image, the GPR data matrix can be modeled using LRSD as where L ∈ R M ×N and S ∈ R M ×N correspond to clutter and target parts, respectively. L and S can be recovered from the input data matrix X ∈ R M ×N via rank minimization. Since this problem is non-convex, a convex solution may be obtained by nuclear norm minimization. RPCA [19] solves this minimization as where . * , . 1 and λ are the nuclear norm, L1-norm and the penalization parameter, respectively. RPCA is computationally expensive due to the use of iterative SVD operations [19]. An alternative solution is GoDec, which uses BRP instead of SVD [20]. For the rank minimization problem, one may use non-convex matrix factorization for fast low rank recovery. ROSL [22] offers a solution for LRSD problem by modelling the low rank part under an orthonormal subspace such as L is given as The rank of L, i.e. the sparse coefficient vector α is upper bounded by the number of nonzero rows of α defined as α row−0 L and S are recovered by min S,D,α where λ refers to the penalization parameter.
Using L1-norm as a relaxation for NP hard L0-norm and group sparsity α row−1 for α row−0 , the final form becomes min S,D,α Here α row−1 is defined as k i=1 α i 2 and (6) can be solved by augmented Langrangian method removing the equality constraint X = Dα + S and writing the augmented Langrangian function as where Y and µ denote the Langrange multiplier and the over-regularization parameter [22]. Using inexact alternating direction method (ADM) L(.) can be solved in 3 steps; The sparse component can be easily updated by thresholding as where T α (X ) = max{abs(X ) − α, 0} is the shrinkage function.
Step 1 presents a non-convex problem (simultaneous solution of D and α with constraint) which may be recast into two convex sub-problems updating one matrix while keeping the other constant. These sub-problems can be solved using the block coordinate descent (BCD) method.
The BCD sequentially updates (D t , α k ) pair for 1 ≤ t ≤ k is updated as where the residual is (11) and the magnitude shrinkage function is for X 2 = 0.
To speed up the convergence, the subspace dimension k is shrunk by discarding the zero pairs. For more details, refer to [22]. The steps of the proposed method are presented in Table 1.

III. EXPERIMENTAL RESULTS
The proposed ROSL method is compared with PCA [2], NMF [3] RPCA [9], GoDec [10] and recently proposed RNMF [11] to show its superiority over conventional subspace-based methods as well as the other recently proposed LRSD based methods. The visual and quantitative results are presented for simulated and real datasets, respectively to compare their performances. Peak signal-to-noise ratio (PSNR) is used for the quantitative analysis metric and the mean square error (MSE) can be expressed as where M and N are the dimensions of the input GPR data matrix. X and X ref denote the reconstructed clutter-free GPR image and reference images, respectively.
The raw GPR data matrix is decomposed into its low rank and sparse parts which correspond to the clutter and the target, respectively. There is no parameter selection for conventional PCA and NMF methods however LRSD based methods generally need fine-tuned parameter/parameters. Unlike RPCA [9], GoDec [10] or RNMF [11], proposed ROSL-based method does not require any presetting of rank parameter r, but only the value of the penalization parameter λ. Fig. 1 shows the change of PSNR value with respect to λ parameter. The empirical formulation of λ parameter is proposed in [9] as follows However, (15) is not always optimal for RPCA [9] and RNMF [11] for the clutter removal problem in GPR. These methods seem to need more fine tuning for λ parameter. However, the clutter removal performance of the proposed VOLUME 8, 2020  method remains constant for a comparatively larger interval of λ values. This phenomenon is shown in Fig. 1(a). The y-label shows the PSNR values in dB and the x-label shows the performance change with respect to penalization parameter λ in Fig. 1(a). Higher PSNR values indicate better results. Fig. 1(a) shows that RPCA produces the best results around λ = 0.22 and then the performance dramatically decreases. However, (15) proposes to use λ = 0.0625 which does not give a good result for RPCA. For the RNMF, the best performance is obtained around λ = 3 × 10 −4 and this high performance is valid for a very limited number of λ values compared to RPCA. Again (15) proposes to use λ = 0.0625 however the PSNR result provided in Fig. 1(a) is not satisfactory. The proposed method ROSL gives satisfactory results for values greater than λ = 0.02. thus it can be assumed that the parameter selection does not effect ROSL as in RPCA and RNMF. Thus, we can say that the proposed ROSL method is almost parameter-free for clutter removal in GPR problem.
As for GoDec, the penalization parameter is different from RPCA, RNMF and ROSL, thus Fig. 1(b) is given separately. As seen in Fig. 1(b), GoDec depends highly on the card(S) value and it produces good results around card(S) = 2000. For the other card(S) values, it produces results similar to the conventional subspace-based methods. Moreover, it can be observed in Fig. 1(b) that GoDec suddenly collapses for some card(S) values. Thus, card(S) parameter should be selected very carefully, otherwise the results can be very misleading.
RPCA [9], GoDec [10] and RNMF [11] require preprocessing to determine the optimum λ and card(S) values by making exhausting grid search. Although the λ value is defined conventionally as in (15), our experiments for both the simulated and the real datasets have demonstrated that RPCA and RNMF behave poorly for this empirical value when applied to the GPR clutter removal problem. Thus, RPCA and RNMF both require a preprocessing step to determine the optimal λ value for each new dataset as well as card(S) for GoDec. Several experiments validated that similar PSNR curves are obtained for ROSL independent of the datasets. Thus, we decided that a choice of λ value between (0.02 − 1) interval permits adequate decomposition of the input data in ROSL and any value used in this interval is appropriate for the visual and quantitative results for the next sections.

A. SIMULATED DATASET RESULTS
The gprMax simulation software, which is an open source software that simulates electromagnetic wave propagation [24], is used to construct the simulated dataset. The simulated dataset permits us to do a quantitative analysis besides the visual one since the reference target images can also be constructed by gprMax. The experimental design of the simulated dataset can be given as follows. The antenna frequency is selected as 1.5-GHz which is a commercial antenna in the library of gprMax (GSSI 1.5GHz, Model 5100). Various soil types, target materials and burial depths are used in the simulations to model different scenarios. Which is shown in the Fig. 1. The size of the GPR images in the simulated dataset is 256 × 183. A more detailed information is given in [8].
For the simulated dataset results, there is no parameter selection for the PCA [2] and NMF [3]. The critical λ parameters are selected as 3 × 10 −4 for RNMF [11], 1.8 × 10 −2 for RPCA [9] and 0.1 for ROSL [22]. The rank r is chosen as k = 1 for RNMF since the signal strength of the clutter dominates the signal reflected from the target. The k = 1 selection provides the clutter image directly, while k = N , which corresponds to the full rank case, provides the raw GPR image itself. For GoDec [10], the parameters are chosen as rank(L) = 1 and card(S) = 1 × 10 3 for the simulated dataset.
The first row of Fig. 3(a) and (b) show the raw GPR image and the reference target image for the simulated dataset where an aluminium target is buried in dry sand soil at 1 cm depth for this scenario. Fig. 3 (c)-(h) give the obtained target image results of PCA, NMF, RPCA, GoDec, RNMF and ROSL, respectively. PCA results present tails in vertical direction, while some clutter can still be observed in the NMF result. Since all the other methods are based on LRSD with optimum parameters, the obtained results are visually almost identical for this simple scenario. Fig. 4(a) and (b) show the raw GPR image and the reference target image for the simulated dataset where a plastic target is buried in dry clay soil at 2 cm depth for this scenario. Fig. 3(c)-(h) give the target image results of PCA, NMF, RPCA, GoDec, RNMF and ROSL, respectively. All the LRSD based methods successfully decompose the raw GPR image for the aluminium case. There is some remaining clutter in the PCA and NMF results. However the PCA result does not present any vertical lines as in the aluminum case. Fig. 5, Fig. 6, Fig. 7 and Fig. 8 presents the PSNR (dB) results at various burial depths and for different soil types for both target types in the simulated dataset. Fig. 4 and Fig. 5 present the quantitative results of the aluminum target buried in dry sand soil and the plastic target buried in dry clay soil for PCA, NMF, RPCA, GoDec, RNMF and the proposed ROSL method at various burial depths. It is clearly seen that LRSD based methods are superior over low rank based methods PCA and NMF. This phenomenon is also observed in the visual results. Although the visual results presented in Fig. 3 and Fig. 4 are similar, the quantitative results demonstrate the superiority of our method among LRSD based approaches. In the LRSD based methods, highest PSNR (dB) results are mostly achieved by RNMF and ROSL, RPCA follows them for the aluminum target scenario, while the performance of GoDec remains behind them except for the 0 cm burial depth. However, for the plastic target GoDec outperforms RPCA. RNMF presents the best scores overall and is closely followed by ROSL. Thus, the proposed ROSL method outperforms RPCA and GoDec and is slightly behind RNMF for the results in Fig. 4 and Fig. 5 even when the optimum parameters are exhaustively searched for RPCA, GoDec and RNMF unlike ROSL. Fig. 7 and Fig. 8 present the quantitative results at 2 cm burial depth with various soil types for the simulated dataset. Again, RNMF mostly gives the highest PSNR results. ROSL has the second best score, RPCA and GoDec follow them for the aluminum target case. For the plastic one, again RNMF has the best result followed by ROSL, GoDec and RPCA. In summary, quantitatively ROSL outperforms RPCA and GoDec for almost all of the cases and its performance remains slightly behind RNMF for some scenarios. GoDec presents better results than RPCA in the plastic case however RPCA defeats GoDec for aluminum cases. PCA and NMF present the worst scores for both cases, as expected.
The PSNR (dB) results are provided for the best parameter choices for the RPCA, GoDec and RNMF and there is no parameter choice for PCA and NMF. The performances of RPCA, GoDec and RNMF depend heavily on the choice VOLUME 8, 2020   of the parameters as it can be observed from the target images obtained by RPCA, GoDec, RNMF and ROSL from Fig. 9(b)-(e) and Fig. 10(b)-(e) for both plastic and aluminum target cases. Here other choices than the optimum ones     real datasets. The λ parameter for ROSL is set to 0.05. It can be observed that RPCA and RNMF loose most of the target information, while the remaining clutter is still present in the GoDec results for simulated dataset. ROSL is able to remove the clutter for a wide range of λ parameters as demonstrated in Fig. 1. To validate the difference in visual results for Fig. 3 and Fig. 4 vs. Fig. 9 and Fig. 10, the PSNR (dB) results of the optimum parameters and an arbitrary choice of the parameters are compared in Table 2 and 3, respectively.
The results show that, RNMF is the highest while GoDec is the least effected one from the parameter changes. The performance of ROSL is not effected even if we reduce the λ value by half. The visual and quantitative results show that ROSL is more robust to parameter changes.

B. REAL DATASET RESULTS
To test the performance of the proposed algorithm, two real GPR datasets are used. In the real dataset-I, PMA-3 and VOLUME 8, 2020    PMA-1 type anti-personal landmine, stone and copper strip targets are buried at the depth of 5 cm. The first two targets contain plastic material and the soil type is dry clay with small rocks. Thus the soil surface is rough and the irregularities are around 10 cm. The antenna frequency is 1 GHz. The size of the obtained GPR image (a B-Scan) is 512 × 197 and the experimental design of the real dataset-I is given in Fig. 11(a) and the obtained raw GPR image is given in Fig. 11(b) [25].
The real dataset-II is obtained from a project of the International Test and Evaluation Program for Humanitarian Demining [26]. SIR-3000, a commercial GPR antenna, with 1.5-GHz frequency was used. Five landmines, all PPM-2, with a diameter of roughly 13 cm were buried in the same directions with positions of 0.6, 1.4, 2.2, 3.0, and 3.8 m. and at burial depths 25, 20, 15, 10 and 5 cm, respectively. The soil type is magnetic sand, an artificial mixture to replicate soil with high magnetic susceptibility. The size of the obtained GPR image is 2049 × 401 and the experimental design of the real dataset-II is given in Fig. 12(a), the obtained raw GPR image is given in Fig. 12(b) [26]. Fig. 13 (a)-(h) show the raw and the obtained target images of real dataset-I and the target images obtained by RPCA, NMF, RPCA, GoDec, RNMF, and ROSL, respectively. It can be observed that PCA results present some disturbing vertical lines, while NMF fails to remove the clutter completely. Again LRSD methods are able to extract the complete  target information. GoDec and RNMF present clearer results. However, we should note that the performance of GoDec depends heavily on the choice of card(S) with respect to the GPR image size: a larger value fails to remove the clutter while a smaller value weakens the targets. The proposed method does not need any presetting for rank and cardinality parameters. Fig. 14 (a)-(h) show the raw and the obtained target images of the real dataset-II and the target images obtained by PCA, NMF, RPCA, GoDec, RNMF, and ROSL, respectively. This dataset is more challenging due to the soil type, thus the raw GPR data presents noise as it can be observed in Fig. 12 (a). Vertical lines are still present in the PCA result, while remaining clutter is observed in NMF result. As expected, GoDec, which decomposes the input image into clutter, target and noise parts presents the clearest target image. However, hyperbolas are not complete.The targets are more visible in RNMF and ROSL results despite the noise  part that has remained in the whole image. The horizontal clutter is completely eliminated in both results. RPCA presents the worst result.
To investigate the effect of the parameter choice on the results, the empirical formula given in (13) is used for RPCA and RNMF along an arbitrary parameter choice for GoDec and ROSL. The results are shown in Fig. 15 and 16. Again RPCA removes the clutter but looses some target information for both datasets, RNMF results seem to be better but still remain behind ROSL, the second target is hardly visible in the results of the real dataset-I. GoDec also fails to remove the clutter, while keeping the target for both datasets. To evaluate the clutter-target separation performance of the methods, energy profiles (plots of energy versus depth) are also used. Fig. 17 (a)-(e) show energy profiles obtained for raw data, RPCA, GoDec, RNMF and ROSL results, respectively. The A-scan, used in the energy calculation is shown by dotted red line in top row of Fig. 17. The middle row gives the energy profiles for raw data and target images obtained by each method. A closer look is given in the bottom row. The clutter seen between indices 400-500 of the energy profile corresponding to raw data is missing as expected in the results of the clutter removal methods. Some target information between indices (700-800) seem to be lost in RPCA. RNMF and ROSL present similar profiles. GoDec profile seems to be the best. However, we should remark that although GoDec completely removes the clutter for most of the scenarios, it looses some target information as shown in the clutter removal results for simulated and real scenarios. This may cause an undesired decrease in the detection rates of the targets.
The computational complexities of LRSD based methods RPCA, GoDec, RNMF and our method are given in the Table 4 [14], [18]. M and N are the size of the input GPR data matrix and r is the rank of the decomposition. According to the Table 4, the fastest method is ROSL and RNMF follows it. GoDec is faster than RPCA however it is slower than ROSL and RNMF. The number of computations is calculated for the Real dataset-II (M = 2049, N = 401 and r = 1) in the Table 4 to show the computational complexity. Since r = 1, the computational complexity of GoDec, RNMF and ROSL decrease considerably with respect to RPCA. We can conclude that the proposed ROSL based clutter removal method can compete both visually, quantitatively and in complexity with RNMF and outperforms it by the fact that it does not require a preprocessing step to determine its parameters unlike RNMF.

IV. CONCLUSION
A new LRSD based clutter reduction method is proposed for GPR. The proposed method separates the raw GPR image into its low rank and sparse parts, namely clutter and target parts via ROSL. Compared to other LRSD approaches VOLUME 8, 2020 already used in GPR such as RPCA and GoDec, ROSL does not require nuclear norm minimization, thus the use of SVD operations. ROSL outperforms RPCA in runningtime, it also has a faster implementation compared to GoDec which replaces SVD operations with bilateral projections. It does not require a presetting of cardinality and rank parameters as GoDec does. The proposed ROSL based method reaches approximately the performance of recently proposed RNMF, a robust non-negative factorization method. However, in RNMF the penalization parameter requires a preprocessing for the choice of the optimum parameter, while the proposed method does not present such an issue.