Dynamic Joint Reconstruction of Walls and Targets in Through-the-Wall Radar Imaging

Multipath exploitation with compressive sensing (CS) has been successfully applied in through-the-wall radar imaging (TWRI) given prior knowledge of the room geometry. However, in most practical applications and critical missions, the geometry of the room is not known. In this paper dynamic wall pursuit algorithm is proposed to simultaneously recover the room geometry and image the scene behind the wall. We demonstrate that moving targets increase the wall reconstruction accuracy. The algorithm exploits the fact that wrong wall positions populate the reconstructed image with false targets. It makes use of the moving targets in the scene to increase the certainty of the detected wall positions. Simulated results show the effectiveness of this method even at low SNR values.


I. INTRODUCTION
Through-the-wall radar imaging (TWRI) aims at sensing through building walls using radio frequency (RF) signals. TWRI has attracted many researchers with a variety of important civilian and military applications. In TWRI, transmitted signals mainly reflect from stationary objects, moving targets and walls. Additional reflections are received through indirect paths, known as multipath signals. These multipath signals results in false targets known as ghosts. Knowledge of the room geometry, improves target detection. However, in most practical applications, the geometry of the room is not known.
Multipath signals can be exploited to advantage [1]- [4]. In [1], the authors verified analytically that multipath ghosts appear near the walls and their locations depend on that of the receivers. Assuming a multipath model for an enclosed four walls room, they associated every target with its multipath ghosts. The authors in [2] used a householder transformation to express specular reflections and model multiple reflections and non-rectangular rooms.
To reduce the amount of data stored and processed in TWRI, compressive sensing (CS) was introduced to TWRI in [3]. This allows for high quality image reconstruction using only small fraction of the measurements. Extensions The associate editor coordinating the review of this manuscript and approving it for publication was Jing Liang. of this work then followed in [5]- [9]. Combined multipath exploitation with CS was proposed in [4]. The authors in [4] started by assuming isotropic targets and equal attenuation in all propagation paths. These limitations were overcome by applying group sparsity with a mixed-norm optimization. The reconstructed image quality was improved. However, their work assumes the building geometry is known a priori; which is rarely the case in practice.
An algorithm to eliminate multipath in TWRI without prior knowledge of the reflecting geometry was proposed in [10]. First, the impulse response of the primary target which has the strongest reflection is identified. Then a delay operator that matches similar reflections in the residual data is computed. The waveform is then updated to compensate for any distortions due to through wall propagation. Another ghost suppression method that does not require prior knowledge of the room geometry was proposed in [5]. The authors exploited the aspect dependence feature in TWRI to achieve a ghost free image under CS framework.
In [6], a CS based joint scene reconstruction and wall positions estimation approach was presented. The multipath model in [4] was extended to include a parameter that represents the position of the walls. After formulating the problem to find the wall positions that result in the sparsest reconstructed image, they applied a nested optimized scheme using either Quasi-Newton method or genetic algorithms.
Using M-sequence UWB radar system, the entire processing procedure for obtaining the contours of a scanned building using Hough transform was tested on real measurement data [11]. The work in [12] proposed wall reconstruction method based on minimum spanning tree which originates from graph theory. The authors in [13] first obtained multiple single-channel building layout images for independent channels of different views and used a noncoherent fusion method to combine them into a single-view layout image. While many researchers presented the use of SAR, the authors in [14]- [17] attempted layout reconstruction via TWIR sensor network.
Most researchers assumed common building designs with regularity and rectilinearity. Oblique illumination was considered in [18] to enhances the radar returns from the corners. The correlogram of the received radar signal was compared with a known correlogram of the scattering response of an isolated reference corner reflector. Their procedure can be applied using compressed observations which is an alternate approach to the optimization encountered in conventional CS. To make the reconstruction more practical, de Wit and van Rossum attempted extraction of building features from standoff measured through-wall Radar data [19].
The method presented in [7] relies on the fact that wrong wall positions populate the reconstructed image with false targets. Initially, the algorithm finds one wall position that minimizes the number of targets in the scene. The dictionary to be used in successive searches is updated with the multipath returns of the detected wall. This process is repeated until a termination criterion is satisfied. Unlike the work in [7], this work utilizes the dynamic nature of the moving targets and proposes a dynamic CS framework that jointly reconstruct walls and targets. We prove that moving targets primes the accuracy of the detected wall positions even at low values of SNR. The work done in [6] is highly sensitive to the initial wall positions, our proposed method does not require an initial wall position. The combined wall and target detection is very important in critical mission applications where no or limited information about the wall is available.
The remainder of this paper is organized as follows. Section II presents the geometry of the scene and the multipath signal model. A proposed method for obtaining a ghost-free image and detecting the interior wall positions is presented in Section III and the dynamic version is discussed in Section IV. In Section V, simulation results are analyzed and discussed. Finally, we summarize and conclude in Section VI.

II. MULTIPATH SIGNAL MODEL A. GEOMETRY OF THE SCENE AND SIGNAL MODEL
We concentrate on monostatic impulse radar based TWRI, though the proposed work can be easily translated to other types of radar. Assume an array of M time-multiplexed transceivers are placed parallel to a wall. The transceivers interrogate the scene by transmitting wideband Gaussian pulses, s(t). Multipath components from multiple reflections of electromagnetic waves from the target to the walls, floor and ceiling pose a major challenge in TWRI. When the signal is received, multipath signals are thought to be reflections from real targets and thus ghost targets arise in the image. One of the major challenges in TWRI is multipath stemming from multiple reflections of electromagnetic waves from the target to the walls, floor and ceiling. FIGURE illustrates the geometry of a simple TWRI scenario with one point target. The array is located parallel to the front wall at back off distance, d off . The front wall is homogeneous wall of thickness, d, and dielectric constant .
The signal emitted from the m th transceiver and reflected from point targets is received back at the m th transceiver as: where P is the number of targets, R is the number of multipath arrivals per target, with r = 0 representing the direct path. Only first order multipath signals are considered. The total reflectivity coefficient of target p when the signal goes through the r th path, σ pr , is assumed to be independent of aspect angle and frequency. τ pr,m is the signal travel time associated with the m th transceiver for the signal reflecting from target p through path r. The received signal is the superposition of multiple delayed and scaled forms of the transmitted signal. While wall attenuation and propagation loss are not considered in (1) for notational ease, they can be represented by a scaling factor [8].
By sampling the received signal y m (t) at {t k } K −1 k=0 , we obtain a K × 1 column vector, y m . The scene is divided into Q pixels. The objective is to find the pixels that contain targets. The received signal in (1) can now be expressed in a vector form as: where the first product term represents the direct path while the others correspond to multipath signals from different walls. Only specular multipath components via interior walls are considered. Multipath from targets to targets, named interaction multipath, can be considered when building the sparse model as in [21]. The inner wall positions are discretized and each element in w r represents a wall position in direction r, r = 0, 1, . . . R. If only one back-wall and two side-walls are present, R = 3. The vector, w r has at most one nonzero element, with a value of 1, matching to the position of the wall in direction r. If there is no wall in direction r, all the elements of the vector w r will have the value of zero. The length of w r , W , is proportional to the resolution and range considered for the inner walls.
The q th column in the K × Q matrix contains the signal we expect to receive if the signal was transmitted from the m th transceiver, reflected from a target at pixel q to the wall in direction r (when r equal to zero, the signal will not reflect from any inner wall), and finally received back at the same transceiver. The order at which the signal reflects does not matter since it will have the same delay in both cases. The k th element in the q th column is given by s qr,m 2 is the energy of the signal in the q th column, which suggests that each column in (w r ) m is normalized [8]. The values for the reflectivity for all the Q pixels in the scene are concatenated to form the vector x r ; the q th element in x r equal to zero when no target exists in pixel q, or equal to σ q if a target exists. The position of the nonzero values in each x r , for all values of r, should be the same. However, the values are different depending on the reflectivity of each path corresponding to a wall direction r.
Equation (2) can be presented in a compact form as Note that w r reduces to a scalar that represents the position of the wall r when dealing with stationary targets [7].

B. IMAGING WITH COMPRESSED SENSING
Directly working with y is not efficient as large amounts of data requires large memory and intensive processing [8]. This problem can be addressed by employing CS which measure the linear projection of y m , [9], where m is a J × K measurement matrix (J K ), and y m is the compressed version of the signal y m . The obtained compressed signal can be represented as: is a block diagonal (Blkdiag) matrix. To be able to recover x accurately from (6), the matrices m have to be designed with minimum mutual coherence with m [20]. Good results are expected when the elements of the matrix m are independent and identically distributed Gaussian or Bernoulli random variables [8].
If only direct path returns are considered, multipath signals will cause ghosts in the reconstructed image. Instead, if the wall positions are identified, w is known, a dictionary matrix (w) can be constructed and ghost free image can be recovered by solving the optimization problem given bŷ and λ is the regularization parameter for the 2 norm. No ghost will be present in the reconstructed scene. However, when wall positions are incorrectly rebuilt in the dictionary, wrong multipath propagation delays will create ghost targets in the reconstructed image. Utilizing this fact, we propose a novel and effective reconstruction technique, in the next section, that jointly estimates walls and targets and reconstructs a ghost-free image.

III. WALL PURSUIT
To get a ghost free image, the estimated wall positions should be very accurate. Incorrect estimate of the wall positions results in a populated scene. The objective is to find the wall positions that minimizes the number of targets in the scene, and simultaneously obeys (6). This can be expressed as a mathematical optimization problem: The solution of (8) provides the wall positions and a ghost free image, assuming the presence of at most one wall in each direction. Numerous methods can be used to solve (8). One approach is to exhaustively search for all possible wall combinations w, update the matrix (w) each time, solve for (7), and then consider the answer which results in the sparsest estimatedx. This exhaustive approach is expensive and time consuming since we must explore through O (W + 1) R different wall combinations. The complexity grows exponentially with the number of walls. Other less complex approaches include: Your Algorithm for L1 (YALL1), Gradient Projection for Sparse Reconstruction with Barzilai-Borwein steps (GPSR-bb), Spectral Projected-Gradient (SPGL1), Fixed-Point Continuation with Barzilai-Borwein steps (Fpc-bb), and Sparse Reconstruction by Separable Approximation (SpaRSA) [22]. For low SNR, the performance degrades dramatically with data compression. YALL1 seems to be the best in this case, where it still gives reasonable results when having up to 50% percent compression. On the other hand, for high SNR values, most of the algorithms are not highly affected when the percentage of reduction is increased; meaning that similar performance can be obtained with fewer amounts of data. Due to its superior performance at both low and high SNR, YALL1 is adopted for the remaining part of this work.

IV. DYNAMIC WALL PURSUIT
In many applications of TWRI, the target of interest is moving, thus the received signal is changing. This can be utilized to further improve the accuracy of wall detection. The presence of a moving target inside the room means that more information about the wall positions is gained with time. An alternative approach which we call Dynamic Wall Pursuit (DWP) iteratively construct the scene geometry. Few effective extensions are made. First, we scan for the most effective wall positions in each direction r separately. The dictionary is updated with these wall positions. Further, we refine the positions by repeating this process. Change detection technique is used to suppress the signals coming from stationary objects [8], [9]. It is performed by subtracting the previous received signal from the current signal, and thus signals reflecting from stationary targets will be illuminated. This helps in increasing the sparsity of the scene, thus improving the performance of the CS algorithm during image reconstruction. Second, the multipath dictionary generator (MDG) accumulates the cost of each wall, from different time frames [7]. Thus, information from earlier frames have an influence on the MDG's choice of wall updates. The number of refinement iterations is referred to as the depth of the algorithm. The wall positions converge to the correct positions after a couple of iterations and the final image is reconstructed with the refined dictionary. Figure 2 illustrates the iterative process. First, the MDG provides an initial dictionary (w) [0] that does not consider any inner walls. The subscript in (w) [d] between the brackets denotes the depth. The process searches for the most probable wall position in each wall direction r. This is demonstrated within the dashed boxes in Figure 2. There are R layers and each layer represents one of the wall directions. Three layers are shown in the figure with the r th wall direction search layer at the top. Usingỹ, , and¯ (w) [d] , an image x r is built for the w th wall position and is assigned a confidence factor s r (w). The process is repeated for all w = 1, 2, . . . , W . The confidence of all wall positions, stored in vector s r , is then fed to the MDG, which updates the dictionary and send it back to the layers for more refinement as needed. The iterative process terminates when all wall positions converge or a specific depth D is reached. Compared with exhaustive search, the number of search combinations needed in this method is dramatically reduced to O(DWR), which is a direct function of the depth, the number of recovered walls, and the range resolution.
To illustrate the difference, assume that we are searching in three wall directions, R = 3, and for each wall we have twenty hypotheses, W = 20. The exhaustive method searches through (21) 3 = 9261 possible wall combinations, while the latter method only searches through D × 20 × 3 = 60D wall combinations. The depth, D, needed to get good results will be shown to be a small number which makes the algorithm appropriate for real-time tracking applications.
In the described framework, there can be many variations in the image reconstruction algorithm, image evaluation, dictionary update, and final image computation. When reconstructing the image at each layer, an imaging method that allows for accurate wall detection is preferred. A cost function that depends on wall position accuracy is evaluated. From each layer, the MDG receives the confidence factor of each wall position, s r . With the aid of these values, soft decisions on the wall positions, as well as hard ones, can be made.
The image reconstruction in each layer is done by solving where¯ (w) [d] is the same as (w) [d] but with the submatrix updated each time we change w r . For each wall position w in layer r, the resulting signal x r,w will be the one that best obeys VOLUME 7, 2019 the block sparsity feature within the given constraints; since minimizing the mixed 1,2 norm emphasizes this property. However, the signals resulting from incorrect wall position estimations are not as block sparse as the one resulting from the correct ones. Therefore, a reasonable image evaluation criterion is the mixed 1,2 norm, s r (w) = x r,w 1,2 . After obtaining the cost of the different wall positions, each layer passes the vector s r to the MDG. The MDG will then identify the minimum value in s r for each layer: and consider the corresponding wall position to be the effective one in w r , w r (i) = δ w min ,i , ∀i = 1, . . . , W . This will allow the MDG to update the dictionary, (w) , with the new wall positions and pass it to the layers for further refinement. After a desired depth, D has been reached, the final image is reconstructed by solving (9) with the final model (w) [D] . If w contains the correct wall positions, the desired ghost-free image of the scene will be obtained.
To continuously suppress ghosts, the first received signal can be processed until the ultimate updated dictionary (w) [D] is developed. This dictionary is then used to reconstruct later images. There is no guarantee that this approach will converge to the proper dictionary. Moreover, the performance of this approach declines at low SNR, and it does not utilize the dynamic information from moving targets. The dynamic nature of moving targets in the scene allows for multiple independent observations. Moving targets can be detected by change detection which also suppresses multipath components from the front wall and stationary objects behind the wall. In addition, change detection increases the sparsity of the scene which enhance the performance of the CS algorithm during image reconstruction [8].
To keep track of the best wall positions, the MDG accumulates the confidence level of each reconstructed wall, s r (w), using a moving average. Information from earlier frames influence the MDG's decision on the updated wall position. The process continues until a certain depth D is reached. The value of D is selected to guarantee a certain number of detected walls. In most of the cases, when D is from two to four, at least two walls are correctly detect.

V. SIMULATION RESULTS
To evaluate the efficacy of the DWP. Three monostatic radars were interspaced by 0.25 m and located 1 m away from the front wall. The dimension of the imaged scene behind a 20 cm is 3 m × 2.5 m, with 5 cm a pixel size in both downrange and crossrange. The scene is interrogated using a Gaussian pulse of width 0.73 ns and carrier frequency of 1 GHz. The sampling time is t = 25 ps and the total number of samples is K = 1085. The front wall is accessible and the distance to the front wall can be measured and the returns from the front wall are assumed to be calibrated and compensated. Four perfect reflection point-targets are located behind the front wall in a 2.6 m × 2 m room. Back and side walls, which are assumed to be perfect reflectors, result in multipath reflections with amplitudes less than that of the direct reflections. Ray tracing is used to evaluate the received signal. Additive white Gaussian noise, of various SNR values, is added to the received radar signal to test the algorithm immunity to noise. We defined the SNR to be the power of the received signal containing the returns from all the targets and the multipath returns divided by the power of noise. For the compressed data, a linear projections of the returned signals with random ±1 sequences, called random modulation preintegration (RMPI) architecture [23] is used to reduce the number of CS measurements to 20% of the conventional Nyquist sampling rate.

A. WITHOUT UTILIZING MOVING TARGETS
First, the effectiveness of the method is evaluated without utilizing the presence of moving objects. The effect of various noise levels on the algorithm is investigated in Table 1. Precisely, for different SNR, the success rate indicates how frequent the algorithm correctly estimates the wall geometry in the room. Also, for the cases where correct wall positions are estimated, we evaluate the required depth. At high SNR, the algorithm correctly recovers all wall positions. However, for low SNR, some walls are wrongly located, and the algorithm may oscillate. When the algorithm oscillates at some depth level, d, the MDG revisit wall positions that were already visited. The average number of required depth for the successful reconstruction of different number of walls with a given rate is depicted in Table 1. A depth of zero means no iterations are needed. For the examined SNR range, at least two walls are correctly located when the algorithm is four levels deep.

B. WITH UTILIZING MOVING TARGETS
Utilizing the previous setup to test the performance of the DWP, four moving point targets moves arbitrarily in different directions with varying speeds ranging between 0.5 to 1.5 m/s; which is the speed of a walking human [24]. Since the pixel dimension is 5 cm, a frame rate of 10 frames per second should smoothly capture the moving target. The number of processed frames is 35 with a depth level of D = 2. The results are averaged over 100 runs. In each run, the target initial location, moving paths, and speeds are all chosen uniformly at random, within a defined range. No initial guess for the wall positions are needed. Figure 3 shows the results of the first five frames for a single target moving to the right. We do not obtain an accurate reconstruction of the scene at the first frame. However, while the target moves, we are gathering more information and can get an accurate reconstruction after only 5 frames.
It is desired to evaluate the performance of this algorithm with various noise levels. For the tested range of SNR values, the proposed method was found to be always successful in reconstructing at least two walls. For the case of three walls, Figure 4(a) shows that it is successful in detecting all the walls in more than 95% of the cases when the SNR is above 15 dB. When information from the moving targets are being utilized, results show that wall position detection is improved, and hence a better image of the scene is obtained. The quality of recovered image improves with the number of correctly reconstructed walls because the reconstruction criteria is joint for walls and targets. When false wall positions are considered, the reconstructed image will be populated with false targets. The numerical evaluation of image quality depends on the imaging algorithm and the used evaluation metric. A detailed comparative study is presented by the authors in [22]. When DWP succeeds in detecting all the walls, it converges very fast. Figure 4(b) presents the average number of frames needed to correctly reconstruct wall positions. When SNR is 15 dB, on the average, 6 frames are needed for successful detection of the walls, as shown in Figure 4(b). The maximum number of required additional frames at 15 dB was found to be 25 by examining all experiments.
When compared with other surveyed work, the presented results are either for a specific building measured from multiple directions [11] or in a very controlled lab environment [12]. The difference in scenario and the non-availability of the data prevented direct numerical comparison. In [10], two methods were introduced to solve the optimization problem and reconstruct the walls, namely: Quazi-Newton and Genetic Algorithm (GA). Quasi-Newton method outperforms GA when the initial guess is close to the true wall locations. The GA approach provided rather poor overall accuracy irrespective of the initial guess. Further, the improved reconstruction performance of their approach comes at the cost of much higher numerical complexity [6]. The work of [11]- [13] reconstructs the building contour and does not perform joint reconstruction of walls and targets.

VI. CONCLUSION
A method for joint estimation of the positions of the inner building walls and reconstruction of a ghost-free image of moving targets under compressive sensing framework was introduced. The DWP method utilizes the fact that moving targets increase the knowledge of the wall positions. The results show that utilizing the new information from different time frames significantly increase the performance of wall reconstruction. The algorithm was shown to be very efficient with an average of 6 frames needed to reconstruct all walls at significantly low SNR.