SURE-Fuse WFF: A Multi-Resolution Windowed Fourier Analysis for Interferometric Phase Denoising

Interferometric phase (InPhase) images, acquired by phase imaging systems, often suffer from two major degradations: 1) phase wrapping, caused by the sinusoidal $2\pi $ -periodic sensing mechanism, and 2) noise, introduced by the acquisition process or the system. This work focuses on InPhase denoising, which is a fundamental restoration step to many posterior applications of InPhase, namely to phase unwrapping. The presence of sharp fringes, which arises from phase wrapping, makes InPhase denoising a hard inverse problem. Motivated by the local sparsity often exhibited by InPhase images in Fourier domain, we propose a multi-resolution windowed Fourier filtering (WFF) analysis that fuses WFF estimates with different resolutions, thus overcoming the WFF fixed resolution limitation. The proposed fusion relies on an unbiased estimate of the mean square error derived using the Stein’s lemma adapted to complex-valued signals. This estimate, known as SURE, is minimized using an optimization framework to obtain the fusion weights. Strong experimental evidence, using synthetic and real (InSAR & MRI) data, that the developed algorithm, termed as SURE-fuse WFF, outperforms the best hand-tuned fixed resolution WFF counterpart, as well as other state-of-the-art InPhase denoising algorithms, is provided.


Introduction
In the recent decade, interferometry has strongly benefited from advancements in phase imaging techniques and has significantly contributed to many fields including remote sensing [1,2], optical metrology [3,4], astronomy [5,6], surveillance [7,8], medical diagnostic [9][10][11], weather forecasting [12,13], etc.In such coherent imaging modalities, the information related to the physical and geometric properties such as shape, deformation, movement, refractive index, structure, etc. of the illuminated objects is coded in the phase of the underlying signals.
Radar and sonar interferometry [14,15], are relevant technologies in the context of phase imaging, in particular Interferometric Synthetic Aperture Radar & Sonar (InSAR/InSAS) [1,2,[16][17][18].InSAR is based on the interference of the electromagnetic field scattered by an object or surface and measured by antennas (or sensors) located at different positions, as illustrated in the fig. 1. InSAS is similar to InSAR but operated with sound waves [19].Given the positions s 1 and s 2 of the sensors, the height h of a given surface element, relative to a reference plane, is easily derived from r 1 − r 2 , i.e., the difference between the distances of the surface element arXiv:1811.04137v1[eess.SP] 9 Nov 2018 and the antennas.Since the propagation phases are given by φ 1 = 4π λ r 1 and φ 2 = 4π λ r 2 , where λ is the carrier wavelength, we conclude that that h = g(φ 1 − φ 2 ), where the function g is derived from the acquisition geometry.See [17] for further details.It happens, however, that the measured signals depend only on the principal (wrapped) values of the original phase (absolute phase) φ .Thus, the measured phase, which we term interferometric phase (InPhase) and takes values in [−π, π), is a non-linear function of the actual phase.This process, termed as phase wrapping, makes the absolute phase inaccessible in the direct measurement.A second major challenge is the noise introduced by the acquisition systems/processes and this further complicates the task of estimating the absolute phase.Owing to these two degradation mechanisms, the estimation of the absolute phase is a challenging inverse problem.Though an InSAR scenario is adopted for the discussion, the underlying inverse problem pops ups in many technologies that incorporate phase imaging, namely, Magnetic Resonance Imaging (MRI) [20,21], Optical Interferometry [4] and High Dynamic Range (HDR) Photography [22].
Most of the proposed approaches for absolute phase estimation follows a two-step procedure: In the first step, a clean phase is estimated in the interval [−π, π).This step is termed as the InPhase estimation or phase denoising; while in the second step, the absolute phase is inferred from the InPhase obtained from first step and this stage is known as phase unwrapping.In this paper, our main focus is the InPhase estimation which is the first step of the two-step approach.The sharp interferometric fringes from phase wrapping make the InPhase images different from the natural images.InPhase image denoising is difficult since the wrapping discontinuities should be preserved carefully for the second stage of unwrapping.Despite these difficulties, many strategies have been proposed to tackle this problem.Local polynomial approximation(LPA) [23] is an early stage attempt in InPhase estimation in which phase is approximated by zero order polynomials in a small local area (in a rectangular window).The lack of an adaptive window size selection is a major drawback of this method.An oversized window fails to estimate sharp discontinuities and non-smooth regions while, on the other hand, a small window may yield poor denoising performance.The selection of the window size is addressed in [24] in which zero-and first-order LPA of the phase are calculated in sliding windows of varying size.A criterion for adapting the window size is obtained from the zero-order approximations, whereas the filtering is performed using first-order approximations.The performance of the proposed algorithm, termed PEARLS, is also limited owing the limited modelling power of first-order polynomials in the areas containing discontinuities.
The recent trend in image denoising exploits non-local self-similarity of the natural images [25,26].The Block matching with 3D filtering (BM3D [27]) algorithm is a popular tool in this category in which collaborative filtering is applied to the groups of similar image patches.The non-local self-similarity has been exploited in the context of InPhase estimation.The methods NL-InSAR [28], SpInPhase [29], and MoGInPhase [30] are three representative examples of this class.SpInPhase and MoGInPhase exploit non-local properties of the InPhase images in complex domain and yield state-of-the-art denoising performance.SpInPhase reformulates the phase denoising problem as a dictionary-based sparse regression problem in the complex domain, where each patch is modelled as a sparse linear combination of the atoms of the dictionary.The sparse regression implicitly projects the noise from a high dimensional space to a low dimensional subspace, which largely attenuates the noise.In MoGInPhase, the patches of complex phase images are modelled using Mixture of Gaussian (MoG) densities in the complex domain.The phase patches, due to the non-local self-similarity, are well modelled by very few eigen-directions of the covariance matrices of the MoG components, which underlies the MoGInPhase denoising performance.
Both SpInPhase and MoGInPhase are data adaptive in the sense that they learn representations from the image to be denoised.In spite of the power of the data adaptive representations, the fact that, in many applications, the phase is mostly locally smooth makes fixed representations in the Fourier domain still powerful tools in phase modelling, as they yield sparse representations of local complex phase patches.A state-of-the-art algorithm of this class is the Windowed Fourier filtering (WFF) based algorithm [31].Arguably, the major weakness of WFF is the lack of adaptiveness of the used windows in windowed Fourier transform (WFT).
Research on automatic window size selection has been carried out in time-frequency analysis of one dimensional signals.In [32], a data-adaptive time-frequency representation is developed using Gaussian basis functions with varying time width and chirp rate; but this adaptive method has high computational complexity.A simple, computationally efficient and signal-dependent time-frequency representations is proposed in [33], in which the window is characterized by a free parameter, which is adapted over time.The temporal estimation of the parameter is done by maximizing a short-term quality measure of the time-frequency representation.More recently, a fast version of adaptive WFT using B-splines that have near optimal time-frequency localization is proposed in [34].A new adaptive short-time Fourier analysis and synthesis scheme, applied to speech enhancement, is proposed in [35].This scheme uses an efficient modified overlap-add procedure.It is to be noted that these adaptive time-frequency analysis methods are developed for real domain one dimensional signals, such as speech signal.
For the two-dimensional image analysis, the wavelet transform, due to its multi-resolution ability, outperforms the Fourier based analyses.We conclude this section by specially mentioning a series of work presented by T. Blu et al. [36][37][38][39] for natural images and videos in which a new strategy, termed as 'SURE-LET', is developed for wavelet-based denoising.In SURE-LET approach, the denoising process is parametrized as a sum of elementary non-linear processes with unknown weights.The optimal weights are computed by minimizing an estimate of the mean square error (MSE) between the clean image and the denoised one.The key point of this strategy is that it makes use of a statistically unbiased estimate of MSE, namely, Stein's unbiased risk estimate (SURE), that does not depend on the clean image.This a priori avoids the random process modelization of wavelet coefficients which is the basis of most of the wavelet-based denoising schemes.A similar strategy is followed in [40].Although the wavelet transform and other multi-resolution analysis have been developed for natural two-dimensional images, they do not adapt well to InPhase estimation, where the underlying signals are complex-valued.
In this paper, we adopt a SURE-based strategy and propose a new multi-resolution Fourier analysis for complex-valued InPhase signals.

Proposed approach
As already mentioned, the research herein developed is motivated by the fact that, in many applications, the phase Φ, is locally (i.e., in 2D neighborhoods) smooth [17].Therefore, the Fourier transform of a 2D patch tends to be sparse.These ideas have already been exploited in the literature.We refer to the WFF work [31,41], which exhibits competitive performance in InPhase denoising of locally smooth phase images.WFF works with fixed windows.This is acceptable if the smoothness of the image phase does not vary considerably across the image.If this is not the case, ie., the image smoothness varies widely across the image, the lack of adaptiveness of the window size greatly limits the WFF performance.The influence of window size was recently studied in windowed Fourier ridge algorithm [42].Without surprise, it was concluded that small windows are good for detecting sharp discontinuities whereas larger windows yield strong noise attenuation in smooth phase regions and/or for high amplitude of noise.
Aiming at endowing WFF with window size adaptiveness, and inspired by the works [36][37][38], this paper proposes a novel approach in which WFF estimates from different window sizes are fused in an optimal way.The fusion is pixel-wise, linear and is implemented in the complex domain such that the estimates from larger windows get higher weights in smooth regions and vice versa.The linear weights for the fusion is designed in such a way that the mean square error (MSE) between the clean and the estimated image is minimum.Since the oracle MSE is not available in a practical inverse problem, we resort to Stein's unbiased risk Estimate (SURE) adapted to the complex domain.

Contributions
The main contributions of this paper are summarized as follows: • Reformulation of InPhase estimation as sparse regression in the frequency domain using Fourier analysis.
• Mathematical formulation of a SURE-based unbiased estimate of the MSE, which is derived in the complex domain for the specific InPhase sparse regression model.• Design of multi-resolution WFF by pixel-wise linear fusion of WFF estimates with different resolutions.
The optimal fusion is achieved through a quadratic programming, designed to minimize the SURE.
The paper is organized as follows: Section 2 introduces the observation model and the underlying interferometric phase problem.A brief revisit of WFT-based denoising strategy, in the perspective of sparse regression in frequency domain, is presented in Section 3 to build the mathematical foundation for the main proposal.Section 4 discusses the resolution-related limitations of WFF and formulates the main research contribution, i.e., a multi-resolution WFF based on SURE-fusion.Strong experimental evidence, using synthetic and real (InSAR & MRI) data, of the competitiveness of the proposed algorithm termed as SURE-fuse WFF, is given in section 5. Finally, the paper concludes in Section 6.

Problem formulation
Let us assume that the images are defined in a 2D grid G := {1, ....m} × {1, ....n} with total number of pixels N = m × n.Also, let a k := (k 1 , k 2 ) ∈ G be a point in the 2D grid G. Herein, for the observed image z := {z k ∈ C, k ∈ G}, we assume the following model: where is the complex-valued noise, having n ℜ and n ℑ as its real and imaginary parts respectively.The noise n k is assumed to be zero-mean circular Gaussian [43] and white with variance σ 2 , i.e., n ℜ k and n ℑ k are zero-mean independent Gaussian random variables with variance σ 2 /2.
1 R ≥0 denotes the set of non-negative real numbers.
Model (1) captures the essential features of the interferometric phase estimation problem [28,29,44,45] and it is also a good approximation for magnetic resonance imaging (MRI) [17].Even in InSAR applications [28,44,45], model (1) endowed with a suitable spatially variant noise, is a good replacement for the true observation model, as shown in [29].The interferometric phase Φ 2π := {Φ 2π k ∈ [−π, π) , k ∈ G} is defined as where W(.) is the wrapping operator that performs component wise 2π-modulo wrapping operation defined by where the function mod (., 2π) denotes modulo-2π operation.Let a clean complex image be denoted as We remark that Φ 2π = arg(x).Hereafter, for the sake of mathematical convenience, we assume that the images z, x, n, a,Φ, and Φ 2π are stacked into a column vector of size N = m × n according to the lexicographic ordering of set G. Throughout our discussion, the components of these column vectors are indexed using their original 2-D grid location k ∈ G. Also, all the mathematical operations in model ( 1) is to be understood as component-wise.With these notations in hand, we define InPhase denoising as the estimation of Φ 2π from the noisy observation z.

InPhase Denoising via Sparse Regression in the Frequency Domain
We start by briefly reviewing the WFT and the inverse WFT (IWFT) in terms of analysis and synthesis frames.Then, we formulate a sparse regression framework in the frequency domain for InPhase denoising.

Two-dimensional windowed Fourier transform (WFT)
The discrete WFT of z, computed at a spatial point k ∈ G and frequency ω := (ω 1 , ω 2 ) ∈ [−π, π)2 , is defined as (see [31] for more details) where h s k is a Gaussian weighting window centered at location k, defined as and ω, k := Size of the window is controlled by the parameter s termed as scale. 2We remark that the analysis function h s k−k e − j ω,k in (4), with h s k being Gaussian, has the structure of a Gabor wavelet [46], which has optimal concentration properties in terms of time duration and frequency bandwidth [47].Also, we assume that both z and h s are periodic images of size m × n.An equivalent way to express Z k,ω is where denotes-2D cyclic convolution.The interpretation of ( 5) is clear: the WFT of z computed at a given space-frequency point (k, ω) may be obtained by modulating z k with the complex exponential e − j ω,k and then convolving the modulated signal with the weighting function h s k .In practice, we compute Z k,ω at frequencies (ω i , ω j ) = (2πi/m, 2π j/n) for (i, j) ∈ I ⊂ G.This can efficiently be implemented with Discrete Fourier Transform (DFT) or its fast version Fast Fourier Transform (FFT).
The IWFT associated with (4) is not uniquely defined.In this work, we adopt the following definition: where is the energy of h s and is a set of 2D-frequencies, and |W | denotes the cardinality of W .Following below is a derivation to obtain the perfect reconstruction condition (PRC) of the IWFT, i.e., the condition under which z k = z k .We start by substituting (4) in ( 6): Let D 1 and D 2 to be sub-multiples of m and n respectively and where r 1 , r 2 ∈ Z (integer) and δ (y 1 , y 2 ) = 1, (y 1 , y 2 ) = (0, 0) , 0, otherwise.
Evaluating (7) for the cases when δ (., .)= 1 in (8) (i.e.,when gives: In order to obtain PRC, we assume that the support of h s k 1 ,k 2 w.r.t.k 1 and k 2 are not greater than m D 1 and n D 2 respectively.With this assumption, all the terms in ( 9) vanishes except the one corresponding to r 1 , r 1 = 0, i.e, We emphasis that the PRC is achieved only when the window function h s k satisfies the aforementioned support constraints and when D 1 and D 2 are sub-multiple of m and n respectively. 3In all the proceeding IWFT expressions, a default assumption of PRC is taken.For the efficient FFT implementation, ( 6) is rewritten as: 3 The specific PRC settings adopted in our implementation, i.e., settings of D 1 and D 2 in relation to the window support, are explained in section 4.5.

Denoising via sparse regression in the frequency domain
Let A s : C N → C N×|W | be the linear operator representing the WFT analysis associated with the definition (5) and S s : C N×|W | → C N be the linear operator representing the IWFT synthesis associated with the definition (11).We recall that, according to the observation model ( 1), our objective is to estimate x from z.We attack this inference problem by solving an optimization problem with two facts in mind: 1. X := A s x is sparse or at least compressible 4  2. Since the A H s A s ∝ I, 5 if the noise n is circular complex Gaussian, independent, and identically distributed (i.i.d.), then A s n is also circular complex Gaussian and i.i.d.Therefore, we propose to solve the following optimization where Z = A s z and the final estimate of x is given by x = S s X.The motivation for the formulation ( 12) stems from the above two facts: the 0 term is a regularizer that promotes sparse WFT representations of the original data and the second term is the negative loglikelihood associated to the circular complex Gaussian i.i.d.noise.The relative weight between the two terms is set by the parameter λ > 0.
Optimization (12), although non-convex, has a simple closed-form solution: the hard-threshold given by where The WFT-based denoising strategy, formulated under a sparse regression framework, is summarized in the following steps: 1. Compute the WFT of z, i.e., Z = A s z give by (5).
2. Compute a sparse solution in the frequency domain by hard-thesholding Z yielding X = Θ λ H (Z) 3. Compute the IWFT of the thresholded signals to get the spatial domain estimate, i.e., x = S s X.
We term this denoising scheme, courtesy of [31], as windowed Fourier filtering(WFF).We remark that this paper does not claim the novelty WFF.Our proposal in section 3 aims to present WFF in a solid framework necessary to the following developments, in addition to a computational efficient implementation.This section is concluded by formally defining WFF as follows: This can also be rewritten as:

4
X is compressible if its magnitude decays rapidly.

5
H denotes Hermitian transpose operation

SURE-fuse WFF: A multi-resolution WFF for improved InPhase denoising
WFT and its variants are widely used in speech and natural image denoising.A significant related work, WFF [31] proposed by Kemao, shows that WFT is a powerful tool to efficiently represent interferometric fringes.Although WFT-based techniques are very popular, they are designed with fixed resolution, which sets a bottleneck to their performance.The spatial and frequency resolution of the WFT are determined by the size of the windowing function.Often in a WFT-based signal analysis, a trade-off between frequency and spatial (or temporal) resolution is considered.Usually, this trade-off is application specific and is set using the Heisenberg uncertainty principle.
We present an illustrative experiment, as a pre-discussion, to show the role of window size in WFF.Two different types of phase surfaces are simulated -i) a 'non-smooth mountain' (fig.2a), which has frequent ups and down (compared to the WFF window size), and ii) a 'smooth smooth mountain' (fig.2b), which is a smooth surface.The complex-valued image ae jΦ is generated by using each of these simulated surfaces as the absolute phase Φ with amplitude a set to unity.Observations as per model ( 1) are generated for low to high level of noise (σ ∈ {0.3, 0.5, 0.7, 0.9}).InPhase Φ 2π is estimated from these noisy observations using WFF [31].The experiments are repeated using various windows whose sizes are determined by the scales s ∈ S = {1, 2, 3, ..., 10}.
A small value of s corresponds to a small window and vice versa.The performance measurements, in terms of peak signal to noise ratio PSNR (46), are given in table 1 from which the following observations are made: i) The non-smooth mountain possesses abrupt variations and hence the spatial resolution is preferred over the frequency resolution.Here a small-window WFF yields better results compared to a large-window WFF.ii) The smooth mountain has a smooth terrain and here the frequency resolution is preferred over the spatial resolution.A larger window is preferred in such cases.However, in the latter case, since the surface is not entirely flat like a plane, there is an optimal window size (s = 7), beyond which, if the window size is increased, the performance starts to decrease.This experiment motivates the necessity of a WFT-based algorithm that can adapt the window size depending on the data.Also, the real life InPhase images are often more challenging than the two examples just considered.In real images, we often have different levels of smoothness across the image, abrupt peaks and valleys, sharp discontinuities, etc.As shown later, having adaptive window size selection yields large denoising gains.Inspired by the works [36][37][38], we address the adaptive window selection by proposing a pixel-wise fusion mechanism of WFF estimates having different resolutions.The fused estimate at a pixel k ∈ G is given by where x s k is the image estimate at the pixel k obtained from WFF of scale s and a s k is the corresponding fusion coefficient.The main challenge here is the design of a s k to obtain a data-adaptive fusion.We propose a strategy in which the fusion coefficients are computed by minimizing SURE, an unbiased estimate of mse.Before introducing the fusion framework, SURE, adapted to complex-valued signals, is derived below.

SURE: Unbiased estimate of mse
Let x := f(z) ∈ C N be the estimate of the true image x, stacked in vector form.Also we use the notation f k (z) to represent the k th component of the estimate x.The mse of x, denoted by mse( x), is given by Expression (18), the oracle mse, cannot be evaluated in a real-life scenario, as we do not have access to the true signal x.However, in the case of real-valued estimates, the Stein's lemma [48] opens the door to replace ( 18) by an unbiased estimate, which is a function of only the observations.Stein's lemma [48] for real-valued signals, courtesy of [36], is as follows: Lemma 1 (Stein's lemma for real-valued signals).Let g(z ) be a mapping from R N to R N such that x is a fixed parameter, and n is a zero-mean i.i.d Gaussian random vector with variance σ 2 .Then we have6 where T denotes the transpose operator and ∇ is the divergence operator, such that It is to be noted that Lemma 1 assumes g k (z ) is differentiable.This is a fundamental result, as it do not make any assumption regarding x that it is treated as a fixed parameter, and the source of randomness comes only from the additive Gaussian i.i.d.noise.We use Lemma 1 to derive SURE for complex-valued signal.

SURE for complex-valued estimators
Stein's Lemma demands the estimating function g to be differentiable.But the WFT-based estimator f : C N → C N operates on complex-valued signals.Hence we adopt a general framework of differentiation, termed as Wirtinger calculus [49,50], which can handle the differentiation involving complex-valued signals.Following this framework, for any real differentiable7 function f k (z), the derivative can be defined as: where the superscript ℜ and ℑ are used to denote the real and imaginary parts respectively, i.e., z k = z ℜ k + jz ℑ k .In order to derive SURE, we consider the expectation of mse defined in ( 18): Lemma 1 is applied to real and imaginary parts seperately.This is possible since the observation model (1) assumes that the real and imaginary noise components (n ℜ k and n ℑ k ) are zero-mean independent Gaussian random variables with variance σ 2 /2.
In (25), the only term depending the true variable x is x 2 .This energy term can be easily replaced as x 2 = z 2 − σ 2 .Using (25), we define the SURE, an unbiased estimate of mse, for complex-valued signal as: It is quite evident from (25) that ε sure is an unbiased estimate of mse( x), i.e., E {ε sure } = E {mse( x)} .If the standard deviation of mse( x) is "small", then ε sure is an extremely useful estimate of the mse( x), as it depends only on the observations z, the estimator f, and the noise variance σ 2 .

Threshold function for denoising
Stein's lemma 1 assumes differentiable functions-However, the WFF estimator defined in (15) contains the hard threshold function Θ λ H (.), which is not differentiable.As in [37], this roadblock is tackled by selecting the following threshold function, known as Linear expansion of threshold (LET) 1 in literature: where λ is the threshold parameter.In our implementation, λ is fixed as 3σ , where σ is the noise standard deviation.This value is obtained by an empirical tuning, specific to the SURE-fuse algorithm, which is explained in detail in section 4.5.Figure 3 shows the shape of an LET-based threshold function Θ λ L and a hard threshold function Θ λ H for σ = 0.3 and 0.9, representing a low and a high noise level respectively.The rationale behind LET is the following: i) it has almost shrinking capability, since Θ λ L (y) 0 for |y| λ ; ii) LET is differentiable and thus can be applied with mathematical formulation involving Stein's lemma 1; iii) LET exhibit anti-symmetry and thus avoids sign preferences; and iv) LET does not make changes to large valued y, i.e., Θ λ L (y) tends to y for large values of y.Equation ( 15) is modified using LET function to define the WFF let as 1 A soft thresholding function, which naturally might come to mind, is not applicable, as it is not differentiable as well.
4.3.Finding the divergence term, i.e., ∇. f(z) Calculation of ∇. f(z) involves the complex derivative in Wirtinger calculus sense [50].Expanding (28) and using the LET function (27) yields: = z k − 1 From eq. ( 30), the derivative may be written as (the details are developed in appendix A) The complex derivative and the derived risk estimate (SURE) is assessed in the following experiment.Here four different surfaces given in figs.10a to 10d are considered as the true absolute phase Φ with unit amplitude a. Observations following model ( 1) are generated for low to high level of noise (σ ∈ {0.3, 0.5, 0.7, 0.9}).The values of SURE (26) and of oracle mse are plotted in fig. 4 and all of them show that SURE is a very good approximation of the oracle mse.

SURE-Based fusing
Let us consider S different WFF let denoisers according to (28) with scales s = 1, 2, • • • , S. Let the windows be h i i=S i=1 and the corresponding WFF let estimates at pixel k, arranged in vector form, be f k (z For each pixel k ∈ G, a fused estimate f fuse k (z) is obtained by the following fusion formula: where We design an optimization frame work, aiming at finding best fusion coefficient a k , by minimizing SURE of f fuse k (z).To design an efficient multi-resolution WFF, f k (z) should contain WFF estimates having resolutions in all possible ranges.To accomplish this, S different windows are chosen in such a way that they cover a wide range of window sizes yielding estimates with low, medium and high resolutions.In such a choice of windows, the optimization is expected to pick up the best window out of the S available choices.Mathematically, this is possible only if the fusion coefficient for the best window is positive and real-valued and the coefficients for the other windows are close to zero.This indicates that even if we use complex-valued a k 's, the preferred feasible region of the SURE-fuse optimisation is a subspace that accommodates only real valued a k 's.Keeping this in mind, we design SURE-fusion using real-valued non-negative fusion coefficients, i.e., a k ∈ R S

≥0
We first define the mse of f fuse k (z) and then derive an optimizing frame work to minimize its SURE.It is to be noted that in (18), a global mse is defined.Hence the sample mean of the squared error is computed by  considering all the samples from the image, i.e., k ∈ G.But for a pixel-wise fusion process, mse for a pixel k is to be defined.This is taken care by defining a neighbourhood, ξ k , for each k and by assuming that the fusion coefficient is same within ξ k .We define ξ k to be the set of pixels that falls within a square window centred at k. 8  With this assumption in hand, the mse of f fuse k (z) is defined as: 8 Implementation settings of ξ k is explained in section 4.5.
In the above definition, we used the result a n = a k , ∀n ∈ ξ k , since we assumed that fusion coefficient is same in the neighbourhood ξ k .The unbiased estimate, SURE, of this mse can easily be define with the help of (26) as: = 1 = 1 (where * denotes the complex conjugate operator and c is a conxtant not depending on a k ) where The objective is to obtain a k such that it minimizes ε sure k .From (39), the following optimization problem is formulated: This is a quadratic program with non-negative constraints and can easily be solved.
In the following experiment, SURE-fusion is analysed using an image representation of the fusion coefficients {a k } k∈G .A Truncated Gaussian phase surface is used for the experiment.This is a Gaussian surface in which one quarter is removed to introduce sharp discontinuity.A highly noisy complex-valued observation is simulated as per model (1) (a = 1, σ = 0.9).SURE-fusion is done using four different scales s ∈ {2, 4, 6, 8}.The best scale, for each pixel, computed using pixel-wise minimum error (oracle), is indicated in fig.5b.From the image representation of the fusion coefficients (figs.5c to 5f), it is observed that the SURE-fusion is in agreement with the pixel-wise minimum error and adapts to the terrain topography.

Parameter settings of SURE-fuse WFF
WFT window: For the windowed Fourier analysis, we choose Gaussian function, i.e., s 2 .It is to be noted that the support of a Gaussian function should be larger than 6 times the standard deviation (6s/ √ 2), to accommodate most of the significant non-zero values of the function.Accordingly, the 2D-support of h, which we denote as n h , is chosen as: where the function y odd finds the nearest odd integer greater than or equal y.We prefer an odd-valued support to place the window symmetrically around the central pixel.PRC settings: For a window function with scale s and support n h , the values of D 1 and D 2 are carefully chosen to to ensure PRC.We recall the PRC criteria discussed in section 3: i) The support of h k 1 ,k 2 w.r.t.k 1 and k 2 are not greater than m D 1 and n D 2 respectively.Accordingly, we may set m D 1 = n h 1 and n D 2 = n h 2 .ii) D 1 and D 2 should be sub-multiples of m and n respectively.In order to guarantee i and ii together, me modify image size (m, n) to the multiple of (n h 1 , n h 2 ), in the respective dimensions, by padding zeros.The size of the new zero-padded image, denoted as (m, n), and the value of (D 1 , D 2 ) for perfect reconstruction are as follows: where is the ceiling function that performs 'rounds to the nearest higher integer' operation.The updated 2D-frequency resolution of our WFT analysis is 2π m/D 1 , 2π n/D 2 .
LET parameter λ : The parameter λ of the LET-based thresholding is heuristically tuned.It has been observed from the literature that this threshold performs well when expressed as a function of the noise level [36,37].In our implementation we perform an empirical tuning by observing the performance of SURE-fuse WFF, in terms of PSNR (46), for a wide range of of λ , expressed as a function of the noise level σ .The experiment is conducted with different test images and it is observed that the best value of λ in majority of cases is λ = 3σ .The tuning of λ done for 3 different test images are shown in fig.6 Neighbourhood ξ k for mse computation: Another important parameter is the size of the square neighbourhood ξ k which is used to compute mse.In general, for a better mse estimation, large number of samples are preferred.But on the other hand, since the non-local regions of the phase surface can be structurally very different, a very large neighbourhood is also not preferred.In our implementation we use 7 × 7 square neighbourhood centred at the pixel of interest.This is a heuristically tuned trade off.

Experiments & Results
In this section, we present a series of experiments to evaluate the performance of the developed algorithm and to compare it with the state-of-the-art.The SURE-fuse WFF is expected to beat the best resolution WFF or to show a result close to it.This is demonstrated by comparing the performance of SURE-fuse WFF with the individual WFFs that are being fused.We use the implementation by Kemao [31] for the individual WFFs.In addition to this, two algorithms, namely SpInPhase [29] and MoGInPhase [30] are also considered as the competitors, which are the state-of-the-art in InPhase denoising to the best of our knowledge.All these algorithms are tuned for the optimum performance as indicated in the respective papers.To evaluate the quality of the estimates, given that our main objective is InPhase denoising, we adopt peak signal-to-noise ratio (PSNR), defined as: where Φ is the true phase (unwrapped), Φ 2π , is the estimated wrapped phase and W is the wrapping operator defined in (3).In the initial parts, we deal with simulated data set that are carefully selected to represent various structurally different phase surfaces.In the final parts, we also use phase data from InSAR and MRI to illustrate the effectiveness of the developed algorithm in real world remote sensing and medical imaging scenarios.

SURE-fuse WFF versus WFF
Initially, the same experiments conducted with smooth (fig.2b) and non-smooth (fig.2a) mountains in section 4 are repeated here using SURE-fuse WFFs. Figure 7 shows the PSNR plots of SURE-fuse WFF and the individual WFFs as a function of noise variance.For better readability of the plots, WFFs with representative scales s = 1, 4, 7, 10 are shown; the performance of other WFFs lies in between.It can be observed from figs.7a and 7b that SURE-fusion's shows very competitive results, which are close to or better than the best WFF being fused.
In the above experiment, although a smooth and a non-smooth topology are considered, the structural properties of a particular topology is spatially similar throughout the image.In such cases, the optimum resolution does not much vary from pixel to pixel and the SURE-fusion will have close performance with the best WFF.Now we consider more complex structures where a single phase image has regions with both smooth and sharp phase variations.In such cases, SURE-fusion is expected to beat the best WFF, as fusion mechanism can pick-up the best resolution in a pixel-wise manner.To demonstrate this, we simulate a special surface, which we term as

SURE-fuse WFF versus state-of-the-art competitors
The multi-resolution ability of SURE-fuse WFF, in comparison with the normal WFF, was demonstrated in the previous section.In this section, we present more experiments and bring the comparisons with the state-of-the-art competitors.To consider a wide range of phase topologies, we use a data set which is shown in fig.10.Out of these surfaces, figs.10a to 10d are synthetic data.But we also use data from Interferometric SAR (InSAR), which is shown in fig.10e.This is the digital elevation model of a mountainous terrain around Longs Peak,  10e)-0 to 90 radians.A complex-valued data is created with each of these surfaces as the absolute phase (Φ) and with an invariant amplitude equal to 1.These is a very tough data set for the phase denoising algorithms as the underlying absolute phases contain wide range of structural properties that include smooth regions, planes, sharp peaks and valleys, sharp discontinuities, etc. Observations as per model ( 1) are generated for low to high level of noise (σ ∈ {0.3, 0.5, 0.7, 0.9}).
Ten different windows (S = 10), corresponding to the scales s ∈ {1, 2, 3, ...., 10}, are considered.Table 2 shows performance indicators of the developed algorithms in comparison with the state-of-the-art competitors.In the same table, we have also tabulated the results for the normal WFF, for each of the scales, using the implementation by Kemao [31].In table 2, the WFF results corresponding to the best window sizes are highlighted using rectangle boxes.It is evident from the table that in most of the cases, SURE-fusing yields the best result, even considering the best WFF estimate.Also in majority of the test cases, except for the Spear Plane, SURE-fusion beats SpInPhase and MoGInPhase with considerable margins.We would like to remark that the Spear Plane considered here has a periodic structure with high level of non-local self-similarity.For such structures, algorithms like SpInPhase and MoGInPhase are expected to perform very well as they are designed to exploit the non-local self-similarity.However, such synthetic images, a sinusoidal surface for another example, are structurally very different from the real life data.

Real MRI data
Experiments using real MRI interferograms are presented in this section.We remark that the MRI InPhase data shows same statistical properties as any other interferograms and is well modelled by (1) (please refer [17] for more details).Figure 11 shows the InPhase images collected from real MRI scan 2 from a human head along side, top and front orientations.These three phase images are used as clean InPhase images and iid white Gaussian noise with variance corresponds to σ ∈ {0.3, 0.5, 0.7, 0.9} is added, as per (1) to get the noisy data.Phase denoising is done for all these noise levels using SURE-fuse WFF with scale ∈ {1, 2, ..., 8}.In fig.12, SURE-fusion is comared with WFF.For better readability of the graphs, only the WFF plots corresponding to scale =  In this experiment, a moderate noise variance (σ = 0.5) is considered.In addition to this, the same experiment is repeated for other noise levels and the results are tabulated in table 3. From Figure 13 and table 3 it can be concluded that the developed algorithm is very competitive with the state-of-the-art, providing experimental evidence to its effectiveness and competitiveness to perform InPhase denoising in real data.

Interferometric SAR data (InSAR)
In this section, we present strong experimental evidences to demonstrate the robustness of SURE-fuse WFF in denoising real InSAR data.In SAR interferometry, the terrain topography is reconstructed from the InPhase data which are extracted from the complex-valued noisy SAR images collected at slightly displaced antennas.Let x 1 and x 2 be the acquired SAR images.Our focus is to denoise the noisy interferogram given by Φ 2π = arg (x 1 x * 2 ), where x * 2 denotes the complex conjugate of x 2 and represents the Hadamard product [51].The InSAR data used in the following experiments are distributed with the book [17].These data sets were generated based on real digital elevation models of mountainous terrains around Longs Peak and Isolation Peak, Colorado, using a high-fidelity InSAR simulator.We use six different SAR data, each having pixel size 152 × 152.The first three interferograms are collected from adjacent areas around Longs Pleak (LP), which we term as LP-1 (fig.14a), LP-2 (fig.14b), LP-3 (fig.14c).The latter three are from Isolation peak (ISP), and we term them as ISP-1 (fig.14d), ISP-2 (fig.14e), ISP-3 (fig.14f).The corresponding interferograms, corrupted with InSAR noise, are shown in figs.14g to 14l.The detailed description of the InSAR simulators are beyond the focus of this discussion and the readers are suggested to refer [17,Chapter 3].Also, for the additive and uniform noise variance modelling of the InSAR noise, we adopt the methodologies described in [29].
The estimation is carried out by neglecting the 'bad' pixels as indicated by the quality maps supplied along with the data.In fig.15, the InPhase estimates for two selected interferograms [Long Peak-1 (fig.14g) and Isolation Peak-2 (fig.14k)] are given for visual comparison.Careful examination of fig.15 shows ability of SURE-fusion to retain the minute details, which are over smoothened for its competitors.This is clearly supported by the PSNR values.The detailed results for the experiments conducted on these data sets are concluded in table 4. The displayed values indicates that SURE-fuse WFF brings remarkable improvements in quality of the estimates.This results provide strong experimental evidence of the advantages of SURE-fusion and leads to the conclusion that SURE-fusion is very well suited to the real and challenging InSAR data.to other two estimates.Also, in general, the interferometric fringes are more sharp and clear for SURE-fuse estimate.

Absolute phase imaging with phase unwrapping
The absolute phase estimation from noisy interferograms, is usually accomplished in two stages-Phase Denosing and Phase Unwrapping.This paper focuses on phase denoising part.But we remark that phase denoising is a crucial step in absolute phase estimation as the quality of the denoised InPhase images plays a major role in the success of the proceeding stage, i.e., phase unrapping.The preservation of sharp fringes and discontinuities of the interferogram is an important attribute of any phase denoising algorithm.In this section, the SURE-fuse WFF estimates are unwrapped using the state-of-the-art phase unwrapping algorithm, termed as PUMA [52].
The section contains both qualitative and quantitative evaluation of SURE-fuse WFF in comparison with its competitors.The quality of the unwrapped denoised phase ( Φ) is measured based on the number of estimated pixels having error less than π compared to the true phase image (Φ).We define number of error larger than π (NELP [29]) as: where High quality InPhase estimates should have low NELP values and vice versa.Also, based on the set J we define a new peak signal-to-noise ratio, denoted as PSNR a as follows: PSNR a := 10 log 10 4Nπ 2 where the notation Φ J stands for the restriction of Φ to the set J [29].

Phase unwrapping of simulated data set
Phase unwrapping is demonstrated using two different surfaces-Truncated Gaussian (fig.18a) and peak-valley (fig.18e) surfaces.Heavily noisy interferograms are generated (σ = 0.9, a = 1 in model ( 1)) using these surfaces as the absolute phases (Φ).The underlying absolute phase estimation is quite challenging as the interferograms are highly noisy and they contain sharp irregularities.These noisy wrapped phases are denoised using SURE-fuse WFF and its competitors and then unwrapped using PUMA [52].Figures 18b to 18d show the results for truncated Gaussian.Form a careful examination, the competitive nature of the SURE-fuse WFF in retaining the sharp discontinuities as well as the smooth and flat regions can be observed.This is supported by the NELP and PSNR a values.
The second surface in fig.18e is more challenging as it contains narrow peaks and pits.The SpInPhase estimate (fig.18f) has very poor quality on the smooth areas.The MoGInPhase estimate (fig.18g) has better smooth areas but many narrow peaks are missing as a result of the smoothening.The SURE-fuse estimate (fig.18h) is much better, compared to its competitors, in retaining the narrow peaks as well as the smooth regions.These qualitative observations are supported by NELP and PSNR a values with more than 2 dB improvement.We remark that, even in SURE-fuse estimate, some of the narrow peaks are missing or badly estimated; but this is a very challenging data with a heavily noisy interferogram.

Phase unwrapping of InSAR images
The absolute phase estimation is performed using real InSAR data described in section 5.   Finally, we repeat the previous experiments with real InSAR data using other phase denoising algorithms.The results tabulated in table 5 show that SURE-fuse WFF has around 1 dB (or more) improvement in PSNR a for most of the test cases.Also the NELP values of SURE-fuse are much better compared to that of its competitors.This is a very strong empirical proof that makes the SURE-fuse WFF a very good choice in real InSAR phase imaging applications.

Conclusion
This paper introduced SURE-fuse WFF, an algorithm based on multi-resolution windowed Fourier analysis, for interferometric phase image denoising.InPhase images, being natural images, show high level of sparsity in the frequency domain and WFT is a powerful tool to be exploited to denoise such images.Through a brief discussion on the pros and cons of WFT-based representations, we arrived at the conclusion that the fixed resolution (window size ) sets a major bottle neck in the performance of WFT.This issue has been addressed by proposing a new multi-resolution WFT, termed as SURE-fuse WFF, aiming at fusing the WFF-estimates having different resolutions to achieve multi-resolution.A linear, pixel-wise and data-dependent fusion mechanism in the complex domain is introduced.An optimization frame work that minimizes the unbiased estimate of mean square error, known as SURE, is developed to obtain the fusion weights.
Through a series of demonstrative experiments, we show the data dependent fusing ability of SURE-fuse WFF to achieve multi-resolution.A very detailed experiment section is provided to prove that SURE-fusion outperforms the best hand tuned WFF.Also strong qualitative and quantitative experimental evidences are provided using a set of synthetic, real InSAR and real MRI data to show that the SURE-fuse WFF outperforms the state-of-the-art InPhase denoising algorithms.
Appendix A : Derivation of divergent ∇. f (z) As per (21), calculation of ∇. f (z) involves the complex derivative ∂ f k (z) ∂ z k . Here the derivative w.r.t.complex-valued variable are not straight forward and we adopt Wirtinger calculus [50] which is explained in (22).According to this framework, ∂ z k ∂ z k = 1 and ∂ z k ∂ z * k = 0. Expanding (28) using the LET function ( 27) yields: Now from (4) we have,  )

9 Figure 3 .
Figure 3. Shapes of LET and Hard Threshold function for λ = 3σ

Figure 4 .
Figure 4. SURE vs oracle mse.WFF window corresponding to the scale s = 4 is used

( a ) 8 Figure 5 .
Figure 5. Image representation of SURE-fusion coefficients for a Truncated Gaussian surface of size:120 × 120 and phase range: 0 to 44 radians.

Figure 6 .
Figure 6.Emperical tuning for λ using 3 different test images.(The absolute phase surface are given in figs.10a, 10c and 10d respectively)

Figure 8 .
Figure 8. Peak-valley surface.Simulated phase surface of size 120 × 120 and phase range from 0 to 22 radians

, 4 & 8 ,
representing low, medium and large sized windows, are shown.Figures 12a to 12c show that SURE-fuse WFF always outperforms the best scale WFF.The comparison with the state-of-the-art competitors, using the same experiments is also done and the resulting estimates along with PSNR values are shown in fig.13.2The work was carried out on a 1.5 T GE Signa clinical scanner operating within Western General Hospital (WGH), University of Edinburgh.

(a)Figure 11 .
Figure 11.Real MRI interferograms of a human head obtained by scanning along 3 different directions

Figure 13 .
Figure 13.Clean, noisy and estimated MRI phase images.Results for SURE-fuse WFF versus state-of-the-art algorithms are shown.Top, middle and bottom rows contain side, top and front views respectively.

Figure 14 .Figure 15 .
Figure 14.Real digital elevation model data from Longs Peak (LP) and Isolation Peak (ISP).Top row : Clean interferogram, Bottom row: Corresponding noisy data with InSAR noise.
3 (figs.14a to 14f).The original digital elevation model (DEM) of the clean InSAR data collected from Longs Peak is shown in

Figure 20 .
Figure 20.Top row: DEM of the clean InSAR data collected from Isolation Peak.Bottom row: The absolute phase estimates obtained from the noisy interferograms of the top row (please see figs.14j to 14l for the noisy interferograms) using SURE-fuse WFF and PUMA unwrapping [52].

Table 2 .
Performance indicators for surfaces shown in fig.10.Sp:SpInPhase, MoG: MoGInPhase, s: scale of WFF.Best values among WFF estimates are shown in boxes.Best among Sp, MoG and SURE-fuse are shown in bold.

Table 3 .
Performance indicators for surfaces shown in fig.11.Best values are shown in bold.
Next we present experiments based on an InSAR Single Look Complex (SLC) data set, having pixel size 200 × 250, collected from an area near Evaggelistria, Greece (from 38 • 18'30"N to 38 • 19'32"N and from 23 • 01'15"E to 23 • 04'24"E), distributed by European Space Agency (ESA)3.As it is evident from fig.16, this is a highly challenging Interferometric data set which is corrupted with heavy noise.

Table 4 .
PSNR comparison for InSAR data

Table 5 .
Performance Comparison for Phase Unwrapping with real InSAR data