An Exact Solution for Sparse Sampling for Optimal Detection of Known Signals in Gaussian Noise

Detection of known signals of interest that are embedded in colored noise involves whitening the received samples and matched-filtering. In many applications, due to computational constraints, it is critical to select only a subset of the received samples for detection. This letter addresses the problem of selecting only a given number of temporal or spatial samples while maximizing detection performance for deterministic signals in first-order autoregressive Gaussian noise. The direct solution of this entails a combinatorial search, where the deflection coefficient is evaluated for each possible combination of sparse samples. This approach is infeasible when the number of samples is large since the number of possible combinations increases factorially with the number of samples. We present an efficient method to whiten Gaussian noise samples and express deflection coefficient in a form that is amenable to dynamic programming. Exploiting dynamic programming, we specify a feasible and efficient procedure to find optimal sparse samples where the number of computational steps increases linearly with the number of samples. Also, conditions under which uniform sampling is optimal is given.


I. INTRODUCTION
T HE problem of selecting a subset of temporal or spatial samples from available data sets to optimize a parameter of interest has a rich history [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22]. In this research, we address the problem of selecting M samples out of N samples while maximizing detection in a binary hypothesis problem, where the signal of interest is deterministic and the noise is Gaussian. For this detection problem, maximizing detection is equivalent to maximizing the deflection coefficient [23]. The Furthermore, for each possible combination, a matrix inversion is required to whiten the received data, which is computationally expensive. We present an alternative method to solve this problem efficiently, where the number of possible combinations that needs to be analyzed increases linearly with N and M . The crux of our method is an alternative procedure to whiten colored noise and a novel way to express the deflection coefficient. The unique representation we provide for the deflection coefficient is amenable to Dynamic Programming (DP), which enables solving the sample selection problem in a time that increases linearly with M [24, pp. 37]. Some of the approaches that attempt to optimally choose M samples out of N samples or design optimal sampling patterns can be found in [15], [16], [17], [18], [19], [20], [21], [22]. They have met with varying degrees of success and cannot be said to be optimal in general. The rate of the convergence of the detection metric of the discrete-time samples to that of the continuous-time observations is derived in [15]. Their results are of an asymptotic nature and they do not provide any method to find the sampling pattern that maximizes the detection metric. We provide a procedure for selecting optimal samples from finite data samples. Also, we have implicitly assumed the availability of only Nyquist samples and hence a discrete-time noise model with no assumptions about the underlying continuous-time origin of the data. The sampling schemes in [16], [17] are applicable only when N → ∞. [18] designs sparse sampling patterns that maximize I-divergence, J-divergence, Bhattacharya distance, and Chernoff distance. [19] considers Kullback-Leibler divergence (KLD) and Chernoff distance as the optimality criteria and proves that the problem is NP hard. They provide an algorithm to provide the suboptimal solution to the problem of sample selection while optimizing KLD and Chernoff distance. [20], [21], [22] use KLD and Bhattacharya distance as optimilaity criteria for Gaussian observations using submodular optimization. As a result, the approaches in [18], [19], [20], [21], [22] are suboptimal. Furthermore, the computation necessary to obtain the asymptotic solutions is considerable and not guaranteed to This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ lead to the true asymptotic solution. Our approach applies to finite N , guarantees optimality, and is feasible for any value of N . Also, our method uses an autoregressive (AR) noise model of order p = 1. With high enough p, an AR model approximates any reasonable noise process [25]. An AR(1) process is commonly used to model noise processes with a single peak in the power spectrum, which arise in applications such as sonar, radar, and speech processing [26], [27]. Details on estimation of the AR model for a given data set are given in [25]. Future work will attempt to extend our results to higher order AR processes. The overall contributions of this letter are 1) An alternative method to whiten the data for a signal embedded in AR(1) noise (Section II). 2) A novel representation of deflection coefficient that lends itself to DP (Section II). 3) A DP method to the find the subset of M samples that maximizes detection of deterministic signals (Section III). 4) An analytical method to find the subset of M samples that maximizes detection of known constant signals (Section IV-A). Our method applies to both temporal and spatial sampling. We use the notation and terminology for temporal sampling.

II. SIGNAL MODEL AND AN ALTERNATIVE EXPRESSION FOR DEFLECTION COEFFICIENT
Consider the following discrete-time binary hypothesis problem for a known deterministic signal in noise given as where H 0 and H 1 represent the null and alternative hypotheses, respectively, w[n] is a zero-mean discrete-time AR(1) process which satisfies the first order difference equation and u[n] is white Gaussian noise (WGN) with variance σ 2 u and mean 0. We desire to choose samples {x[n 1 ], x[n 2 ], . . . , x[n M ]} so that the detection performance is maximized. Without loss of generality, we assume that the samples are arranged in ascending order, i.e., We have also assumed in (3) and the autcorrelation coefficient sequence is The optimal detector consists of a whitening filter followed by a replica-correlator [23]. The detection metric is the deflection coefficient. To find the M samples that are optimal for detection, we need to evaluate the deflection coefficient, which entails computing the inverse of a covariance matrix. To circumvent the covariance matrix inversion and facilitate a closed form representation of the deflection coefficient, we employ an alternative approach to whiten the data samples. This approach to whiten noise has been exploited by Grenander in [28, pg. 118] to find the Radon-Nikodym derivative for a continuous-time problem. Using this approach, as derived in Appendix, al alternative expression for the deflection coefficient is Expressing deflection coefficient in the form given in (6) reveals its special structure. The computation of each term in the summation of (6)  To maximize d 2 d over the integer variables n 1 , n 2 , . . . , n M , we frame the maximization problem as To express (7) in a recursive form, we rewrite it as Since the term inside the braces can be expressed as The successive backward recursion provides the expressions . . . (I 1 (n 2 ) + g(n 2 , n 3 )) I 1 (n 2 ) = max n 1 (I 0 (n 1 ) + g(n 1 , n 2 )) .

A. Detection of Constant Signals
Consider the maximization of (6) for s[n] = A over the M sparse samples. Then, where it is assumed that 0 < −a [1] < 1 (the noise power spectral density is lowpass). Now let (with a slight abuse of notation) [1]) n i+1 −n i and therefore we wish to maximize J(n 1 , n 2 , . . . , n M ) = It is easily shown that for is a strictly increasing function as well as a strictly concave function of t. With these observations we have the following theorem.
Theorem 1 (Optimality of uniform samples for DC Level Signal): Assume that g(t) is a strictly increasing and strictly concave function over the continuous interval [1, T ]. To maximize 1). Thus, the optimal sparse samples will be uniformly spaced. Furthermore, assuming that T is an integer, denoted by N and that Δ * is an integer, the optimal samples will be at the integer samples, n * i = 1 + (i − 1)(N − 1)/(M − 1). Note that the first samples is at n * 1 = 1 and the last sample is at n * M = N . Proof: Using the one-to-one transformation, yields where the constraints are now t 1 ≥ 1 and since t M ≤ T , we have equivalently is strictly monotonically increasing, the maximizing value for t 1 must be its minimum value or t * 1 = 1. Thus, we need to maximize M −1 i=1 g(Δ i ) over only the Δ i . Since g(·) is strictly concave, we have that for all λ i ≥ 0 with equality if and only if Δ 1 = Δ 2 = · · · = Δ M −1 = Δ * . And finally, if the sample times are integers, which will be the case if Δ * = (T − 1)/(M − 1) = (N − 1)/(M − 1) is an integer, then the optimal sample times are as given in the theorem.

A. Detection of Sinusoidal Signals
Consider a sinusoidal signal s[n] = A sin(2πf 0 n), for n = 1, 2, . . . , N. Let A = 1, f 0 = 0.05 Hz, ρ = 0.88, N = 50, and M = 25. The signal samples are depicted in Fig. 1. The number of ways in which we can select 25 samples out of 50 samples is staggeringly large-1.2614 × 10 14 . Thus, brute force solution of this problem requires tremendous computing resources. However, with DP, we obtain the precise solution for this problem in 2.9 ms using MATLAB on an Intel(R) Core(TM) i7-8750H CPU @ 2.20 GHz machine. The optimal 25 samples are depicted in Fig. 1. Fig. 2 plots the receiver operation characteristic (ROC) curve for M = 20, obtained using 100,000 realizations of the test statistic for each hypothesis. The ROC curve obtained using the theoretical value of d 2 d aligns with the empirical ROC curve. The figure also plots the ROC curve corresponding to the naive sampling approach, where M samples with the largest signal powers are selected for detection. The optimal approach outperforms the naive approach as evidenced by the figure.
Substituting h = 0 in (14), we obtain the variance of y[n m ] for m = 1, 2, . . . , M, which is Substituting h > 0 in (14) Therefore, the variables in (13) are uncorrelated. We divide the variables in (13) by their standard deviations to obtain the following whitened samples In (17) N (0, I M ) and N (μ , I M ), respectively. Therefore, the optimal detector is T ( ) = T μ > γ and the deflection coefficient is d 2 d = μ T μ [23]. This is the detection metric also for the case when the amplitude of s[n] is not known [23]. Using (18), an efficient closed form expression for deflection coefficient is given by (6).
Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.