Quantization in Compressive Sensing: A Signal Processing Approach

Influence of the finite-length registers and quantization effects on the reconstruction of sparse and approximately sparse signals is analyzed in this paper. For the nonquantized measurements, the compressive sensing (CS) framework provides highly accurate reconstruction algorithms that produce negligible errors when the reconstruction conditions are met. However, hardware implementations of signal processing algorithms involve the finite-length registers and quantization of the measurements. An analysis of the effects related to the measurements quantization with an arbitrary number of bits is the topic of this paper. A unified mathematical model for the analysis of the quantization noise and the signal nonsparsity on the CS reconstruction is presented. An exact formula for the expected energy of error in the CS-based reconstructed signal is derived. The theory is validated through various numerical examples with quantized measurements, including the cases of approximately sparse signals, noise folding, and floating-point arithmetics.


I. INTRODUCTION
C OMPRESSIVE sensing (CS) theory provides a rigorous mathematical framework for the reconstruction of sparse signals, using a reduced set of measurements [1]- [9]. Advantages of CS are directly related to the signal transmission and storage efficiency, which is crucial in big data setups. Moreover, the problem of the physical unavailability of measurements, or the problem of a significant signal corruption, are also potentially solvable within the CS framework. Since the establishment of CS, the phenomena related to the reduced sets of measurements and sparse signal reconstruction have been supported by the fundamental theory and well-defined mathematical framework, while the performance of the reconstruction processes have been continuously improved by newly introduced algorithms, often adapted to perform in a particular context, or to solve specific problems [10]- [15]. In real applications, many signals are sparse or approximately sparse in a certain transformation domain. This makes the CS applicable in various fields of signal processing [14].
Ideally, the measurements used for the reconstruction should be taken accurately, assuming a very large number of bits in their digital form (providing high precision levels). However, this could be extremely demanding and expensive for I. Stanković hardware implementations [16]. Therefore, in practice, the measurements are quantized, meaning that they are represented using a limited number of bits. Such measurements bring robustness, memory efficiency and simplicity in the corresponding hardware implementation (particularly in sensor design). This paper investigates the quantization influence on the CS reconstruction with a simple yet rigorous characterization of the related phenomena, through the derivation of the corresponding errors. The theory is supported by a relevant theoretical framework and detailed statistical analysis, through extensive numerical experiments.
The most extreme case of quantization is limiting the measurements using only one bit. In previous work [16]- [19], one-bit measurements are initially treated as sign constraints, as opposed to the values to be matched in the mean squared sense during the reconstruction process. Quantization to onebit measurements is suitable for hardware systems since the quantizers do not suffer from dynamic range issues. However, as signs of measurements do not provide amplitude information of the signal, it can be recovered up to a constant scalar factor only. Additionally, more measurements are needed for a successful reconstruction for such systems, exceeding the signal length. In this paper, we focus on the general B-bit quantization of available measurements and its effect to the reconstruction accuracy.
If the reconstruction is performed without measurements quantization, the error will be equal to zero or negligible. The quantization step in measurements inevitably introduces reconstruction errors. The effects of quantization in compressive sensing theory has been recently presented [20]- [25]. The results mainly include the derivation of quantization error bounds and adaptation of CS algorithms aiming to reduce the distortions related to the quantization [4], [5]. The upper bound of the reconstruction error, for strictly sparse signals, has been derived in [20]. Other reported results are focused on the worst case analysis [21]. Exact asymptotic distortion rate functions have been derived in [21] for scalar quantization, where the reconstruction strategies have been adapted to accommodate quantization errors. An overview of the quantization phenomena in the compressive sensing context is presented in [22]. Therein, the fundamental analysis provides the performance bounds only, with an additional focus on the Sigma-Delta quantization and the related theory. Recently, the effect of quantization on the estimation of sparsity order, support and signals have been studied within a large number of Monte Carlo simulations in [23]. The most frequently used algorithms in compressive sensing are adjusted to the quantization effect arXiv:1907.01078v1 [cs.IT] 1 Jul 2019 in [24]. For the case of one-bit unlimited sampling, a quantization approach using the one-bit modulo samples, [25] shows the bounds of the reconstruction error.
This paper aims to fill the literature gap regarding the exact characterization of the quantization in compressed sensing, by deriving an explicit relation for the mean squared error, instead of the error bounds. The error produced by the quantization of measurements is analyzed from a practical signal processing point of view. Additional to that, the error appearing when approximately sparse signals are reconstructed under the sparsity constraint is also examined. The analysis is expanded to include the effect of the pre-measurements noise in the sparsity domain coefficients, known as the noise folding [26]. The presented theory is unified by exact relations for the expected squared reconstruction errors, derived to take into account all the studied effects. The results are validated using three different reconstruction algorithms. Moreover, we comment on the modifications of the derived relations, required to include the floating-point arithmetics.
The paper is organized as follows. In Section II, basic CS concepts and definitions are briefly presented. Section III introduces a common approach to solve the CS reconstruction problem, including also a brief overview of relevant properties which characterize possible solutions. Section IV puts the quantization within the compressive sensing framework. In Section V, the concept of nonsparse (approximately sparse) signals reconstructed under the sparsity constraint is analyzed, leading to the reconstruction error equation which unifies the studied effects. The theory is expanded, to take into account the noise folding effect, in Section VI, while the Section VII discusses the quantization in floating-point arithmetics. Numerical results verify the presented theory in Section VIII. The paper ends with the concluding remarks.

II. BASIC COMPRESSIVE SENSING DEFINITIONS
Definition: A discrete signal x(n), n = 0, 1, . . . , N − 1 is sparse in one of its representation domains X(k) if the number K of nonzero coefficients is much smaller than the total number of samples N , that is, where K N . Definition: A measurement of a signal is a linear combination of its sparsity domain coefficients X(k), or in matrix form where y is an M × 1 (M -dimensional) column vector of the measurements y(m), A is an (M × N )-dimensional measurement matrix with coefficients a m (k) as its elements, and X is an N × 1 (N -dimensional) column sparse vector of coefficients X(k). It is common to normalize the measurement matrix such that the energy of its columns is 1. In that case, the diagonal elements of the matrix A H A are equal to 1, where A H denotes a Hermitian transpose of A. By definition, a measurement of a K-sparse signal can be written as The compressive sensing theory states that, under certain realistic conditions, it is possible to reconstruct a sparse Ndimensional vector X from a reduced M -dimensional set of measurements (M < N ) belonging to vector y, The reconstruction conditions are defined in several forms. The most widely used are the forms based on the restricted isometry property (RIP) and the coherence index [1]- [4]. Although providing tight bounds, the RIP based condition is of high calculation complexity. This is the reason why the coherence based relation will be considered in this paper, along with some comments on its probabilistic relaxation.
The reconstruction of a K-sparse signal, X is unique if K < (1 + 1/µ) /2, where the coherence index, µ, is equal to the maximum absolute off-diagonal element of A H A, assuming its unity diagonal elements.
A simple proof will be provided later.
Formally, compressive sensing aims to solve the optimization problem min X 0 subject to y = AX, or its corresponding relaxed convex form. Amongst many others, an approach based on matching the components corresponding to the nonzero coefficients, can be used to solve (4). It is further assumed that the CS reconstruction is based on a such methodology. The solution is discussed in the next section, since it will be used to model the quantization noise and other studied effects.

III. PROBLEM SOLUTION
To perform the reconstruction, we use an iterative version of the orthogonal matching pursuit algorithm from [10]. Assume that K nonzero values X(k) are detected at positions k ∈ K = {k 1 , k 2 , . . . , k K }. The system of measurement equations becomes The system is solved for the nonzero coefficients X(k), k ∈ K written in vector form as where only the columns corresponding to the positions of nonzero elements in X(k) are kept. The solution of the system is where pinv(A M K ) is the pseudo-inverse of the matrix A M K and A H M K A M K is known as a K × K Gram matrix of A M K . Therefore, the problem solution can be split into two steps: 1) detect the positions of nonzero coefficients, and 2) apply the reconstruction algorithm at detected positions.

A. Initial Estimate
Detection of the positions of nonzero coefficients X(k) will be based on the initial estimate concept. An intuitive idea for the initial estimate comes from the fact that the measurements are obtained as linear combinations of the sparsity domain coefficients, with rows of the measurement matrix A acting as weights. It means that the back-projection of the measurements y to the measurement matrix A, defined by can be used to estimate the positions of nonzero coefficients.
For the coefficient at the kth position, its initial estimate X 0 (k) takes the following form: or where are the coefficients of mutual influence (interference) among elements X(k). They are equal to the elements of matrix and µ(k, k) = 1. Note that the µ represents also the coherence index condition.
For various values of k i , the off-diagonal elements µ(k i , k) of matrix A H A act as random variables, with different distribution for different measurement matrices. For the partial discrete Fourier transform (DFT) matrix, distribution of µ(k i , k) tends to a Gaussian distribution for 1 M N , while for an equiangular tight frame (ETF) measurement matrix, µ(k i , k) takes only the values such that |µ(k i , k)| = µ. Distributions of µ(k i , k) for other measurement matrices can be also easily determined.
The reduced set of measurements (samples) manifests as a noise in the initial estimate, which therefore acts as a random variable, with mean-value and variance given by where δ(k) = 1 only for k = 0 and δ(k) = 0, elsewhere. In the analysis of the reconstruction error, we are interested in the variance of random variable µ(k i , k), i.e.

B. Detection of Nonzero Element Positions
The initial estimate can be used as a starting point for an analysis of the reconstruction performance and its outcomes. Potentially, such analysis can lead to the improvements of the reconstruction process. The detection can be done in one step or in an iterative way.
One-step detection: In an ideal case, matrix A H A should be such that the initial estimate X 0 contains K coefficients higher than the other coefficients. Then by taking the positions of the highest coefficients in (7) as the set K, the signal is simply reconstructed using (6).
Iterative detection: The condition that all K nonzero coefficients in the initial estimate X 0 are larger than the coefficient values X 0 (k) at the original zero-valued positions k / ∈ K, can be relaxed using an iterative procedure. To find the position of the largest coefficient in X based on X 0 , it is sufficient that the coefficient X 0 (k) has a value larger than the values of the coefficients X 0 (k) at the original zero-valued coefficient positions k / ∈ K. Remark 1: Solution uniqueness. The worst case for the detection of a nonzero coefficient, with a normalized amplitude 1, occurs when the remaining K − 1 coefficients are equally strong (i.e. with unity amplitudes). Then, the influence of other nonzero coefficients to the initial estimate of the considered coefficient may assume its highest possible value. The influence of the kth coefficient on the one at the ith position is equal to µ(k i , k), given by (10). Its maximum possible absolute value is the coherence index µ. In the worst case, the amplitude of the considered coefficient in the initial estimate is 1 − (K − 1)µ. At the position where there the original coefficient X(k) is zero-valued, in the worst case, the maximum possible contributions µ of all K coefficients sum up in phase to produce the maximum possible disturbance Kµ. The detection of the strongest coefficient is successful if producing the well-known coherence condition for the unique reconstruction K < (1 + 1/µ) /2.
After the largest coefficient position is found and its value is estimated, this coefficient can be subtracted and the procedure can be continued with the remaining (K − 1)-sparse signal. If the reconstruction condition is met for the K-sparse signal, then it is met for all lower sparsities as well.
The procedure is iteratively repeated for each coefficient. The stopping criterion is that A M K X K = y holds for the estimated positions {k 1 , k 2 , . . . , k K } and coefficients X(k).
The method is summarized in Algorithm 1.

Algorithm 1 Reconstruction Algorithm
Input: Vector y, matrix A, assumed sparsity k ← position of the highest value in A H e 4: K ← K ∪ k

5:
A K ← columns of matrix A selected by set K 6: e ← y − y K 9: end for Output: Reconstructed X R = X K and positions K.
Remark 2: Solution exactness. The coherence index condition guarantees that the positions of the nonzero elements in X will be uniquely determined. Next, we will show that the values of nonzero coefficients will be recovered exactly.
The system of linear equations in (9), for k ∈ K, can be written in a matrix form as where B is a K × K matrix with elements b pi = µ(k p , k i ), X 0K is the vector with K elements obtained from the initial estimate as X 0K (i) = X 0 (k i ), and X K is the vector with K corresponding coefficients from the original signal. The influence of the other K − 1 coefficient to the considered coefficient is denoted by C K .
The reconstructed coefficients X R , at the nonzero coefficient positions, are obtained by minimizing where A K is a matrix obtained from measurement matrix A by keeping the columns for k ∈ K. Since A H K y = X 0K , according to (7), we can rewrite (14) as Since X 0K = BX K , the reconstruction is exact if The reconstruction algorithm produces correct coefficient values X(k) at the selected positions k ∈ K. It means that Fig. 1. Illustration of a system for the reconstruction of a sparse signal the influence of other K − 1 coefficients to each coefficient in the initial coefficient estimate X 0 (k), denoted by C(k), is canceled out.
In summary, the reconstruction algorithm for a coefficient at a position k ∈ K, works as an identity system to the original signal coefficient in X 0 (k), eliminating the influence of other coefficient at the same time, Fig. 1.

C. Noisy Measurements
Assume next that the observations are noisy with a zero-mean signal independent noise ε. The noise variance of the assumed additive input noise ε is σ 2 ε and the covariance is given by Variance of X 0 (k) due to the input noise in measurements, is σ 2 X0(k) = σ 2 ε since it has been assumed that the columns of A have unite energy, The noise variance in the reconstructed coefficient is (Remark 2 and Fig. 1) var{X R (k)} = σ 2 ε . Since the noise is independent in each reconstructed coefficient, the total mean squared error (MSE) in K reconstructed coefficients is If the partial DFT matrix is formed as a submatrix of the standard inverse DFT matrix (with normalization 1/N ), then we would get X R − X K 2 2 = KN 2 σ ε 2 /M , as shown, for example, in [15].

IV. QUANTIZATION EFFECTS
Traditional CS theory does not consider the limitations in the number of bits used for the measurements representation. This can affect the reconstruction performance of the standard CS approaches.
The measurement quantization is particularly important in the hardware implementation context. One-bit measurements are the most extreme case, promising simple, comparatorbased hardware devices [16]. The one bit used represents the sign of the sample, i.e. y = sign{AX}. However, a larger number of samples is required for an accurate reconstruction, which is difficult to achieve using only the sign of a measurement [16], [17].
A more general form of the hardware implementation uses a B-bit digital sample of a measurement. We will assume that the measurements are stored into (B + 1)-bit registers (one sign bit and B bits for the signal absolute value), whereas the reconstruction of the coefficients X(k) is done in a more realistic sense for hardware purposes. The requirement for storage is also significantly reduced for such measurements, since the total number of bits is reduced. Note that, for a complex-valued signal x(n), the measurements y B are also complex, formed as where both real and imaginary part of measurements are quantized to B-bits.

A. Quantization errors
Quantization influences the results of the compressive sensing reconstruction in several ways: • Input signal quantization error, described by an additive quantization noise. This influence can be modeled as a uniform noise with values between the quantization level bounds. • Quantization of the results of arithmetic operations. It depends on the way how the calculations are performed. • Quantization of the coefficients in the algorithm. However, being deterministic for a given measurement matrix, this type of error is commonly neglected from the analysis. In order to perform an appropriate and exact analysis, some standard assumptions are further made: • The quantization error is a white noise process with a uniform distribution. • The quantization errors are uncorrelated. • The quantization errors are not correlated with the input signal. The most important source of an error is the quantization of the measurements y(m) and the quantization of the measured sparse signals X(k), referred to as the quantization noise folding. They will be analyzed next.

B. Input signal ranges
Assume that registers with B bits, with an additional sign bit, are used and that all measurements are normalized to the range −1 ≤ y(m) < 1.
The total number of bits in a register is b = B + 1.
In that case, it is important to notice that the sparse signal coefficients X(k) must be within the range − min{ √ M /K, 1} < X(k) < min{ √ M /K, 1} so that y = A M K X K does not produce a value with amplitude greater or equal to 1. For the partial DFT matrices, this condition holds in a strict sense, while for the Gaussian matrices it holds in a mean sense (all values whose amplitudes are greater than 1 are quantized to the closest level with amplitude below 1). Note that the butterfly schemes for the measurements calculation (as in the quantized FFT algorithms) could extend this bounds for X(k) so that the maximum range −1 < X(k) < 1 can be used.

C. Measurements quantization
For the B-bit registers, the digital signal values y B are coded into a binary format. When the signal amplitude is quantized to B bits, the difference in amplitude which produces the quantization is called the quantization error. The quantization error is bounded by where ∆ is related to B through The quantization error of a signal can be defined as an additive uniform white noise affecting the measurements where e is the quantization error vector with elements e(m).
The mean and variance of the quantization noise are calculated as [13] µ e = E{e} = 0, Note that, for a complex-valued signal, both real and imaginary parts of samples contribute to the noise. Therefore, in this case, the variance of the quantization noise can be written as Considering y as noisy measurements, the initial estimate will result in a noisy X 0 (k). Since X 0 (k) is calculated from (8), with the quantization noise in measurements, its variance will be σ 2 Therefore, the noise variance in the output (reconstructed) coefficients, for the system shown in Fig. 1, is equal to the input noise variance Since only K out of N coefficients are used in the reconstruction, the energy of the reconstruction error is where for notation simplicity we have used X R − X K 2 2 to denote the expected value of the squared norm-two of the vector X R − X K . The full and complete notation of the left side of (28) would be E{ X R − X K 2 2 }.

D. Sparsity to Number of Bits Relation
Based on the previous relations, influence of the quantization with B bits can be related to the sparsity K. The error energy in the reconstructed coefficients will remain the same if It means that reduction of B to B − 1 will require sparsity reduction from K to K/4. The logarithmic form of the reconstruction error is e 2 = 10 log X R − X K 2 2 = 3.01 log 2 K − 6.02B − 7.78.

V. NONSPARSITY INFLUENCE
Due to many circumstances, majority of signals in realworld scenarios are only approximately sparse or nonsparse. This means that a signal, in addition to the K-sparse large coefficients, has N − K coefficients in the sparsity domain which are small but nonzero. Assume such an approximately sparse (or nonsparse) signal X. The signal is reconstructed under the K-sparsity constraint using Algorithm 1, with the reconstruction conditions being satisfied in the compressive sensing sense, thus allowing that the algorithm can detect the K largest coefficients.
The reconstructed signal X R then has K reconstructed coefficients with amplitudes X R (k 1 ), X R (k 2 ), . . . , X R (k K ). The remaining N − K coefficients which are not reconstructed are treated as a noise in these K largest coefficients. Variance from a nonzero coefficient, according to (13) N − 1)). The total energy of noise in the K reconstructed coefficients X R will be where X K is the sparse version of the original (nonsparse) signal, i.e. a signal with K largest coefficients from X, and others set to zero. Denoting the energy of remaining signal, when the K largest coefficients are removed from the original signal, by we get For the partial DFT measurement matrix, the result will be In the case when the signal is strictly K-sparse, i.e. X = X K , and if the reconstruction is performed with nonquantized measurements, the reconstruction would be ideal and the error would be X R −X K 2 2 = 0 (or negligible). Since the measurements are quantized by B-bits, error of the form (28) will be introduced.
In the case of a nonsparse signal, a general expression is obtained by combining (28) and (32) to get This result will be validated by examples in the next section, by calculating the signal-to-noise ratio (SNR) of each result and comparing it with the statistical SNR given by

VI. NOISE FOLDING QUANTIZATION
Additionally to the nonsparse case, we extend our analysis to include the case when a quantization noise z exists in the signal coefficients X, prior to taking the measurements [26]. In this case, the measurements are of the form which can be rewritten as where v = e − Az, and the total quantization noise affecting the signal measurements is denoted by e with covariance σ 2 e I. The quantization noise vector z is random with covariance σ 2 z I, being independent of e. Therefore, the resulting noise v is characterized by a covariance matrix If the considered measurement matrix A is formed as a partial Fourier matrix, the relation AA H = N M I holds. The variance of v is then with the covariance matrix C = σ 2 v I. However, for the sparse case, the quantization error is present in only K nonzero elements of X. It means that the noise Az variance is K M σ 2 e or For the nonsparse partial DFT matrix case, K(N − M )/(M (N − 1)) X − X K 2 2 is added to the right part of the Eq. (41) where it is assumed that the quantization of K largest elements in X is dominant in that part of error. It is shown that all previous relations, for various measurement matrices, can be applied to this case.

VII. FLOATING POINT REGISTERS
In floating point registers, the quantization error is modeled as a multiplicative error where e is the quantization error vector with elements e(n). As in the classical digital signal processing, for the analysis of floating point arithmetics, it will be assumed that the sparse coefficients X(k i ), i = 1, 2, . . . , K, are independent, equally distributed zero-mean random variables, with variance σ 2 X . The coefficients X(k i ) are statistically independent from the measurement matrix A elements a m (k). The mean value of the quantization error is E{y(m)e(m)} = 0.

The variance is
for all measurement matrices with normalized energy columns, when their elements a m (k) are equally distributed. This means that the quantization noise y(m)e(m) has the variance σ 2 X σ 2 e K/M and we can write (Remark 2 and Fig. 1) All formulas, in various considered scenarios, can now be rewritten, including the cases of nonsparse signals and noise folding. For example, if the measurements are normalized such that E{y 2 (m)} = σ 2 X K/M = 1 then X R − X K 2 2 = Kσ 2 e , that is the floating-point arithmetics produces the same results as the fixed-point arithmetics. However, if the range of measurement values is lower, for example, E{y 2 (m)} = σ 2 X K/M = 1/10, then the floating-point arithmetics will produce ten times lower error, X R − X K 2 2 = Kσ 2 e /10.

VIII. NUMERICAL RESULTS
Example 1: One realization of a sparse and nonsparse signal will be considered as an illustration of the reconstruction. a) Consider an N = 256-dimensional signal of sparsity K = 10, whose M = N/2 available measurements are stored into registers with B = 6 bits. The measurements matrix is a partial DFT matrix with randomly selected M out of N rows from the full DFT matrix, with columns being energy normalized. The the sparsity domain coefficients are assumed in the form ν(p)), for p = 1, . . . , K 0, for p = K + 1, . . . , N, where ν(p) is a random variable with uniform distribution from 0 to 0.4. Since this signal is sparse, the reconstruction error is defined by (28). The SNR is defined by (35) with X − X K 2 2 = 0. The original and the reconstructed signals are shown in Fig. 2(top). The statistical SNR is SN R st = 42.35 dB and SN R th = 42.56 dB.
b) The signal from a), with K = 10 significant coefficients, is considered here. However, we will also assume that the remaining N − K coefficients are small but not zero-valued, for p = 1, . . . , K in 100 realizations are presented in Fig. 3(a)-(c). Black dots represent the statistical results, SN R st , and the dash-dot lines show the theoretical results, SN R th . The agreement is high.
For nonsparse signals we used the model in (47). Random changes of coefficient amplitudes ν(p) are assumed from 0 to 0.2, while the amplitudes of the coefficients X(k) for k p / ∈ K are of the form X(k p ) = exp(−p/(8K)) in order to reduce its influence to the quantization level. With such amplitudes of the nonsparse coefficients, the quantization error dominates in the reconstruction up to B = 14, while the nonsparse energy is dominant for B ≥ 16, as can be seen in Fig. 3(d)-(f). Statistics is again in full agreement with the theoretical results.
Finally, the noise folding effect is included, taking into account that the input coefficients X(k) are quantized, in addition to the quantization of measurements y(m). Since the folding part of quantization error is multiplied by K/M 1 in (41), the results do not differ from those presented in Fig.  3(a)-(c). In order to test the influence of noise folding we assumed that the quantized input coefficients X(k) contain an additional noisy. An input additive complex-valued i.i.d. Gaussian noise with variance σ z = 0.0001 is added to these coefficients. This noise is of such a level that it does not influence quantization error for B < 14. However, for B ≥ 14, it becomes larger than the quantization error and its influence becomes dominant. The results with the quantization and noise folding, with additional noise, are shown in Fig. 3(g)-(i).

Example 3:
The statistical analysis is extended to other forms of the measurement matrices, namely the ETF, the Gaussian, and the uniform random matrix. All three forms of the signal and quatization error are considered here with M = 128 measurements. Sparse and nonsparse signals described in Example 2 are used in the analysis. The reconstruction error with various number of bits B = {4, 6, 8, 10, 12, 14, 16, 18, 20, 24} used in the quantization, and various assumed sparsity levels K are shown in Fig. 4(a)-(c). The results for nonsparse signals, reconstructed with assumed sparsity, are presented in Fig. 4(d)-(f). The noise folding is analyzed for a reduced number of bits in the quantization of X(k) and presented in Fig. 4(g)-(i).

Example 4:
The analysis of quantization effects is done with the assumption that the quantization errors are uncorrelated. This condition is met for all previously considered matrices. However, in the case of Bernoulli measurement matrix and a small signal sparsity this condition does not hold, meaning that we can not expect quite accurate estimation of the statistical error using the previous formulas. To explain this effect, we will start with the simplest case of signal whose sparsity is K = 1. The measurements are y(m) = a m (k 1 )X(k 1 ). For all previously considered matrices y(m) and y(n) are different for m = n and the quantization errors are independent. However, for the Bernoulli measurement matrix we have y(m) = ±X(k 1 )/ √ M . These measurements will produce only two possible quantization errors for all m = 1, 2, . . . , M . It means that M/2 errors in the initial estimate will sum up in phase, producing the mean square error with variance var{X R (k)} = M 2 σ 2 e . This is significantly higher than var{X R (k)} = σ 2 e in other cases. For K = 2, we get the measurements y(m) = (±X(k 1 ) ± X(k 2 ))/ , which is also used for the derivation of the theoretical results. Here we will show that we may expect similar results for other reconstruction methods as far as the reconstruction conditions are met. The simulation for the reconstruction with the partial DFT measurement matrix with sparse and nonsparse signals, including noise folding, are repeated with the iterative hard thresholding (IHT) reconstruction method, given in Algorithm 2 [12], [24]. The theoretical and statistical errors are shown in Fig. 6(a)-(c), showing high agreement between the statistical and theoretical results.
Example 6: In this example, the reconstruction of sparse and nonsparse signals, including noise folding, is performed using the Gaussian measurement matrix and the Bayesianbased method, [28], [29], summarized in Algorithm 3. The theoretical and statistical errors for the signals from Example 1 are shown in Fig. 7(a)-(c). Again, a high agreement between the statistical and theoretical results is obtained.
For each i 10: Remove columns from matrix A selected by R

14:
Remove elements from array d i selected by R

15:
Remove elements from vector p selected by R 16: until stopping criterion is satisfied 17: Reconstructed vector X nonzero coefficients are in vector V with corresponding positions in vector p, X pi = V i Output: • Reconstructed signal vector X R = V, the set of positions K = p.

IX. CONCLUSIONS
The effect of quantization noise to the reconstruction of signals under sparsity constraint is analyzed in this paper. If the measurements are not quantized, the reconstruction would be ideal and the error will be negligible. However, the quantization is a very useful step since the hardware realization of systems cannot store the exact values of samples. We derived the exact error of the reconstruction due to quantization noise. The cases when a signal is not strictly sparse is analyzed, as well as the noise folding effect. The reconstruction performance is validated on numerical examples, and compared to the statistical error calculation concluding a high agreement between them. Sparse signal with measurements quantized to fit the registers with B bits for various sparsities using a uniform, Gaussian, and ETF measurement matrices, respectively. (d)-(f) Nonsparse signal with measurements quantized to fit the registers with B bits for various sparsities using a uniform, Gaussian, and ETF measurement matrices, respectively. (g)-(i) Nonsparse signals when both the measurements y and input coefficients X are quantized to B bit fixed point registers (quantization noise folding)using a uniform, Gaussian, and ETF measurement matrices, respectively.

Influence of quantization to the reconstruction condition
Quantization noise can be included in the coherence index based relation for the reconstruction. The worst case amplitude of the considered normalized coefficient in the initial estimate This inequality follows from the relation between the normtwo and the norm-one of a vector. For the partial DFT matrix, the random partial DFT matrix, and the Bernoulli matrix the equality ν = √ M holds. Following the same reasoning as in Remark 1, we may conclude that at a position where the the ordinal coefficient X(k) is zero-valued, in the worst case, the maximum possible disturbance is Kµ + , producing the condition for reconstruction Influence of the quantization error to the uniqueness condition will be negligible if √ M ∆ 1 holds. Coherence index based values guarantee exact reconstruction, however, they are pessimistic. More practical relations can be obtained by considering the probabilistic analysis [6], [30]. The resulting disturbance in the initial estimate at the position k ∈ K due to the other coefficients and quantization noise behaves as a Gaussian random variable N (1, (K − 1)σ 2 µ + σ 2 e ), for K 1 . The initial estimate at k / ∈ K behaves as N (0, Kσ 2 µ + σ 2 e ). Probabilistic analysis may provide approximative relations among N , M , and K, for a given probability. We have performed the statistical analysis with various measuremet matrices. The results of this analysis lead to the conclusion that for high probabilities of reconstruction we may neglect the quantization effect influence to the reconstruction condition for B ≥ 4.
Note that for large sparsities K we have found that the reconstruction probability can be improved by increasing the upper limit for iterations in Algorithm 1 for a few percents, with respect to the expected sparsity K. After the iterations are completed, the expected sparsity K is used in the final reconstruction. This solves the problem that the iterative reconstruction in Algorithm 1 cannot produce the exact result if it misses one of the nonzero coefficient positions during the iterative process for large K.