Information Flow Optimization for Estimation in Linear Models Using a Sensor Network

The problem considered is one of maximizing the information flow through a sensor network tasked with estimating, at a fusion center, an underlying parameter in a linear observation model. The sensor nodes take observations, quantize them, and send them to the fusion center through a network of relay nodes. The links in the network are assumed to satisfy certain capacity constraints in terms of the maximum number of bits that can be transmitted on the links. Furthermore, the relay nodes are assumed to satisfy flow conservation constraints, i.e., the number of bits flowing into a relay node is equal to the number of bits flowing out of it. It is shown that this flow optimization problem for estimation can be cast as a Network Utility Maximization (NUM) problem by suitably defining the utility functions at the sensors. The inference problem considered is one of parameter estimation with a linear observation model, which is studied in both Bayesian and non-Bayesian settings. Upper bounds on the mean-squared error (MSE) of optimal linear estimators are obtained in both settings, and these bounds are used to construct utility functions for the corresponding NUM problems. It is verified via simulations that the bit assignments at the sensors obtained through the solutions to the NUM problems, in both the Bayesian and non-Bayesian settings, yield considerably better estimation performance than the Max-Flow solution that simply assigns bits to the sensors in such a way as to maximize the total bits transmitted to the fusion center.

a certain number of bits (to be optimized), and send these bits to intermediate nodes. The intermediate nodes simply act as relays and route the bits from the various sensor nodes towards the fusion center. The fusion center is tasked with inferring the state of nature, specifically estimating a (vector) parameter associated with an underlying linear observation model. The links in the network are assumed to satisfy certain capacity constraints in terms of the maximum number of bits that can be transmitted on the links. Furthermore, the relay nodes are assumed to satisfy flow conservation constraints, i.e., the number of bits flowing into a relay node is equal to the number of bits flowing out of it. The problem of interest is therefore the optimization of the information flow in the network (optimization of bit allocations in the network to minimize MSE) for the estimation task at the fusion center. Note that the trade-off here is between the bits assigned to each sensor and the accuracy of the resulting parameter estimate. An important application area for such information flow optimization in sensor networks is the Internet of Battlefield Things [1], where accurate inference needs to be made at a central node (fusion center) based on sensor observations that are compressed and relayed to the central node via a wireless mesh network with stringent communication constraints. There has been related prior work on parameter estimation based on sequences of observations available through a set of distributed sensors (see, e.g. [2], [3], [4], [5], [6], [7]). In these works, the estimation problem is generally cast in a multi-terminal source coding framework. The goal in these works is to characterize the rate region for compression at the sensors, for a given distortion constraint at the fusion center. Also, the focus in all of these works is on single-hop networks, where the sensors have a direct connection to the fusion center. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ There has also been work on the one-shot version of the estimation problem, like the kind considered in this paper, where the sensors send quantized observations to the fusion center only once (e.g. [8], [9], [10], [11], [12], [13], [14]). The focus in these works is either on designing optimal quantizers or on designing optimal/near-optimal estimation algorithms based on quantized sensor observations. The closest work to ours is that in [13], in which the authors consider a Bayesian parameter estimation problem with a linear observation model, with the parameter and the sensor noise having Gaussian distributions. They also make the simplifying assumption that the quantization noise is independent of the quantizer input and is uniformly distributed in the quantization interval. Furthermore, they consider a single-hop network model, and assume a constraint on the total number of bits that can be transmitted from the sensors to the fusion center. In this paper, we consider both Bayesian and non-Bayesian settings, and make no explicit distributional assumptions on the parameter, the sensor noise, or the quantization noise as is done in [13]. Furthermore, we consider a general network structure with link capacity and flow conservation constraints, rather than a single-hop network with a constraint on the total number of bits transmitted from the sensors to the fusion center.
We consider linear observation models, such as the kind that arise in the application of direction of arrival estimation using a distributed sensor network [15]. We propose estimators that are linear functions of the quantized sensor observations available at the fusion center through the sensor network, and analyze their MSE. We then show that the inference objective at the fusion center can be modeled as a sum of concave utility functions, each of which captures the information content in the quantized observations of the individual sensors. We can then cast the information flow optimization problem as a Network Utility Maximization (NUM) problem [16], which can be solved efficiently.
We extend the preliminary version of this work [17] by moving from approximate to exact bounds for the design of utility functions, and additionally study the Bayesian setting.

II. NETWORK MODEL AND UTILITY MAXIMIZATION
We model the network (see Fig. 1) as a directed graph G = (V, E), where V is the set of all nodes (sensors, relay nodes and fusion center) and E is the set of all directed edges (links between the nodes). Let S ⊂ V denote the set of sensor nodes. Let t ∈ V denote the fusion center, which is a sink node. In this graphical model, each ordered pair of edges (u, v) ∈ E has a capacity c (u,v) associated with it, which is the maximum possible bits that can be transmitted from node u to node v. Let the number of bits transmitted in the link (u, v) ∈ E be b (u,v) . The bit flow obeys flow conservation constraints at each relay node.
The performance of an algorithm which assigns number of bits on each directed edge, is measured in terms of the information content regarding the state of nature to be inferred, in the messages sent by the sensor nodes. In this regard, suppose that there exists a concave utility function g s : R → R, that captures the notion of information content, relevant to the inference task, in the message transmitted from sensor node s. (We will characterize such utility functions for the inference task of interest in the paper in Section III.) The goal of information flow optimization is then to maximize the sum of the utilities associated with the sensors. Let the total number of bits transmitted from a node u ∈ V be denoted by b + u , and into a node u ∈ V be denoted by We can then pose the information flow optimization problem as follows: Observe that (2c) arises from the flow conservation constraints.
Relaxing b (u,v) 's to be real-valued, problem (2) becomes a convex program, and is therefore straightforward to solve. A variety of decentralized or distributed methods have been proposed to solve this kind of NUM problem (see, e.g., [16], [18], [19], [20], [21] and the references therein). To obtain a near-optimal solution to problem (2), the optimal solution to the real-valued relaxation can be used to obtain an integer-valued solution by rounding the real values to the nearest integers smaller than them.

III. FLOW OPTIMIZATION FOR PARAMETER ESTIMATION
Consider the following linear model: where Y = [Y 1 · · · Y m ] represents all observations (potentially non-homogeneous with Y i ∈ R d i ) across the sensors, A ∈ R (d 1 +···+d m )×D is a known matrix, and θ ∈ R D is the unknown parameter vector to be estimated. The vector N is of the form N = [N 1 · · · N m ] , where N i s are zero-mean, independent random vectors and N i can be considered to be the noise associated with sensor i. Let the covariance matrix of N be denoted by Ω. We will consider both cases where θ is modeled as random and non-random. We restrict our attention to linear estimators. Let the quantized observations be denoted as Y q = (Y q 1 ) · · · (Y q m ) and let the difference between quantized and non-quantized observations be = [ 1 · · · , m ] = Y q − Y . We consider a quantization scheme that satisfies the following reasonable and practical assumptions: 1) In the non-Bayesian setting, we assume that for i = j,

A. Non-Bayesian Setting
Consider A to be full rank. It is well-known that the generalized least squares (GLS) estimator is the best linear unbiased estimator (BLUE): We consider the estimator given byθ(y q ) = A † y q . The MSE of θ(y q ) is given by We have that Note that the following can be considered to be an innerproduct between X, Z (where X and Z have bounded covariance matrices): Applying Cauchy-Schwarz inequality, we obtain . (18) Observe that it follows from Assumption 1 that E θ [ ] is a block diagonal matrix: ⎡ Moreover, from Assumption 2, we have Thus the upper bound on the MSE ofθ(y q ) is: where, For simplicity of notation, let The upper bound in (21) can be further bounded in two ways. The first method involves bounding the middle term in (21): The second method applies the inequality x + 2 √ xy + y ≤ 2(x + y) to obtain the following upper bound on (21): The difficulty in optimizing the MSE arises from the fact that the MSE has no known closed-form expression. These bounds help us to circumvent the problem, and we propose to optimize these bounds instead of the MSE. Based on these bounds, we can define the utility function for sensor i in two different ways as follows: where c i depends on B i as defined in (4).

B. Bayesian Setting
In the Bayesian setting, where θ is modeled as a random vector Θ, the linear minimum MSE (LMMSE) estimator is given bŷ The error covariance and MSE of the LMMSE estimator are: We consider the estimator given byθ(y q ) = W y q + b. The MSE ofθ(y q ) is given by Using the same approach as given in the non-Bayesian setting, we obtain the following upper bound on the MSE: where C is as defined in (20). Proceeding in a similar manner as the non-Bayesian setting, we obtain the following utility functions for sensor i: where IV. PROPOSED SCHEME In both the Bayesian and non-Bayesian settings, it can be easily checked that the utility functions defined are indeed concave, i.e., g (1) i and g (2) i are concave. It would be preferable to define the utility function for sensor i as g i = max{g (1) i , g (2) i }. However, such a g i is not necessarily concave. We therefore propose to solve the optimization problem (2) in both settings, by first solving the problem with utility functions taken to be g (1) i , and then solving the problem with utility functions taken to be g (2) i . The final bit assignment can be taken to be the solution with the higher optimal value.

V. SIMULATION STUDIES
We compared the proposed scheme (Opt-Info-Flow) against the well-known Max-Flow scheme [23], [24]. We implemented both schemes, using the software packages CVXPY and CVX-OPT [25], [26]. The Max-Flow solution is obtained by solving (2) with the utility functions taken to be identity functions. In all the simulation studies, the observations at the sensors are scalars, all quantizers are uniform quantizers, and noise at the sensors is assumed to be i.i.d. according to a uniform distribution on Consider first the simple deterministically constructed network depicted in Fig. 2 Table I.
We now consider a randomly constructed network, where we fix the network architecture as given in Fig. 3, and generate the edge capacities randomly i.i.d. according to a uniform  . We fix an arbitrary realization of the above model parameters. For the non-Bayesian setting, θ is set as an arbitrary realization of the unifom distribution on [0, 1] 3 and for the purpose of selecting quantization intervals, we assume that θ 2 ≤ √ 3. For the Bayesian setting, the underlying parameter vector Θ ∈ R 3 is modeled as a random vector with entries generated i.i.d. according to a truncated normal distribution (truncated to [−1, 1]). In both settings, the bit allocations at sensors {10, 11, 12, 13, 14} obtained by Max-Flow are {7, 6, 0, 9, 5}. The bit allocations obtained by Opt-Info-Flow are {5, 5, 5, 6, 6} and {5, 5, 5, 5, 7} in non-Bayesian and Bayesian settings, respectively. The MSE calculated from simulations (10 6 iterations) is presented in Table II.
It is clear from Tables I and II that the proposed scheme has a much smaller MSE than the Max-Flow scheme.

VI. CONCLUSION
We showed that the problem of information flow optimization for linear estimation with linear observation models over a distributed sensor network can be cast as a NUM problem. We provided solutions in both Bayesian and non-Bayesian settings, which we showed have lower MSE than the Max-Flow scheme. An interesting direction for future research is the extension to signal estimation in linear models, with sequential observations at the sensors and bit-rate constraints on the links.