Probabilistic Message-Passing Control

There is insufficient current understanding of how to apply fully decentralized control to networks of sparsely coupled nonlinear dynamical subsystems subject to noise to track a desired state. As exemplars, this class of problem is motivated by practical requirements of creating decentralized power grids robust to cascade failures, the digital transformation of Industry 4.0 managing IoT connectivity reliably, and controlling transport flow in smart cities by computing at the edge. We demonstrate that an approach utilizing probability theory to characterize and exploit the uncertainty in locally received information, and locally optimized messages passed between neighboring subsystems is sufficient to implicitly infer global knowledge. Thus, control of a global state could be realized through decentralized control signals applied only to local subsystems using only local information without any reference to a global current state. Given a global system that can be decomposed into a set of locally coupled subsystems, we develop a theoretical method of probabilistic message passing and probabilistic control signals all interacting only at the subsystem level, but which promotes a system-wide convergence to a desired state. Our theoretical results are corroborated using computational experiments on a network of a 10-node partially coupled system decomposed into four separated subsystems with control inputs applied and determined at the subsystem level. Comparing the results with a centralized control method utilizing information from all the nodes to achieve global state convergence validates our hypothesis that local decentralized probabilistic control can be affected by the mechanism of local probabilistic message passing without needing access to global centralized information. We also provide a set of numerical experiments increasing the network size showing that the decentralized algorithm is independent of the global system size.

has been highlighted [1] by the development of blockchain [2], both as a mechanism for supporting egalitarian cryptocurrency, and as an example of a decentralized ledger technology (DLT). Blockchain however, is but one recent impact of decentralization. The impact of distributed energy resources, the mandated requirement of ambitious renewable energy targets to national energy policies, and inefficiencies of energy transport across distance are motivating a future decentralized power grid at nation state level, favoring decentralized generation [3]. Similarly, traffic management systems to ease congestion and pollution across smart cities by computing at the edge requires a stable, controllable decentralized connected network of local autonomous systems [4]. For multiagent systems such as drone formations for agriculture or surveillance, there exists the generic unsolved distributed formation control problem of multirobot systems [5].
The demand for stable behavior of self-adaptive decentralized yet connected systems motivates the core of this article: to devise a theoretical and algorithmic framework for a fully decentralized control approach that can be applied to such noisy and nonlinear systems.
However, full decentralization leads to many challenges, not least of which is how can a decentralized system arrive at a global consensus that is optimal for the whole community when all decision making is made at the local subgroup level. Simple "unsupervised" self-organization in dynamical systems is insufficient in systems that require a specific global outcome. It is not a closed system problem and needs to respond and interact with external stimuli or interventions guiding the system to specific outcomes, such as confirming a block of financial transactions on a ledger, or a desired voltage output whilst dealing with fluctuating global power demand. Being decentralized, each local subgroup by definition does not respond to commands sent from a global controlling center. Each subgroup only sees information passed by its (presumably sparse) connection links to other local subgroups and modifies its local behavior to optimize its own characteristics. The challenge is how to locally direct each subsystem to a behavior that optimizes a global system benefit, when each subsystem does not have access to the global cost function, and only observes signals from its neighbors, and in turn will communicate its own local state to its own connected subgroups.
The final challenge is that in addition to dealing with nonlinear behavior within a subsystem we have to contend with noise, uncertainty, and stochasticity across the system, so conventional deterministic control techniques cannot be employed to direct the behavior of the decentralized subsystems. This implies we need an approach where subsystems and external control strategies need to be developed probabilistically. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ The focus of this article is to introduce the theoretical structure for the explicit control interventions in a decentralized but interconnected system under the assumption that each subsystem only has access to probabilistic information on the state of each connected subsystem. In a similar way, each subsystem conveys information on its current state to its own connected neighbors, all without having access to the global cost function of the whole system. A correcting probabilistic control signal is introduced at each subsystem, the precise form of which is one of the novel contributions of this article.
The methodology employed in this article is to consider the decentralized set of partially coupled subsystems where each subsystem's state is locally estimated and controlled independently using a probabilistic model for each subsystem and a probabilistic controller each of which needs to be estimated rather than predetermined. Each subsystem can measure the state of the local variables that describe its own dynamical behavior (i.e., the state of the subsystem is observable by the local subsystem). However, each subsystem only receives partial information on the current state of its neighboring subsystems, via probabilistic message passing, the form of which is derived in this article. Note that messages passed from a particular subsystem about its state are treated by the neighboring subsystems as external variables, and so cannot be influenced by local control interventions. This is the mechanism that allows us to decouple the problem into locally controllable decentralized subsystems.
The novelty is in treating the full control problem as intrinsically probabilistic, where the decentralization aspect is achieved by exploiting the statistical independence between subsystems, allowing a decomposition of the joint distribution into locally accessible probability distributions constituting the information in the message passing between connected subsystems.

II. REVIEW OF CURRENT APPROACHES
The deterministic dynamical systems-based approach has provided several useful developments, notably in synchronizing chaos [6], pinning control [7], [8], multiagent control [9], and mean-field optimal control [10]. These developments however have weaknesses. They either over-represent single-agent architectures as far as the controller design is concerned, which retain elements of centralization requiring knowledge of the global state, or are decentralized but decisions are based on disconnected knowledge, or on a simplifying averaging, and do not incorporate stochastic uncertainty.
Another fundamental problem in complex systems control is consensus in which the objective is to devise decentralized control laws for all subsystems constituting a complex system to ensure that the global objective of the complex system is achieved using only local knowledge available to each subsystem locally. A comprehensive analysis of the consensus problem along with the influence of direct information flow accompanied with changes in the network topology was given in [11]. A recent overview of decentralized consensus issues in modern blockchain networks was reviewed in [12], though without issues of stochastic uncertainty, and not addressing the environmentally unsustainable energy budget consumed by the main consensus algorithms at scale. Similarly, it has been proposed [13] to use a blockchain solution as a decentralized peer-to-peer energy trading model for smart grids, but again suffers from all the problems of scalability and constraints of the consensus algorithms. Also related to using blockchain for microgrids in distributed decision making [14]. Although not strictly a decentralized control algorithm, they investigate a decentralized consensus decision-making approach for cybersecurity of multimicrogrid power systems which are subject to incomplete information. Of note is their use of Bayesian games to incorporate decision making under uncertainty. In particular, they consider incomplete information being made available to each node, and use a Harsanyi transformation to convert this into a problem of complete but imperfect information. A fuzzy payoff matrix is used to determine a local decision which is recorded on a blockchain using modified consensus. This is a rare example of including uncertainty explicitly into the decision-making algorithm, although still constrained by the consensus approaches of blockchain, and limited by uncertainty in dynamics.
Although several methods for consensus and synchronization have been proposed in [15] and [16], there is still a lack of a reliable framework that can guarantee synchronization in the presence of heterogeneous and/or uncertain complex systems components. Some recent studies considered distributed adaptive synchronization to harmonize multiagent systems with large uncertainties and heterogeneous agents, but they mostly assumed linear systems and identical dimensions for all agents [17], [18]. For networks that change topology over time, adaptive control methods have been proposed to guarantee stability for uncertain switched linear systems [19], [20]. A robust synchronization method was proposed in [21] for linear multiagent systems with uncertainty in the agent dynamics. However, despite being proven robust, the aforementioned methods considered linear dynamical systems only, therefore, they are too restrictive to be used in real-world scenarios where the processes exhibit nonlinear dynamics and where their models are usually difficult to obtain. Similarly, the approach in [22] presented the consensus control problem of heterogeneous linear multiagent systems using a distributed proportional-integral protocol to ensure consensus of heterogeneous linear multiagent systems within directed communication graphs. However, in addition to the issues of being constrained to linear subsystems, the algorithm proposed depends on the eigenvalue information of the Laplacian matrix of the communication graph, and so requires knowledge of the whole system.
An approach to decentralization algorithms in tree-structured graphs was considered in [23] using a nonconvex optimization over tree-structured networks, where each node can solve smallscale optimization problems and communicate approximate value functions with its neighbors. However, stochastic uncertainty was not incorporated and the explicit large-scale global knowledge of the tree structure has to be assumed.
An approach to create a distributed output-feedback controller in a linear system using a triggering function to only transmit information between nodes at given interval events to achieve bounded consensus was described in [24]. As a leader-follower framework, the model assumed full deterministic knowledge of the leader dynamics, did not incorporate any noise or stochastic uncertainty in the dynamics or in the observed or estimated state values or control signals. Also, the knowledge of the full connectivity matrix is used to create the control gain in the linear system.
Going beyond linearity and fixed architectures, recent studies have investigated uncertain nonlinear multiagent systems [25]. Decentralized control methods have also been proposed to control large-scale and interconnected nonlinear systems [26], [27]. Nonetheless, most of these methods were based on the exploitation of traditional adaptive and optimal control approaches and the minimization of the mean of the objective function, thus yielding heuristic certainty equivalence controllers which are known to perform badly in the presence of noise, model uncertainty, and unmodeled dynamics. Other studies in complex systems reviewed recent advances on controllability and control of complex networks and explored the intricate interplay between the network topology and dynamical laws. Liu and Barabási [28] discussed the challenges that nonlinearity, high dimensionality, and constraints on the intervention impose on designing system-level control [29], and developed analytical tools to study the controllability of an arbitrary complex directed network [30].
Ma and Ma [31] proposed an algorithm using radial basis function neural networks to estimate the unknown nonlinearities in switched nonlinear large-scale systems for adaptive decentralized fault-tolerant control using certainty equivalence controllers. However, stochastic uncertainties (noise) and unknown dynamics limit the use of certainty equivalence controllers.
Another approach using a radial basis function neural network to estimate the intermediate nonlinear behaviors in a decentralized control system, but this time considering input saturation and time-dependent output constraints has recently been shown in [32]. A control algorithm was introduced using time-varying barrier Lyapunov functions to ensure output constraints and a backstepping approach with Lyapunov stability for finite time system stability was used for convergence. No noise effects or other stochastic uncertainty disruptions were included, scaling behavior was not analyzed, and the system used knowledge of a global Lyapunov function.
There are also message-passing approaches for the minimization of global network objective functions using only local decentralized optimizations. For instance, [33] devised a decentralized message-passing approach which was efficient for the basic decomposition of electrical load networks. This work gave full decentralization providing a minimization of a global cost function if the device objective functions are convex closed proper functions. However, it was not a probabilistic framework to deal with the inherent stochasticities, neither did it provide a mechanism of local control.
As indicated in this overview of existing research directions, there are gaps in knowledge surrounding insights for a robust framework of genuine decentralized control. This is especially acute for nonlinear partially coupled stochastic systems with inherent uncertainties, where no global information can be communicated to the local level.

III. SYSTEM MODEL
This section outlines an architecture and the components of the decentralized control framework. A global multicomponent system is decomposed into a set of nonoverlapping but loosely coupled subsystems. This loose coupling will be quantified later on by a set of probabilistic messages representing partial knowledge of one subsystem passed to another subsystem. The receiving subsystem treats these partial messages as external independent "environment" variables. At no point is global system knowledge needed at the local subsystem level.

A. Nomenclature
The global system is composed of K subsystems, labeled α, β, γ, . . . , where needed. t denotes time, t = {1, 2, . . . , |t|} is the time sequence, and |t| is the cardinality of t. The (locally) observable multivariate state of a given subsystem at time t is denoted x α t (or simply x t where no confusion occurs). Information received from a subsystem β by a subsystem α at time t is denoted y β→α t (generically abbreviated y t when there is no confusion). These are treated as independent external variables. We will also use the notation z t = [x t , y t ] to group together the observed and external input variables to a subsystem. The local control signal at subsystem α at time t is denoted u α t , or generically u t . Since we are developing a probabilistic framework, we assume that there is a conditional joint distribution of all variables at each subsystem, denoted s α [ . . . ]. We will write this as We also use s(x t | . . . ) and s(y t | . . .) to denote the distribution functions of the observed and external variables separately which will be clear from the context of variables. The control signals are also assumed to be taken from a conditional distribution function, denoted c(. . . ) = c α (u t |z t−1 ) applied at subsystem α. A particular optimized choice of a specific control strategy will be denoted c o . We also use D(J s ||J I ) to denote the Kullback-Leibler divergence (KLD) between the two probability distributions between the joint probability density function of the closed-loop control system J s and an ideal joint probability density J I .

B. Architecture
The global system is decomposed into K partially coupled subsystems each estimated and controlled independently with a probabilistic model and a probabilistic controller as shown in Fig. 1. Each subsystem can access its own local state that describes its dynamical behavior. This measured information of a particular subsystem is denoted by {x t }. Each subsystem communicates partial information that describes its current state to its neighboring subsystems via probabilistic message passing to be derived later. Messages passed from a particular subsystem about its state x t enter the neighboring subsystems as external variables which are denoted by y t by the receiving subsystem. The main objective of the subsystem's controller is to influence its own state x t while it treats the receipt of its neighbors' state information as external disturbances y t . It will be shown that the optimized probabilistic message passing between neighboring subsystems achieves global system Fig. 1. Architecture of the decentralized probabilistic control framework of complex systems. The set of sensors in the top parallelogram shape measure the state of the variables, x t that describe the dynamical behavior of that subsystem. TD is the temporal delay line. y t denotes the external signals that are received from other neighboring subsystems. The dynamics of each subsystem are estimated using two probabilistic models, s α (x α t |u α t , z t−1 ) and s α (y α t |y t−1 ). The optimized probabilistic controller of a particular subsystem is c * α (u α t |z t−1 ). The messages being passed depend on the architecture of other subsystems. In the figure, the outgoing messages from a subsystem are shown as a dark parallelogram shape while the incoming messages to the subsystems are shown as a light gray parallelogram shape. The figure shows mainly three subsystems α, β, and γ along with another subsystem which is shown to be further away with fading connection arrows to emphasize the existence of other subsystems that constitute a complex system. Two messages are shown being passed from subsystem α to subsystem β denoted as from the source subsystem are treated as external signals, y α t in the destination subsystem.
goals in a cooperative manner where various subsystems have access to different and local information. The probabilistic message passing to be developed in this article allows the subsystems to update local information using the partial information they hold on the state of their neighboring subsystems. Reconciliation of the subsystems' partial knowledge allows a consistent global approximation to emerge.
In the prescribed network of interacting components, the state of each subsystem x t is influenced by a multivariate control input u t . The states of the subsystems are also driven by part of the observed multivariate state values of their neighbors, y t−1 received through probabilistic message passing. However, we assume a probabilistic approach in which each of these variables (states, external variables, and control signals) are sampled from underlying probability distributions that need to be determined. We will use the independence between the external variables and the local state variables to decompose the full joint distributions into products of simpler distributions that need to be estimated. The interaction between all these random variables in a given subsystem can be described by a Markov-type probability density function as follows: The first conditional density model on the right-hand side of (1), s(x t |z t−1 , u t ) follows from the assumed Markovian property, whereas the second conditional density model, s(y t |y t−1 ) represents the assumption that y t are external variables to that subsystem with their dynamics uninfluenced by the control input u t or the subsystem states x t . In the following sections, we derive the probabilistic estimation and control problem of a subsystem.

C. Fully Probabilistic Design of Subsystem
Following the previous representation of a subsystem model, a subsystem objective is to derive a randomized control input probability distribution c(u t |z t−1 ) to achieve its objectives and consequently the objectives of the overall complex system. In this article, this optimization process is achieved by using the fully probabilistic design (FPD) methods of control. FPD selects the optimal control strategy, c o from the set c of randomized control strategies, formed by sequences of randomized control laws c = {c(u t |z t−1 )}, t ∈ t, as follows: where D(.) is the KLD between the joint probability density function of the closed-loop control system J s and an ideal joint probability density J I . The joint probability density function of the closed-loop system can be written explicitly using the system's input and output data as follows: The ideal joint pdf of the closed-loop system behavior J I specifies the preferred form for the joint distribution of the system behavior J s . For a subsystem model explained in Section III, it is given by, To clarify, the ideal distribution of the subsystem internal state, s I (x t |z t−1 ) can be specified according to the control objective that is required to be achieved from controlling the system under consideration. For example, for the objective of regulation around the origin, the mean of the distribution s I (x t |z t−1 ) needs to be set equal to zero and the covariance of this distribution will be specified by the maximum fluctuations that are allowed around the origin. Similarly, the mean of the ideal controller c I (u t |z t−1 ) will need to be set to zero to achieve the regulation objective, while the covariance of this distribution will specify the allowed range of optimal inputs.
In (4), the probability density function s(y t |y t−1 ) describing the desired behavior of the external variables y t is taken to be equal to the corresponding counterpart in (3). This follows from the assumption that the dynamics of external random variables are not influenced by the control input u t of the subsystem. The interpretation of the desired probability density function s(y t |y t−1 ) of the external variables, y t respects their externalities and allows them to evolve independently.
Therefore, once the various distributions given in (3) and (4) have been determined, the Kullback-Leibler minimization as specified by (2) determines the optimal randomized control strategy, which in turn needs to be selected from its distribution.
We determine each of the required distributions in the following.

D. Estimating the Probability Distribution Function of Subsystem Model
The FPD control method assumes the availability of a probability density function, s(x t , y t |z t−1 , u t ) describing the system model (1). This model is not prescriptive and cannot be found as the solution of a closed-form set of equations. We make the assumption that at a given time instant this unknown model function can be locally approximated as a conditionally dependent Gaussian distribution, parameterized by a mean and a covariance function that depends on the state and external variables and is adaptively modified at each time instant.
Consequently, we employ a machine learning approach to affect the function approximation of the distribution in an adaptive online manner. The use of a function-approximating nonlinear model to estimate the probabilistic form of the subsystems has several advantages, including the ability to implicitly detect complex nonlinear relationships between a subsystem output and its inputs.
To estimate the probabilistic model of a subsystem (1), we adopt the method proposed in [34]. In this method, the distribution of a system output was approximated using a Gaussian function but with functional forms of inputdependent means and global variances. The input-dependent means were estimated using machine learning models as Architecture of the neural network model that estimates the probabilistic models of the subsystems of a complex system. universal nonlinear function approximators. Following the estimation of the input-dependent means, the residual errors in the means' predictions are calculated which are approximately Gaussian random noise with zero mean and a global covariance matrix. Applying this approach to estimating the subsystem model function, the nonlinear relationships between the multivariate subsystem states as well as its external state variables, {x t , y t } with their corresponding inputs {(z t−1 , u t )} are obtained using a standard nonlinear interpolator such that the following inequalities hold: where {δ 1 , δ 2 } > {0, 0} are small tolerance values, and f (z t−1 , u t ),ĝ(y t−1 ) are neural network approximations of the subsystem state x t , and external state y t , respectively. Following this estimation, the residual errors from the estimation process can be obtained as follows: Here, e 1 (z t−1 , u t ) and e 2 (y t−1 ) represent the approximation errors satisfying |e 1 (z t−1 , u t )| ≤ δ 1 and |e 2 (y t−1 )| ≤ δ 2 , respectively. The global covariance of these approximation errors can then be calculated as follows: The estimated subsystem states and external state, {f (z t−1 , u t ),ĝ(y t−1 )} and their corresponding estimated autocovariances {R x t , R y t } are then used to characterize the Gaussian distribution description of the probability density functions of the subsystem state and its external state as follows [34]: This architecture is shown in Fig. 2.

E. Optimal Randomized Controller of Subsystem
The next distribution to estimate is the control distribution applied to each subsystem. For this, we make use of the following theorem.
Theorem 1: The randomized optimal control strategy for a subsystem model (1) and an ideal closed-loop model (4) that optimizes the FPD objective function defined in (2) is given by where this solution is obtained by solving the following recursive equation that represents the minimization of (2) with respect to the control input The full derivation of (10) can be found in [35]. Equation (10) constitutes the recurrence equation of the dynamic programming solution to the FPD control problem. Proof: The proof of this theorem can be found in [35].
For the general form of probability density functions of a system dynamics as given in (1), the integrals defined in Theorem 1, that are required to derive the optimized control strategy, will be hard to evaluate even if Gaussian probability density functions are assumed. This is due to the nonlinear dependency of these probability density functions on the state and control inputs. Therefore, the problem of the derivation of the optimal randomized controller is generically a nonlinear optimization problem that can be solved by setting the derivative of the recurrence functional equation (10) with respect to the control input equal to zero where is the derivative of the optimal cost-to-go function with respect to the system state vector. This solution yields the Riccati equation and statedependent control input in the linear quadratic regulator case which can be solved efficiently. In the case where the parameters of the system probability density functions are general nonlinear functions of the state and control input as specified by (8), the optimal control solution from (11) has no closed analytic form. Therefore, we extend the FPD adaptive critic approach to derive the near to optimal control inputs for single dynamical nonlinear systems such that optimal control inputs can be optimized for nonlinear decentralized dynamical systems with external state variables. The FPD adaptive critic approach is based on the dual heuristic programming (DHP) scheme of adaptive critic methods. The methodology and optimization method involved in the FPD adaptive critic approach for the proposed decentralized control framework remains the same as that for single systems and is taken as a ready methodology in this article. In particular, three parametric blocks called the controller or action network, the forward model and the critic network need to be implemented to optimize the control system and derive the optimal control inputs. The action network is responsible for estimating the conditional distribution of control signals as will be explained next, while the critic network provides an approximation to the derivative of the optimal cost-to-go function as specified in (12). The forward model can be either a mathematical model or neural network approximation to the conditional distribution of the system dynamics as discussed in Section III-D. The main differences between the implementation of the FPD adaptive critic approach in the proposed decentralized framework as compared to that of single dynamical systems come from the existence of the external state variables in the complex system decomposition and design. In particular, the inputs to the forward dynamical models need to be dealt with to take care of external state variables of the subsystems as discussed in Section III-D. In addition, the form of the recurrence equation (12) to be optimized involves multiple integrations as a consequence of the introduction of the external state to the state of the subsystems of a complex system.
Next, we focus on the problem of estimating the conditional distribution of control inputs based on the derived optimal control values from (11). This is required for further development of the probabilistic message passing proposed in this article. The conditional distribution of control inputs can be obtained using the same method that is discussed in Section III-D for estimating the conditional distribution of the forward dynamics of the controlled system. Following this method, a controller network can be optimized such that the error between the optimal control input u t , obtained from (11) and estimated control input h(z t−1 ) from the neural network is minimized. Once this network is optimized, information about the error between the optimal control u t and estimated control h(z t−1 ) will become available. Hence, the controller generates a control signal u t stochastically from a Gaussian distribution where h(z t−1 ) is the mean computed from the controller network and is a fixed covariance matrix equal to the average residual error between the output of the network and the optimal control signal,

IV. PROBABILISTIC MESSAGE PASSING
As discussed, the framework uses an architecture where neighboring subsystems are assumed to have access to partial copies of their neighbors' state measurements estimated via probabilistic message passing. Probabilistic message passing is required between the subsystems in order to achieve consensus among the subsystems on the information that they retain about the state of their neighbors. This decentralized probabilistic message-passing method, which forms the main innovation of this article, will be developed in this section. Without loss of generality, the method is outlined here for a complex system that is composed of two subsystems indexed by ρ ∈ {α, β}. Each subsystem can access its measured state values, x t , retains copies of the partial states of its neighbors y t , and optimizes its control input, u t that influences its state values. Moreover, the governing equation of a subsystem indexed by ρ is a general nonlinear equation as discussed in Section III-D.
We start our development by considering the optimally tuned closed-loop probability density description of subsystem α, evaluated at time t Using the probability density functions of the forward system dynamics of node α which is specified by (8), the designed randomized controller of node α given by (13), and treating the probability models s α (x α t |u α t , z α t−1 ) and s α (y α t |y α t−1 ) as multivariate Gaussian distributions, we can rewrite (14) according to the analysis given by (15) as shown at the bottom of the page.
Subsystem β receives information about the internal state, x α t of its neighboring subsystem α through message passing. This information about the state of subsystem α is treated as external variables in subsystem β, x α t → y β t . Therefore, in order to update the knowledge that subsystem β retains on its external variables, we need to integrate the closedloop probability density description of subsystem α over all state variables except its internal state x α t which needs to be passed to subsystem β in order to update its information on its external state variables, Using (15) in (16) yields (17), as shown at the bottom of the page. The evaluation of this integral is shown in (18), at the bottom of the page.
This is an integral over a Gaussian distribution with the dependent variable governing the nonlinear mean in the control input, and so does not have a closed-form representation. Under the assumption that the system is asymptotically close to an equilibrium, the dominant contribution from the integrand comes from the peak and its surrounding. We could consider a stationary phase or saddle point approximation to the full distribution. Alternatively, in this article, we use an approximation of the full distribution-based around its conditional mean and conditional variance, ignoring higher order cumulants. Thus, to evaluate this integral, the mean of M β←−α (x α t |z α t−1 ) can be obtained as follows: where . . . denotes the mean over the control signal. Now, the covariance of M β←−α (x α t |z α t−1 ) can be evaluated as follows: This means that the message to be passed from subsystem α to subsystem β can be approximated by the following Gaussian distribution: In our proposed framework, a subsystem exchanges messages with its neighbors to update its knowledge about the external variables it retains on the state of its neighbors. Thus, the probabilistic model that is evaluated in subsystem α as given by (21) will be passed to subsystem β to update its knowledge about its external variables, y β t . In particular, as shown in (8), node β estimates its external variables according to the following probabilistic model: By passing probabilistic messages about the information that subsystem α retains on the external variables of node β, the following probabilistic model becomes available to the external variables of subsystem β: Therefore, the information provided by the two probability density functions specified by (22) and (23) can be fused using Bayes' rule by multiplying the two together, i.e., considering the prediction model from node β as specified by (22) and the message-passed model from node α as specified by (23). The new pdf representing the fusion of the information from prediction and probabilistic message passing, is therefore given by (24) at the bottom of this page. Where in (24), the following definitions are used.
The two equations in (25) represent the message-passing update steps that are equivalent to the measurement update equations of the Kalman filter. This can be explicitly shown by noticing that the Kalman gain can be defined as follows: Using this definition of the Kalman equation in (25) yields the measurement update equations of the Kalman filter, V. EXPERIMENT The proposed theoretical decentralized framework is verified in this section on the small scale but difficult problem of a stochastic version of a coupled map nonlinear lattice with periodic boundary conditions [36] which was originally introduced in [37]. Although this system will have its own internal self-organized behavior, we will explore an explicit control problem where the criterion is that regardless of the starting configuration, control needs to move the system to the origin, regardless of any self-organizing tendency. The stochastic uncontrolled dynamic of the lattice is described by where i ∈ {1, 2, . . . L}, L is the length of the lattice, is the coupling strength, and periodic boundary conditions v i+L t = v i t . κ t is the Gaussian noise with covariance matrix 0.01I, and I is the identity matrix. The local map f (v, a) is taken to be the logistic map f (v) = av(1 − v) that satisfies the condition f (v * , a) = v * . This coupled map lattice exhibits chaotic characteristics in the regime 3.57 < a ≤ 4.0 and has a homogeneous steady state v * = 1 − 1/a. The linearized dynamics of this lattice about the homogeneous steady state is described by X t = AX t−1 + κ t , where is the L × L Jacobian matrix, X = v − v * , and η = (∂f (v * , a)/∂v)| v=v * . Note that this choice of local dynamics is not crucial to this article. It is used as an example of local nontrivial dynamical behavior that has a tendency to self-organize in the absence of control. This problem is difficult in that it is inherently nonlinear, and the noise induces fundamental global coupling and destabilizing perturbations. We start by considering a 10node lattice (L = 10) with a nonsymmetric division into four coupled subsystems (K = 4).
We will then scale this example network by exploring different sized networks with up to 1000 nodes.

A. 10-Oscillator Inhomogeneously Coupled Lattice
We simulate a coupled lattice of length L = 10 nonlinear oscillators, i.e., a lattice with ten oscillators. Two sets of experiments were conducted for comparison. The first set of experiments considers the decentralized control of the coupled map lattice according to the probabilistic message passing while the second set considers the globally centralized pinning control of the lattice. In these experiments, a = 3 and = 0.33 is simulated. Without loss of generality, the chosen network architecture is represented by the connectivity matrix (31), as shown at the bottom of the page.
The high-level control aim is to return the whole lattice state of ten nodes from its initial value X = X 0 to the origin (the fixed point position) or a state close to the origin.
In the decentralized control experiment, the control task is designated by four separated subsystems to be controlled by local knowledge, where each subsystem is responsible for controlling a different set of oscillators. Node α takes X t,1 = x α 1,t , X t,2 = x α 2;t , and X t,3 = x α 3;t as internal states, and X t,4 = y α 4,t , X t,5 = y α 5;t , X t,8 = y α 6,t , X t,9 = y α 7;t , and X t,10 = y α 8;t as external states. Hence, the system model of node α is described by, Node β takes X t,4 = x β 1,t and X t,5 = x β 2;t as internal states, and X t,1 = y β 1,t , X t,2 = y β 2;t , X t,3 = y β 3;t , X t,6 = y β 4;t , and X t,7 = y β 5;t as external states. Hence, the system model of node β is described by, where, Node γ takes X t,6 = x γ 1,t and X t,7 = x γ 2;t as internal states, and X t,4 = y γ 3,t , X t,5 = y γ 4;t , X t,8 = y γ 5;t , X t,9 = y γ 6;t , and X t,10 = y γ 7;t as external states. Hence, the system model of node γ is described by, Finally, node δ takes X t,8 = x δ 1,t , X t,9 = x δ 2;t and X t,10 = x δ 3;t as internal states, and X t,1 = y δ 4,t , X t,2 = y δ 5;t , X t,3 = y δ 6;t , X t,6 = y δ 7;t , and X t,7 = y δ 8;t as external states. Hence, the system model of node δ is described by, where, The unknown parameters of nodes α, β, γ , and δ are estimated using generalized linear models as discussed in Section III-D. After message-passing optimization, the evaluated dynamical evolution of the states of the lattice network and the resulting evaluated control signals are shown in Fig. 3(a) and (b), respectively. The figure shows that the decentralized controlled lattice is globally locked to the origin and that the designed probabilistic control and messagepassing approach has been effective in reconstructing the global desired state using only decentralized local knowledge.
We now consider a control mechanism which uses global knowledge and hence is centralized. This second experiment considers lattice control using probabilistic pinning control, where the length ten lattice is controlled using just two control signals that are placed adjacent at the lattice sides, thus yielding the following control dynamics: and κ t is Gaussian noise with covariance matrix 0.01I, I is the identity matrix, and u t = [u t,1 , u t,2 ] T is the vector of control inputs. For fair comparison, the parameters of the lattice that are defined in (36) are assumed to be unknown, thus estimated online using a generalized linear model. The resulting optimized states of the lattice network and the control signals are shown in Fig. 4(a) and (b), respectively. It can be observed that the centralized controller controlling all ten oscillators shows higher fluctuations in the transient and steady-state periods than the decentralized experiment.

B. Scaling Performance
In a decentralized approach to probabilistic control on networks of loosely connected subclusters, the expected network variance away from the desired control value should be independent of the total number of nodes in the system (above the individual subcluster size), and the magnitude of this variance would be set by the noise in the dynamics governing each individual subsystem. Also, we would expect that the computational time to converge using a sequential computer polling each subcluster in turn should only scale linearly with the number of subclusters, and not scale superlinearly as the total number of individual nodes increases at scale.
In this final experiment, we evaluate the variance away from a desired value and estimate the total computational time as the number of nodes is increased to a total network size of N = 10, 40, 100, 300, 600, 1000 nodes, respectively. For consistency with the previous experiment, each subcluster consists of 10-node coupled lattices, each divided into four loosely coupled subsystems, and each operating locally within its 10-node environment. The dynamics within a subcluster are governed by the same dynamical system as used in the previous experiment, but the parameters governing the dynamic behavior are randomized, thus allowing mixed chaotic/nonchaotic subcluster behavior.
Each node in each 10-node subgroup is randomly connected to a node in two neighboring subgroups with each node initially operating its own dynamical system.
For each subcluster, we evaluated the optimal probabilistic local controllers that minimize the Kullback-Leibler divergence and determine the optimal subsystem state at a given instant. The probabilistic messages are evaluated and passed between the subsystems to inform the local controllers about the state of their neighboring nodes which in turn allow the achievement of the global control objective of the considered network.
For each network size N = 10, 40, 100, 300, 600, 1000 decomposed into randomly connected blocks of 10-node 4-subcluster components, the computational time to converge and the resulting variance away from the desired value is recorded and displayed in Fig. 5.
In all of these experiments, the local controllers of subsystems in the network were able to synchronize their corresponding states after a few iterations. The average steady-state variances around the desired objective as well as the sequential computational time for different network sizes are shown in Fig. 5(a) and (b), respectively. Fig. 5(a) shows that the average network variance away from the desired state determined by the probabilistic controller is a constant, independent of the system size. The magnitude of this variance is set by the noise level in the stochastic dynamics of each subsystem. Fig. 5(b) shows that the sequential execution time scales linearly with the system size, and so when operating in a decentralized nonsequential mode, the time to convergence of the system is independent of the actual system size.
The figure confirms that the probabilistic message passing scales in a decentralized manner and as such has the capacity to scale to huge system sizes, limited only by the maximum size of a local subsystem.

VI. CONCLUSION
The hypothesis of this article was that a fully decentralized probabilistic framework to control connected subsystems using only locally available information could be devised to approximate globally optimal control. We have verified this hypothesis as follows.
1) The global system was taken as a set of partially coupled subsystems, where each subsystem had its own dynamics, the instantaneous state of which was estimated from a locally self-consistent process. 2) Each subsystem can measure its own state, but information on neighboring subsystems was treated as an external variable quantified only as a Gaussian probability distribution of evaluated mean and covariance. 3) Each subsystem was moderated by a local control signal. 4) The local optimal control signal was obtained from maximizing the KLD between an estimated and a desired probability distribution. 5) Estimating the relevant probability distributions was achieved by imposing an independence between subsystems allowing the full problem to be decomposed into a fully decentralized, self-consistent, locally estimable problem. 6) The probability density function of the system model needs to be evaluated, and in the absence of having an analytic solution, this was estimated online using a machine learning approach assuming conditionally dependent Gaussian distributions of unknown inputdependent mean and covariances to be learned. This had the benefit of detecting complex nonlinear relationships between local subsystem outputs and inputs, which is another novelty of this article. 7) Numerically comparing the proposed decentralized message-passing approach with a centralized (pinning) control approach on a small-scale nonlinear multicomponent 10-node partially coupled 4-subsystem example verified the hypothesis of local probabilistic control achieving global optimization. 8) Numerically demonstrating the approach on network graphs of size 10, 40, 100, 300, 600, 1000 nodes showed a convergent constant average and total network variance around the desired value, independent of the number of nodes, as would be expected in a correctly operating decentralized control algorithm. 9) Also the computational time on a sequentially operating computer scaled only linearly with the network size, thus reflecting that the actual real-time operation of such a network in decentralized subclusters would be constant, determined only by the size of each subcluster, not the total number of nodes. This demonstrates that the algorithm scales to very large networks bound only by the size of the largest subcluster. This article has followed an FPD approach and used input-dependent Gaussian distribution approximations for the unknown probability distributions, that are updated at each instant using a machine learning approach. This avoided the need to evaluate analytically intractable integrals numerically explicitly, though this was not a key focus of this article. Although the approximation was sufficient to support the article's hypothesis, this is one aspect that will benefit from more in depth evaluation in the future. The approach scales to very large systems since it is fully decentralized, and no aspect of the implementation requires any global system knowledge. This allows computational and memory resource to be decentralized to each subcluster. This article demonstrated the efficacy of allowing each subcluster to send a connection to other subclusters, with a connection signal determined from the decentralized message-passing algorithm.