Browse

• Abstract

SECTION I

## INTRODUCTION

This paper is concerned with modeling of networks involving an extremely large number of components. The conventional way to study large networks is by computer modeling and simulation [1]. The approach involves representing the network in computer software and then applying a numerical simulation method to study how the network behaves. Typically, each individual component is explicitly represented as a separate entity. As we are confronted with larger and larger networks, the number of its components that have to be represented increases, and this significantly lengthens the time it takes to write, manage, and run computer simulation programs. Simulating large networks typically requires expensive, highly sophisticated supercomputers involving large parallel computing hardware with specialized software. It is not uncommon for a simulation run to take days or weeks, even on a large supercomputer. The larger the network, the longer it takes. The computational overhead associated with direct simulation thus severely limits the size and complexity of networks that can be studied in this fashion.

Our recent papers [2]– [3] [4][5] address this problem by using continuum modeling to capture the global characteristics of large networks. In large networks, we are often more interested in the global characteristics of an entire network than in a particular individual component. Continuum models do away with the need to represent each individual component of a large network as a separate entity, and consider the behavior of the components on the scale of the aggregate rather than of the individual. Similar to treating water as a continuous fluid instead of a large number of individual molecules, continuum modeling treats the large number of communicating components (or nodes) in a network collectively as a continuum. The continuum modeling strategies in [3]– [4][5] use partial differential equations (PDEs) to approximate large sensor or cellular networks modeled by a certain class of Markov chains. The PDE model represents the global characteristics of the network, while the individual characteristics of the components enter the model through the form and the parameters of the PDE.

PDEs are well-suited to the modeling of continuum behavior. Although uncommon in modeling networks, they are common in modeling many physical phenomena, including heat, sound, electromagnetism, and fluid flow. There are well-established mathematical tools to solve PDEs, such as the finite element method [6] and the finite difference method [7], incorporated into computer software packages such as Matlab and Comsol. We can use these tools to greatly reduce computation time. As a result, the effort to run the PDE models in a computer no longer suffers from the curse of sheer size. (In fact, as we will show, the larger the network, the closer the PDE approximates it.) Continuum modeling thus provides a powerful way to deal with the number of components in large networks. This, in turn, would make it possible to carry out—with reasonable computational burden even for extremely large systems—network performance evaluation and prototyping, network design, systematic parameter studies, and optimization of network characteristics.

The work in this paper is motivated by the continuum modeling strategies in the papers [3]– [4][5] mentioned above, and by the need for a rigorous description of the heuristic limiting process underlying the construction of their PDE models. We analyze the convergence of a class of Markov chains to their continuum limits, which are the solutions of certain PDEs. We consider a general Markov chain model in an abstract setting instead of that of any particular network model. We do this for two reasons: first, our network modeling results involve a class of Markov chains modeling a variety of communication networks; second, similar Markov chain models akin to ours arise in several other contexts. For example, a very recent paper [8] on human crowd modeling derives a limiting PDE in a fashion similar to our approach.

In the convergence analysis, we show that a sequence of Markov chains indexed by $N$, the number of components in the system that they model, converges in a certain sense to its continuum limit, which is the solution of a time-dependent PDE, as $N$ goes to $\infty$. The PDE solution describes the global spatio-temporal behavior of the model in the limit of large system size. We apply this abstract result to the modeling of a large wireless sensor network by approximating a particular global aspect of the network states (queue length) by a nonlinear convection-diffusion-reaction PDE. This network model includes the network example discussed in [3] as a special case.

### A. Related Literature

The modeling and analysis of stochastic systems such as networks is a large field of research, and much of the previous contributions share goals with the work in this paper.

In the field of direct numerical simulation approaches, many efforts have been made to accelerate the simulation. For example, parallel simulation techniques have been developed to exploit the computation power of multiprocessor and/or cluster platforms [9]– [10] [11][12]; new mechanisms for executing the simulation have been designed to improve the efficiency of event scheduling in event-driven simulations (see, e.g., [13], [14]); and fluid simulations, in contrast to traditional packet-level ones, have been used to simplify the network model by treating network traffic (not nodes) as continuous flows rather than discrete packets [15]– [16] [17][18]. However, as the number of nodes in the network grows extremely large, computer-based simulations involving individual nodes eventually become practically infeasible. For the remainder of this subsection, we review some existing results on analysis of stochastic networks that do not depend on direct numerical simulation.

Our convergence analysis in this paper uses Kushner's ordinary differential equation (ODE) method [19]. This method essentially studies a “smoothing” limit as a certain “averaging” parameter goes to $\infty$, but not a “large-system” limit as the number of components in the system goes to $\infty$. In contrast, the limiting process analyzed in this paper involves two steps: the first similar to that in Kushner's ODE method, and the second a “large-system” limit. (We provide more details about the two-step procedure later in Section I-D.) In other words, while Kushner's method deals with a fixed state space, we treat a sequence of state spaces $\{{\BBR}^{N}\}$ indexed by increasing $N$, where $N$ is the number of components in the system.

Kushner's ODE method is closely related to the line of research called stochastic approximation, started by Robbins and Monro [20] and Kiefer and Wolfowitz [21] in the early 1950s, which studies stochastic processes similar to those addressed by Kushner's ODE method, and has been widely used in many areas (see, e.g., [22], [23], for surveys). Among the numerous following efforts, several ODE methods including that of Kushner were first developed in the 1970s (see, e.g., [24], [25]) and extensively studied thereafter (see, e.g., [26]– [27][28]), many times addressing problems outside of the category of stochastic approximation (see, e.g., [19]).

The general subject of the approximation of Markov chains (or equivalently, the convergence of sequences of Markov chains to certain limits) goes beyond the scope of ODE methods or stochastic approximation, and there are results on the convergence of different models to different limits. A huge class of Markov chains (discrete-time or continuous-time) that model various systems, phenomena, and abstract problems, hence having in general very different forms from ours, have been shown to converge either to ODE solutions [29]– [30][31] (and more generally, abstract Cauchy problems [32]), or to stochastic processes like diffusion processes [19], [33]. These results use methods different from Kushner's, but share with it the principle idea of “averaging out” of the randomness of the Markov chain. Their deeper connection lies in weak convergence theory [19], [33], [34] and methods to prove such convergence that they have in common: the operator semigroup convergence theorem [35], the martingale characterization method [36], and identification of the limit as the solution to a stochastic differential equation [19], [33]. The reader is referred to [19], [33] and the references therein for additional information on these methods.

Similar to Kushner's ODE method, all the convergence results discussed above differ from our approach in the sense that they essentially study only the single-step “smoothing” limit as an “averaging” parameter goes to $\infty$, but do not have our second-step convergence to the “large-system” limit (PDE) (as $N\to\infty$). There are systems in which the “averaging” parameter represents some “size” of the system (e.g., population in the epidemic model, the metapopulations model, and the searching parasites problem [31], [32]). However, it is still the case that the convergence requires a fixed dimension of the state space of the Markov chain, like the case of Kushner's ODE convergence, and does not apply to the “large-system” limit in our second step. For example, in the epidemic model, the Markov chain represents the number of people in a population in two states: infected and uninfected, and the large population limit is studied. This is a single-step limit and the state space of the Markov chain is always in ${\BBR}^{2}$. Notice that in these cases, the Markov chains model the number or proportion of the components in different states in the system, and unlike our model, the indexing or locations of the components are either unimportant or ignored. In contrast, in our case, the spatial index of nodes is addressed throughout.

In fact, a variety of other approximation methods for large systems, in general built on different ideas from the aforementioned ones, take a similar direction: they study the number or proportion of components in a certain state (or some related performance parameters), thus ignoring their order or the difference in their spatial locations. For example, the famous work of Gupta and Kumar [37], followed by many others (e.g., [38]– [39] [40][41]), derives scaling laws of network performance parameters (e.g., throughput); and many efforts based on mean field theory [42]– [43] [44] [45] [46] [47] [48] [49] [50][51] or on the theory of large deviations [52]– [53] [54] [55][56] study the convergence with regard to the so-called empirical (or occupancy) measure or distribution, which essentially represents the proportion of components in certain states, to a deterministic function, as the number of components grows large, treating the components as exchangeable in terms of their order or spatial indices. These approaches differ from our work at least in the sense that they only study the statistical instead of the spatio-temporal characteristics of the system. As a result of treating the components without regard to their locations, when the limits obtained by these approaches are in fact differential equations, they are usually ODEs instead of PDEs. Note that the statistical parameters studied in these works correspond to some deterministic quantities easily obtained from our deterministic limits that directly approximate the state of the systems. For example, the proportion of nodes with, say, empty queues in our network model can be directly calculated from the limiting PDE solution (in addition, their locations are directly observable); and the instantaneous throughput can be obtained by integrating the PDE solution at a certain time over the spatial domain.

Of course, there do exist numerous continuum models in a wide spectrum of areas such as physics, chemistry, ecology, economics, transportation, and sociology (e.g., [8], [57]– [58] [59] [60] [61] [62][63]), many of which use PDEs to formulate spatio-temporal phenomena and approximate quantities such as the probability density of particle velocity in thermodynamic systems, the concentration of reactants in chemical reactions, the population density in animal swarms, the wealth of a company in consumption-investment processes, the car density on highways, and the density of people in human crowds. All these works differ from the work presented here both by the properties of the system being studied and the analytic approaches. In addition, most of them study distributions of limiting processes that are random, while our limiting functions themselves are deterministic. We especially emphasize the difference between our results and those of the mathematical physics of hydrodynamics [64]– [65] [66] [67] [68][69], because the latter have a similar style by deducing macroscopic behavior from microscopic interactions of individual particles, and in some special cases result in similar PDEs. However, they use an entirely different approach, which usually requires different assumptions on the systems such as translation invariant transition probabilities, conservation of the number of particles, and particular distributions of the initial state; and their limiting PDE is not the direct approximation of system state, but the density of some associated probability measure.

There is a vast literature on the convergence of a large variety of network models different from ours, to essentially two kinds of limits: the fluid limit (or functional law of large numbers approximation) [70]– [71] [72] [73] [74] [75] [76] [77] [78][79] and the diffusion approximation (or functional central limit theorem approximation), under the so-called fluid and diffusion scalings, respectively, with the latter limit mostly studied in networks in heavy traffic [80]– [81] [82] [83] [84] [85] [86] [87] [88] [89] [90][91]. ( Some papers study both limits [92]– [93][94].) Unlike our work, this field of research focuses primarily on networks with a fixed number of nodes.

Our work is to be distinguished from approaches where the model is constructed to be a continuum representation from the start. For example, many papers treat nodes as a continuum by considering only the average density of nodes [95]– [96] [97] [98] [99] [100] [101][102]; and others model network traffic as a continuum by capturing certain average characteristics of the data packet traffic, with the averaging being over possibly different time scales [103]– [104][105]. The latter shares a similar idea with fluid simulations discussed at the beginning of this section.

### B. Markov Chain Model

We first describe our model in full generality. Consider $N$ points $V_{N}=\{v_{N}(1),\ldots, v_{N}(N)\}$ in a compact, convex Euclidean domain ${\cal D}$ representing a spatial region. We assume that these points form a uniform grid, though the model generalizes to nonuniform spacing of points under certain conditions (see Section IV for discussion). We refer to these $N$ points in ${\cal D}$ as grid points.

We consider a discrete-time Markov chain TeX Source$$X_{N,M}(k)=[X_{N,M}(k,1),\ldots,X_{N,M}(k,N)]^{\top}\in{\BBR}^{N}$$ (the superscript $\top$ represents transpose) whose evolution is described by the stochastic difference equation TeX Source$$X_{N,M}(k+1)=X_{N,M}(k)+F_{N}(X_{N,M}(k)/M,U_{N}(k)).\eqno{\hbox{(1)}}$$ Here, $X_{N,M}(k,n)$ is the real-valued state associated with the grid point $v_{N}(n)$ at time $k$, where $n=1,\ldots,N$ is a spatial index and $k=0,1,\ldots$ is a temporal index; $U_{N}(k)$ are i.i.d.random vectors that do not depend on the state $X_{N,M}(k)$; $M$ is an “averaging” parameter (explained later); and $F_{N}$ is a given function.

Treating $N$ and $M$ as indices that grow, the (1) defines a doubly indexed family $X_{N,M}(k)$ of Markov chains indexed by both $N$ and $M$. (We will later take $M$ to be a function of $N$, and treat this family as a sequence $X_{N}(k)$ of the single index $N$.) Below we give a concrete example of a system described by (1).

### C. A Stochastic Network Model

In this subsection we demonstrate the various objects in the abstract Markov chain model analyzed in this paper on a prototypical example. We begin by describing a stochastic model of a wireless sensor network.

Consider a network of $N$ wireless sensor nodes uniformly placed over the domain ${\cal D}$. That is, the $N$ nodes are located on the grid points $V_{N}=\{v_{N}(1),\ldots, v_{N}(N)\}$ described above. We label the node at $v_{N}(n)$ by $n$, where $n=1,\ldots,N$. The sensor nodes generate, according to a probability distribution, data messages that need to be communicated to the destination nodes located on the boundary of the domain, which represent specialized devices that collect the sensor data. The sensor nodes also serve as relays for routing messages to the destination nodes. Each sensor node has the capacity to store messages in a queue, and is capable of either transmitting or receiving messages to or from its immediate neighbors. (Generalization to further ranges of transmission can be found in our paper [106].) At each time instant $k=0,1,\ldots$, each sensor node probabilistically decides to be a transmitter or receiver, but not both. This simplified rule of transmission allows for a relatively simple representation. We illustrate such a network over a two-dimensional domain in Fig. 1(a).

Fig. 1. (a) An illustration of a wireless sensor network over a two-dimensional domain. Destination nodes are located at the far edge. We show the possible path of a message originating from a node located in the left-front region. (b) An illustration of the collision protocol: reception at a node fails when one of its other neighbors transmits (regardless of the intended receiver). (c) An illustration of the time evolution of the queues in the one-dimensional network model.

In this network, communication between nodes is interference-limited because all nodes share the same wireless channel. We assume a simple collision protocol: a transmission from a transmitter to a neighboring receiver is successful if and only if none of the other neighbors of the receiver is a transmitter, as illustrated in Fig. 1(b). We assume that in a successful transmission, one message is transmitted from the transmitter to the receiver.

We assume that the probability that a node decides to be a transmitter is a function of its normalized queue length (normalized by an “averaging” parameter $M$). That is, at time $k$, node $n$ decides to be a transmitter with probability $W(n, X_{N,M}(k,n)/M)$, where $X_{N,M}(k,n)$ is the queue length of node $n$ at time $k$, and $W$ is a given function.

In this section, for the sake of explanation, we simplify the problem even further and consider a one-dimensional domain (a two-dimensional example will be given in Section II-E-III). Here, $N$ sensor nodes are equidistributed in an interval ${\cal D}\subset{\BBR}$ and labeled by $n=1,\ldots,N$. The destination nodes are located on the boundary of ${\cal D}$, labeled by $n=0$ and $n=N+1$.

We assume that if node $n$ is a transmitter at a certain time instant, it randomly chooses to transmit one message to the right or the left immediate neighbor with probability $P_{r}(n)$ and $P_{l}(n)$, respectively, where $P_{r}(n)+P_{l}(n)\leq 1$. In contrast to strict equality, the inequality here allows for a more general stochastic model of transmission: after a sensor node randomly decides to transmit over the wireless channel, there is still a positive probability that the message is not transferred to its intended receiver (what might be called an “outage”).

The special destination nodes at the boundaries of the domain do not have queues; they simply receive any message transmitted to them and never themselves transmit anything. We illustrate the time evolution of the queues in the network in Fig. 1(c).

The queue lengths form a Markov chain network model given by (1), where TeX Source\eqalignno{&U_{N}(k)=[Q(k,1),\ldots,Q(k,N), T(k,1),\ldots,T(k,N),\cr&\qquad~~\qquad G(k,1),\ldots, G(k,N)]^{\top}} is a random vector comprising independent random variables: $Q(k,n)$ are uniform random variables on $[{0,1}]$ used to determine if the node is a transmitter or not;$T(k,n)$ are ternary random variables used to determine the direction a message is passed, which take values $R$, $L$, and $S$(representing transmitting to the right, the left, and neither, respectively) with probabilities $P_{r}(n)$, $P_{l}(n)$, and $1-(P_{r}(n)+P_{l}(n))$, respectively; and $G(k,n)$ are the number of messages generated at node $n$ at time $k$. We model $G(k,n)$ by independent Poisson random variables with mean $g(n)$.

For a generic $x=[x_{1},\ldots,x_{N}]^{\top}\in{\BBR}^{N}$, the $n$th component of $F_{N}(x,U_{N}(k))$, where $n=1,\ldots,N$, is TeX Source$$\cases{1+G(k,n),~~{\rm~if}\cr\quad Q(k,x_{n-1})<W(n-1,x_{n-1}),~T(k,n-1)=R,\cr\quad Q(k,x_{n})>W(n,x_{n}),\cr\quad Q(k,x_{n+1})>W(n+1,x_{n+1});\cr\quad{\rm~or~}\cr\quad Q(k,x_{n+1})<W(n+1,x_{n+1}),~T(k,n+1)=L,\cr\quad Q(k,x_{n})>W(n,x_{n}),\cr\quad Q(k,x_{n-1})>W(n-1,x_{n-1})\cr-1+G(k,n),{\rm~if}\cr\quad Q(k,x_{n})<W(n,x_{n}),~T(k,n)=L,\cr\quad Q(k,x_{n-1})>W(n-1,x_{n-1}),\cr\quad Q(k,x_{n-2})>W(n-2,x_{n-2});\cr\quad{\rm~or~}\cr\quad Q(k,x_{n})<W(n,x_{n}),~T(k,n)=R,\cr\quad Q(k,x_{n+1})>W(n+1,x_{n+1}),\cr\quad Q(k,x_{n+2})>W(n+2,x_{n+2})\cr G(k,n),{\rm~otherwise,}}\eqno{\hbox{(2)}}$$ where $x_{n}$ with $n\leq 0$ or $n\geq N+1$ are defined to be zero; and $W$ is the function that specifies the probability that a node decides to be a transmitter, as defined earlier. Here, the three possible values of $F_{N}$ correspond to the three events that at time $k$, node $n$ successfully receives one message, successfully transmits one message, and does neither of the above, respectively. The inequalities and equations on the right describe conditions under which these three events occur: for example, $Q(k,x_{n-1})<W(n-1,x_{n-1})$ corresponds to the choice of node $n-1$ to be a transmitter at time $k$, $T(k,n-1)=R$ corresponds to its choice to transmit to the right, $Q(k,x_{n})>W(n,x_{n})$ corresponds to the choice of node $n$ to be a receiver at time $k$, and so on.

We simplify the situation further by assuming that $W(n, y)=\min (1,y)$. (We use this assumption throughout the paper.) With the collision protocol described earlier, this provides the analog of a network with backpressure routing [107].

After presenting the main results of the paper, we will revisit this network model in Section II-E and present a PDE that approximates its global behavior as an application of the main results.

### D. Overview of Results in This Paper

In this subsection, we provide a brief description of the main results in Section II.

The Markov chain model (1) is related to a deterministic difference equation. We set TeX Source$$f_{N}(x)=EF_{N}(x,U_{N}(k)),\quad x\in{\BBR}^{N},\eqno{\hbox{(3)}}$$ and define $x_{N,M}(k)=[x_{N,M}(k,1),\ldots,x_{N,M}(k,N)]^{\top}\in{\BBR}^{N}$ by TeX Source\eqalignno{x_{N,M}(k+1)=&\, x_{N,M}(k)+{{1}\over{M}}f_{N}(x_{N,M}(k)),\cr x_{N,M}(0)=&\,{{X_{N,M}(0)}\over{M}}{\rm~a.s.}&{\hbox{(4)}}} (“a.s.” is short for “almost surely”).

#### Example 1

For the one-dimensional Markov chain network model introduced in Section I-C, it follows from (2) (with the particular choice of $W(n,y)=\min (1,y)$) that for $x=[x_{1},\ldots,x_{N}]^{\top}\in [{0,1}]^{N}$, the $n$th component of $f_{N}(x)$ in its corresponding deterministic difference (4), where $n=1,\ldots,N$, is (after some tedious algebra, as described in [3]) TeX Source\eqalignno{& (1-x_{n})[P_{r}(n-1)x_{n-1}(1-x_{n+1})\cr&\!\!\quad+P_{l}(n+1) x_{n+1}(1-x_{n-1})]\cr&\!\!\quad-x_{n}[P_{r}(n)(1-x_{n+1})(1-x_{n+2})\cr&\!\!\quad+P_{l}(n)(1-x_{n-1})(1-x_{n-2})]+g(n), &{\hbox{(5)}}} where $x_{n}$ with $n\leq 0$ or $n\geq N+1$ are defined to be zero.

We analyze the convergence of the Markov chain to the solution of a PDE using a two-step procedure. The first step depends heavily on the relation between $X_{N,M}(k)$ and $x_{N,M}(k)$. We show that for each $N$, as $M\to\infty$, the difference between $X_{N,M}(k)/M$ and $x_{N,M}(k)$ vanishes, by proving that they both converge in a certain sense to the solution of the same ODE. The basic idea of this convergence is that as the “fluctuation size” of the system decreases and the “fluctuation rate” of the system increases, the stochastic system converges to a deterministic “small-fast-fluctuation” limit, which can be characterized as the solution of a particular ODE. In our case, the smallness of the fluctuation size and largeness of the fluctuation rate is quantified by the “averaging” parameter $M$. We use a weak convergence theorem of Kushner [19] to prove this convergence.

In the second step, we treat $M$ as a function of $N$, written $M_{N}$ (therefore treating $X_{N,M_{N}}(k)$ and $x_{N,M_{N}}(k)$ as sequences of the single index $N$, written $X_{N}(k)$ and $x_{N}(k)$, respectively), and show that for any sequence $\{M_{N}\}$ of $N$, as $N\to\infty$, $x_{N}(k)$ converges to the solution of a certain PDE (and we show how to construct the PDE). This is essentially a convergence analysis on the approximating error between $x_{N}(k)$ and the PDE solution. We stress that this is different from the numerical analysis on classical finite difference schemes (see, e.g., [7], [108], [109]), because our difference (4), which originates from particular system models, differs from those designed specifically for the purpose of numerically solving differential equations. The difficulty in our convergence analysis arises from both the different form of (4) and the fact that it is in general nonlinear. We provide not only sufficient conditions for the convergence, but also a practical criterion for verifying such conditions otherwise difficult to check.

Finally, based on these two steps, we show that as $N$ and $M_{N}$ go to $\infty$ in a dependent way, the continuous-time-space extension (explained later) of the normalized Markov chain $X_{N}(k)/M_{N}$ converges to the PDE solution. We also characterize the rate of convergence. We note that special caution is needed for specifying the details of this dependence between the two indices $N$ and $M$ of the doubly indexed family $X_{N,M}(k)$ of Markov chains in the limiting process.

### E. Outline of the Paper

The remainder of the paper is organized as follows. In Section II, we present the main theoretical results and apply the results to the wireless sensor network introduced above, and present some numerical experiments. In Section III, we present the proofs of the main results. Finally, we conclude the paper and discuss future work in Section IV.

SECTION II

## MAIN RESULTS AND APPLICATIONS

### A. Construction of the Limiting PDE

We begin with the construction of the PDE whose solution describes the limiting behavior of the abstract Markov chain model.

For each $N$ and the grid points $V_{N}=\{v_{N}(1),\ldots,v_{N}(N)\}\subset{\cal D}$ as introduced in Section I-B, we denote the distance between any two neighboring grid points by $ds_{N}$. For any continuous function $w:{\cal D}\to{\BBR}$, let $y_{N}$ be the vector in ${\BBR}^{N}$ composed of the values of $w$ at the grid points $v_{N}(n)$; i.e., $y_{N}=[w(v_{N}(1)),\ldots,w(v_{N}(N))]^{\top}$. Given a point $s\in{\cal D}$, we let $\{s_{N}\}\subset{\cal D}$ be any sequence of grid points $s_{N}\in V_{N}$ such that as $N\to\infty$, $s_{N}\to s$. Let $f_{N}(y_{N}, s_{N})$ be the component of the vector $f_{N}(y_{N})$ corresponding to the location $s_{N}$; i.e., if $s_{N}=v_{N}(n)\in V_{N}$, then $f_{N}(y_{N}, s_{N})$ is the $n$th component of $f_{N}(y_{N})$.

In order to obtain a limiting PDE, we have to make certain technical assumptions on the asymptotic behavior of the sequence of functions $\{f_{N}\}$ that insure that $f_{N}(y_{N},s_{N})$ is asymptotically close to an expression that looks like the right-hand side of a time-dependent PDE. Such conditions are familiar in the context of PDE limits of Brownian motion. Checking these conditions often amounts to a simple algebraic exercise. We provide a concrete example (the network model) in Section II-E where $f_{N}$ satisfies these assumptions.

We assume that there exist sequences $\{\delta_{N}\}$, $\{\beta_{N}\}$, $\{\gamma_{N}\}$, and $\{\rho_{N}\}$, functions $f$ and $h$, and a constant $c<\infty$, such that as $N\to\infty$, $\delta_{N}\to 0$, $\delta_{N}/\beta_{N}\to 0$, $\gamma_{N}\to 0$, $\rho_{N}\to 0$, and:

• Given $s$ in the interior of ${\cal D}$, there exists a sequence of functions $\{\phi_{N}\}:{\cal D}\to{\BBR}$ such that TeX Source\eqalignno{& f_{N}(y_{N},s_{N})/\delta_{N}\cr&~~\quad=f(s_{N}, w(s_{N}),\nabla w(s_{N}),\nabla^{2}w(s_{N}))+\phi_{N}(s_{N}), &{\hbox{(6)}}} for any sequence of grid points $s_{N}\to s$, and for $N$ sufficiently large, $\vert\phi_{N}(s_{N})\vert\leq c\gamma_{N}$; and
• Given $s$ on the boundary of ${\cal D}$, there exists a sequence of functions $\{\varphi_{N}\}:{\cal D}\to{\BBR}$ such that TeX Source\eqalignno{& f_{N}(y_{N},s_{N})/\beta_{N}\cr&~~\quad=h(s_{N}, w(s_{N}),\nabla w(s_{N}),\nabla^{2}w(s_{N}))+\varphi_{N}(s_{N}), &{\hbox{(7)}}} for any sequence of grid points $s_{N}\to s$, and for $N$ sufficiently large, $\vert\varphi_{N}(s_{N})\vert\leq c\rho_{N}$.

Here, $\nabla^{i}w$ represents all the $i$th order derivatives of $w$, where $i=1$, 2.

Fix $T>0$ for the rest of this section. Assume that there exists a unique function $z:[0,T]\times{\cal D}\to{\BBR}$ that solves the limiting PDE TeX Source$${\mathdot{z}}(t,s)=f(s, z(t,s),\nabla z(t,s),\nabla^{2}z(t,s)),\eqno{\hbox{(8)}}$$ with boundary condition TeX Source$$h(s, z(t,s),\nabla z(t,s),\nabla^{2}z(t,s))=0\eqno{\hbox{(9)}}$$ and initial condition $z(0,s)=z_{0}(s)$.

Recall that $x_{N,M}(k)$ is defined by (4). Suppose that we associate the discrete time $k$ with points on the real line spaced apart by a distance proportional to $\delta_{N}$. Then, the technical assumptions (6) and (7) imply that $x_{N,M}(k)$ is, in a certain sense, close to the solution of the limiting PDE (8) with boundary condition (9). Below we develop this argument rigorously.

Establishing existence and uniqueness for the resulting nonlinear models is a difficult problem in theoretical analysis of PDEs in general. The techniques are heavily dependent on the particular form of $f$. Therefore, as is common with numerical analysis, we assume that this has been established. Later, we apply the general theory to the modeling of networks of particular characteristics. The resulting limiting PDE is a nonlinear reaction-convection-diffusion problem. Existence and uniqueness for such problems for “small” data and short times can be established under general conditions. Key ingredients are coercivity, which will hold as long as $z$ is bounded away from 1, and diffusion dominance, which will also hold as long as $z$ is bounded above.

### B. Continuous Time-Space Extension of the Markov Chain

Next we define the continuous time-space extension of the Markov chain $X_{N,M}(k)$.

For each $N$ and $M$, define TeX Source\eqalignno{dt_{N,M}=&\,{{\delta_{N}}\over{M}},t_{N,M}(k)=k\,dt_{N,M}, K_{N,M}=\left\lfloor{{T}\over{dt_{N,M}}}\right\rfloor,{\rm~and}\cr{\mathtilde T}_{N}=&\,{{T}\over{\delta_{N}}}.&{\hbox{(10)}}}

First, we construct the continuous-time extension $X^{(o)}_{N,M}({\mathtilde{t}})$ of $X_{N,M}(k)$, as the piecewise-constant time interpolant with interval length $1/M$ and normalized by $M$: TeX Source$$X^{(o)}_{N,M}({\mathtilde{t}})=X_{N,M}(\lfloor M{\mathtilde{t}}\rfloor)/M,\quad{\mathtilde{t}}\in [0,{\mathtilde{T}}_{N}].\eqno{\hbox{(11)}}$$ Similarly, define the continuous-time extension $x^{(o)}_{N,M}({\mathtilde{t}})$ of $x_{N,M}(k)$ by TeX Source$$x^{(o)}_{N,M}({\mathtilde{t}})=x_{N,M}(\lfloor M{\mathtilde{t}}\rfloor),\quad{\mathtilde{t}}\in [0,{\mathtilde{T}}_{N}].\eqno{\hbox{(12)}}$$

Let $X^{(p)}_{N,M}(t,s)$, where $(t,s)\in [0,T]\times{\cal D}$, be the continuous-space extension of $X^{(o)}_{N,M}({\mathtilde{t}})$ (with ${\mathtilde{t}}\in [0,{\mathtilde{T}}_{N}]$) by piecewise-constant space extensions on ${\cal D}$ and with time scaled by $\delta_{N}$ so that the time-interval length is $\delta_{N}/M:=dt_{N,M}$. By piecewise-constant space extension of $X^{(o)}_{N,M}$, we mean the piecewise-constant function on ${\cal D}$ such that the value of this function at each point in ${\cal D}$ is the value of the component of the vector $X^{(o)}_{N,M}$ corresponding to the grid point that is “closest to the left” (taken one component at a time). Then $X^{(p)}_{N,M}(t,s)$ is the continuous time-space extension of $X_{N,M}(k)$, and for each $t$, $X^{(p)}_{N,M}(t,\cdot)$ is a real-valued function defined on ${\cal D}$. We illustrate in Fig. 2.

Fig. 2. An illustration of $X_{N,M}(k)$ and $X^{(p)}_{N,M}(t,s)$ in one dimension, represented by solid dots and dashed-line rectangles, respectively.

The function $X_{N,M}^{(p)}(t,s)$ with $(t,s)\in [0,T]\times{\cal D}$ is in the space $D^{\cal D}[0,T]$ of functions from $[0,T]\times{\cal D}$ to ${\BBR}$ that are Càdlàg with respect to the time component, i.e., right-continuous at each $t\in [0,T)$, and have left-hand limits at each $t\in (0,T]$. Denote the norm $\Vert\cdot\Vert^{(p)}$ on $D^{\cal D}[0,T]$ such that for $x\in D^{\cal D}[0,T]$, TeX Source$$\Vert x\Vert^{(p)}=\sup_{t\in [0,T]}\int_{\cal D}\vert x(t,s)\vert\,ds.\eqno{\hbox{(13)}}$$

### C. Main Results for Continuum Limit of the Abstract Markov Chain Model

In this subsection, we present the main theorem, Theorem 1, which states that under some conditions, the continuous-time-space extension $X^{(p)}_{N,M}$ of the Markov chain $X_{N,M}(k)$ converges to the solution $z$ of the limiting PDE (8) in the norm defined by (13), as $N$ and $M$ go to $\infty$ in a dependent way. By this we mean that we set $M$ to be a function of $N$, written $M_{N}$, such that $M_{N}\to\infty$ as $N\to\infty$. Then we can treat $X_{N,M_{N}}(k)$, $x_{N,M_{N}}(k)$, $X^{(p)}_{N,M_{N}}$, $dt_{N,M_{N}}$, $t_{N,M_{N}}$, and $K_{N,M_{N}}$ all as sequences of the single index $N$, written $X_{N}(k)$, $x_{N}(k)$, $X^{(p)}_{N}$, $dt_{N}$, $t_{N}$, and $K_{N}$ respectively. We apply such changes of notation throughout the rest of the paper whenever $M$ is treated as a function of $N$.

Define $z_{N}(k,n)=z(t_{N}(k),v_{N}(n))$ and $z_{N}(k)=[z_{N}(k,1),\ldots,z_{N}(k,N)]^{\top}\in{\BBR}^{N}$. Define the truncation error TeX Source$$u_{N}(k,n)={{f_{N}(z_{N}(k),n)}\over{\delta_{N}}}-{{z_{N}(k+1,n)-z_{N}(k,n)}\over{dt_{N}}},\eqno{\hbox{(14)}}$$ and $u_{N}(k)=[u_{N}(k,1),\ldots,u_{N}(k,N)]^{\top}\in{\BBR}^{N}$. Define TeX Source$$\varepsilon_{N}(k,n)=x_{N}(k,n)-z_{N}(k,n),\eqno{\hbox{(15)}}$$ and $\varepsilon_{N}(k)=[\varepsilon_{N}(k,1),\ldots,\varepsilon_{N}(k,N)]^{\top}\in{\BBR}^{N}$. By (4), (10), (14), and (15), we have that TeX Source\eqalignno{&\varepsilon_{N}(k+1)\cr&\!\!\quad=\varepsilon_{N}(k)+{{1}\over{M_{N}}}(f_{N}(x_{N}(k))-f_{N}(z_{N}(k)))+dt_{N}u_{N}(k)\cr&\!\!\quad=\varepsilon_{N}(k)+{{1}\over{M_{N}}}(f_{N}(z_{N}(k)+\varepsilon_{N}(k))-f_{N}(z_{N}(k)))\cr&\qquad\quad+dt_{N}u_{N}(k).&{\hbox{(16)}}}

Let $\varepsilon_{N}=[\varepsilon_{N}(1)^{\top},\ldots,\varepsilon_{N}(K_{N})^{\top}]^{\top}$ and $u_{N}=[u_{N}(0)^{\top},\ldots,u_{N}(K_{N}-1)^{\top}]^{\top}$ denote vectors in the $(K_{N}N)$-dimensional vector space ${\BBR}^{K_{N}N}$. Assume that TeX Source$$\varepsilon_{N}(0)=0.\eqno{\hbox{(17)}}$$ Then by (16), for fixed $z$, there exists a function $H_{N}:{\BBR}^{K_{N}N}\to{\BBR}^{K_{N}N}$ such that TeX Source$$\varepsilon_{N}=H_{N}(u_{N}).\eqno{\hbox{(18)}}$$

Define the vector norm $\Vert\cdot\Vert^{(N)}$ on ${\BBR}^{K_{N}N}$ such that for $x=[x(1)^{\top},\ldots,x(K_{N})^{\top}]^{\top}\in{\BBR}^{K_{N}N}$, where $x(k)=[x(k,1),\ldots,x(k,N)]^{\top}\in{\BBR}^{N}$, TeX Source$$\Vert x\Vert^{(N)}=ds_{N}\max_{k=1,\ldots,K_{N}}\sum_{n=1}^{N}\vert x(k,n)\vert.\eqno{\hbox{(19)}}$$ Define TeX Source$$\mu_{N}=\lim_{\alpha\to 0}\sup_{\Vert u\Vert^{(N)}\leq\alpha}{{\Vert H_{N}(u)\Vert^{(N)}}\over{\Vert u\Vert^{(N)}}}.\eqno{\hbox{(20)}}$$

We now present the main theorem.

#### Theorem 1

(Main Theorem) Assume that:

1. there exist a sequence $\{\xi_{N}\}$ and $c_{1}<\infty$ such that as $N\to\infty$, $\xi_{N}\to 0$, and for $N$ sufficiently large, $\Vert u_{N}\Vert^{(N)}<c_{1}\xi_{N}$;
2. for each $N$, there exists an identically distributed sequence $\{\lambda_{N}(k)\}$ of integrable random variables such that for each $k$ and $x$, $\vert F_{N}(x,U_{N}(k))\vert\leq\lambda_{N}(k)$ a.s.;
3. for each $N$, the function $F_{N}(x,U_{N}(k))$ is continuous in $x$ a.s.;
4. for each $N$, the ODE ${\mathdot{y}}=f_{N}(y)$ has a unique solution on $[0,{\mathtilde T}_{N}]$ for any initial condition $y(0)$, where ${\mathtilde T}_{N}$ is as defined by (10);
5. $z$ is Lipschitz continuous on $[0,T]\times{\cal D}$;
6. for each $N$, (17) holds; and
7. the sequence $\{\mu_{N}\}$ is bounded.

Then a.s., there exist $c_{0}<\infty$, $N_{0}$, and ${\mathhat M}_{1}<{\mathhat M}_{2}<{\mathhat M}_{3},\ldots$ such that for each $N\geq N_{0}$ and each $M_{N}\geq{\mathhat M}_{N}$, TeX Source$$\Vert X^{(p)}_{N}-z\Vert^{(p)}<c_{0}\max\{\xi_{N},ds_{N}\}.$$

This theorem states that as $N$ and $M_{N}$ go to $\infty$ in a dependent way, $X^{(p)}_{N}$ converges to $z$ in $\Vert\cdot\Vert^{(p)}$ a.s. We prove this in Section III-C.

### D. Sufficient Conditions on $f_{N}$ for the Boundedness of $\{\mu_{N}\}$

The key assumption of Theorem 1 is that the sequence $\{\mu_{N}\}$ is bounded (Assumption T1.7). We present in the following theorem a result that gives specific sufficient conditions on $f_{N}$ that guarantee that $\{\mu_{N}\}$ is bounded. This provide a practical criterion to verify this key assumption otherwise difficult to check.

Consider fixed $z$. We assume that $f_{N}\in{\cal C}^{1}$ and denote the jacobian matrix of $f_{N}$ at $x$ by $Df_{N}(x)$. Define for each $N$ and for $k=0,\ldots,K_{N}-1$, TeX Source$$A_{N}(k)=I_{N}+{{1}\over{M_{N}}}Df_{N}(z_{N}(k)),\eqno{\hbox{(21)}}$$ where $I_{N}$ is the identity matrix in ${\BBR}^{N\times N}$.

We denote the 1-norm on ${\BBR}^{N}$ and its induced norm on ${\BBR}^{N\times N}$ both by $\Vert\cdot\Vert_{1}^{(N)}$; i.e., for a vector $x=[x_{1},\ldots,x_{N}]^{\top}\in{\BBR}^{N}$, TeX Source$$\Vert x\Vert_{1}^{(N)}=\sum_{n=1}^{N}\vert x_{n}\vert,\eqno{\hbox{(22)}}$$ and for a matrix $A\in{\BBR}^{N\times N}$ with $a_{ij}$ being its $(i,j)$th component, TeX Source$$\Vert A\Vert_{1}^{(N)}=\max_{j=1,\ldots,N}\sum_{i=1}^{N}\vert a_{ij}\vert.\eqno{\hbox{(23)}}$$

We then have

#### Theorem 2

(Sufficient condition for key assumption) Assume that:

1. for each $N$, (17) holds;
2. for each $N$, $f_{N}\in{\cal C}^{1}$; and
3. there exists $c<\infty$ such that for $N$ sufficiently large and for $k=1,\ldots,K_{N}-1$, $\Vert A_{N}(k)\Vert_{1}^{(N)}\leq 1+c\,dt_{N}$, where $\Vert\cdot\Vert_{1}^{(N)}$ is defined by (23).

Then $\{\mu_{N}\}$ is bounded.

We prove this in Section III-D.

In Section III-E, we will show that these sufficient conditions hold for the network model described in Section I-C, and use this theorem to prove the convergence of its underlying Markov chain to a PDE.

### E. Application to Network Models

In this subsection, we apply the main results to show how the Markov chain modeling the network introduced in Section I-C can be approximated by the solution of a PDE. This approximation was heuristically developed in [3].

We first deal with the one-dimensional network model. Its corresponding stochastic and deterministic difference (1) and (4) were specified by (2) and (5), respectively.

For this model we set $\delta_{N}$ (introduced in Section II-A) to be $ds_{N}^{2}$. Then TeX Source$$dt_{N,M}:=\delta_{N}/M=ds_{N}^{2}/M.$$

Assume that TeX Source$$P_{l}(n)=p_{l}(v_{N}(n)){\rm~and~}P_{r}(n)=p_{r}(v_{N}(n)),\eqno{\hbox{(24)}}$$ where $p_{l}(s)$ and $p_{r}(s)$ are real-valued functions defined on ${\cal D}$ such that TeX Source$$p_{l}(s)=b(s)+c_{l}(s)ds_{N}{\rm~and~}p_{r}(s)=b(s)+c_{r}(s)ds_{N}.$$ Let $c=c_{l}-c_{r}$. The values $b(s)$ and $c(s)$ correspond to diffusion and convection quantities in the limiting PDE. Because $p_{l}(s)+p_{r}(s)\leq 1$, it is necessary that $b(s)\leq 1/2$. In order to guarantee that the number of messages entering the system from outside over finite time intervals remains finite throughout the limiting process, we set $g(n)=Mg_{p}(v_{N}(n))dt_{N}$, where $g_{p}:{\cal D}\to{\BBR}$ is called the message generation rate. Assume that $b$, $c_{l}$, $c_{r}$, and $g_{p}$ are in ${\cal C}^{1}$. Further assume that $x_{N,M}(k)\in [{0,1}]^{N}$ for each $k$. Then $f_{N}$ is in ${\cal C}^{1}$.

We have assumed above that the probabilities $P_{l}$ and $P_{r}$ of the direction of transmission are the values of the continuous functions $p_{l}$ and $p_{r}$ at the grid points, respectively. This may correspond to stochastic routing schemes where nodes in close vicinity behave similarly based on some local information that they share; or to those with an underlying network-wide directional configuration that are continuous in space, designed to relay messages to destination nodes at known locations. On the other hand, the results can be extended to situations with certain levels of discontinuity, as discussed in Section IV.

By these assumptions and definitions, it follow from (5) that the function $f$ in (8) for this network model is:TeX Source\eqalignno{& f(s, z(t,s),\nabla z(t,s),\nabla^{2}z(t,s))\cr&~~\qquad=b(s){{\partial}\over{\partial s}}\left((1-z(t,s))(1+3z(t,s))z_{s}(t,s)\right)\cr&\quad~~\qquad+2(1-z(t,s))z_{s}(t,s)b_{s}(s)\cr&~~\quad\qquad+z(t,s)(1-z(t,s))^{2}b_{ss}(s)\cr&~~\quad\qquad+{{\partial}\over{\partial s}}(c(s)z(t,s)(1-z(t,s))^{2})+g_{p}(s).&{\hbox{(25)}}} Here, a single subscript $s$ represents first derivative and a double subscript $ss$ represents second derivative.

Note that the computations needed to obtain (25) (and later, (26), (48), and (49)) require tedious but elementary algebraic manipulations. For this purpose, we found it helpful to use the symbolic tools in Matlab.

Based on the behavior of nodes $n=1$ and $n=N$ next to the destination nodes, we derive the boundary condition (9) of the PDE of this network. For example, the node $n=1$ receives messages only from the right and encounters no interference when transmitting to the left. Replacing $x_{n}$ with $n\leq 0$ or $n\geq N+1$ by 0, it follows that the 1st component of $f_{N}(x)$ is TeX Source\eqalignno{& (1-x_{n})P_{l}(n+1)x_{n+1}\cr&\!\!\quad-x_{n}[P_{l}(n)+P_{r}(n)(1-x_{n+1})(1-x_{n+2})]+g(n).} Similarly, the $N$th component of $f_{N}(x)$ is TeX Source\eqalignno{& (1-x_{n})P_{r}(n-1)x_{n-1}\cr&\!\!\quad-x_{n}[P_{r}(n)+P_{l}(n)(1-x_{n-1})(1-x_{n-2})]+g(n).} Set $\beta_{N}$, defined in Section II-A, to be 1. Then from each of the above two functions we get the function $h$ in (9) for the one-dimensional network: TeX Source\eqalignno{& h(s, z(t,s),\nabla z(t,s),\nabla^{2}z(t,s))\cr&\quad=-b(s)z(s)^{3}+b(s)z(s)^{2}-b(s)z(s).&{\hbox{(26)}}} Note that the function $h$ is the limit of $f_{N}(y_{N},s_{N})/\beta_{N}$, not $f_{N}(y_{N},s_{N})/\delta_{N}$ (whose limit is $f$). Solving $h=0$ for real $z$, we have the boundary condition $z(t,s)=0$.

Let $z$ be the solution of the PDE (8) with $f$ specified by (25) and with boundary condition $z(t,s)=0$ and initial condition $z(0,s)=z_{0}(s)$. Assume that (17) holds. As in Section II-C, we treat $M$ as a sequence of $N$, written $M_{N}$. In the following theorem we show the convergence of the Markov chain modeling the one-dimensional network to the PDE solution.

#### Theorem 3

(Convergence of network model) For the one-dimensional network model, a.s., there exist $c_{0}<\infty$, $N_{0}$, and ${\mathhat M}_{1}<{\mathhat M}_{2}<{\mathhat M}_{3},\ldots$ such that for each $N\geq N_{0}$ and each $M_{N}\geq{\mathhat M}_{N}$, $\Vert X^{(p)}_{N}-z\Vert^{(p)}<c_{0}ds_{N}$.

We prove this in Section III-E.

#### 1. Interpretation of Limiting PDE

Now we make some remarks on how to interpret a given limiting PDE. First, for fixed $N$ and $M$, the normalized queue length of node $n$ at time $k$, is approximated by the value of the PDE solution $z$ at the corresponding point in $[0,T]\times{\cal D}$; i.e., ${{X_{N,M}(k,n)}\over{M}}\approx z(t_{N,M}(k),v_{N}(n))$.

Second, we discuss how to interpret $C(t_{o}):=\int_{\cal D}z(t_{o},s)ds$, the area below the curve $z(t_{o},s)$ for fixed $t_{o}\in [0,T]$. Let $k_{o}=\lfloor t_{o}/dt_{N,M}\rfloor$. Then we have that $z(t_{o},v_{N}(n))ds_{N}\approx{{X_{N,M}(k_{o},n)}\over{M}}ds_{N}$, the area of the $n$th rectangle in Fig. 3. Therefore TeX Source$$C(t_{o})\approx\sum_{n=1}^{N}z(t_{o},v_{N}(n))ds_{N}\approx\sum_{n=1}^{N}{{X_{N,M}(k_{o},n)}\over{M}}ds_{N},$$ the sum of all rectangles. If we assume that all messages in the queue have roughly the same bits, and think of $ds_{N}$ as the “coverage” of each node, then the area under any segment of the curve measures a kind of “data-coverage product” of the nodes covered by the segment, in the unit of “${\rm bit}\cdot{\rm meter}$.” As $N\to\infty$, the total normalized queue length $\sum_{n=1}^{N}X_{N,M}(k_{o},n)/M$ of the network does go to $\infty$; however, the coverage $ds_{N}$ of each node goes to 0. Hence the sum of the “data-coverage product” can be approximated by the finite area $C(t_{o})$.

Fig. 3. The PDE solution at a fixed time that approximates the normalized queue lengths of the network.

#### 2. Comparisons of the PDE Solutions and Monte Carlo Simulations of the Networks

In the remainder of this section, we compare the limiting PDE solutions with Monte Carlo simulations of the networks.1

We first consider a one-dimensional network over the domain ${\cal D}=[{-1,1}]$. We use the initial condition $z_{0}(s)=l_{1}e^{-s^{2}}$, where $l_{1}>0$ is a constant, so that initially the nodes in the middle have messages to transmit, while those near the boundaries have very few. We set the message generation rate $g_{p}(s)=l_{2}e^{-s^{2}}$, where $l_{2}>0$ is a parameter determining the total load of the system.

We use three sets of values of $N=20$, 50, 80 and $M=N^{3}$, and show the PDE solution and the Monte Carlo simulation results with different $N$ and $M$ at $t=1~{\rm s}$. The networks have diffusion $b=1/2$ and convection $c=0$ in Fig. 4 and $c=1$ in Fig. 5, respectively, where the x-axis denotes the node location and y-axis denotes the normalized queue length.

Fig. 4. The Monte Carlo simulations (with different $N$ and $M$) and the PDE solution of a one-dimensional network, with $b=1/2$ and $c=0$, at $t=1~{\rm s}$.
Fig. 5. The Monte Carlo simulations (with different $N$ and $M$) and the PDE solution of a one-dimensional network, with $b=1/2$ and $c=1$, at $t=1~{\rm s}$.

For the three sets of the values of $N=20$, 50, 80 and $M=N^{3}$, with $c=0$, the maximum absolute errors of the PDE approximation are $5.6\times 10^{-3}$, $1.3\times 10^{-3}$, and $1.1\times 10^{-3}$, respectively; and with $c=1$, the errors are $4.4\times 10^{-3}$, $1.5\times 10^{-3}$, and $1.1\times 10^{-3}$, respectively. As we can see, as $N$ and $M$ increase, the resemblance between the Monte Carlo simulations and the PDE solution becomes stronger. In the case of very large $N$ and $M$, it is difficult to distinguish the results.

We stress that the PDEs only took fractions of a second to solve on a computer, while the Monte Carlo simulations took on the order of tens of hours.

#### 3. A Two-Dimensional Network

The generalization of the continuum model to higher dimensions is straightforward, except for more arduous algebraic manipulation. Likewise, the convergence analysis is similar to the one dimensional case.

We consider a two-dimensional network of $N=N_{1}\times N_{2}$ sensor nodes uniformly placed over a domain ${\cal D}\subset{\BBR}^{2}$. Here we switch to a two-dimensional labeling scheme. We label the nodes by $(n_{1},n_{2})$, where $n_{1}=1,\ldots,N_{1}$ and $n_{2}=1,\ldots,N_{2}$, and denote the grid point in ${\cal D}$ corresponding to node $(n_{1},n_{2})$ by $v_{N}(n_{1},n_{2})$. This labeling scheme is more intuitive for this two-dimensional scenario, but is essentially equivalent to the single-label one. (e.g., if we set $n:=(n_{1}-1)N_{2}+n_{2}$ and ${\mathhat v}_{N}(n):=v_{N}(n_{1},n_{2})$, then ${\mathhat v}_{N}(n)$ form the same grid.)

Again let the distance between any two neighboring nodes be $ds_{N}$. Assume that node $(n_{1},n_{2})$ randomly chooses to transmit to the east, west, north, or south immediate neighbor with probabilities $P_{e}(n_{1},n_{2})=b_{1}(v_{N}(n_{1},n_{2}))+c_{e}(v_{N}(n_{1},n_{2})) ds_{N}$, $P_{w}(n_{1},n_{2})=b_{1}(v_{N}(n_{1},n_{2}))+c_{w}(n_{1},n_{2}))ds_{N}$, $P_{n}(n_{1},n_{2})=b_{2}(v_{N}(n_{1},n_{2}))+c_{n}(v_{N}(n_{1},n_{2}))ds_{N}$, and $P_{s}(n_{1},n_{2})=b_{2}(v_{N}(n_{1},n_{2}))+c_{s}(v_{N}(n_{1},n_{2}))ds_{N}$, respectively, where $P_{e}(n_{1},n_{2})+P_{w}(n_{1},n_{2})+P_{n}(n_{1},n_{2})+P_{s}(n_{1},n_{2})\leq 1$. Therefore it is necessary that $b_{1}(s)+b_{2}(s)\leq 1/2$. Define $c_{1}=c_{w}-c_{e}$ and $c_{2}=c_{s}-c_{n}$.

The derivation of the limiting PDE is similar to those of the one-dimensional case, except that we now have to consider transmission to and interference from four directions instead of two. We present the limiting PDE here without the detailed derivation: TeX Source\eqalignno{&{\mathdot z}=\sum_{j=1}^{2}b_{j}{{\partial}\over{\partial s_{j}}}\left((1+5z)(1-z)^{3}{{\partial z}\over{\partial s_{j}}}\right)+2(1-z)^{3}{{\partial z}\over{\partial s_{j}}}{{d b_{j}}\over{d s_{j}}}\cr&~~\quad+z(1-z)^{4}{{d^{2}b_{j}}\over{d s_{j}^{2}}}+{{\partial}\over{\partial s_{j}}}\big (c_{j}z(1-z)^{4}\big)+g_{p},} with boundary condition $z(t,s)=0$ and initial condition $z(0,s)=z_{0}(s)$, where $t\in [0,T]$ and $s=(s_{1},s_{2})\in{\cal D}$.

We now compare the PDE approximation and the Monte Carlo simulations of a network over the domain ${\cal D}=[{-1,1}]\times [{-1,1}]$. We use the initial condition $z_{0}(s)=l_{1}e^{-(s_{1}^{2}+s_{2}^{2})}$, where $l_{1}>0$ is a constant. We set the message generation rate $g_{p}(s)=l_{2}e^{-(s_{1}^{2}+s_{2}^{2})}$, where $l_{2}>0$ is a constant.

We use three different sets of the values of $N_{1}\times N_{2}$ and $M$, where $N_{1}=N_{2}=20$, 50, 80 and $M=N_{1}^{3}$. We show the contours of the normalized queue length from the PDE solution and the Monte Carlo simulation results with different sets of values of $N_{1}$, $N_{2}$, and $M$, at $t=0.1~{\rm s}$. The networks have diffusion $b_{1}=b_{2}=1/4$ and convection $c_{1}=c_{2}=0$ in Fig. 6 and $c_{1}=-2$, $c_{2}=-4$ in Fig. 7, respectively.

Fig. 6. The Monte Carlo simulations (from top to bottom, with $N_{1}=N_{2}=20$, 50, 80, respectively, and $M=N_{1}^{3}$) and the PDE solution of a two-dimensional network, with $b_{1}=b_{2}=1/4$ and $c_{1}=c_{2}=0$, at $t=0.1~{\rm s}$.
Fig. 7. The Monte Carlo simulations (from top to bottom, with $N_{1}=N_{2}=20$, 50, 80, respectively, and $M=N_{1}^{3}$) and the PDE solution of a two-dimensional network, with $b_{1}=b_{2}=1/4$ and $c_{1}=-2$, $c_{2}=-4$, at $t=0.1~{\rm s}$.

For the three sets of values of $N_{1}=N_{2}=20$, 50, 80 and $M=N_{1}^{3}$, with $c_{1}=c_{2}=0$, the maximum absolute errors are $3.2\times 10^{-3}$, $1.1\times 10^{-3}$, and $6.8\times 10^{-4}$, respectively; and with $c_{1}=-2$, $c_{2}=-4$, the errors are $4.1\times 10^{-3}$, $1.0\times 10^{-3}$, and $6.6\times 10^{-4}$, respectively. Again the accuracy of the continuum model increases with $N_{1}$, $N_{2}$, and $M$.

It took 3 days to do the Monte Carlo simulation of the network at $t=0.1~{\rm s}$ with 80×80 nodes and the maximum queue length $M=80^{3}$, while the PDE solved on the same machine took less than a second. We could not do Monte Carlo simulations of any larger networks or greater values of $t$ because of prohibitively long computation time.

SECTION III

## PROOFS OF THE MAIN RESULTS

This section is devoted solely to the proofs of the results in Section II. As such, the material here is highly technical and might be tedious to follow in detail, though we have tried our best to make it as readable as possible. The reader can safely skip this section without doing violence to the main ideas of the paper, though much of our hard work is reflected here.

We first prove Theorem 1 (Main Theorem) by analyzing the convergence of the Markov chains $X_{N,M}(k)$ to the solution of the limiting PDE in a two-step procedure. In the first step, for each $N$, we show in Section III-A that as $M\to\infty$, $X_{N,M}(k)/M$ converges to $x_{N,M}(k)$. In the second step, we treat $M$ as a function of $N$, written $M_{N}$, and for any sequence $\{M_{N}\}$, we show in Section III-B that as $N\to\infty$, $x_{N}(k)$ converges to the PDE solution. Based on the two steps, we show in Section III-C that as $N$ and $M_{N}$ go to $\infty$ in a dependent way, $X_{N}^{(p)}$ converges to the PDE solution, proving Theorem 1. We then prove Theorem 2 (Sufficient condition for key assumption) in Section III-D. Finally, in Section III-E, we prove Theorem 3 (Convergence of network model) using Theorem 1 and 2.

### A Convergence of $X_{N,M}(K)$ and $X_{N,M}(K)$ to the Solution of the Same ODE

In this subsection, we show that for each $N$, $X_{N,M}(k)/M$ and $x_{N,M}(k)$ are close in a certain sense for large $M$ under certain conditions, by proving that both their continuous-time extensions converge to the solution of the same ODE.

For fixed $T$ and $N$, by (10), ${\mathtilde{T}}_{N}$ is fixed. As defined by (11) and (12) respectively, both $X^{(o)}_{N,M}({\mathtilde{t}})$ and $x^{(o)}_{N,M}({\mathtilde{t}})$ with ${\mathtilde{t}}\in [0,{\mathtilde{T}}_{N}]$ are in the space $D^{N}[0,{\mathtilde{T}}_{N}]$ of ${\BBR}^{N}$-valued Càdlàg functions on $[0,{\mathtilde{T}}_{N}]$. Since they both depend on $M$, each one of them forms a sequence of functions in $D^{N}[0,{\mathtilde{T}}_{N}]$ indexed by $M=1,2,\ldots$. Define the $\infty$-norm $\Vert\cdot\Vert_{\infty}^{(o)}$ on $D^{N}[0,{\mathtilde T}_{N}]$; i.e., for $x\in D^{N}[0,{\mathtilde T}_{N}]$, TeX Source$$\Vert x\Vert_{\infty}^{(o)}=\max_{n=1,\ldots,N}\sup_{t\in[0,{\mathtilde{T}}_{N}]}\vert x_{n}(t)\vert,$$ where $x_{n}$ is the $n$th components of $x$.

Now we present a lemma stating that under some conditions, for each $N$, as $M\to\infty$, $X^{(o)}_{N,M}$ converges uniformly to the solution of the ODE ${\mathdot{y}}=f_{N}(y)$, and $x^{(o)}_{N,M}$ converges uniformly to the same solution, both on $[0,{\mathtilde{T}}_{N}]$.

#### Lemma 1

Assume, for each $N$, that:

1. there exists an identically distributed sequence $\{\lambda_{N}(k)\}$ of integrable random variables such that for each $k$ and $x$, $\vert F_{N}(x,U_{N}(k))\vert\leq\lambda_{N}(k)$ a.s.;
2. the function $F_{N}(x,U_{N}(k))$ is continuous in $x$ a.s.; and
3. the ODE ${\mathdot{y}}=f_{N}(y)$ has a unique solution on $[0,{\mathtilde{T}}_{N}]$ for any initial condition $y(0)$.

Suppose that as $M\to\infty$, $X^{(o)}_{N,M}(0)\mathrel{\mathop\rightarrow\limits^{P}} {y}(0)$ and $x^{(o)}_{N,M}(0)\to y(0)$, where “$\mathrel{\mathop\rightarrow\limits^{P}}$” represents convergence in probability. Then, for each $N$, as $M\to\infty$, $\Vert X^{(o)}_{N,M}-y\Vert_{\infty}^{(o)}\mathrel{\mathop\rightarrow\limits^{P}}{0}$ and $\Vert x^{(o)}_{N,M}-y\Vert_{\infty}^{(o)}\to 0$ on $[0,{\mathtilde{T}}_{N}]$, where $y$ is the unique solution of ${\mathdot{y}}=f_{N}(y)$ with initial condition $y(0)$.

To prove Lemma 1, we first present a lemma due to Kushner [19].

#### Lemma 2

Assume, for each $N$, that:

1. the set $\{\vert F_{N}(x, U_{N}(k))\vert: k\geq 0\}$ is uniformly integrable;
2. for each $k$ and each bounded random variable $X$, TeX Source$$\lim_{\delta\to 0}E\sup_{\vert Y\vert\leq\delta}\vert F_{N}(X, U_{N}(k))-F_{N}(X+Y,U_{N}(k))\vert=0;$$ and
3. there is a function ${\mathhat{f}}_{N}(\cdot)$ [continuous by ${\mathhat{2}}$] such that as $n\to\infty$, TeX Source$${{1}\over{n}}\sum^{n}_{k=0}{F_{N}(x, U_{N}(k))\mathrel{\mathop\rightarrow\limits^{P}}{\mathhat{f}}_{N}(x)}.$$

Suppose that, for each $N$, ${\mathdot{y}}={\mathhat{f}}_{N}(y)$ has a unique solution on $[0,{\mathtilde T}_{N}]$ for any initial condition, and that $X^{(o)}_{N,M}(0)\Rightarrow y(0)$, where “$\Rightarrow$” represents weak convergence. Then for each $N$, as $M\to\infty$, $\Vert X^{(o)}_{N,M}-y\Vert_{\infty}^{(o)}\Rightarrow 0$ on $[0,{\mathtilde T}_{N}]$.

We note that in Kushner's original theorem, the convergence of $X^{(o)}_{N,M}$ to $y$ is stated in terms of Skorokhod norm [19], but it is equivalent to the $\infty$-norm in our case where the time interval $[0,{\mathtilde T}_{N}]$ is finite and the limit $y$ is continuous [110].

We now prove Lemma 1 by showing that the Assumptions L2.1, L2.2, and L2.3 of Lemma 2 hold under the Assumptions L1.1, L1.2, and L1.3 of Lemma 1.

#### Proof of Lemma 1

Since $\lambda_{N}(k)$ is integrable, as $a\to\infty$, $E\vert\lambda_{N}(k)\vert 1_{\{\vert\lambda_{N}(k)\vert>a\}}\to 0$, where $1_{A}$ is the indicator function of set $A$. By Assumption L1.1, for each $k$, $x$, and $a>0$, TeX Source\eqalignno{& E\vert F_{N}(x,U_{N}(k))\vert 1_{\{\vert F_{N}(x,U_{N}(k))\vert>a\}}\cr&\!\!\qquad\leq E\vert\lambda_{N}(k)\vert 1_{\{\vert F_{N}(x,U_{N}(k))\vert>a\}}\cr&\!\!\qquad\leq E\vert\lambda_{N}(k)\vert 1_{\{\vert\lambda_{N}(k)\vert>a\}}.} Therefore as $a\to\infty$, TeX Source$$\sup_{k\geq 0}E\vert F_{N}(x,U_{N}(k))\vert 1_{\{\vert F_{N}(x,U_{N}(k))\vert>a\}}\to 0;$$ i.e., the family $\{\vert F_{N}(x,U_{N}(k))\vert: k\geq 0\}$ is uniformly integrable, and hence Assumption L2.1 holds.

By Assumption L1.2, for each $k$ and each bounded $X$, a.s., TeX Source$$\lim_{\delta\to 0}\sup_{\vert Y\vert\leq\delta}\vert F_{N}(X,U_{N}(k))-F_{N}(X+Y,U_{N}(k))\vert=0.$$ By Assumption L1.1, for each $k$ and each bounded $X$ and $Y$, a.s., TeX Source\eqalignno{&\vert F_{N}(X,U_{N}(k))-F_{N}(X+Y,U_{N}(k))\vert\cr&\qquad\leq\vert F_{N}(X,U_{N}(k))\vert+\vert F_{N}(X+Y,U_{N}(k))\vert\leq 2\lambda_{N}(k).} Therefore, for each $k$, each bounded $X$, and each $\delta$, a.s., TeX Source$$\bigg\vert\sup_{\vert Y\vert\leq\delta}\vert F_{N}(X,U_{N}(k))-F_{N}(X+Y,U_{N}(k))\vert\bigg\vert\leq2\lambda_{N}(k),$$ an integrable random variable. By the dominant convergence theorem, TeX Source\eqalignno{&\lim_{\delta\to 0}E\sup_{\vert Y\vert\leq\delta}\vert F_{N}(X,U_{N}(k))-F_{N}(X+Y,U_{N}(k))\vert\cr&\,~=E\lim_{\delta\to 0}\sup_{\vert Y\vert\leq\delta}\vert F_{N}(X,U_{N}(k))-F_{N}(X+Y,U_{N}(k))\vert=0.} Hence Assumption L2.2 holds.

Since $U_{N}(k)$ are i.i.d., by the weak law of large numbers and the definition of $f_{N}$ in (3), as $n\to\infty$, TeX Source$${{1}\over{n}}\sum^{n}_{k=0}{F_{N}(x, U_{N}(k))\mathrel{\mathop\rightarrow\limits^{P}} {f}_{N}(x)}.$$ Hence Assumption L2.3 holds.

Therefore, by Lemma 2, for each $N$, as $M\to\infty$, $\Vert X^{(o)}_{N,M}-y\Vert_{\infty}^{(o)}\Rightarrow 0$ on $[0,{\mathtilde T}_{N}]$. For any sequence of random processes $\{X_{n}\}$, if $A$ is a constant, $X_{n}\Rightarrow A$ if and only if $X_{n}\mathrel{\mathop\rightarrow\limits^{P}} {A}$. Therefore, as $M\to\infty$, $\Vert X^{(o)}_{N,M}-y\Vert_{\infty}^{(o)}\mathrel{\mathop\rightarrow\limits^{P}}{0}$ on $[0,{\mathtilde T}_{N}]$. The same argument implies the deterministic convergence of $x^{(o)}_{N,M}$: as $M\to\infty$, $\Vert x^{(o)}_{N,M}-y\Vert_{\infty}^{(o)}\to 0$ on $[0,{\mathtilde T}_{N}]$. $\hfill\square$

Based on Lemma 1, we get the following lemma, which states that, for each $N$, $X^{(o)}_{N,M}$ and $x^{(o)}_{N,M}$ are close with high probability for large $M$.

#### Lemma 3

Let the assumptions of Lemma 1 hold. Then for any sequence $\{\zeta_{N}\}$, for each $N$ and for $M$ sufficiently large, TeX Source$$P\{\Vert X^{(o)}_{N,M}-x^{(o)}_{N,M}\Vert_{\infty}^{(o)}>\zeta_{N}\}\leq 1/N^{2}{\rm~on~}[0,{\mathtilde{T}}_{N}].$$

#### Proof

By the triangle inequality, TeX Source$$\Vert X^{(o)}_{N,M}-x^{(o)}_{N,M}\Vert_{\infty}^{(o)}\leq\Vert X^{(o)}_{N,M}-y\Vert_{\infty}^{(o)}+\Vert x^{(o)}_{N,M}-y\Vert_{\infty}^{(o)}.$$ By Lemma 1, for each $N$, as $M\to\infty$, $\Vert X^{(o)}_{N,M}-x^{(o)}_{N,M}\Vert_{\infty}^{(o)}\mathrel{\mathop\rightarrow\limits^{P}}{0}$ on $[0,{\mathtilde{T}}_{N}]$. This completes the proof. $\blackboxfill$

Since $X^{(o)}_{N,M}$ and $x^{(o)}_{N,M}$ are the continuous-time extensions of $X_{N,M}(k)$ and $x_{N,M}(k)$ by piecewise-constant extensions, respectively, we have the following corollary stating that for each $N$, as $M\to\infty$, $X_{N,M}(k)/M$ converges uniformly to $x_{N,M}(k)$.

#### Corollary 1

Let the assumptions of Lemma 1 hold. Then for any sequence $\{\zeta_{N}\}$, for each $N$ and for $M$ sufficiently large, we have that TeX Source$$P\left\{\max_{{k=1,\ldots, K_{N,M}}\atop{n=1,\ldots,N}}\left\vert{{X_{N,M}(k,n)}\over{M}}-x_{N,M}(k,n)\right\vert>\zeta_{N}\right\}\leq{{1}\over{N^{2}}}.$$

### B Convergence of $X_{N}(k)$ to the Limiting PDE

For the remainder of this section, we treat $M$ as a function of $N$, written $M_{N}$. We now state conditions under which $\varepsilon_{N}$ converges to 0 for any sequence $\{M_{N}\}$ as $N\to\infty$.

#### Lemma 4

Assume that:

1. there exist a sequence $\{\xi_{N}\}$ and $c_{1}<\infty$ such that as $N\to\infty$, $\xi_{N}\to 0$, and for $N$ sufficiently large, $\Vert u_{N}\Vert^{(N)}<c_{1}\xi_{N}$;
2. for each $N$, (17) holds; and
3. the sequence $\{\mu_{N}\}$ is bounded.

Then there exists $c_{0}<\infty$ such that for any sequence $\{M_{N}\}$ and $N$ sufficiently large, $\Vert\varepsilon_{N}\Vert^{(N)}<c_{0}\xi_{N}$.

#### Proof

By the definition of $\mu_{N}$ (20), for each $N$, there exists $\delta>0$ such that for $\alpha<\delta$, TeX Source$$\sup_{\Vert u\Vert^{(N)}\leq\alpha}{{\Vert H_{N}(u)\Vert^{(N)}}\over{\Vert u\Vert^{(N)}}}\leq\mu_{N}+1.$$ By Assumption L4.1, $\Vert u_{N}\Vert^{(N)}\to 0$ as $N\to\infty$. Then there exists $\alpha_{1}$ such that for $N$ sufficiently large, $\Vert u_{N}\Vert^{(N)}\leq\alpha_{1}<\delta$, and hence TeX Source$${{\Vert H_{N}(u_{N})\Vert^{(N)}}\over{\Vert u_{N}\Vert^{(N)}}}\leq\sup_{\Vert u\Vert^{(N)}\leq\alpha_{1}}{{\Vert H_{N}(u)\Vert^{(N)}}\over{\Vert u\Vert^{(N)}}}\leq\mu_{N}+1.$$ Therefore, for $N$ sufficiently large, TeX Source$$\Vert\varepsilon_{N}\Vert^{(N)}=\Vert H_{N}(u_{N})\Vert^{(N)}\leq (\mu_{N}+1)\Vert u_{N}\Vert^{(N)}.$$ By Assumption L4.3, and because the derivation above does not depend on the choice of the sequence $\{M_{N}\}$, the proof is completed. $\blackboxfill$

### C. Proof of Theorem 1

We now prove the main theorem.

#### Proof of Theorem 1

By Lemma 4, there exist a sequence $\{\xi_{N}\}$ and $c_{2}<\infty$ such that as $N\to\infty$, $\xi_{N}\to 0$, and for $N$ sufficiently large, $\Vert\varepsilon_{N}\Vert^{(N)}\leq c_{2}\xi_{N}$.

Let $X_{N}=[X_{N}(1)^{\top},\ldots,X_{N}(K_{N})^{\top}]^{\top}/M_{N}$, $x_{N}=[x_{N}(1)^{\top},\ldots,x_{N}(K_{N})^{\top}]^{\top}$, and $z_{N}=[z_{N}(1)^{\top},\ldots,z_{N}(K_{N})^{\top}]^{\top}$ denote vectors in ${\BBR}^{K_{N}N}$. Hence $\varepsilon_{N}=x_{N}-z_{N}$.

For $x\in{\BBR}^{K_{N}N}$, where $x=[x(1)^{\top},\ldots,x(K_{N})^{\top}]^{\top}$ and $x(k)=[x(k,1),\ldots,x(k,N)]^{\top}\in{\BBR}^{N}$, we have that TeX Source$$\Vert x\Vert^{(N)}\leq\max_{{k=1,\ldots, K_{N}}\atop{n=1,\ldots,N}}\vert x(k,n)\vert.$$ Therefore, by Corollary 1, there exists a sequence $\{{\mathtilde M}_{N}\}$ such that if for each $N$, $M_{N}\geq{\mathtilde M}_{N}$, then TeX Source$$\sum_{N=1}^{\infty}P\left\{\Vert X_{N}-x_{N}\Vert^{(N)}>\xi_{N}\right\}\leq\sum_{N=1}^{\infty}1/N^{2}<\infty.$$ It follows from the first Borel-Cantelli Lemma that a.s., there exists $N_{1}$ such that for $N\geq N_{1}$ and $M_{N}\geq{\mathtilde M}_{N}$, $\Vert X_{N}-x_{N}\Vert^{(N)}\leq\xi_{N}$.

By the triangle inequality, TeX Source$$\Vert X_{N}-z_{N}\Vert^{(N)}\leq\Vert X_{N}-x_{N}\Vert^{(N)}+\Vert\varepsilon_{N}\Vert^{(N)}.$$ Therefore, a.s., there exists $N_{2}$ such that for $N\geq N_{2}$ and $M_{N}>{\mathtilde M}_{N}$, TeX Source$$\Vert X_{N}-z_{N}\Vert^{(N)}<(c_{2}+1)\xi_{N}.\eqno{\hbox{(27)}}$$

Let $z_{N}^{(p)}(t,s)$, where $(t,s)\in [0,T]\times{\cal D}$, be the continuous-time-space extension of $z_{N}(k)$ defined in the same way as $X_{N}^{(p)}(t,s)$ is defined from $X_{N}(k)$. Then by its definition, we have that TeX Source$$\Vert X_{N}^{(p)}-z_{N}^{(p)}\Vert^{(p)}=\Vert X_{N}-z_{N}\Vert^{(N)}.\eqno{\hbox{(28)}}$$

Let $\Omega_{N}(k,n)=\Omega_{N}^{(t)}(k)\times\Omega_{N}^{(s)}(n)$ be the subset of $[0,T]\times{\cal D}$ containing $(t_{N}(k),v_{N}(n))$ over which $z_{N}^{(p)}$ is piecewise constant; i.e., $t_{N}(k)\in\Omega_{N}^{(t)}(k)$ and $v_{N}(n)\in\Omega_{N}^{(s)}(n)$, and for all $(t,s)\in\Omega_{N}(k,n)$, $z^{(p)}_{N}(t,s)=z^{(p)}_{N}(t_{N}(k),v_{N}(n))=z(t_{N}(k),v_{N}(n))$.

By (10), there exists a sequence $\{\bar M_{N}\}$ such that if for each $N$, $M_{N}\geq\bar M_{N}$, then for $N$ sufficiently large, $dt_{N}\leq ds_{N}$. By Assumption T1.5, there exists $c_{3}<\infty$ such that for $N$ sufficiently large, for $M_{N}\geq\bar M_{N}$, and for $k=1,\ldots,K_{N}$ and $n=1,\ldots,N$, TeX Source$$\vert z(t_{N}(k),v_{N}(n))-z(t,s)\vert\leq c_{3}ds_{N},\quad (t,s)\in\Omega_{N}(k,n).$$

Then we have that TeX Source\eqalignno{&\Vert z_{N}^{(p)}-z\Vert^{(p)}=\sup_{t\in[0,T]}\int_{\cal D}\vert z_{N}^{(p)}(t,s)-z(t,s)\vert\,ds\cr&\quad=\sup_{t\in [0,T]}\sum_{n}\int_{\Omega_{N}^{(s)}(n)}\vert z_{N}^{(p)}(t,s)-z(t,s)\vert\,ds\cr&\quad=\max_{k}\sup_{t\in\Omega_{N}^{(t)}(k)}\sum_{n}\int_{\Omega_{N}^{(s)}(n)}\vert z_{N}^{(p)}(t,s)-z(t,s)\vert\,ds\cr&\quad\leq\max_{k}\sum_{n}\int_{\Omega_{N}^{(s)}(n)}\sup_{t\in\Omega_{N}^{(t)}(k)}\vert z_{N}^{(p)}(t,s)-z(t,s)\vert\,ds\cr&\quad=\max_{k}\sum_{n}\int_{\Omega_{N}^{(s)}(n)}\sup_{t\in\Omega_{N}^{(t)}(k)}\vert z(t_{N}(k),v_{N}(n))-z(t,s)\vert\,ds\cr&\quad\leq\max_{k}\sum_{n}\int_{\Omega_{N}^{(s)}(n)}c_{3}ds_{N}\,ds=c_{3}ds_{N}\vert{\cal D}\vert, &{\hbox{(29)}}} where $\vert{\cal D}\vert$ is the Lebesgue measure of ${\cal D}$.

By the triangle inequality, TeX Source$$\Vert X_{N}^{(p)}-z\Vert^{(p)}\leq\Vert X_{N}^{(p)}-z_{N}^{(p)}\Vert^{(p)}+\Vert z^{(p)}_{N}-z\Vert^{(p)}.$$ Set ${\mathhat M}_{N}=\max\{{\mathtilde M}_{N},\bar M_{N}\}$. By (27), (28), and (29), a.s., there exist $c_{0}<\infty$ and $N_{0}$ such that for $N\geq N_{0}$ and $M_{N}\geq{\mathhat M}_{N}$, $\hfill\square$

### D. Proof of Theorem 2

To prove Theorem 2, we first prove Lemma 5 and 6 below.

First we provide in Lemma 5 a sequence bounding $\{\mu_{N}\}$ from above. By (18), for each $N$, for $k=1,\ldots,K_{N}$ and $n=1,\ldots,N$, we can write $\varepsilon_{N}(k,n)=H_{N}^{(k,n)}(u_{N})$, where $H_{N}^{(k,n)}$ is from ${\BBR}^{K_{N}N}$ to ${\BBR}$. Suppose that $H_{N}$ is differentiable at 0. Define TeX Source$$DH_{N}=\max_{k=1,\ldots,K_{N}}\sum_{i=1}^{K_{N}}\max_{j=1,\ldots,N}\sum_{n=1}^{N}\left\vert{{\partial H_{N}^{(k,n)}}\over{\partial u(i,j)}}(0)\right\vert.\eqno{\hbox{(30)}}$$

We have that

#### Lemma 5

Assume that:

1. for each $N$, (17) holds; and
2. for each $N$, $H_{N}\in{\cal C}^{1}$ locally at 0.

Then we have that for each $N$, $\mu_{N}\leq DH_{N}$.

#### Proof

Let $J_{N}$ be the jacobian matrix of $H_{N}$ at 0. Note that $J_{N}\in{\BBR}^{K_{N}N\times K_{N}N}$. Let $J_{N}(l,m)$ be its $(l,m)$th component, where $l$, $m=1,\ldots,K_{N}N$. Then we have that for $k$, $i=1,\ldots,K_{N}$ and $n$, $j=1,\ldots,N$, TeX Source$${{\partial H_{N}^{(k,n)}}\over{\partial u(i,j)}}(0)=J_{N}((k-1)N+n,(i-1)N+j).$$ Let $C_{N}(k,i)$ be the matrix in ${\BBR}^{N\times N}$ such that for $n$, $j=1,\ldots,N$, the $(n,j)$th component of $C_{N}(k,i)$ is TeX Source$${{\partial H_{N}^{(k,n)}}\over{\partial u(i,j)}}(0);$$ i.e., $C_{N}(k,i)$ is the $(k,i)$th block in the partition of $J_{N}$ into $N\times N$ blocks (there are $K_{N}\times K_{N}$ such blocks), where $k$, $i=1,\ldots,K_{N}$. Then by (30), TeX Source$$DH_{N}=\max_{k=1,\ldots,K_{N}}\sum_{i=1}^{K_{N}}\Vert C_{N}(k,i)\Vert_{1}^{(N)}.\eqno{\hbox{(31)}}$$ ($\Vert\cdot\Vert_{1}^{(N)}$ is defined by (23).)

By (19) and (22), for $u=[u(1)^{\top},\ldots,u(K_{N})^{\top}]\top\in{\BBR}^{K_{N}N}$, where $u(k)=[u(k,1),\ldots,u(k,N)]^{\top}\in{\BBR}^{N}$, TeX Source\eqalignno{\Vert J_{N}u\Vert^{(N)}=&\, ds_{N}\max_{k=1,\ldots,K_{N}}\left\Vert\sum_{i=1}^{K_{N}}C_{N}(k,i)u(i)\right\Vert_{1}^{(N)}\cr\leq &\,ds_{N}\max_{k=1,\ldots,K_{N}}\sum_{i=1}^{K_{N}}\Vert C_{N}(k,i)u(i)\Vert_{1}^{(N)}\cr\leq &\,ds_{N}\max_{k=1,\ldots,K_{N}}\sum_{i=1}^{K_{N}}\Vert C_{N}(k,i)\Vert_{1}^{(N)}\Vert u(i)\Vert_{1}^{(N)}\cr\leq&\,\max_{k=1,\ldots,K_{N}}\sum_{i=1}^{K_{N}}\Vert C_{N}(k,i)\Vert_{1}^{(N)}ds_{N}\max_{l=1,\ldots,K_{N}}\Vert u(l)\Vert_{1}^{(N)}\cr=&\,DH_{N}\Vert u\Vert^{(N)},} where the last equation follows from (31), (19), and (22). Therefore, for $u\ne 0$, TeX Source$$DH_{N}\geq{{\Vert J_{N}u\Vert^{(N)}}\over{\Vert u\Vert^{(N)}}}.\eqno{\hbox{(32)}}$$

Note that if $u_{N}=0$, then by (16) and (17), $\varepsilon_{N}=0$. Therefore TeX Source$$H_{N}(0)=0.\eqno{\hbox{(33)}}$$ By Assumption L5.2 and Taylor's theorem, there exists a function ${\mathtilde H}_{N}$ such that TeX Source$$H_{N}(u)=J_{N}u+{\mathtilde H}_{N}(u),\eqno{\hbox{(34)}}$$ and TeX Source$$\lim_{\alpha\to 0}\sup_{\Vert u\Vert^{(N)}\leq\alpha}{{\Vert{\mathtilde H}_{N}(u)\Vert^{(N)}}\over{\Vert u\Vert^{(N)}}}=0.\eqno{\hbox{(35)}}$$

By (34) and the triangle inequality, we have that TeX Source$$\Vert H_{N}(u)\Vert^{(N)}\leq\Vert J_{N}u\Vert^{(N)}+\Vert{\mathtilde H}_{N}(u)\Vert^{(N)}.$$ Therefore by (20), TeX Source$$\mu_{N}\leq\lim\nolimits_{\alpha\to 0}\sup\nolimits_{\Vert u\Vert^{(N)}\leq\alpha}\left({{\Vert J_{N}u\Vert^{(N)}}\over{\Vert u\Vert^{(N)}}}+{{\Vert{\mathtilde H}_{N}(u)\Vert^{(N)}}\over{\Vert u\Vert^{(N)}}}\right).$$ Hence by (32) and (35), we complete the proof. $\blackboxfill$

Next we present in Lemma 6 a relationship between $f_{N}$ and $DH_{N}$. Define for each $N$ and for $k$, $l=1,\ldots,K_{N}$, TeX Source$$B_{N}^{(k,l)}=\cases{A_{N}(k-1)A_{N}(k-2)\ldots A_{N}(l),&1\leq l<k;\cr I_{N},&\qquadl=k;\cr 0,&\qquadl>k,}\eqno{\hbox{(36)}}$$ where $A_{N}(l)$ is as defined by (21). We have that

#### Lemma 6

Assume that:

1. for each $N$, (17) holds; and
2. for each $N$, $f_{N}\in{\cal C}^{1}$. Then we have that for each $N$, for $k$, $i=1,\ldots,K_{N}$ and $n$, $j=1,\ldots,N$, TeX Source$${{\partial H_{N}^{(k,n)}}\over{\partial u(i,j)}}(0)=B_{N}^{(k,i)}(n,j)dt_{N}.$$

#### Proof

By Assumption L6.2 and Taylor's theorem, for fixed $z$, there exists a function ${\mathtilde f}_{N}$ such that TeX Source\eqalignno{&f_{N}(x_{N}(k))-f_{N}(z_{N}(k))=Df_{N}(z_{N}(k))\varepsilon_{N}(k)\cr&\hskip 11.5em+tld f_{N}(z_{N}(k)+\varepsilon_{N}(k), z_{N}(k)),} and for each $z$, TeX Source$${\mathtilde f}_{N}(z,z)=0,\eqno{\hbox{(37)}}$$ and TeX Source$$\lim_{\Vert\varepsilon\Vert^{(N)}\to 0}{{\big\Vert{\mathtilde f}_{N}(z+\varepsilon,z)\big\Vert^{(N)}}\over{\Vert\varepsilon\Vert^{(N)}}}=0.\eqno{\hbox{(38)}}$$ Then we have from (16) that for $k=0,\ldots,K_{N}-1$, TeX Source\eqalignno{&\varepsilon_{N}(k+1)=\varepsilon_{N}(k)+{{1}\over{M_{N}}}Df_{N}(z_{N}(k))\varepsilon_{N}(k)\cr&\quad~\qquad\qquad+{{1}\over{M_{N}}}{\mathtilde f}_{N}(z_{N}(k)+\varepsilon_{N}(k),z_{N}(k))+dt_{N}u_{N}(k).} Therefore TeX Source\eqalignno{&\varepsilon_{N}(k+1)=A_{N}(k)\varepsilon_{N}(k)+dt_{N}u_{N}(k)\cr&\qquad~\quad\qquad+{{{\mathtilde f}_{N}(z_{N}(k)+\varepsilon_{N}(k),z_{N}(k))}\over{M_{N}}}.} For $k=0,\ldots,K_{N}-1$, define TeX Source$$\eta_{N}(k)=dt_{N}u_{N}(k)+{{{\mathtilde f}_{N}(z_{N}(k)+\varepsilon_{N}(k), z_{N}(k))}\over{M_{N}}}.\eqno{\hbox{(39)}}$$ Then $\varepsilon_{N}(k+1)=A_{N}(k)\varepsilon_{N}(k)+\eta_{N}(k)$. Therefore for $k=1,\ldots,K_{N}$, TeX Source\eqalignno{&\varepsilon_{N}(k)=A_{N}(k-1)\ldots A_{N}(1)\eta_{N}(0)\cr&\qquad~~\quad+A_{N}(k-1)\ldots A_{N}(2)\eta_{N}(1)\cr&\qquad~~\quad+\ldots+A_{N}(k-1)\eta_{N}(k-2)+\eta_{N}(k-1).} Then it follows from (36) that for $k=1,\ldots,K_{N}$, TeX Source$$\varepsilon_{N}(k)=\sum_{l=1}^{k}B_{N}^{(k,l)}\eta_{N}(l-1).\eqno{\hbox{(40)}}$$

Write $\varepsilon_{N}(k)=H_{N}^{(k)}(u_{N})$. By (39), TeX Source$$\eta_{N}(k)=dt_{N}u_{N}(k)+{{{\mathtilde f}_{N}\left(z_{N}(k)+H_{N}^{(k)}(u_{N}),z_{N}(k)\right)}\over{M_{N}}}.$$ Hence by (40), for $k=1,\ldots,K_{N}$, TeX Source\eqalignno{&\varepsilon_{N}(k)=\sum_{l=1}^{k}B_{N}^{(k,l)}dt_{N}u_{N}(l-1)\cr&\qquad\quad+\sum_{l=1}^{k}B_{N}^{(k,l)}{{{\mathtilde f}_{N}\left(z_{N}(l-1)+H_{N}^{(l-1)}(u_{N}),z_{N}(l-1)\right)}\over{M_{N}}}.}

Denote by $g_{N}^{(k,l,n)}(\cdot):{\BBR}^{K_{N}N}\to{\BBR}^{N}$ the $n$th component of TeX Source$$B_{N}^{(k,l)}{\mathtilde f}_{N}\left(z_{N}(l-1)+H_{N}^{(l-1)}(\cdot),z_{N}(l-1)\right).$$ By (37) and (33), $g_{N}^{(k,l,n)}(0)=0$.

Let $\{e(i,j):i=1,\ldots,K_{N}, j=1,\ldots,N\}$ be the standard basis for ${\BBR}^{K_{N}N}$; i.e., $e(i,j)$ is the element of ${\BBR}^{K_{N}N}$ with the $(i,j)$th entry being 1 and all other entries being 0. Then TeX Source\eqalignno{&{{\partial H_{N}^{(k,n)}}\over{\partial u(i,j)}}(0)\qquad~\qquad=B_{N}^{(k,i)}(n,j)dt_{N}\cr&\textstyle\qquad\quad\qquad+{{1}\over{M_{N}}}\sum\nolimits_{l=1}^{k}\left(\lim\nolimits_{h\to 0}{{g_{N}^{(k,l,n)}(h\,e(i,j))}\over{h}}\right).} It remains to show that TeX Source$$\lim_{h\to 0}{{g_{N}^{(k,l,n)}(h\,e(i,j))}\over{h}}=0.$$ Denote by $\theta_{N}^{(l,d)}(\cdot):{\BBR}^{K_{N}N}\to{\BBR}$ the $d$th component of ${\mathtilde f}_{N}(z_{N}(l)+H_{N}^{(l)}(\cdot),z_{N}(l))$. Then TeX Source$$g_{N}^{(k,l,n)}(u)=\sum_{d=1}^{N}B_{N}^{(k,l)}(n,d)\theta_{N}^{(l-1,d)}(u).$$ Denote by ${\mathtilde f}_{N}^{(l,d)}(\cdot):{\BBR}^{N}\to{\BBR}$ the $d$th component of ${\mathtilde f}_{N}(z_{N}(l)+(\cdot),z_{N}(l))$. Then TeX Source$$\theta_{N}^{(l,d)}(u)={\mathtilde f}_{N}^{(l,d)}(H_{N}^{(l)}(u)).\eqno{\hbox{(41)}}$$ Then it remains to show that TeX Source$$\lim_{\Vert u\Vert^{(N)}\to 0}{{\theta_{N}^{(l,d)}(u)}\over{\Vert u\Vert^{(N)}}}=0.\eqno{\hbox{(42)}}$$

By Assumption L6.2 and by induction, it follows from (16) that for fixed $z$, $\varepsilon_{N}$ is a ${\cal C}^{1}$ function of $u_{N}$, because the composition of functions in ${\cal C}^{1}$ is still in ${\cal C}^{1}$. Hence Assumption L6.2 here implies Assumption L5.2 of Lemma 5. By Assumption L5.2 and (33), there exists $c$ such that $\vert c\vert<\infty$, and for each $\varepsilon_{1}>0$, there exists $\delta_{1}(\varepsilon_{1})$ such that for $\Vert u\Vert^{(N)}<\delta_{1}(\varepsilon_{1})$, $\left\vert{{\left\Vert H_{N}^{(l)}(u)\right\Vert^{(N)}}\over{\Vert u\Vert^{(N)}}}-c\right\vert<\varepsilon_{1}$. Hence for $\Vert u\Vert^{(N)}<\delta_{1}(\varepsilon_{1})$, TeX Source$$\big\Vert H_{N}^{(l)}(u)\big\Vert^{(N)}<(\vert c\vert+\varepsilon_{1})\Vert u\Vert^{(N)}.\eqno{\hbox{(43)}}$$

By (38), $\lim_{\Vert x\Vert^{(N)}\to 0}{{{\mathtilde f}_{N}^{(l,d)}(x)}\over{\Vert x\Vert^{(N)}}}=0$. Hence for each $\varepsilon_{2}>0$, there exists $\delta_{2}(\varepsilon_{2})$ such that for $\Vert x\Vert^{(N)}<\delta_{2}(\varepsilon_{2})$, ${{\left\vert{\mathtilde f}_{N}^{(l,d)}(x)\right\vert}\over{\Vert x\Vert^{(N)}}}<{{\varepsilon_{2}}\over{\vert c\vert+1}}$. Hence for $0<\Vert x\Vert^{(N)}<\delta_{2}(\varepsilon_{2})$, TeX Source$$\left\vert{\mathtilde f}_{N}^{(l,d)}(x)\right\vert<{{\varepsilon_{2}}\over{\vert c\vert+1}}\Vert x\Vert^{(N)}.\eqno{\hbox{(44)}}$$

For each $\varepsilon$, let ${\mathhat\varepsilon}(\varepsilon)$ be sufficiently small such that TeX Source$$(\vert c\vert+{\mathhat\varepsilon}(\varepsilon))\delta_{1}({\mathhat\varepsilon}(\varepsilon))<\delta_{2}(\varepsilon),\eqno{\hbox{(45)}}$$ and TeX Source$${\mathhat\varepsilon}(\varepsilon)<1.\eqno{\hbox{(46)}}$$ Then by (43) and (45), for $\Vert u\Vert^{(N)}<\delta_{1}({\mathhat\varepsilon}(\varepsilon))$, $\left\Vert H_{N}^{(l)}(u)\right\Vert^{(N)}<\delta_{2}(\varepsilon)$. Therefore, in the case that $\left\Vert H_{N}^{(l)}(u)\right\Vert^{(N)}>0$, by (41) and (44), TeX Source$$\left\vert\theta_{N}^{(l,d)}(u)\right\vert=\left\vert{\mathtilde f}_{N}^{(l,d)}\left(H_{N}^{(l)}(u)\right)\right\vert<{{\varepsilon}\over{\vert c\vert+1}}\left\Vert H_{N}^{(l)}(u)\right\Vert^{(N)}.$$ By (43) and (46), TeX Source$$\left\Vert H_{N}^{(l)}(u)\right\Vert^{(N)}<(\vert c\vert+{\mathhat\varepsilon}(\varepsilon))\Vert u\Vert^{(N)}<(\vert c\vert+1)\Vert u\Vert^{(N)}.$$ By the above two inequalities, TeX Source$${{\left\vert\theta_{N}^{(l,d)}(u)\right\vert}\over{\Vert u\Vert^{(N)}}}<\varepsilon.\eqno{\hbox{(47)}}$$ By (37), ${\mathtilde f}_{N}^{(l,d)}(0)=0$. Therefore, in the case that $\big\Vert H_{N}^{(l)}(u)\big\Vert^{(N)}=0$, $\theta_{N}^{(l,d)}(u)=0$, and thus (47) still holds. Therefore, (42) holds. $\blackboxfill$

Now we prove Theorem 2 using the preceding lemmas.

#### Proof of Theorem 2

By (30), Lemma 5, and Lemma 6, we have that TeX Source\eqalignno{&\mu_{N}\leq\max\nolimits_{k=1,\ldots,K_{N}}\sum\nolimits_{i=1}^{K_{N}}\max\nolimits_{j=1,\ldots,N}\sum\nolimits_{n=1}^{N}\left\vert B_{N}^{(k,i)}(n,j)\right\vert dt_{N}\cr&\qquad\quad\qquad=\max\nolimits_{k=1,\ldots,K_{N}}\sum\nolimits_{i=1}^{K_{N}}\left\Vert B_{N}^{(k,i)}\right\Vert_{1}^{(N)}dt_{N}\cr&\quad\qquad\leq\max\nolimits_{k=1,\ldots,K_{N}}K_{N}\max\nolimits_{i=1,\ldots,K_{N}}\left\Vert B_{N}^{(k,i)}\right\Vert_{1}^{(N)}dt_{N}\cr&\qquad\qquad~~\qquad\leq T\max\nolimits_{{k=1,\ldots,K_{N}}\atop{i=1,\ldots,K_{N}}}\left\Vert B_{N}^{(k,i)}\right\Vert_{1}^{(N)}.} ($\Vert\cdot\Vert_{1}^{(N)}$ is defined by (23).) Therefore, by (36) and by the sub-multiplicative property of induced norm, we have that TeX Source\eqalignno{&\mu_{N}\leq T\max\nolimits_{{k=1,\ldots,K_{N}}\atop{i=1,\ldots,k-1}}\left\Vert A_{N}(k-1)A_{N}(k-2)\ldots A_{N}(i)\right\Vert_{1}^{(N)}\cr&~~\qquad\leq T\max\nolimits_{{k=1,\ldots,K_{N}}\atop{i=1,\ldots,k-1}}\left\Vert A_{N}(k-1)\right\Vert_{1}^{(N)}\ldots\left\Vert A_{N}(i)\right\Vert_{1}^{(N)}.} Then by Assumption T2.3, there exists $c<\infty$ such that for $N$ sufficiently large, TeX Source$$\mu_{N}\leq T(1+c\,dt_{N})^{K_{N}}.$$ As $N\to\infty$, $K_{N}\to\infty$, and TeX Source$$(1+c\,dt_{N})^{K_{N}}=\left(1+{{c\,T}\over{K_{N}}}\right)^{K_{N}}\to e^{c\,T}.$$ Therefore $\{\mu_{N}\}$ is bounded. $\hfill\square$

### E. Proof of Theorem 3

We now prove Theorem 3 using Theorem 1 and 2.

#### Proof of Theorem 3

It follows from (5) that there exists $c_{1}$, $c_{2}<\infty$ such that for $N$ sufficient large and $k=0,\ldots,K_{N}-1$, TeX Source$$\cases{\vert u_{N}(k,n)\vert<c_{1}, &n=1,N;\cr\vert u_{N}(k,n)\vert<c_{2}ds_{N}, &n=2,\ldots,N-1.}\eqno{\hbox{(48)}}$$ Therefore, there exists $c_{3}<\infty$ such that for $N$ sufficient large, TeX Source$$\max_{k=0,\ldots,K_{N}-1}\sum_{n=1}^{N}\vert u_{N}(k,n)\vert<c_{3},$$ and hence by (19), we have that for $N$ sufficient large, TeX Source$$\Vert u_{N}\Vert^{(N)}<c_{3}ds_{N}.$$ Hence the Assumption T1.1 of Theorem 1 holds.

By (5), for each $N$, for $x=[x_{1},\ldots,x_{N}]^{\top}\in [{0,1}]^{N}$, the $(n,m)$th component of $Df_{N}(x)$, where $n$, $m=1,\ldots,N$, is TeX Source$$\cases{P_{l}(n)x_{n}(1-x_{n-1}), &m=n-2;\cr (1-x_{n})[P_{r}(n-1)(1-x_{n+1})\cr\quad-P_{l}(n+1)x_{n+1}]\cr\quad+P_{l}(n)x_{n}(1-x_{n-2}), &m=n-1;\cr-P_{r}(n-1)x_{n-1}(1-x_{n+1})\cr\quad-P_{l}(n+1)x_{n+1}(1-x_{n-1})\cr\quad-P_{r}(n)(1-x_{n+1})(1-x_{n+2})\cr\quad-P_{l}(n)(1-x_{n-1})(1-x_{n-2}), &\qquadm=n;\cr(1-x_{n})[P_{l}(n+1)(1-x_{n-1})\cr\quad-P_{r}(n-1)x_{n-1}]\cr\quad+P_{r}(n)x_{n}(1-x_{n+2}),&m=n+1;\cr P_{r}(n)x_{n}(1-x_{n+1}), &m=n+2;\cr 0 &~other wise,}$$ where TeX Source$$X_{N,M}(k)=[X_{N,M}(k,1),\ldots,X_{N,M}(k,N)]^{\top}\in{\BBR}^{N}$$ TeX Source$$\Vert X^{(p)}_{N}-z\Vert^{(p)}<c_{0}\max\{\xi_{N},ds_{N}\}.$$ $x_{n}$ with $n\leq 0$ or $n\geq N+1$ are defined to be zero. It then follows that for each $k$, TeX Source$$\Vert A_{N}(k)\Vert_{1}^{(N)}=1.\eqno{\hbox{(49)}}$$ Hence Assumption T2.3 of Theorem 2 holds. We note that obtaining (49) and (48) requires tedious, but elementary, algebraic manipulation. One can also verify that the other assumptions of Theorem 1 and 2 hold. By Theorem 1, this completes the proof. $\hfill\square$

SECTION IV

## CONCLUSION

In this paper we analyze the convergence of a sequence of Markov chains to its continuum limit, the solution of a PDE, in a two-step procedure. We provide precise sufficient conditions for the convergence and the explicit rate of convergence. Based on such convergence we approximate the Markov chain modeling a large wireless sensor network by a nonlinear diffusion-convection PDE.

With the well-developed mathematical tools available for PDEs, this approach provides a framework to model and simulate networks with a very large number of components, which is practically infeasible for Monte Carlo simulation. Such a tool enables us to tackle problems such as performance analysis and prototyping, resource provisioning, network design, network parametric optimization, network control, network tomography, and inverse problems, for very large networks. For example, we can now use the PDE model to optimize certain performance metrics (e.g., throughput) of a large network by adjusting the placement of destination nodes or the routing parameters (e.g., coefficients in convection terms), with relatively negligible computation overhead compared with that of the same task done by Monte Carlo simulations.

For simplicity, we have treated sequences of grid points that are uniformly located. As with finite difference methods for differential equations, the convergence results can be extended to models that have nonuniform points spacing under assumptions that insure the points in the sequence should become dense in the underlying domain uniformly in the limit. For example, we could consider a double sequence of minimum point spacing $\{h_{i}\}$ and maximum point spacing $\{H_{i}\}$ with $H_{i}/h_{i}={\rm constant}$, and for each $i$, we can consider a model with nonhomogeneous point spacing between $h_{i}$ and $H_{i}$. We can also introduce a spatial change of variables that maps a nonuniform model to a uniform model. This changes the coefficients in the resulting PDE, by substitution and the chain rule. In this way we can extend our approach to nonuniform, even mobile, networks. We can further consider the control of nodes such that global characteristics of the network are invariant under node locations and mobility. ( See our paper [106], [111] for details.)

The assumption made in (24) that the probabilities of transmission behave continuously insures that there is a limiting behavior in the limit of large numbers of nodes and relates the behavior of networks with different numbers of nodes. The convergence results can be extended to the situation in which the probabilities change discontinuously at a finite number of lower dimensional linear manifolds (e.g., points in one dimension, lines in two dimensions, planes in three dimensions) in space provided that all of the discrete networks under consideration have nodes on the manifolds of discontinuity.

There are other considerations regarding the network that can significantly affect the derivation of the continuum model. For example, transmissions could happen beyond immediate nodes, and the interference between nodes could behave differently in the presence of power control; we can consider more boundary conditions other than sinks, including walls, semi-permeating walls, and their composition; and we can seek to establish continuum models for other domains such as the Internet, cellular networks, traffic networks, and human crowds.

## Footnotes

The work of Y. Zhang and E. Chong was supported in part by the National Science Foundation (NSF) under Grant ECCS-0700559 and ONR under Grant N00014-08-1-110. The work of J. Hannig was supported in part by NSF under Grants 1007543 and 1016441. The work of D. Estep was supported in part by the Defense Threat Reduction Agency under Grant HDTRA1-09-1-0036, the Department of Energy under Grants DE-FG02-04ER25620, DE-FG02-05ER25699, DE-FC02-07ER54909, DE-SC0001724, DE-SC0005304, and INL00120133, the Lawrence Livermore National Laboratory under Grants B573139, B584647, B590495, the National Aeronautics and Space Administration under Grant NNG04GH63G, the National Institutes of Health under Grant 5R01GM096192-02, the National Science Foundation under Grants DMS-0107832, DMS-0715135, DGE-0221595003, MSPA-CSE-0434354, ECCS-0700559, DMS-1016268, and DMS-FRG-1065046, and the Idaho National Laboratory under Grants 00069249 and 00115474.

1This paper has supplementary downloadable material available at http://ieeexplore.ieee.org, provided by the authors. This includes eight multimedia AVI format movie clips, which show comparisons of network simulations and limiting PDE solutions. This material is 34.7 MB in size.

## References

No Data Available

## Cited By

No Data Available

None

## Multimedia

Archive

### Approximating Extremely Large Networks via Continuum Limits

This paper appears in:
No Data Available
Issue Date:
No Data Available
On page(s):
No Data Available
ISSN:
None
INSPEC Accession Number:
None
Digital Object Identifier:
None
Date of Current Version:
No Data Available
Date of Original Publication:
No Data Available