• Abstract

SECTION I

## INTRODUCTION

The Internet of Things (IoT, [1], [2]) (including cyber physical systems [3]) represents an evolution in computerized interconnectivity. Not only traditional computers, but also smart autonomous devices will seamlessly interconnect via machine-to-machine (M2M) technologies [4] [5] [6] to provide and consume new classes of service such as intelligent shopping, smart product management and home automation. It is expected that such devices will largely include physical sensors and RFID tags embedded in various hosts (e.g., vehicles, buildings, habitats, humans, utility grids, containers, garments, smart phones, etc.), enabling constant analysis of their environmental and operational states [7], [8]. Examples include real-time supply chain monitoring, remote patient monitoring, and infrastructures monitoring [9]. Further goals include using such capabilities to enable highly intelligent and responsive actuation, for example through dynamic public transit scheduling or efficient and predictive utility (electricity, water) management.

Fig. 1 shows some architectural elements representative of an IoT system. They include a collection of applications and services at the top layer that act on the collective knowledge gathered by the computerized intelligence embedded in sensors at the bottom layer. As the figure shows, these sensors may belong to various domains (e.g., different municipal agencies), which can be all accessed by an individual IoT application (e.g., representing a wide-area traffic management or emergency response service). Between these two layers reside a number of supporting middleware services that facilitate the interaction between the applications and sensors. Such distributed services include the following: (1) an application gateway (IoT AppGW in the figure) for information collection, processing, and delivery; (2) a database (IoT DB for information archival and query; and (3) an energy management server (IoT EMS) for adjusting the trade-off between devices’ resource consumption and provided data quality, which is especially important for energy limited devices. While all the components in Fig. 1 require unique advancements to manifest the IoT vision, we focus our work on the challenge of balancing sensor nodes’ energy and data quality management.

FIGURE 1. The overall IoT architecture, where a middleware layer (composed of the IoT application gateway, EMS, and real-time operational database) bridges a variety of applications with the physical sensors.

There are various technical challenges in sensor energy and data quality management. A major one that drives our work involves the large-scale management of heterogeneous devices that are expected to populate IoT systems. A great many sensor types, manufacturers, protocols, etc. are expected to co-exist here and hence, any solution must be designed to operate as expected regardless of the device configuration. Regarding energy management, this motivates the need for a universal management approach that attempts to control MAC (medium access control) level energy consumption of nodes, as motivated by previous research [7], [10]. Furthermore, an efficient management scheme should minimize the transmission of control messages crossing different domains, and thus we are seeking a long-term optimal solution. In regards to data quality management, a universal measure of expression can be found in recent work in quality-of-information (QoI) management. Broadly speaking, QoI relates to the ability to judge whether information is fit-for-use for a particular purpose [11] [12] [13]. For the purposes of this paper, we will assume that QoI is characterized by a number of attributes including accuracy, latency, and physical context (specifically, sensor coverage in this paper [11]).

To address the aforementioned challenges, we aim to design an energy management service (and supporting algorithms) that is transparent to and compatible with any lower layer protocols and over-arching applications, while providing long-term energy-efficiency under the satisfactory QoI constraints. In support of our design, we first introduce the new concept of “sensor-to-task relevancy” to explicitly consider the sensing capabilities offered by a sensor (or a set thereof) to the applications and QoI requirements required by a task. Second, we use the generic information fusion function to compute the “critical covering set” of any given task in selecting the sensors to service a task over time. Third, we propose a runtime energy management framework based on the previous design elements to control the duty cycles of a sensor in the long run, i.e., the control decision is made optimally considering the long-term task usage statistics where the service delay of each task serves as the constraint. Then, an extensive case study related to environmental monitoring is given to demonstrate the ideas and algorithms proposed in this paper, and a simulation is made to support all performance analysis. Lastly, we further consider the signal propagation and processing latency into our previous proposal to both theoretically and experimentally investigate its impact on the measured delay probability. Finally, we provide some potential implementation guidelines to make the energy management framework more applicable under realistic scenarios. It should be noted that this is first-of-a-kind research that manages the energy usage of a variety of sensors from different domains, irrespective of how the provided sensing capabilities will be used by different applications.

The rest of this paper is organized as follows. Section II presents related research efforts. The system model, including the system flow of the proposed efficient energy management framework, is described in Section III. The sensor-to-task relevancy and the critical covering set are introduced in Section IV, and the optimization problem of efficient energy management is formulated in Section V, where several solutions are also given and analyzed. In Section VI the modeling of signaling overhead is thoroughly discussed and analyzed. In Section VII, a case study of environmental monitoring is explained in detail, and its simulation results are presented. Implementation guidelines are given in Section VIII. Finally, concluding remarks are drawn in Section IX. This paper largely extends [14], by introducing the delay model for all tasks in a probabilistic manner (see Section V-B), giving the explicit definition of the weight factor in the duty cycle optimization (see Section V-C), demonstrating extensive performance evaluation results (see Section VII), and adding the modeling and analysis of signal propagation and processing latency (see Section VI).

A summary of important symbols used in this paper is listed in Table 1.

Table 1. Summary of important symbols.
SECTION II

## RELATED WORK

For the definition of sensor utility, [15] proposes a decentralized approach towards the design of such networks based on utility functions. [16] overviews the information-driven approach to sensor collaboration in ad hoc sensor networks, and a definition of information utility and its approximate measures are introduced. The area of QoI has been proposed recently to judge how retrieved information is fit-for-use for a particular task [17], [18]. Work described in [19] made further contributions to this area by proposing a QoI satisfaction index to describe the degree of QoI satisfaction that tasks receive from a wireless sensor network (WSN). [19] additionally describes a QoI network capacity metric to describe the marginal ability of the WSN to support the QoI requirement of a new task upon its arrival to the network. Based on these, the authors proposed an adaptive admission control scheme to optimally accommodate the QoI requirements of all incoming tasks by treating the whole WSN as a “black box.”

Existing work describes many different schemes for WSN node scheduling [20] [21] [22] [23] [24]. In [20], Ma et.al. propose centralized and distributed algorithms to assign sensors with consecutive time slots to reduce the frequency of operational state transitions. In [21], the authors describe an energy-efficient scheduling scheme for low-data-rate WSNs, where both the sensors’ energy requirements for specific operational states and the state transitions are considered. In [22], the authors propose a cross-layer framework to optimize global power consumption and balance the load in sensor networks. In [23], the authors utilize control theory to balance energy consumption and packet delivery success. In [24], the authors present novel schemes to reduce sleep latency, while achieving balanced energy consumption among sensor nodes with energy harvesting capability. The authors in [25] study the QoI performance of WSN tracking applications with energy constraints, focusing on a duty-cycled network with random wake-up schedules for different nodes. In comparison, our work is different in that we explicitly consider the multi-dimensional QoI requirements of tasks and capabilities of sensors, and use this in addition to energy to determine node duty cycle schedules. This approach completely decouples relations between sensors and applications, but dynamically controls the energy consumption state of each sensor to achieve desired QoI over the long run.

Finally, we note that there has been plenty of work on MAC layer protocol designs for WSNs focusing on minimizing energy consumption in order to achieve prolonged system lifetime, such as S-MAC [26] and T-MAC [27]. In contrast to this work, our proposal is a system-level management operation and not a communications protocol; and more importantly, this proposal can work with systems that engage to MAC level energy conservation as well.

SECTION III

## SYSTEM MODEL

In this section, we present the modeling of sensors, tasks and overall system architecture and flow.

### A. Sensors

We consider an IoT sensory environment that comprises a collection $\mathcal {N}$ of $N$ sensors (indexed by $n\in \{1,2,\ldots , N\})$, plus a gateway (the sink). The sensors form a multi-hop network to transmit data to the gateway. Each sensor is associated with certain sensing, processing and communication capabilities. The sensing capability of a sensor represents its ability to offer a certain level of QoI to a task, but independently of any specific task. The sensing capability of sensor $n$ is described by the $K$-vector $\underline {c}_{n}\in \mathbb {R}^{K}$, whose entries include QoI attributes such as the measurement errors, latency in reporting, its coverage, etc. For each sensor $n$, the initial energy reserve is denoted as $E_{n}$, and the residual energy at time $t$ is denoted as $\bar {E}_{n}(t)$.

We assume that there are only two power consumption levels for a sensor: (a) $P^{\textrm {on}}_{n}$ when in active mode, and (b) negligibly small (relative to $P^{\textrm {on}}_{n}$) approximated with 0 when in sleeping mode.

We consider a collection $\mathcal {M}$ of $M$ tasks (indexed by $m\in \{1, 2, \ldots , M\}$). Each task represents a specific class of activities that may share a common spatial property but not temporal properties, such as starting time or duration. An instance of a task represents a single continuous period that the task is in service. For example, “monitor the water quality at certain location” may represent one of the $M$ tasks, while doing so between times $t_{1}$ and $t_{2}$ or $t_{3}$ and $t_{4}$, represents two instances of the same sensing task executed over two different time periods. Each task’s desired QoI is described by a $K$-vector $\underline {q}_{m}$, describing the desired accuracy, latency, coverage, etc. Note that the elements in $\underline {q}_{m}$ can be vectors as well, as a QoI requirement can be defined by more than one parameters, as illustrated in a case study in Section VII-A.

Finally we consider the granularity for all tasks that cannot be separated into sub-tasks. If a submitted task includes a combination of different tasks, we consider the joint set of their QoI requirements into our framework.

### C. System Flow

We assume that our energy management scheme runs within the IoT EMS, interacting with both the applications and gateways of different underlying network domains, so that control signals can be computed, generated, and sent to the sensors (see Fig. 1).

We consider a discrete (or slotted) time system operation. Duty-cycling decisions are made by the EMS every $L$ time slots, which define the duration of a $L$-slot frame in the system. Then decisions are sent to the gateways that coordinate control operations of the corresponding sensors. For simplicity, we assume a sensor stays active in a frame after it is woken up. Contrary to the collection $\mathcal {N}$, the gateway is assumed to have sufficient processing power and energy capacity. We assume that the frame length $L$ is far less than the average service time of tasks and the average idle time between two consecutive instances of a task, which ensures that the probability of any task changing its status during a frame is negligible. We also assume that the current service time is known to the IoT EMS and gateway after it starts.

Fig. 2 illustrates the procedure for the proposed energy management framework during one frame, which can be summarized as follows:

1. At the sensor deployment stage, compute the critical covering sets (CCSs) of each task with information fusion algorithms, based upon the sensor capabilities and the desired QoI of the tasks (see Section IV).
2. At the beginning of a frame, the EMS makes a decision on when to activate/deactivate which sensor and for how long in the current frame based on the task model and sensor status, and sends the control message back to each sensor through its gateway (see Section V).
3. During a frame, each sensor follows its predetermined waking-up schedule without further communications with the gateway until the next frame.
FIGURE 2. System flow of the proposed energy management framework.
SECTION IV

## QOI-AWARE SENSOR-TO-TASK RELEVANCY AND CRITICAL COVERING SETS

In [17], the 5WH principle was proposed to summarize the information needs of a task and the sensing capabilities of a WSN, and in [28], the spatial relevancy of the provided information was introduced along with a way to measure it. In [29], a set of functions that model the sensing quality of each subset of nodes for each tasks are studied. Motivated by this prior work, we propose the relevancy of a sensor to a task as the degree to which the sensor can satisfy the task’s QoI requirements. Specifically, we define:TeX Source$$$$r_{nm} = f\Big (\underline {c}_{n}, \underline {q}_{m}\Big )\in [{0,1}], \quad \forall n\in \mathcal {N}, m\in \mathcal {M},$$$$ where $r_{nm}$ denotes the relevancy of sensor $n$ to task $m$, and $f(\cdot )$ is a generic relevancy function that takes value in [0, 1] by definition. A specific example of $f$ is given in Section VII-A.

### A. Information Fusion

Some QoI requirements, like the coverage of a region, can be achieved by using a fusion algorithm (function) even if no individual sensor can meet the requirement. The authors in [28] propose to select a number of providers that cumulatively provide the most relevant information using an abstract, scalar-valued representation of QoI. While similar in principle, here we consider a more general way to accommodate a vector-valued QoI in information fusion. For ease of presentation, we use $g_{\mathcal {S}}(\cdot )$ for the generic fusion function, and clearly, $g$ should take a variant number of single-sensor “capabilities” and output an aggregated capability in all aspects. Note that there are a number of works on information fusion in WSNs (see [30], [31]) that can be applied as a realization of $g_{\mathcal {S}}(\cdot )$; further elaboration of $g_{\mathcal {S}}(\cdot )$ is beyond the scope of this paper. Denoting the capability of a subset $\mathcal {S}$ of sensors by $\underline {c}_{\mathcal {S}}$, we have:TeX Source$$$$\underline {c}_{\mathcal {S}} = g_{\mathcal {S}}\Big (\{\underline {c}_{n}| n \in \mathcal {S} \}\Big ).$$$$ Then, the relevancy of a subset of sensors to a task can be defined in the same way as that of a single sensor to a task based upon their aggregated sensing capability, i.e., TeX Source$$$$r_{\mathcal {S},m} = f\Big (\underline {c}_{\mathcal {S}}, \underline {q}_{m}\Big ), \quad \forall \mathcal {S} \subseteq \mathcal {N}, m \in \mathcal {M}.$$$$

### B. Critical Covering Set

We define the critical covering set (CCS) of a task as a set of sensors whose aggregated sensor-to-task relevancy always achieves 1; and if the retrieved information from any sensor is lost, the aggregated relevancy will drop below 1. The calculation of CCS can be regarded as a set cover problem and can be solved by some greedy algorithms [32]. It is worth noting that there is finite probability that the sensors may not be able to cover the entire area of interest when randomly deployed. Furthermore, the desired QoI of certain tasks may be too demanding that even multiple collaborated sensors could not satisfy it. Therefore, it is possible that a task has no CCS. In this paper, we focus on the scenario in which there is sufficient density of deployed sensors to always guarantee the existence of CCSs for each task, with the realization that the system performance metric we defined in Section V-B also fits the scenario in which there exists no CCS for certain tasks. For ease for presentation, let $\mathbb {S}_{m},\forall m\in \mathcal {M}$, be the set of all CCSs for task $m$ and $\underline {\mathbf {S}} = \{\mathbb {S}_{m}\}$ the collection of all these sets.

SECTION V

## QOI-AWARE ENERGY MANAGEMENT

As discussed earlier, in order to fully exploit the energy-efficiency in an IoT sensory environment without sacrificing the QoI delivered to a task, both (a) the irrelevant sensors, i.e., sensors that are not relevant to any future incoming tasks, and (b) the redundant sensors, i.e., sensors that are not critical to any tasks, are allowed to be switched to the sleeping mode (OFF). In this section, we propose a framework to control the duty-cycling of these sensors based upon the task model outlined in Section III.

### A. Duty-Cycling of Sensors

The duty-cycle of a sensor is defined as the fraction of time that the sensor is ON, i.e., $\Sigma _{n}(T)/T$, $\forall n\in \mathcal {N}$, where $\Sigma _{n}(T)$ is the aggregation of the ON times during the lifetime $T$. We express the aggregated ON time as a function of $T$ to explicitly describe its dependency on the lifetime. However, this straightforward definition of duty-cycle does not directly reflect the energy spent while switching between the two modes. Therefore, we propose a generalized duty cycle to explicitly incorporate the extra energy penalty paid each time the sensors switch modes. Specifically, let $P^{\textrm {sw}}_{n},P^{\textrm {on}}_{n}$ denote the amount of energy consumed when each time sensor switches modes and keeps awake, respectively, and $N_{n}(T)$ is the number of mode switchings sensor $n$ makes. Then the generalized duty cycle $\eta _{n}$ of sensor $n$ is defined as TeX Source$$$$\eta _{n}(T) = \frac {P^{\textrm {sw}}_{n}}{P^{\textrm {on}}_{n}} \cdot \frac {N_{n}(T)}{T} + \frac {\Sigma _{n}(T)}{T}, \quad \forall n\in \mathcal {N}.$$$$

The goal of energy management is to minimize the generalized duty cycle in an IoT sensory environment, without sacrificing the QoI levels attained. At the beginning of each frame, the IoT EMS informs the gateway on the decisions as when to switch modes for sensors in the current frame. Let $\underline {A}(t) = \{a_{n}(t)\}, ~0 \leq a_{n}(t) \leq L, ~n \in \mathcal {N}$ denote the mode switching times of sensors in the frame following the decision at time $t$. When $a_{n} (t) < L$, the $n$-th sensor will switch mode at time $t + a_{n} (t)$ and, when $a_{n} (t) = L$, it will not switch mode in this frame. Clearly, the cardinality of the decision space of $\underline {A}(t)$ is $L^{N}$.

### B. Delay Model for Tasks

In practice, when the gateway sends the wake-up signals to the corresponding sensors they may not receive it immediately and be activated exactly at the scheduled time, mainly caused by the signal transmitting and processing latency. Moreover, even if the latency is very small to be neglected, it is likely that no active CCS of a task exists when task instances start. Therefore, the task may have to wait for the next frame when the EMS informs the gateway to wake up a CCS for its service, as shown in Fig. 3. Towards this end, we introduce the delay model for all tasks as follows.

FIGURE 3. An illustrative example of service delay, where under the proposed task model, where four tasks follows different arrival and departure statistics, and during one single emergence, its maximum allowed delay tolerance is shown. Task 1, 2, and 4 have two instances and task 3 has only one instance.

We denote $d_{m, i}$ as the “attained” service delay of task $m$ for its $i$-th instance. Note that this service delay may be tolerable, depending on the type of requested application (e.g., a few seconds delay for reporting the water quality levels are highly likely tolerable). We denote $D_{m}$ as the maximum tolerable delay for any instance of task $m$, defined as the fraction of time compared to the lifetime of the task instance:TeX Source$$$$D_{m}=\tau _{m} l_{m,i}, ~\forall m \in \mathcal {M}, ~i\in \mathbb {N}^{+},$$$$ where $l_{m,i}$ is the lifetime of task $m$’s $i$-th instance, and $\tau _{m}\in [{0,1}]$ is the delay tolerance of task $m$. Clearly, smaller $\tau _{m}$ represents more stringent delay requirement. Then, let $\zeta _{m,i}$ indicate if the system fails to satisfy task $m$’s $i$-th instance’s delay requirement; we have:TeX Source$$$$\zeta _{m,i} \triangleq \begin{cases} 0, & \textrm {if }0 \leq d_{m,i} \leq \tau _{m} l_{m,i} \\ 1, & \textrm {if }d_{m,i} > \tau _{m} l_{m,i}, \end{cases}$$$$$\forall m \in \mathcal {M}, ~i\in \mathbb {N}^{+}$. Define the number of instance of task $m$ as $I_{m}$. Therefore, considering the overall service delay for a task, its average measured delay failure probability is defined as:TeX Source$$$$\zeta _{m} = \frac {1}{I_{m}}\sum _{i=1}^{I_{m}} \zeta _{m,i}, \quad \forall m\in \mathcal {M}, ~I_{m}\in \mathbb {N}^{+},$$$$ and if the “attained” delay failure probability is smaller than a threshold $\xi _{m}$, we call its delay requirement is successfully satisfied:TeX Source$$$$\zeta _{m} \leq \xi _{m}, \quad \forall m\in \mathcal {M}.$$$$

Hence, we have introduced the delay model for all tasks in a probabilistic manner with the task-specified parameters $\xi _{m}$ (delay failure threshold) and $\tau _{m}$ (delay tolerance).

### C. Problem Formulation

At the beginning of each frame, the EMS informs the gateway of decisions made on the energy consumption state of each sensor $n\in \mathcal {N}$, i.e., which set of sensors should be waken up for task service in the current frame, and which set of sensors are allowed to be turned OFF, given the historical task evolution and sensor activity information, which we denote by $\underline {{H}}(t)$. For ease of presentation, we use $\Lambda$ to denote a generic task evolution model without specifying the mathematical details. Therefore, a decision policy $\nu$ is defined as a mapping from $\underline {{H}}(t)$ to $\underline {A}(t)$, given the known task model and the pre-determined CCS information:TeX Source$$$$\underline {A}(t) = \nu (\underline {\textbf {H}}(t)|\Lambda , \underline {\mathbf {S}}).$$$$ The goal of EMS algorithm is to find the optimal decision policy $\nu ^{*}$ that optimizes the sensor duty-cycles under the delay failure threshold for tasks. We propose a performance metric to describe the system performance, and then formulate the corresponding optimization problem.

#### 1) Minimize Weighted Average Duty Cycle

As the task is changing from time to time, some sensors may be excessively depleted, which can greatly decrease the system lifetime. This model aims at minimizing the average duty cycle of the entire IoT sensory environment, while takes the energy consumption fairness into consideration. The optimization problem is:TeX Source\begin{align} &\mathop { {minimize:}}_{\nu }~~ \overline {\eta } = \sum _{n\in \mathcal {N}}\beta _{n} \eta _{n}, \notag \\ & {subject to:}~~ \zeta _{m} \leq \xi _{m}, \quad \forall m\in \mathcal {M}, \end{align} where $\{\beta _{n}\} \in [{0,1}]$ are weight factors, and $\sum _{n\in \mathcal {N}} \beta _{n}=1$.

### D. A Greedy Algorithm

The optimization problem in (10) is generally NP-hard and their optimal solution are difficult to find without an exhaustive search. In this work, we propose a greedy algorithm for optimization problem (10). The algorithm is greedy in that at any decision point, it chooses the action that leads to the least marginal increment in $\overline {\eta }$.

We explicitly define the weight factors in (10) as the normalized ratio between the remaining energy $\bar {E}_{n}(t)$ and total energy reserve $E_{n}$ to achieve a degree of energy fairness among all sensors:TeX Source\begin{align} \kappa _{n}(t)=\exp \bigg (\!\!\!-\frac {\bar {E}_{n}(t)}{E_{n}}\!\bigg ),~~ \beta _{n}(t)=\frac {\kappa _{n}(t)}{\sum ^{N}_{n=1}\kappa _{n}(t)}, ~~ \forall n\in \mathcal {N}.\notag \\\text{}\end{align} The mapping function is chosen to characterize the effect of decreased $\beta _{n}(t)$ with the increase of $\bar {E}_{n}(t)$. Clearly, $\beta _{n}(t)$ is a non-increasing function that the sensors with less residual energy are assigned with higher weights, indicating the smaller probability to be utilized at the decision moment. Therefore, a certain level of energy consumption fairness can be achieved, and network lifetime, defined as the time when the first sensor depletes its energy, is prolonged.

For ease of presentation, suppose the system starts at $t=0$. Denote $t=iL$ as the beginning of the $i$-th frame, where $i\in \mathbb {N}$. Denote $\eta _{t}^{n}$ as the runtime generalized duty cycle of sensor $n$ up to time $t$. $\eta _{t}^{n}$ can be updated recursively by TeX Source\begin{align} \eta _{n}(t)=& \frac {1}{t}\bigg [\frac {P^{\textrm {sw}}_{n}}{P^{\textrm {on}}_{n}}\bigg (N_{n}(t)\!-\!N_{n}(t\!-\!L)\bigg ) +\bigg (\Sigma _{n}(t)-\!\Sigma _{n}(t\!-\!L)\bigg ) \notag \\ &\quad ~\! +\eta _{n}(t\!-L)\cdot (t-\!L)\bigg ], \end{align}$t = iL, i\in \mathbb {N}^{+}$, with $\eta _{n}(0)$ defined to be zero. Note that $N_{n}(t) - N_{n}(t-L)$ and $\Sigma _{n}(t)-\Sigma _{n}(t-L)$ are the number of state switches and the aggregated ON time between time $t-L$ and time $t$ for sensor $n$, respectively. We define the marginal increase in the normalized energy consumption of sensor $n$ between time $t$ and $t+L$ as:TeX Source\begin{align} \Delta _{n}(t) \triangleq \frac {P^{\textrm {sw}}_{n}}{P^{\textrm {on}}_{n}}\Big (N_{n}(t+L) - N_{n}(t)\Big )\!+\!\Big (\Sigma _{n}(t+L)\!-\!\Sigma _{n}(t)\Big ),\notag \\\text{}\end{align} and clearly, we have:TeX Source$$$$\eta _{n}(t)\!=\!\frac {1}{t} \Big ( \Delta _{n}(t-L)\!+\! (t\!-\!L) \cdot \eta _{n}(t\!-\!L)\Big ),~~\forall n\in \mathcal {N}.\qquad$$$$ Further, define the weighted average marginal increase in the normalized energy consumption of all sensors between time $t$ and $t+L$ as TeX Source$$$$\overline {\Delta }(t) \triangleq \sum _{n\in \mathcal {N}}\beta _{n}(t)\Delta _{n}(t).$$$$

At the $i$-th decision point, EMS needs to predict the task activities in the current frame and prepare the sensors accordingly. Rather than a global algorithm that minimizes $\overline {\eta }$ throughout the lifetime of the IoT sensory environment, we specify an algorithm that minimizes $\overline {\Delta }(t)$ at each decision point while satisfactorily guaranteeing the service delay requirement, i.e., the delay failure probability.

Specifically, define $\Phi _{m,i}(t)$ as the probability that the $i$-th instance of task $m$ starts exactly at time $t$, and $\Psi _{m}(t)$ as the “preparation” probability that at least one CCS of task $m$ exists at time $t$. Then, we can rewrite the measured delay failure probability in (7) as the sum of the probabilities when CCS of a task is not prepared under the condition of the task appearance:TeX Source$$$$\zeta _{m} = \sum _{t=0}^{T} \Phi _{m,i}(t) \Big (1- \Psi _{m}(t)\Big ), \quad \forall m\in \mathcal {M}.$$$$$T$ denotes the task lifetime. Note that $\Phi _{m,i}(t)$ is zero almost everywhere. To see this, $\Phi _{m,i}(t)=0$ if: (a) $t$ is not a task transition time, (b) either less than $i-1$ or greater than $i+1$ instances of task $m$ have occurred. In other words, $\Phi _{m,i}(t)$ takes non-zero value only at the time of task transition and task $m$ is expecting its $i$-th instance. Therefore, the above summation is easy to compute.

The task transition is modeled as a (discrete) semi-Markov process. A semi-Markov process is a stochastic process which moves from one state to another, with the successive states visited forming a Markov chain, and that the process stays in a given state a random length of time (holding time). The state space of a semi-Markov process is countable and the distribution function of the holding times may depend on the current state as well as on the one to be visited next [33]. When modeling the task evolution by a semi-Markov model, the tasks are treated as the states. The behavior of the tasks can be summarized in the following three aspects:

• There is at most one task in service at any time slot. In the main context, we only consider the existence of task instance, and we discuss the inclusion of “idle task” in Section VIII;
• The service time of a task is known at the time it starts.

We denote $\mathbf {P}=\{p_{k,m}\}$ as the task transition matrix, where $p_{k,m}$ is the transition probability from task $k$ to task $m$. We also assume that $\mathbf {P}$ is known a priori to the gateway. In reality, $\mathbf {P}$ can be estimated based on the task evolution history by the EM algorithms 1 [34], which finds maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models in an iterative manner. The input of the EM algorithm is the observed data from task evolution history, consisting of the time of occurrence and departure of each task instance. More details will be given in Section VIII.

#### Lemma 5.1

For any given task $m$, its delay requirement is satisfied, if the probability of any instance of it fails to provide the satisfactory service delay $\tau _{m}$, is upper-bounded by $\frac {\xi _{m}}{\pi _{m}}$, where $\xi _{m}$ denotes the delay failure threshold, and $\pi _{m}$ denotes the steady state probability of task $m$.

#### Proof

See Appendix A.

In our model, the EMS knows exactly the time when the current task ends and the next task starts, whereas which specific task succeeds the current one is uncertain. If there is no transition between tasks in a frame, the system only needs to keep awake the CCS of the current task that leads to the least $\overline {\Delta }(t)$ and set the other sensors to sleep. Otherwise, if a task transition is bound to happen in a frame, the system has to wake up the corresponding sensors to make preparation for all possible succeeding tasks under their specific delay failure threshold. Suppose the current task will end during a frame and a new task will start at time $t^{\prime }$, where $t^{\prime } \in {[}iL, (i+1)L{)}$. Our greedy algorithm at the $i$-th decision point is the solution to the following optimization problem TeX Source\begin{align} &\mathop {{minimize:}}_{\nu } ~~\overline {\Delta }(iL) \notag \\ & {subject to:} ~~ P_{m}(t^{\prime })(1 - \Psi _{m}(t^{\prime })) \leq \xi _{m}, \quad \forall m\in \mathcal {M}. \qquad \end{align} where $\overline {\Delta }(iL)$ is defined in (13) and (15), and $P_{m}(t^{\prime })$ is the transition probability from the current task to task $m$.

To solve the above optimization problem, the constraint can be rewritten as TeX Source$$$$\Psi _{m}(t^{\prime }) \geq 1 - \min \bigg \{1, \frac {\xi _{m}}{P_{m}(t^{\prime })}\bigg \},$$$$ which illustrates a way of computing the preparation probabilityc given the task transition model and delay failure requirement. Essentially, (18) specifies the minimum required probability of existence of CCSs for each task at the task transition time $t^{\prime }$. Therefore, the EMS can determine whether to wake up a CCS for each task according to $\Psi _{m}(t^{\prime })$, but the collective decisions for all tasks can be made either jointly or separately, since CCS may overlap for different tasks and a global optimum is achieved when the joint decision is made. However, this will induce further computational complexity, especially when $M$ is large. Therefore, as another “degree of greediness”, we let the EMS make decision separately for each task. After the decision on making preparation for which tasks is made, the EMS chooses and schedules the wake-up times for a subset of sensors that can cover that selected group of tasks and induces the minimum increase in the marginal energy consumption $\overline {\Delta }(iL)$. In the $(i+1)$-th frame, which task follows the previous task is already known and all the irrelevant sensors prepared for the possible occurrence of other tasks can be sent to sleep.

It is worth noting that signal propagation and processing latency has impact on the control decision operation, and we shall investigate this in Section VI. As for now, we assume that when the decision is made at EMS and further informs the gateway, the latter is able to control all corresponding sensors immediately, i.e., the wake-up time of the scheduled sensors is the task transition time $t^{\prime }$. If the selected sensors can cover the next arrival task rightly after the completion of the current task, at the start moment of the next frame $(i+1)L$, non-critical sensors will be shut down. However, if the predicted sensors are incapable to cover the actual coming task, at time $(i+1)L$, the gateway sends a new wake-up signal and activates the sensors from the CCS which induces the minimum increase in the marginal energy consumption $\overline {\Delta }((i+1)L)$. This is because the gateway already knows the identification of this task at time $(i+1)L$. It is easily obtained that, in the latter case, the service delay equals to $(i+1)L-t^{\prime }$.

The algorithm can be summarized in the following steps:

1. At the beginning of each frame, shut down any sensor that is not critical to the current task, if there exist such sensors;
2. If no task transition is bound to happen in the current frame, keep the current status of each sensor until the next frame;
3. If a task transition is bound to happen in the current frame, for each task, compute the minimum required probability of existence of a CCS based on the delay failure threshold by (18), and determine (by random tests) whether to make preparation for that task according to the derived probability. At the time of task transition, wake up a subset of sensors that critically covers all the tasks to be prepared for, yet induces the minimum increase in the marginal energy consumption.

The algorithm is greedy in three aspects:

1. The algorithm satisfies the delay failure threshold every time task transitions happen;
2. The algorithm minimizes the marginal increase in energy consumption at every decision point;
3. The algorithm makes decision on whether to prepare for the possible occurrence of each task separately.

As discussed earlier, we can revise the greedy algorithm so that it makes decision on whether to prepare for the possible occurrence of each task jointly. Compared to the original one whose computational complexity increases linearly with $M$, the revised greedy algorithm increases exponentially with $M$ and it is therefore much more time consuming. In order to show the potential improvement, we demonstrate the results for both greedy algorithms in Section VII.

SECTION VI

## MODELING THE SIGNAL PROPAGATION AND PROCESSING LATENCY

In practice, because of the signal propagation and processing latency by MAC protocols and routing algorithms, the selected CCS members cannot be waken up immediately after the control decision is made. Furthermore, for many applications they do not require immediate task service, but allowing certain service delay after the specified start time. Towards this end, in this section, we improve our system model by explicitly considering the signal propagation and processing latency, modeled by certain amount of wake-up delays after the control decision is made when all CCS members have been successfully informed. In other words, this model consider the longest signal propagation delay from EMS to a CCS member, denoted as a period of $\omega L$, $\omega \in \mathbb {Z}$. We also assume that once the wake-up signal has been sent, it cannot be revoked. In this section, we provide a thorough theoretical analysis on the new system model, coupled with experimental results of the impact of this signaling latency on average measured delay failure probability.

### A. Model Description and Problem Formulation

Without loss of generality, we rename the appearance sequence of all task instances sequentially by index $k=1,2,\ldots$, as shown in Fig. 4. The figure also illustrates a localized view on a specific time period, where the $k$-th instance of all tasks arrives at time $t^{\prime } \in {(}iL, (i+1)L{)}$, $i\in \mathbb {N}^{+}$, with its lifetime $l_{k}$. Further let $x$ denote the interval between $iL$ and $t^{\prime }$ as a random variable, then we have $t^{\prime }=\sum _{i=1}^{k-1} l_{i}=iL+x$.

FIGURE 4. An illustrative example of service delay with signaling latency.

After receiving the control decision from EMS, the gateway sends a wake-up signal to the corresponding sensors at decision time $iL$. If at least one CCS of instance $k$ exists at time $t^{\prime }$, the CCS members will be waken up at time $(i+\omega )L$, i.e., after this control decision successfully propagates to all sensors in $\omega L$ frames. Then, the service delay is computed as:TeX Source$$$$(i+\omega )L-t^{\prime }=(i+\omega )L-(iL+x)=\omega L-x.$$$$ However, if no CCS of instance $k$ exists at time $t^{\prime }$, by knowing exactly which task will appear at the next decision time $(i+1)L$, the gateway then sends a new wake-up signal which eventually takes effect at time $(i+1+\omega )L$, i.e., after considering the signal propagation and processing latency $(1+\omega )L$ from the current time. In this case, the service delay is:TeX Source$$$$(i+1+\omega ) L-t^{\prime }=(\omega +1) L-x.$$$$ Note that it is necessary to ensure $(\omega +1)L-x<l_{k}$, $\forall k\in \mathbb {N}^{+}$, since the task instance $k$ needs to be handled before its termination. Consequently, $\omega <\lfloor l_{\textrm {min}}/L \rfloor -1$.

We are able to compute the probability that the service delay incurred at any instance is larger than the specified delay tolerance $\tau _{m} l_{k}$, conditioned when a CCS is well-prepared for service and otherwise, as:TeX Source\begin{align} &\hspace {-1.1pc}P_{m}(t^{\prime })\Psi _{m}(t^{\prime })\textrm {Pr}\left \{\omega L-x>\tau _{m} l_{k}\right \} \notag \\ &+P_{m}(t^{\prime })\left (1-\Psi _{m}(t^{\prime })\right )\textrm {Pr}\left \{(\omega +1)L-x>\tau _{m} l_{k}\right \}\leq \xi _{m}, \notag \\\text{}\end{align} $\forall m\in \mathcal {M}$. It is worth noting that (21) exactly characterizes the constraint of our optimization problem in (17), and more importantly a way to further theoretically derive $\textrm {Pr}\{\zeta _{m,i}=1\}$.

#### Theorem 6.1

Given the service time distribution in the previous assumptions, $\Psi _{m}(t^{\prime })$ in (21) can be derived as: TeX Source$$$$\Psi _{m}(t^{\prime })\geq \frac {F(\omega +1)-\frac {\xi _{m}}{P_{m}(t^{\prime })}}{F(\omega +1)-F(\omega )},$$$$ where TeX Source\begin{align} F(\omega )=1-\frac {\tau _{m}}{\mu L}\exp \bigg (\!\!-\frac {\omega \mu L}{\tau _{m}}+\mu l_{\textrm {min}}\!\bigg )\bigg (\!\!\!\exp \bigg (\frac {\mu L}{\tau _{m}}\bigg )-1\!\bigg ).\notag \\\text{}\end{align}

#### Proof

See Appendix B.

We next discuss the feasibility issues of Theorem 6.1. As $F(\omega )$ monotonically increases with $\omega$, we have $F(\omega +1)>F(\omega )$ always holds. Since $\Psi _{m}(t^{\prime })$ denotes the preparation probability that at least one CCS of task $m$ exists at time $t^{\prime }$, by definition it is a scaler between 0 and 1. Therefore, it requires:TeX Source$$$$F(\omega ) \leq \frac {\xi _{m}}{P_{m}(t^{\prime })}\leq F(\omega +1).$$$$ However, since $\xi _{m}\in [0,1)$ is the maximum allowed delay failure probability, it is specified by applications, and does not have relations with the transition probability $P_{m}(t^{\prime })$ from the current task to task $m$. Therefore, (24) may not always hold. Recall that $F(\omega )$ represents the probability that the incurred service delay is larger than the specified tolerance, or $F(\omega )\triangleq \textrm {Pr}\{\omega L-x>\tau _{m} l_{k}\}$. Then, when $\frac {\xi _{m}}{P_{m}(t^{\prime })}>F(\omega +1)$, we set $\Psi _{m}(t^{\prime }) = 0$. This is because if the maximum allowed delay failure is relatively very high or the task arrival probability is low enough, there is no need to prepare sensors for it. On the contrary, when $\frac {\xi _{m}}{P_{m}(t^{\prime })}<F(\omega )$, we set $\Psi _{m}(t^{\prime })=1$ as a constant, indicating that if the maximum allowed delay failure is very low or a task instance is very likely to come at time $t^{\prime }$, it is necessary to prepare sensors for service.

In a summary, replaced by the new constraint in (22) under the realistic delay model, our objective function in (17) and proposed greedy algorithms can provide a sub-optimal solution. All other steps in Section V-D apply.

### B. Satisfactory Region of Delay Tolerance

The value of $F$ denotes the theoretically derived probability that the service delay exceeds the maximum tolerable threshold. Fig. 5 shows the value of $F$ with respect to (w.r.t.) different specified delay tolerance values, by varying $\omega$ and $\mu L$. Consistent with previous analysis, $F$ monotonically increases with $\omega$, since more severe signaling latency would result in higher delay outage probability. It also can be seen from the figure that higher delay tolerance $\tau _{m}$ leads to lower delay probability, and this probability increases with $\mu L$ that characterizes the ratio of frame size and average duration of task instance. Higher $\mu L$ (i.e., larger frame size or shorter instance duration) will relax the delay constraint imposed by tasks, and thus lower delay outage is expected.

FIGURE 5. $F$ vs. delay tolerance threshold, varying parameters $\omega =3, 7$ and $\mu L=0.002, 0.003$.

As analyzed in Lemma 5.1, given task $m$, in order to successfully achieve its delay requirement, the probability of delay occurrence at any of its instances should be upper-bounded by $\xi _{m}/\pi _{m}$. Based on our analysis of signaling latency in (21), we rewrite the steady state form of Lemma 5.1 as: TeX Source$$$$\Psi _{m} F(\omega )+\left (1-\Psi _{m} \right )F(\omega +1) \leq \frac {\xi _{m}}{\pi _{m}},$$$$ where $\Psi _{m}$ denotes the corresponding preparation probability under the steady state of task transitions. Since $F(\omega )\!<\!F(\omega \!+\!1)$, we relax the constraint in (25) by TeX Source$$$$F(\omega ) < \frac {\xi _{m}}{\pi _{m}}, \quad \forall m\in \mathcal {M}.$$$$ According to the derivation of $F$ as a non-increasing function of the delay tolerance $\tau _{m}$, it is interesting to observe that $\tau _{m}$ cannot be arbitrarily chosen, but tightly coupled with system parameters $L, \omega , \mathbf {P}_{m}$ and task parameters $\mu , l_{\textrm {min}}$. In Fig. 5, we visualize the condition (26) that eventually crosses all curves of different parameters. We call the region of $\tau _{m}$ as its (26) as its satisfactory region. Therefore, given those parameters, the system has its own feasible working range, beyond which higher system settings (like $L$) should be configured. Deriving this lower-bounded region requires solving the transcendental equation and thus, numerical solutions are expected like the Newton’s method.

SECTION VII

## PERFORMANCE EVALUATION

In this section, we show an example of our methodology, by assuming a monitoring IoT application, such as using (randomly deployed) multi-functional sensors with certain sensing range to measure the water quality, temperature, air pollution or humidity of certain observation points. We present the system pertinent solutions, the environment settings and show the results.

### A. System Model and Simulation Setup

In our environmental monitoring system, each sensor with certain monitoring capability is randomly deployed and its spatial coverage follows a classic disk model. We assume that the sensory data within the sensing region is corrupted by noise during measurement and/or transmissions. Fig. 6 shows an illustrative example of the sensing monitoring graph for sensor and task deployment, where $N=15$ sensors are deployed in a $600\times 600$ square area of unit distance, to obtain $M=4$ different types of monitoring data for 4 specific locations (as red square), and a gateway is placed to collect the data from sensors.

FIGURE 6. An illustrative example of the IoT application where 15 sensors with certain sensing range (as black dots) are randomly deployed to monitor the water quality, temperature, air pollution or humidity of four locations (as red square), and a gateway is placed to collect the data.

In this example, we consider measurement accuracy and service delay (both with multiple metrics) as two QoI requirements. For the former, we define its probabilistic model as: TeX Source$$$$\textrm {Pr}\Big \{|Z_{m}(t) - z_{m}|\geq \delta _{m} \Big \} \leq \epsilon _{m}, ~\forall m\in \mathcal {M},$$$$ where the random variable $Z_{m}(t)$ is the fused, sensor-retrieved information for task $m$ at time $t$, and $z_{m}$ is the actual but unknown information, i.e., the ground truth. Analogously to the desired QoI functions in [28], we define $\underline {q}_{m}$ as:TeX Source$$$$\underline {q}_{m} = \Big \{Y_{m}, \left (\delta _{m}, \epsilon _{m}\right ) \Big \}, ~\forall m \in \mathcal {M},$$$$ where $Y_{m}$ and $\{\delta _{m}, \epsilon _{m}\}$ are the geographical location and accuracy requirement of task $m$, respectively. For service delay, tasks specify the required delay tolerance threshold $\tau _{m}$ and delay failure probability $\xi _{m}, \forall m\in \mathcal {M}$, as shown in Section V-B.

On the other hand, the sensing capability of sensor $n$, i.e., $\underline {c}_{n}$, can be defined as:TeX Source$$$$\underline {c}_{n} =\Big \{ (X_{n}, r_{n}), \gamma _{n}\Big \}, ~\forall n\in \mathcal {N},$$$$ where $X_{n}$ is the location of the sensor and $r_{n}$ is its sensing radius. We model the measurement noise as additive white Gaussian noise (AWGN) with variance $\gamma _{n}$ for sensor $n$.

A sensor-to-task relevancy function for this model is TeX Source\begin{align} f(\underline {c}_{n}, \underline {q}_{m})=& f(X_{n}, r_{n}, \gamma _{n}, Y_{m}, \delta _{m},\epsilon _{m}) \notag \\ =& {1}\{\mbox {dist}(X_{n}, Y_{m}) \leq r_{n}\}\notag \\ &\cdot \min \bigg \{ \frac {\epsilon _{m}}{\textrm {Pr}\{|W_{n}(t) - \omega L| \geq \delta _{m} \}}, 1\bigg \} \notag \\ =& {1}\{\mbox {dist}(X_{n}, Y_{m})\leq r_{n}\}\cdot \min \bigg \{\!\frac {2\epsilon _{m}}{Q(\frac {\delta _{m}}{\sqrt {\gamma _{n}}})}, 1\!\bigg \},\qquad \end{align}$\forall n \in \mathcal {N}, m \in \mathcal {M}$, where ${1}\{{statement}\}$ is the indicator function that takes value 1 if the statement is true and 0 otherwise, dist($X_{n}, Y_{m}$) is the Euclidean distance between two points, the random variable $W_{n}(t)$ is the information retrieved from sensor $n$ at time $t$, and $Q(\cdot )$ is the tail probability of the standard normal distribution.

If task $m$ is serviced solely by sensor $n$, then $Z_{m}(t) = W_{n}(t)$; otherwise, if it is serviced by a subset $\mathcal {S}$ of sensors, then $Z_{m}(t) = W_{\mathcal {S}}(t)$, where $W_{\mathcal {S}}(t)$ is the fused information from a subset $\mathcal {S}$ of sensors. One possible information fusion algorithm of relevant sensors in this case can be:TeX Source$$$$\hspace {-1pc}W_{\mathcal {S}}(t)\!=\! \arg \min _{w} \frac {1}{\mathcal |{S}|} \!\sum _{n \in \mathcal {S}}\frac {1}{\gamma _{n}}\bigg |W_{n}(t) \!-\! w\bigg |^{2}\!\!\!=\! \frac {\sum _{n \in \mathcal {S}} \frac {W_{n}(t)}{\gamma _{n}}}{\sum _{n \in \mathcal {S}} \frac {1}{\gamma _{n}}}.\quad$$$$ The right hand of (31) is a specific example of the fusion function $g_{\mathcal {S}}(\cdot )$ we defined in Section IV-A. Specially, if all $\gamma _{n}$ are equal and $W_{n}(t) \sim \mathsf {N}(1, \gamma )$, the fused information of a group of $K$ relevant sensors is the average of the individual ones and $W_{t}^{\mathcal {S}} \sim \mathsf {N}(1, \gamma /K)$, where $\mathsf {N}(\mu , \sigma ^{2})$ is a Gaussian distribution with mean $\mu$ and variance $\sigma ^{2}$. Based on the above fusion algorithm, CCSs of every task can be computed during the sensor deployment stage, and used in the online duty-cycling control.

Our numerical result is based on the environmental monitoring system discussed above and is achieved in MATLAB. The capabilities (exclusive of sensing radius which is illustrated in Fig. 6) of all sensors are: $\gamma _{n}=3$, $\forall n\in \mathcal {N}$. Moreover, $P^{\textrm {sw}}_{n} = 5$ and $P^{\textrm {on}}_{n} = 1,\forall n\in \mathcal {N}$ and the initial energy reserve of each sensor is set as 20,000 units. We assume that with the predetermined working power, each sensor is able to fully cover its sensing area. For all tasks in $\mathcal {M}$, the desired QoI satisfies: $\epsilon _{m} = 0.1$, $\delta _{m} = 1$, $\tau _{m}=0$ (i.e., delay-sensitive applications). Assume that the service time of each task follows identical exponential distribution with average duration $1/\mu =50$ time slots and minimum duration $l_{\textrm {min}}=25$ time slots (both are sufficiently longer than the frame size $L$), thus the arrival of all tasks is a Poisson process. A total of 1,000 task instances are simulated. The task transition matrix is given by:TeX Source$$$$\mathbf {P} = \begin{pmatrix} 0 &\quad 1/10 &\quad 2/5 &\quad 1/2 \\ 1/5 &\quad 0 &\quad 3/5 &\quad 1/5 \\ 1/3 &\quad 1/3 &\quad 0 &\quad 1/3 \\ 4/5 &\quad 1/10 &\quad 1/10 &\quad 0 \end{pmatrix}.$$$$ The sensor-to-task relevancy and CCSs can therefore be computed at offine, and there are 10 candidate sets in for task 1, 2, and 4, and 20 CCSs for task 3. Meanwhile, each of CCS consists of three sensors.

We consider the optimization problem in (10) with the energy-aware weight factor $\beta _{n}$ in (11), and the solution is obtained by using the proposed greedy algorithms outlined in Section V-D.

### B. Simulation Results

In Fig. 7, we arbitrarily pick up five sensors and plot the evolving trend of their duty cycles over time. We set up the system parameters as stated above with $\xi _{m}=0.04$, $\forall m \in \mathcal {M}$ and $L=20$ time slots. It can be seen from the figure that after a few rounds of fluctuations at the very beginning when the sensors are trading-off their energy consumption with the provided QoI to tasks, the duty cycle of each sensor converges soon afterwards. This is because our proposed greedy algorithm successfully selects the best set of sensors for service under the stochastic, but Markovian task transitions, and the weight factors accurately capture the energy consumption state of all sensors and guarantees a degree of fairness among them.

FIGURE 7. Sensor duty cycle vs. time.

Next, we show the impact of two system parameters, the frame size $L$ and sensor mode switching power $P^{\textrm {sw}}_{n}$, on the average duty cycle of all deployed sensors as shown in Fig. 8 and Fig. 9, respectively. The delay failure threshold $\xi _{m}$ of all tasks are equally chosen while varying between 0 and 0.1. We observe that for fixed $L$ and $P^{\textrm {sw}}_{n}$, the average measured duty cycle linearly decreases with the increase of required delay failure threshold, as higher $\xi _{m}$ relaxes the service delay requirement provided to all tasks by allowing certain amount of task instances to fail, and in turn the sensors allow to spend more time in the sleeping mode. For fixed $\xi _{m}$, the duty cycle increases with $L$ and $P^{\textrm {sw}}_{n}$. Larger $L$ represents the less frequency system control decisions and thus in order to provide satisfactory services to the next task, the system tends to wake up more than necessary sensors. These unnecessary sensors will stay awake until the next decision point when they can be turned OFF by the EMS. Clearly, the wasted ON times of sensors increase linearly with the frame length. Furthermore, larger $P^{\textrm {sw}}_{n}$ indicates the less reluctant control behavior (or higher penalty) when switching the mode. Thus, the system decisions favor more those sensors who have been in the ON state, and let them continue servicing other tasks who may not eventually appear. Therefore, the energy consumption of all sensors are not optimally allocated, resulting in the larger both the average duty cycle and its variance.

FIGURE 8. Average duty cycle vs. delay failure threshold, by changing frame size $L=\{5,10,15,20\}$.
FIGURE 9. Average duty cycle vs. delay failure threshold, by changing switching power $P^{\textrm {sw}}_{n}=\{2,4,6,8\}$.

Similar trends have been found when simulating the impact of task accuracy requirement $\epsilon _{m}$, as shown in Fig. 10. With the increase of $\epsilon _{m}$, i.e., less stringent QoI requirement that allows more measurement errors, the CCS of a task may involve less sensors for service, and in turn reducing the average duty cycle.

FIGURE 10. Average duty cycle vs. delay failure threshold, by changing the task accuracy requirement $\epsilon _{m}=\{0.06,0.07,0.10,0.12\}$.

Then, we investigate the impact of system parameter $L$ on the measured delay failure probability given different delay failure threshold, i.e., to judge if the required delay parameters are successfully guaranteed by the greedy algorithm. Table 2 shows the result, where the measured delay failure is satisfactorily less than the delay failure threshold and apparently, it increases with the threshold. However, it is interesting to observe that under the same delay failure threshold, the difference between measured results of different frame sizes $L$ is insignificant. This can be explained by our setting of $\tau _{m}=0$, $\forall m\in \mathcal {M}$, i.e., we only consider the delay-sensitive applications, and thus as long as the prepared sensors are incapable to service the coming task at the task transition time, the delay failure event is counted, irrespective how big the frame size $L$ is (or when the next decision time will be, even if the sensors are well-prepared then). We shall investigate the impact of signal propagation and processing latency and frame size $L$ on delay-tolerable applications in the next section.

Table 2. Average measured delay failure w.r.t. different frame sizes and delay failure thresholds.

Fig. 11 illustrates the energy depletion process, for $\xi _{m}=0.04$, $L=20$, with other parameters being set up as the former setting. Besides the proposed greedy algorithm and its revised version (i.e., jointly considering the CCSs of all tasks ather than treating them separately), we also show the result of the optimal solution where the EMS knows exactly which task succeeds the current one. The slope of a curve in the figure represents the energy depletion rate. We observe that the revised greedy algorithm only achieves a slightly better performance than the basic greedy algorithm, at the expense of more computational complexity. It can also be identified the gap between the greedy algorithm and the genie-aided optimal solution, due to the potential error in estimating the future arrived task. Furthermore, we plot the energy depletion process for two extremes: the least (sensor 12) and most used (sensor 5) sensor in the proposed greedy algorithm. As sensor 12 is located at the border area with limited sensing range, it can only cover task 2, which is also being covered by many other sensors like 2, 4, 5, 15, and thus being least frequently used. Meanwhile, sensor 5 is located at the center with a relatively larger sensing radius allowing it to service all four tasks, thus being used mostly. Nevertheless, the energy-aware weight factor in (11) that explicitly takes into the residual energy of a sensor into considerations, helps to lower the variance between these two extremes so that a certain degree of energy consumption fairness is achieved, as shown in the following.

FIGURE 11. Normalized remaining energy vs. time.

We investigate the energy consumption fairness, quantified by the Jain’s fairness index 2 under the proposed greedy algorithm as shown in Fig. 12. We compare the proposed energy-aware weight assignment approach with the equal weight assignment, i.e., $\beta _{n}=1/N$, $\forall n \in \mathcal {N}$. Clearly, for fixed number of sensors $N$, the Jain’s index under energy-aware approach is higher than the one under equal assignment. Furthermore, when more nodes are deployed in a fixed geographic area, the increased node density helps to achieve better fairness among them since any single task would potentially be serviced by more CCSs. This trend does not hold for the equal weight setting, since the diversity gain cannot be utilized by assigning the same weights to all sensors irrespective their different amount of residual energy, and in turn the fairness level decreases with the number of sensors.

FIGURE 12. Fairness index (Jain’s) on energy consumption among all sensors.

To investigate the impact of propagation delay, we still use the same task transition matrix $\mathbf {P}$ as shown in Section VII-A, and set $\xi _{m}=0.1,\forall m\in \mathcal {M}$. A total of 1,000 task instances are simulated, with average duration of $1/\mu =2000$ and minimum duration ł$_{\textrm {min}}=25$ time slots. Other parameters are the same as the basic setting in Section VII-A.

According to the proof of Lemma 5.1, the steady state $\boldsymbol {\pi }$ is obtained as (0:33; 0:14; 0:25; 0:28). Each element of the steady state denotes the stationary probability of a specific task. Fig. 13 shows the simulation result on the average measured delay failure among all tasks w.r.t. different delay tolerance thresholds. It can be observed that $\omega =3$ can successfully guarantee that the required $\xi _{m}=0.1$ for all $\tau _{m}\in [{0.02,0.1}]$, however only part of entire $\tau _{m}$ values can satisfy the same requirement when $\omega =7$; consistent with our analysis in Section VI-B. Furthermore, smaller parameter $\mu L$ decreases the probability of delay failure occurrences either through more frequent control decisions (smaller $L$) or servicing longer instance duration (larger $1/\mu$); equivalently wider satisfactory region for delay tolerance given a $\xi$ and $\mathbf {P}$.

FIGURE 13. Average measured delay failure probability vs. delay tolerance threshold.
SECTION VIII

## IMPLEMENTATION GUIDELINES

We have made a few simplification and assumptions in Section III to ease the analysis, some of which may potentially generate new implementation guidelines in practice.

First, we assume that the CCS are known a priori to the EMS, which is usually realized during the deployment stage where the application owner deploys certain amount of devices to the specified network domain with known geographical locations. Then, the offline computation can be performed given the desired task location and requirements.

Second, the computation of task transition matrix requires advance algorithms like EM algorithm [34]. It uses an iterative procedure to compute the maximum likelihood estimation of a set of parameters in a given distribution (from empirical analysis). To apply it in our framework, EM needs the observed data from task evolution history, including the start and end time of each task instance, from which we can derive the transition times between tasks. Then, the EM algorithm approximates the parameters of the give distribution, as well as its expected value. Note that these expected values exactly represent the average number of transitions between each pair of two tasks, whose normalized values are a “noisy” version of the hidden, true value of all entries of the transition matrix $\mathbf {P}$.

SECTION IX

## CONCLUSION

In this paper, a system-level efficient energy management framework is proposed to provide satisfactory QoI experience in IoT sensory environments. Contrary to past efforts, our proposal is transparent and compatible to lower protocols in use, and preserving energy-efficiency in the long run without sacrificing any attained QoI levels. Specifically, we introduced the new concept of QoI-aware “sensor-to-task relevancy” to explicitly consider the sensing capabilities offered by a sensor to the IoT sensory environments, and QoI requirements required by a task. Then, we proposed a novel concept of the “critical covering set” of any given task in selecting the sensors to service a task over time. Next, energy management decision is made dynamically at runtime, as the optimum for long-term traffic statistics under the constraint of the service delay. An extensive case study based on utilizing the sensor networks to perform environmental monitoring is given to demonstrate the ideas and algorithms proposed in this paper, and a simulation is made to show the performance of the proposed algorithms. To make our energy management framework more applicable and practical in realistic scenarios, we further considered the signal propagation and processing latency into our system model, and both theoretically and experimentally showed its impact on average measured delay probability. Finally, based on our system model assumptions, we brought forward some implementation guidelines in practice and discussed the applicability of our proposal.

APPENDIX A

## PROOF OF LEMMA 5.1

Define the steady state of transition matrix $\mathbf {P}$ of the Markov Chain that models the task evolution as $\boldsymbol {\pi }$. Due to the structure of the task model, all the possible states of the Markov Chain can be mutually visited. There exist a steady state $\boldsymbol {\pi }=(\pi _{1}, \pi _{2}, \ldots , \pi _{M})$ for all $M$ tasks such that $\boldsymbol {\pi }\mathbf {P} = \boldsymbol {\pi }, \textrm {and } \sum _{m=1}^{M}\pi _{m}=1$. This $\boldsymbol {\pi }$ can be found by solving the set of linear equations. Then, given any task $m$, its average measured delay failure probability of all instances in (7) can be rewritten as: TeX Source\begin{align} \zeta _{m} =& \lim _{I_{m}\rightarrow \infty }\frac {1}{I_{m}}\sum _{i=1}^{I_{m}} \zeta _{m,i} = \mathbb {E}\left [\zeta _{m,i}\right ]\notag \\ =&1\cdot \textrm {Pr}\{\zeta _{m,i}=1\}\pi _{m}+0\cdot \textrm {Pr}\{\zeta _{m,i}=0\}\pi _{m}\notag \\ =&\textrm {Pr}\{\zeta _{m,i}=1\}\pi _{m}, ~\forall m\in \mathcal {M}. \end{align} Therefore, for satisfactory delay performance under parameter $\xi _{m}$, we rewrite $\zeta _{m} \leq \xi _{m}$ as:TeX Source$$$$\textrm {Pr}\{\zeta _{m,i}=1\} \leq \frac {\xi _{m}}{\pi _{m}}, \quad \forall m\in \mathcal {M}, i\in \mathbb {N}^{+}.$$$$

APPENDIX B

## PROOF OF THEOREM 6.1

As the service time of each task follows identical exponential distribution, the total number of instance occurrences has a Poisson distribution over $(0,t]$, and the occurrences are distributed uniformly on any interval of time. Therefore, the random variable $x$ shown in Fig. 4 follows a uniform distribution in $(0,L)$. As $l_{k}$ is exponentially distributed with average $1/\mu$ and lower-bound $l_{\textrm {min}}$, its probability density function is given by: TeX Source$$$$f_{l}(l)= \begin{cases} \mu \exp (-\mu l+\mu l_{\textrm {min}}), & l>l_{\textrm {min}}, \\ 0, & \textrm {others}. \end{cases}$$$$ From (21), we have: TeX Source\begin{align} \Psi _{m}(t^{\prime })\geq \frac {\textrm {Pr}\{(\omega +1)L-x>\tau _{m} l_{k}\}-\frac {\xi _{m}}{P_{m}(t^{\prime })}}{\textrm {Pr}\{(\omega +1)L-x>\tau _{m} l_{k}\}-\textrm {Pr}\{\omega L-x>\tau _{m} l_{k}\}}.\notag \\\text{}\end{align} Let $F(w)$ denote the probability that the incurred service delay is larger than the specified tolerance, or $F(w)\triangleq \textrm {Pr}\{\omega L-x>\tau _{m} l_{k}\}$, then:TeX Source\begin{align} F(w)=&\textrm {Pr}\{l_{k}<\frac {\omega L-x}{\tau _{m}}\}\notag \\ =&\int _{0}^{L} \frac {1}{L} \int _{l_{\textrm {min}}}^{\frac {\omega L-x}{\tau _{m}}}\mu \exp (-\mu l_{k}+\mu l_{\textrm {min}}) dl_{k} ~dx\notag \\ =&1-\frac {\tau _{m}}{\mu L}\exp \bigg (-\frac {\omega \mu L}{\tau _{m}}+\mu l_{\textrm {min}}\bigg )\bigg (\exp \bigg (\frac {\mu L}{\tau _{m}}\bigg )-1\bigg ).\notag \\\text{}\end{align} Hence, replace (37) back to (36), we obtain the closed-form expression for $\Psi _{m}(t^{\prime })$ (22).

## Footnotes

This work was supported by the National Natural Science Foundation of China under Grant 61300179.

Corresponding Author: C. H. Liu

1The expectation maximization (EM) algorithms will be executed within the energy management server (EMS).

2It is defined by ${(\sum (E_{n}-\bar {E}_{n}))^{2}}/{\left (N \sum (E_{n}-\bar {E}_{n})^{2}\right )},\forall n\in \mathcal {N}$. The result ranges from $\frac {1}{N}$ (worst case) to 1 (best case). The larger the index is, the better fairness that we can achieve.

## References

No Data Available

## Cited By

No Data Available

None

## Multimedia

No Data Available
This paper appears in:
No Data Available
Issue Date:
No Data Available
On page(s):
No Data Available
ISSN:
None
INSPEC Accession Number:
None
Digital Object Identifier:
None
Date of Current Version:
No Data Available
Date of Original Publication:
No Data Available