Optimization of Web Service-Based Data-Collection System With Smart Sensor Nodes for Balance Between Network Trafﬁc and Sensing Accuracy

—Web services integrate various components in the Internet of Things (IoT). In a Web service-based data-collection system with multiple smart sensor nodes periodically sampling and estimating the same unknown physical parameter of interest, the smart sensor nodes ﬁrst submit their estimates to the Web server, and then, the server picking the one with the minimum error seems to be a practical way to arrive at a minimum error estimate (MEE). More submissions provide the Web server with more candidates to consider, which can maximize the probability of the server guaranteeing the MEE, while also leading to more network trafﬁc. Therefore, how to make the optimal tradeoff between network trafﬁc and sensing accuracy arises as an interesting problem. This article proposes a network trafﬁc-dependent probability threshold policy within an intended underlying optimization-theoretical framework to address this problem. The policy is such that the smart sensor nodes submit their estimates and corresponding estimation errors (ECEEs) to the Web server within a tolerable network trafﬁc threshold while maximizing the probability of the server delivering the MEE. Theoretical analysis, simulation, and ﬁeld experiments document and illustrate its performance. Note to Practitioners —This article addresses the interesting tradeoff between sensing accuracy and network trafﬁc demand in the Web service-based data-collection system that operates in some remote areas with limited network trafﬁc. It helps to improve the operation efﬁciency of the Internet-of-Things (IoT) systems that employ Web service technology to enable the Web server to deliver minimum error estimate with maximum probability while keeping the network trafﬁc within a given range. Our simulation and experimental investigations show that the solution developed here outperforms existing solutions.

Index Terms-Internet of Things (IoT), multiple smart sensor nodes, network traffic, sensing accuracy, web service.

NOMENCLATURE MEE
Minimum error estimate. ECEE Estimate and corresponding estimation error. s i i th smart sensor nod.e n Number of all smart sensor nodes. N = {s 1 , s 2 , . . . , s n }, the set of all smart sensor nodes. 1 An action, submit the estimate to the Web server. 2 An action, do not submit the estimate to the Web server. Cumulative probability under π. C π Cumulative network traffic consumed under π. Q Available network traffic in a periodic cycle. e i Estimation error of s i . δ i indicates whether the sever obtains the MEE by s i taking a i . C i (a i ) Network traffic consumed by s i taking a i . c Average network traffic consumed by an estimate-submission without data block. ε Average variation of network traffic consumed by an estimate-submission per additional estimate submitted to the Web server. ζ i represents that s i is the i th node submitting the estimate. π * Optimal policy in G, it is the optimal policy for each sensor nodes to determine whether to submit its estimate to the Web server in order to maximize the probability of obtaining the minimum error estimate while maintaining This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ the server network traffic at the required level. r Number of smart sensor nodes that must take 1 . P(r ) = P π (P(r ) is used in the Appendix). C(r ) = C π (C(r ) is used in the Appendix). r * = arg max r∈{1,2,...,n} P(r ), the optimal r , the number of sensor nodes that must submit their estimates to the server under the optimal policy.

I. INTRODUCTION
W ITH the capability to enable communication between machines via the Worldwide Web, Web services have infiltrated into various Internet-of-Things (IoT) applications to bridge different components, such as Web server and smart sensor nodes [1]- [4], where the latter differs from the traditional sensor nodes by having the intelligence to make decisions [5].
The structure of a Web service-based data-collection system with multiple smart sensor nodes is shown in Fig. 1, in which the nodes: 1) are independent of, and cannot communicate with, each other; 2) periodically sample and estimate the same unknown physical parameter of interest; 3) obtain their ECEEs 1 and submit the ECEEs to the Web server according to some criteria; and 4) have no knowledge of each other's ECEE.
The Web server works according to the following mechanism [6], [7]. To initiate a transmission control protocol (TCP) connection with the Web client, the Web server must listen on a network port for the client request. If the TCP connection is successfully built, and the Web client request is made, then a response with several hundred bytes must be issued from the Web server to the client no matter whether this request is successful or not. If successful, then a data packet with the required information will be given; otherwise, a detailed reason for the failure will be included in the response.
The smart sensor nodes do not know and cannot obtain each other's estimates, so, for the Web server to obtain the MEE, all the nodes first submit their estimates to the Web server, and then, the server picking the one with the minimum error seems to be a practical method [8]. However, each submission has to consume network traffic, and the network resource may be too limited to support all the submissions. Therefore, new policies are needed to balance the network traffic and sensing accuracy under the environments where the available network resource is not sufficient to support each node to issue a data packet to the Web server. Section II provides a review of the related literature. 2 Fig. 1. Structure of a Web service-based data-collection system with multiple smart sensor nodes (for each sensor node, its estimate and corresponding estimation error are private to it before reaching the Web server).
To achieve the abovementioned balance, this article attempts to find an effective policy to maximize the probability of the Web server obtaining the MEE with the cumulative network traffic consumed being no larger than a given threshold based on six conditions as listed in the follows.
1) Each smart sensor node can obtain an estimate which is private to the node before reaching the Web server.
2) The sensor nodes have no knowledge of each other's estimates and whether the other nodes have submitted their estimates or not and cannot obtain this information from the Web server or by mutual communication. 3) To let the Web server obtain the MEE, some (at least one) estimate-submissions from the nodes to the Web server must happen, and the MEE will be selected by the Web server from among all the estimates submitted (the Web server can recognize the relative MEE). 4) Each smart sensor node will submit its estimate only if it considers that it has a smaller estimation error than all the previous ones that have submitted their estimates; 5) Each smart sensor node has a node number which is arbitrary and exclusive (it is private to the node). 6) The other dependable source, such as power, is sufficient for each sensor node to make decision and submission. Accordingly, we have made in this article two contributions as follows.
1) An optimization-theoretical framework is established for the Web service-based data-collection system with multiple smart sensor nodes. 2) We develop a network traffic-dependent probability threshold policy for the smart sensor nodes to determine whether to submit their estimates to the Web server in order to maximize the probability of the server obtaining the MEE while maintaining the required level of network traffic. The rest of this article is organized as follows. We will review the relevant literature in Section II, state and formulate the problem in Section III, present the algorithms in Section IV, carry out algorithm analysis in Section V, perform experiments in Section VI, and draw conclusions in Section VII.

II. LITERATURE REVIEW
Web services have been employed into sensor networks, such as a sink node, but with more powerful functions [9]- [12]. However, resource constraints imposed by practical application conditions often prohibit the use of Web service technologies [13], [14]. To overcome this difficulty, the mutual communication of smart sensor nodes has been exploited to optimally allocate limited resources to multiple nodes [15]. For example, when the network traffic is limited, some smart sensor nodes are selected as proxies, and they first receive the information from the other nodes and then filter the dummy information that wastes network traffic [16]. However, mutual communication between smart sensor nodes is not allowed everywhere [17], which drives us to focus on the design of strategies for smart sensor nodes to interact with the Web server on the condition that the network resource is constrained, and the sensor nodes cannot exchange information.
The Web service-based data-collection system with multiple smart sensor nodes aims to increase the level of accuracy of estimations of the physical parameters of interests. Receiving more estimates from the smart sensor nodes allows the Web server to report with greater sensing accuracy. However, this creates challenges when the network traffic available to smart sensor nodes is not at the level to support each estimate-submission [18]. Hence, many advanced studies have been carried out over the past decades. Based on the structure of the methods, we roughly divide them into three classes: probability-dependent threshold algorithm (PDTA) [19], network traffic exhaustive algorithm (NTEA) [20], and web service request-assistant algorithm (WSRA) [21].
PDTA commands the smart sensor nodes to submit their estimates to the Web server with a fixed probability [19]. For example, Beyah and Cai [22] proposed a probabilistic network model in which the data of smart sensor nodes is transmitted probabilistically. Liu et al. [23] developed a data-collection scheme for the sensor networks to conduct data transmission probabilistically. NTEA requires the network traffic to support as many estimate-submissions as possible until exhausted [20]. For example, Kyusakov et al. [24] embedded simple object access protocol (SOAP)-based Web services into smart sensor nodes to enable them to use as much network traffic as possible. Glombitza et al. [25] implemented Web services on smart sensor nodes without restrictions on the use of network traffic supplied. WSRA lets the smart sensor nodes acquire knowledge of the relative MEE by requesting from the Web server [21]. For example, Han et al. [26] made use of the service-oriented architecture (SOA) to establish a building automation system with multiple smart sensor nodes that capture the needed information from the Web server. Khan et al. [27] comprehensively classified the mobility sink-dependent data collection and dissemination policies into three policies in all of which the mobility sink plays the role of the Web server and provides the needed information for the other smart sensor nodes. However, PDTA does not ensure that the server can obtain at least one estimate; NTEA always leads to waste of network traffic, and WSRA's feedback from the server to the smart sensor nodes must consume network traffic [28], [29]. Therefore, an effective strategy in practice to make the optimal tradeoff between network traffic and sensing accuracy is called for, with the available network traffic constrained.
To balance the network traffic and sensing accuracy in the network traffic-constrained environment for the Web service-based data collection system in which the smart sensing nodes cannot exchange information, the noncooperative game framework seems to be a feasible one in which each smart sensor node wants to maximize its utility function, as researched in our previous work [30]. However, what is solved in this article is the probability of the Web server achieving the MEE at the required level of network traffic. It is the sum of utility functions of all smart sensor nodes and a single criterion optimization problem and not the maximization of individual utility functions as in the standard noncooperative game with the Nash equilibrium [31]. Therefore, the noncooperative game aspect has to be eliminated. Instead, the intended underlying optimization framework aiming at a single criterion optimization problem [32] is adopted. Accordingly, the main contribution is the formulation of the problem of smart sensor nodes determining whether to submit their estimates to the Web server or not, under network traffic constraints, as an optimization problem, and further develop a technique to capture the optimal tradeoff between the network traffic and sensing accuracy.

A. Problem Statement and Main Challenges
The data-collection system in Fig. 1 is working in a periodic way, as shown in Fig. 2: at the beginning of a time interval with fixed length (we call it a periodic cycle and assume that it is chosen small enough so that the unknown physical parameter of interest does not change significantly within a cycle): 1) the smart sensor nodes start to sample and estimate the unknown physical parameter; 2) if the periodic cycle ends, then the nodes submit their estimates to the Web server according to some criterion 3 ; and 3) the server picks the estimate with minimum error from all the estimates submitted and sees it as the MEE. The procedure repeats in the next cycle [33].
Each smart sensor node can obtain an estimate, but does not know and cannot obtain others' estimates. The estimates in different sensor nodes differ from each other, 4 and can successfully reach the Web server if they are submitted. The way that all the sensor nodes submitting their estimates to the Web server and then the server picking the one with minimum error leads to unnecessary network traffic. For example, for the two sensor nodes s 1 and s 2 , as shown in Fig. 3, if s 2 has a smaller estimation error than s 1 and has submitted its estimate to the Web server, then it is unnecessary for s 1 to submit its because its estimate cannot help the Web server to decrease the estimation error, while its estimate-submission has to consume network traffic. Unfortunately, s 1 has no knowledge of s 2 's estimate, as well as whether s 2 has submitted its estimate or not, and cannot acquire this information by asking s 2 or requesting from the Web server, because s 2 and s 1 cannot mutually communicate, and the Web server mechanism shows that the server cannot actively send this information to s 1 . Therefore, s 1 cannot compare its estimate with s 2 's before making the decision. With this in mind, the problem this article would like to solve can be stated as follows, with the precise formulation given in Section III-B as an optimization problem.

Problem [Estimate-Submission Decision Problem (ESDP)]:
To design a policy for each smart sensor node to determine whether to submit its estimate to the Web server in order to maximize the probability of the server obtaining the MEE while maintaining the network traffic at the required level.
Accordingly, the process of ESDP can be described as follows: in each periodic cycle, when the data-sampling is stopped, each smart sensor node obtains an estimate and submits its estimate only if it believes that it has a smaller estimation error than the others submitting their estimates ahead of it.
The main challenges lie in that the smart sensor nodes have no information on others' estimates, as well as whether these estimates have been submitted to the Web server and cannot obtain this information from the server or by mutual communication when making its decision.

B. Problem Formulation
Nomenclature lists the notations used in the rest of this article.
is the set of all actions of s i ∈ N with 1 and 2 denoting, respectively, whether s i submits or does not submit its estimate to the Web server. Then, since P{a i = 1 } ∈ [0 1], the policy space G of π can be given by ( In a periodic cycle, let P π denote the cumulative probability of the Web server obtaining the MEE under π and C π denote the cumulative network traffic consumed under π. Now, we are ready to state ESDP as an optimization problem that can be solved by programming as where Q is the available network traffic consumed in a periodic cycle, 5 which is not at a sufficient level to support all the smart sensor nodes s 1 , s 2 , . . . , s n to submit their estimates. In the following, we will show that the formulation of ESDP in (4) can be written in terms of the parameters in (1).
Denote by e 1 , e 2 , . . . , e n the estimation errors of s 1 , s 2 , . . . , s n ; then, we have MEE = min(e 1 , e 2 , . . . , e n ) (5) and, recalling the footnote 3, we have s 1 , s 2 , . . . , s n are identical to each other except their estimates, so we have Let δ i be the indicator function of whether the Web server achieves the min(e 1 , e 2 , . . . , e n ) with s i taking a i , that is, Then, by (7) and the definition of 1 and 2 , we have and further by the definition of U i (a i ), we have Let constant c denote the average network traffic consumed by an estimate-submission without data block. More estimate-submissions lead to more data blocks, and the network traffic consumed by an estimate-submission varies under different number of blocks. Let constant ε denote the average value of such variation per additional estimate submitted to the Web server; then where ζ i denotes that s i is the i th smart sensor node submitting the estimate. 6 Therefore, the ESDP in (4) can be formally rewritten as 6 The conditions 5) and 6) in Section II-A indicate that we can always let s i be the ith smart sensor node deciding whether to submit the estimate.

A. Our Algorithm: NTPTA
The network traffic-dependent probability threshold algorithm (NTPTA) working ∀s i ∈ N to determine whether to submit its estimate to the Web server or not is formally given in Algorithm 1, where π * is the optimal policy for each sensor node to determine whether to submit its estimate to the Web server in order to maximize the probability of the server obtaining the MEE while maintaining the network traffic at the required level, and r * determined by (16) or (17) (for large n) is the number of sensor nodes that must submit their estimates to the server under the optimal policy π * .

B. Other Algorithms
We compare NTPTA with three existing popular algorithms as summarized in Section II: PDTA, NTEA, and WSRA in Algorithms 2, 3, and 4, respectively.
Brief Description of WSRA: ∀s i ∈ N: 1) if i ∈ {1, 2, . . . , (Q/(2c + (n + 1)ε))}, then s i requests the server for e relative and takes 1 and 2 under e i < e relative and e i ≥ e relative , respectively and 2) else, s i takes 2 . In other words, the nodes in WSRA can compare their estimation error with that of the relative MEE requested from the server before taking action.

V. ANALYSIS OF ALGORITHM
The analysis consists of: 1) optimality analysis showing that NTPTA makes the optimal tradeoff between network traffic and sensing accuracy and 2) performance analysis showing that NTPTA is better than PDTA, NTEA, and WSRA. 7 We use a random sample θ obeying uniform distribution U (0, 1) to obtain the probabilities. Especially, we generate θ from uniform distribution U (0, 1) to charge the action of the sensor nodes that just submit their estimates with respective probabilities: for such kind of sensor node, if the random sample θ generated from uniform distribution U (0, 1) lies in the interval (0 (r * /(i − 1))], then the sensor node s i should submit its estimate, otherwise not. 8 P π * = (r * /n) + n i=r * +1 (1/n)(r * /(i − 1)) is the exact value of P π * no matter whether the number n of sensor nodes is small or large, while P π * ≈ (r * /n) − (r * /n) ln(r * /n) is the approximate value of P π * when n is just large (the latter is easier to calculate than the former when n is large), but the approximation error is no larger than 5% for n ≥ 20. For example, it is 4.25% 1.71%, 0.85%, and 0.09% for n = 20, 50, 100, and 1000, respectively. A. Optimality of NTPTA Under condition 3) in Section II-A, we let r denote the number of smart sensor nodes that must take 1 .

B. Performance Comparison
Let P PDTA , P NTEA and P WSRA denote, respectively, the cumulative probabilities of the Web server obtaining the MEE by PDTA, NTEA, and WSRA.
Proof: See the Appendix. Theorem 2 shows that the performance of NTPTA is better than that of PDTA/NTEA/WSRA in terms of maximizing the probability of the Web server obtaining the MEE under the same level of network traffic. This result follows because of the following.
1) NTPTA ensures that the Web server can obtain at least one estimate while PDTA does not. 2) NTPTA in every periodic cycle makes full use of the network traffic, while NTEA and WSRA both always waste at least (2cQ/(2c + (n + 1)ε)) − c(2Q/(2c + (n + 1)ε)) ≥ 0. 3) NTPTA does not request from the server while WSRA does. In summary, NTPTA has a better performance than PDTA, NTEA, and WSRA by not only considering a more realistic model but also avoiding wasting network traffic.

A. Experimental Setup
In this section, we compare NTPTA against PDTA, NTEA, and WSRA through both simulations on MATLAB and field experiments on the experimental systems, as shown in Fig. 4.
The experimental system consists of smart sensor nodes, hotspot (smartphone), 4G base, operators' network, and Web server (laboratory-based server). The smart sensor nodes with Wi-Fi but without 4G function working in the remote area are connected to the Web server with the help of the hotspot that enables the mutual conversion between Wi-Fi and 4G. Especially, the smart sensor nodes locally connect the hotspot through a shared Wi-Fi network, while the hotspot remotely communicates with the Web server through the 4G network that is provided by the 4G base.
The Web server is implemented on a laboratory-based personal computer (PC) with the working voltage, resolution, CPU, and hard disk being 220-V ac, 1366 × 768, Processor Intel Core i3-3220 CPU at 3.30 GHz, and 500 GB, respectively.
The smart sensor node consists of temperature sensor ER-TH-M5, transmission media universal serial bus, electrical logic converter MAX232, microprocessor MSP430, and serialinterface/Wi-Fi converter HLK-RM04. The smart sensor nodes were assigned with a task to sample the temperature every 18 s before stopping sampling in the periodic cycle set as 5 min.
MAX232 [35] and HLK-RM04 [36] are both signal converters with working voltage 5 V, where the former converts the Recommended Standard 232 (RS232) to transistor-transistor logic (TTL) signal, while the latter converts the TTL to transmission control protocol (TCP) data package. Here, we supplied 5-V voltage for them and the universal serial bus.
As a mixed-signal processor, MSP430 can work under the voltage from 1.8 to 3.3 V [37], and we supplied 3.3 V here. It can work under the 0-25 MHz frequency, and we supplied 4 MHz here. By its P3.4 and P3.5 pins, it can send the commands to and receive the samples from ER-TH-M5, respectively, through the universal serial bus and MAX232 with Baud Rate 9600 b/s. By its P3.6 and P3.7 pins, it can send the samples to and receive the Web service response from the Web server, respectively, through the universal serial bus and HLK-RM04 with a baud rate of 9600 b/s. It makes use of its timer, Timer A (TA), to realize the duration of 18 s and 5 min.
As a digital sensor to sample the temperature, ER-TH-M5 can work under the voltage from 7 to 25 V [38], and we supplied 10 V here. It can work under the 4-MHz frequency, matching MSP430's, so we supplied 4 MHz here. By the TXD/D+ and RXD/D− pins, it can send the samples to and receive commands from MSP430, respectively, through the universal serial bus and MAX232 with Baud Rate 9600 b/s.
To calibrate all the sensors, we first selected the sensors having the same type, i.e., all the temperature sensors are ER-TH-M5, and then resumed them to default settings. The noise produced in the temperature measurement is the thermal noise of the sensor. It obeys normal distribution with mean value 0 and variance 0.5 2 given by the temperature sensor datasheet. 9 Its spectrum is a uniform distribution with a magnitude of 0.5 2 .
The Wi-Fi network uses the band of 2.4G and the protocol of IEEE 802.11 b. The channel capacity available to a single smart sensor node approximates 1000 B. When all smart 9 See https://wenku.baidu.com/view/8958c21314791711cc79174a.html. sensor nodes are operating, the channel capacity available to each sensor node is approximately between 1000 and 1300 B, but totaling to less than the available network traffic Q.
We constrained the channel capacity by the built-in function "data limit" of the hotspot (smartphone). It allows us to set the channel capacity limit. When the limit is reached, the hotspot will be automatically disabled, the Wi-Fi network will disappear, and the connection of the smart sensor nodes to the Web server will be terminated. This means that the channel capacity constraint occurs in the hotspot.
The smart sensor nodes work periodically as stated in Section I, with the periodic cycle of 5 min. We obtained the MEE by the following steps: In a periodic cycle: 1) each smart sensor node iteratively calculates its estimate and corresponding estimation error by the first and second equations, respectively, in (20), which is proposed in our previous work [33] (see (15) in [33]), where, at time t, z t is the sample, σ 2 z,t is the sample variance, μ t is the Bayesian posterior estimate ( μ t is initialized as the first sample, i.e., z 1 ), and σ 2 t is the Bayesian posterior estimation error ( σ 2 t is initialized as the reference sensing accuracy given by the temperature sensor datasheet); 2) if the periodic cycle ends, then the smart sensor node considers the last Bayesian estimate as the truth measurement and submit it and the corresponding Bayesian estimation error to the Web server according to our proposed algorithm or the algorithms compared against ours; and 3) the server picks the Bayesian estimate with minimum Bayesian estimation error from all the estimates submitted and sees it as MEE In the simulations, the software package of MATLAB Simulink Communication Block Set in [39] was used to establish the simulation environment: 1) the smart sensor nodes and Web server were both simulated by existing sensor models in Simulink; 2) the network channel was modeled by additive white Gaussian noise (AWGN) channel in Simulink allowing the setting of channel capability limit (according with Fig. 4, we replaced Bluetooth in [39] with Wi-Fi/4G network); and 3) the message collision and loss were calculated by path loss block in Simulink. To emulate the real scenario, simulations share the same noise with field experiments, i.e., we added the white Gaussian noise with mean value 0 and variance 0.5 2 to the simulated observations. Its spectrum is a uniform distribution with a magnitude of 0.5 2 .

B. Parameter Identification
Some parameters are identified before simulations and field experiments. We briefly state how to complete this. By (16) or (17) or Algorithm 1, the parameters should be identified are c and ε. For c, an estimate-submission package from the smart sensor node contains 800 B, and a Web service response package contains 200 B, without data block on the Internet, so c = 800 + 200 = 1000 B. For ε, it is calculated by (21), where C with and C without are the accumulative network traffic consumption, with and without data block, respectively, and n ε is the number of ε introduced by data blocks that are caused by sensor nodes submitting estimates in a shared Wi-Fi network We let ten smart sensor nodes submit their estimates successively. This means that C without = 1000 × 10 = 10 000 B and n ε = 55 (see footnote 10). 10 In the test, we measured C with = 10 055 B, so we have ε ≈ (10 055 − 10 000)/55 = 1 B.
To emulate the real scenario, simulations and field experiments share the same c and ε and are both carried out by the following three parts. 11 1) Carry out experiments under ten pairs of (n, Q), as shown in Table I. 2) Carry out experiments under ten pairs of (n, Q), as shown in Table II, according to changing the number n of sensor nodes at the same level of the background of network traffic Q. 3) Carry out experiments under ten pairs of (n, Q), as shown in Table III, according to changing the background of network traffic Q while keeping the number n of sensor nodes unchanged.

C. Simulation Results
Thirty replications of the simulation are performed in total, the same as the field experiments. The 30 simulation results consist of three parts and are concluded, respectively, as the 10 The hotspot (smartphone) always consumes network traffic even though there is no estimate-submission due to the reason that some built-in applications cannot be completely shut down and always lead to network traffic cost. The network data corresponding to such consumption and the estimate-submissions jointly produce data blocks. More estimate-submissions make more data blocks, and the network traffic consumed by an estimate-submission varies under a different number of blocks. ε denotes the average value of such variation per additional estimate submitted to the Web server. Therefore, for ten sensor nodes successively submitting estimates, the numbers of ε introduced by the first, second, . . ., tenth estimate-submissions are 1, 2, . . . , 10, respectively, summing to 55. ε may be, of course, positive, 0, or negative. 11 Because of the limits of the experiment field, the field experiments do not share the same number of sensor nodes with simulations.   Q IN THE FIRST AND SECOND COLUMNS,  RESPECTIVELY, OF TABLE I form of standard box plot in Fig. 5(a)-(c), which displays the probability of the Web server obtaining the MEE under ten pairs of (n, Q) set, as shown in the first column of Tables I-III, respectively. From Fig. 5(a)-(c), we can find that NTPTA does a better job than PDTA, NTEA, and WSRA in maximizing the probability of the Web server obtaining the MEE no matter whether (n, Q) comes from Tables I, II or III. Especially, the average probabilities of the 30 simulations by NTPTA, PDTA, NTEA, and WSRA are 0.9143, 0.8673, 0.8613, and 0.7296, respectively, which shows that NTPTA leads to higher probability than PDTA, NTEA, and WSRA with 5.42%, 6.15%, and 25.31%, respectively, and validates

D. Field Experiment Results
Similar to the abovementioned simulation results, the 30 field experiment results consist of three parts 12 Fig. 5 concludes the simulation and experiment results as the form of boxplots [40], where the bottom and top of the box are the first and third percentile, respectively, the bar inside the box is the median, the lower bar below the box is the lowest datum still within 1.5 interquartile range (IQR) of the first quartile (IQR is the difference between the third and first percentiles), the upper bar over the box is the highest datum still within 1.5 IQR of the third quartile, the lower red + below the box is the data outside of 1.5 IQR of the first quartile, the upper red + over the box is the data outside of 1.5 IQR of the third quartile (the lower and upper red + are both called extreme outliers), and the rest of lines are used to shape the box plot [33]. This figure shows the probability of the Web server guaranteeing the MEE under different pairs of channel capability limits and smart sensor nodes, not the error. Since the probability lies in [0, 1], the magnitude of its boxes or bars is 1. and are concluded respectively as the form of standard box plot in Fig. 5(d)-(f), which displays the probability of the Web server obtaining the MEE under ten pairs of (n, Q) set as shown in the second column of Tables I-III, respectively. From Fig. 5(d)-(f), we can find that NTPTA still does the best job. Especially, the average probability of the 30 field experiments by NTPTA, PDTA, NTEA, and WSRA are 0.8867, 0.8487, 0.8233, and 0.7134, respectively, which shows that NTPTA leads to higher probability than PDTA, NTEA, and WSRA with 4.48%, 7.70%, and 24.29%, respectively, and validates the potential of NTPTA in improving the sensing accuracy for Web service-based data-collection systems again.
We depicted the average theoretical probabilities as the form of dashed lines in Fig. 5, from which we can find that the overall trend of relative ranking of probabilities in the simulations, the field experiments, and theory are the same, despite the field experiments having the largest variance, due to something unknown and difficult to capture in the experiment field, such as sensor failures, network disturbance and instability, and data package losses [41].
In summary, under the same network traffic, a higher probability of the Web server obtaining the MEE can be achieved by NTPTA than PDTA, NTEA, and WSRA. Meanwhile, we can find something reasonable in the following.  Table II approximate the average value of Q in the first and second columns, respectively, of Table I, and the numbers 250 and 25 of sensor nodes in Table III approximate the average value of n in the first and second columns, respectively, of Table I. 2) Fig. 5 shows that the conclusion of Theorem 2, i.e., under the same level of allowed network traffic, P π * > P PDTA ≥ P NTEA > P WSRA holds.

VII. CONCLUSION
For the Web service-based data-collection system with multiple smart sensor nodes periodically sampling and estimating the same unknown physical parameter of interest, this article has focused on designing a criterion for its smart sensor nodes to decide whether or not to submit their estimates to the Web server in order to maximize the probability of the server obtaining the minimum error estimate while meeting the condition of the required level of network traffic. Within an optimization-theoretical framework, a new algorithm NTPTA has been proposed and validated by theoretical analysis, simulation, and field experiments. Since smart sensor nodes nowadays are the foundations of CPS and IoT from the hardware or software points of view, and the Web service is an important communication element in the IoT system and the CPS, the estimate-submission management of smart sensor nodes based on NTPTA helps IoT system and CPS to operate in a network traffic efficient way while maintaining acceptable levels of estimation error, especially under network resource constraints.
Given that, in some cases, smart sensor nodes can communicate with each other, so our future work will focus on designing similar criteria for such scenarios.

APPENDIX
Proof of Theorem 1: First, we prove the sufficiency that some smart sensor nodes must take 1 (see condition 3), so, without loss of generality, we consider that s k 1 , s k 2 , . . . , s k r are these nodes (r, k 1 , k 2 , . . . , k r ∈ {1, 2, . . . , n}), that is, Then, by (9), (22), and the definition of U i (a i ), we have min(e 1 , e 2 , . . . , e n ) ∈ or / ∈ {e k 1 , e k 2 , . . . , e k r }. If the latter holds, then ∀s i ∈ N − {s k 1 , s k 2 , . . . , s k r }, it always has the intelligence to consider that s k 1 , s k 2 , . . . , s k r , . . . , s k j −1 have submitted their estimates ahead of it ( j ∈ {r + 1, r + 2, . . . , n}, k 1 , k 2 , . . . , k j −1 ∈ {1, 2, . . . , n}), and if min(e k 1 , e k 2 , . . . , e k j −1 ) ∈ {e k 1 , e k 2 , . . . , e k r }, then there are valid grounds for it to perform the estimate-submission since its estimation error may be smaller than those of s k 1 , s k 2 , . . . , s k j −1 . With this in mind and the conditions 1), 2), and 4) in Section II-A, we have min(e k 1 ,e k 2 ,...,e k j −1 )∈{e k 1 ,e k 2 ,...,e kr }, further by (7) and (24) P{e k x = min(e k 1 , e k 2 , . . . , e k j −1 )} Then, by (9) and (25) and the definition of U i (a i ), we have The condition 5) in Section II-A shows that the subscripts of s 1 , s 2 , . . . , s n are arbitrary, so, by (23) and (26), and relabeling s k 1 , s k 2 , . . . , s k n and a k 1 , a k 2 , . . . , a k n as s 1 .s 2 . . . . , s n and a 1 , a 2 , . . . , a n , respectively, U i (a i ) can be written as s k 1 , s k 2 , . . . , s k r must submit their estimates, so, by the definition of ε and ζ i , we have and by (25), we have Similar to (27), we have In a periodic cycle, the earlier estimate-submission suffers less data-block than the later ones, so (30) can be given by Equations (27) and (31) show that U i (a i ) and C i (a i ) both depend on r . Thus, we denote P(r ) as the cumulative probability of the Web server obtaining the min(e 1 , e 2 , . . . , e n ) and C(r ) as the cumulative network traffic consumed, that is, Then, by (27) and (32), we have P(r ) = r n + n i=r+1 1 n r i − 1 (34) and by (31) and (33), we have Therefore, (14) can be specifically rewritten as max r∈{1,2,...,n} r n By (34), we have and by (35), we have which means that P(r ) and C(r ) are both increasing functions of r . Thus, the solution denoted by r * for (36) is given by and max r∈{1,2,...,n} P(r ) = P(r * ).
By P π * = P PDTA , (50), and (52), we have and by (51) and (53), we have Further by (56), we have (note that c is a positive constant) By (55) and (56), we have By (58), we have and then, combining with (56) and (59), we have (note that ε is a positive constant) which means that C π * < C PDTA under P π * = P PDTA . Network traffic and sensing accuracy are two contradictory metrics, so, under C π * = C PDTA , we have P π * > P PDTA .