Eliciting Truthful Data From Crowdsourced Wireless Monitoring Modules in Cloud Managed Networks

To facilitate efficient cloud managed resource allocation solutions, collection of key wireless metrics from multiple access points (APs) at different locations within a given area is required. In unlicensed shared spectrum bands collection of metric data can be a challenging task for a cloud manager as independent self-interested APs can operate in these bands in the same area. We propose to design an intelligent crowdsourcing solution that incentivizes independent APs to truthfully measure/report data relating to their wireless channel utilization (CU). Our work focuses on challenging scenarios where independent APs can take advantage of recurring patterns in CU data by utilizing distribution aware strategies to obtain higher reward payments. We design truthful reporting methods that utilize logarithmic and quadratic scoring rules for reward payments to the APs. We show that when measurement computation costs are considered then under certain scenarios these scoring rules no longer ensure incentive compatibility. To address this, we present a novel reward function which incorporates a distribution aware penalty cost that charges APs for distorting reports based on recurring patterns. Along with synthetic data, we also use real CU data values crowdsourced using multiple independent measuring/reporting devices deployed by us in the University of Oulu.


I. INTRODUCTION
The evolution of wireless networks to the 5th generation (5G) and beyond is driven by extremely low-latency demands, improved throughput requirements and additional use cases for wireless access, such as connectivity support for augmented/virtual reality, autonomous robotic systems, and internet of things [1], [2]. Moreover, the deployment of 5G and beyond wireless networks is viewed as an evolution that builds not only on licensed spectrum bands but also on various unlicensed shared spectrum bands [3].
For the enterprise wireless networks in unlicensed shared spectrum bands, such as the wireless networks deployed in a large university campus, an office building, and an airport, The associate editor coordinating the review of this manuscript and approving it for publication was Cesar Vargas-Rosales . cloud managed network configuration platforms are being developed for efficient resource utilization [4]. Cloud managed platforms can improve their performance by utilizing real-time edge analytics of key wireless metrics, such as wireless channel utilization (CU) [5]. Recent developments in artificial neural networks (ANN) [6] have encouraged the industry and the academia to increasingly focus on utilization of not only instantaneous wireless metric values but also on the prediction of metric values. Predicted metric values in turn can be used to enable proactive resource allocation (PRA).
For a resource controller in cloud managed networks, PRA can be a more challenging task in shared unlicensed spectrum bands as APs belonging to multiple independent networks can operate in the same unlicensed channels. For example, to have sufficiently accurate metric data, collection of data through crowdsourcing from independent networks VOLUME 8, 2020 This is also required for ANN design at the resource controller. However, independent networks can be self-interested and may tend to report inaccurate metric data which includes invalid and outdated information [7], as reporting accurate data requires measuring a metric which involves computational costs. This is particularly important for cloud managed solutions utilized for enterprise wireless networks, such as networks deployed in a large university campus, an office building, and an airport, where network utilization can often have recurring patterns. For example, Fig. 1 shows real CU value data collected for a period of one week in the Tellus area of the University of Oulu. It can be seen that there is a daily pattern for weekdays where there is high CU from 8 am to 6 pm and there is less CU usage in other times during the same days. Moreover, our detailed data collection over the period of 2 months shows that there are also week days and weekend patterns in CU utilization. Independent networks operating in enterprise scenarios can exploit these data patterns to devise non-truthful reporting strategies which can give them higher reward payments as compared to when they perform honest measurements.
In this paper, we present a method that incentivizes multiple independent APs to truthfully monitor/send spectrum data, such as their measured CU related values. CU is defined as a fraction of time a frequency channel is being used for the transmission by all wireless devices active in the channel [8]. We study the issue of truthful reporting of CU data using incentive-compatible methods based on proper scoring rules, such as logarithmic and quadratic scoring rules. A method is called incentive compatible if all APs can achieve best reward outcome by honestly measuring the CU related data values and truthfully reporting them to the data collection entity. We show that when faced with measurement computation costs truthful reporting of CU data under the proper scoring rules may no longer behaviorally incentive compatible. In fact, we show that when computation costs of measurements are incorporated, independent reporting APs (agents) can increase their payoff by distorting the reporting CU data in a particular way. To address this we present a novel reward function which introduces a cost measure that takes into account the distortions in reporting by the APs. In our work, we use not only synthetic CU data values collected by simulated agents but also use real CU data values crowdsourced using multiple independent measuring/reporting devices deployed by us in the University of Oulu.
The main contributions of this paper can be summarized as follows, 1) We design a crowdsourcing scheme which incentivizes multiple independent APs (agents) operating in unlicensed shared spectrum to measure/report truthfully their CU related data to a cloud managed platform. We focus on eliciting the entire probability distribution of the measured CU values within the given time interval as probability distribution contains much more information than the single mean CU value. 2) We consider various information elicitation methods and evaluate the impact of error using logarithmic and quadratic scoring rules, which can incentivize agents to report their true measured CU related data. We use both synthetic and real CU data in our study. We analyze the impact of observation errors on logarithmic and quadratic scoring rules. We provide closed form expressions that quantify the expected difference in reward between truthful reporting and non-truthful reporting in the presence of observation errors. We compare the derived closed-form expressions with the simulations and show that the values are within 1% of those obtained from the closed-form expressions. 3) Private/independent agents need to spend their computational resources to measure the true CU values. CU in unlicensed spectrum of enterprise wireless networks can often exhibit recurring patterns (see Fig. 1). We show that when faced with measurement computation costs truthful reporting of CU data under the proper scoring rules may no longer behaviorally incentive compatible.
To address this, we design a reward function which incorporates a cost measure that penalizes the agents for distorting their reports based on recurring patterns in measured data. We show that the proposed reward function outperforms the logarithmic and quadratic scoring rules and incentivizes the agents to report true measured data even under high computation costs. The rest of this article is organized as follows. Section II presents related work. Section III presents our system model. Section IV presents the information eliciting and probability scoring methods. Section V presents the distribution aware reward payment method. In Section VI we provide our concluding remarks and also discuss a possible future direction for this work.

II. RELATED WORK
Cloud managed wireless networks have attracted strong interest in the wireless research community because of their capability to have centralized insights into key network metric data that can help in efficient resource management decisions [9]. For cloud managed networks in unlicensed shared spectrum, a technique to improve both overall network capacity and performance perceived by end users has been proposed in [10]. To support a variety of virtual network functions for network flexibility, a cloud managed network architecture has been proposed in [11].
The works in [10], [12]- [14] have shown that flexible wireless network monitoring can improve the performance of a cloud managed wireless network. Wireless network monitoring requires collection and processing of key wireless metrics related data. To support network monitoring in cloud managed networks, using crowdsourcing based solutions to collect wireless metrics data has gained considerable attention [12], [15]. The works in [12], [15] have highlighted the opportunities and challenges of using crowdsourcing for wireless network monitoring. In [16], and [17] crowdsourcing is used to collect network coverage related data, and [18] has used crowdsourcing for spectrum sensing.
In wireless networks using shared spectrum bands, it makes sense for a cloud managed wireless operator to collect data relating to key metrics not only from its own APs but also use crowdsourcing to collect data from APs owned by other private networks in the same area. However, private APs or APs belonging to other networks need to spend their computation and energy resources to participate in such a crowdsourcing task. Independent private networks act as autonomous agents and some incentive mechanism need to be designed which can motivate them to participate truthfully in a crowdsourcing task. Various works have proposed the use of contract theory to design incentive mechanisms for wireless networks (see [19], and references therein).
A pricing based incentive mechanism for wireless powered network has been introduced in [20] to maximize the agent's utility for honestly reporting its channel gain. In [21], the authors consider the problem of optimal task assignment in mobile data crowdsourcing and propose methods that incentivize strategic workers to truthfully report their private worker quality and data to the requester. However, none of these works consider reporting of probability distribution values and also they ignore the possible exploitation of recurring patterns in data by the reporting agents.
It is possible that the crowdsourcing requester entity wants to elicit from the agents a single measured value which is the mean value of the measurements performed within some time interval, or it wants to elicit the entire probability distribution of the measured values within the same time interval as probability distribution contains much more information than the single mean value. In Fig. 2, we present a taxonomy of information eliciting mechanisms used for single value and probability distribution values elicitation (see [22]). In Fig. 2, the left branch contains the truth agreement mechanism and the output agreement mechanism which are utilized when a single value is required to be elicited. The truth agreement method considers an ideal scenario in which the requester is assumed to have access to the ground truth which in general is not true for wireless networks. The right branch contains the proper scoring rules, such as the quadratic and the logarithmic scoring rules, which have been widely utilized for designing incentive mechanisms where entire probability distribution is required to be elicited from the strategic agents [23], [24] and [22]. The quadratic and the logarithmic scoring rules have been utilized to elicit the whole probability distribution of a measured value [22].
In this paper, we apply the truthful elicitation mechanisms to the enterprise network settings where a cloud managed wireless operator wants the APs to truthfully monitor/send spectrum data, such as probability distribution of their measured CU values, and the cloud manager does not have access to the ground truth. Moreover, different from other existing works we consider distribution aware selfish APs which can VOLUME 8, 2020 save their computation costs associated with the measurements, and instead of measuring and sending the true data they simply use underlying distributions describing the data and send that data instead.

III. SYSTEM MODEL
We consider a cloud-managed wireless network operator using shared spectrum and has N p number of APs in a given area each of which are equipped with a radio frequency (RF) monitoring module. We call these monitoring modules as peer APs. In the same area, there are N a number of other APs deployed by independent network owners which also have RF monitoring modules. We call these monitoring modules as agent APs. Hence, the total number of APs equipped with RF monitoring modules is N = N p + N a . Table 1 summarizes the notations used in this paper.
Measuring only mean values of CU over some time instant may not be enough in terms of understanding CU behavior, hence, from each AP, the wireless operator is interested in eliciting truthful information about probability distribution of CU values. A single CU value represents the percentage of time the channel is being used for transmission by the wireless devices which is generally indicated by a value between 0% to 100%. Since a wireless channel can be used by multiple wireless technologies, a CU value indicates the amount of transmission from multiple wireless devices on a channel [25].
Histograms are computationally simple way of obtaining a probability distribution. Over a period of time the peer and agent APs measure frequency distribution of CU by constructing histograms of measured CU values, where each bin of histogram represents a CU state. For a given interval of t time units, a histogramȞ i of CU can be given by where S 1 , S 2 , · · · , S k represent partitioning of CU values (ranging from 0 to 100) into k contiguous intervals commonly known as bins. Each bin in this work represents a CU state. Each CU state is defined as S j = [S j ; S j ) with S j as the minimum value and S j as the maximum value. When a sample of measured CU value is within some bin (state) I j then the counter for that state is incremented by one or else it remains Let's denote a measured CU value at a given time instant t by γ . The count values for k = 5 CU states are given by θ 1 , θ 2 , · · · , θ 5 , where = (θ 1 , θ 2 , . . . .θ 5 ) represents the frequency distribution of 5 CU states. To convert the obtained frequency distribution of k = 5 CU states to probability distributions data the cloud manager simply divides the count values in each state by the total number of count values in all the 5 states.
In Fig. 3 we illustrate both synthetic and real data collection models. A snapshot of collected 5 state CU histogram converted to probability distribution for real data is shown in Fig. 1c.

A. CU STATE MODEL USING REAL DATA
We use real CU data in our work which we have collected over a period of almost two months using three independent CU measurement devices deployed in the University of Oulu. One of the utilized CU measurement device is shown in Fig. 1d. The three devices measure real-time CU in a 2.4 GHz channel and utilize them to computes the histogramȞ i . Every t = 22 seconds, each device sends the histogramȞ i representing frequency distribution of CU states to the cloud manager. One of the measurement device is considered to be worked as peer while the other two devices are considered to work as agents. The measurement devices are implemented on Xilinx's Zynq-7000 system on chip mounted with RF transceivers. Details regarding the design and implementation of three measurement devices can be found in [25].

B. CU STATE MODEL USING SYNTHETIC DATA
Under the synthetic data usage, we consider a data generating source that at some fixed period outputs a CU value based on a given probability distribution of five CU states. Each peer AP and also the agent AP observes the CU value. Both the peer and the agent can have errors in their observed CU values and we quantify the error in a measured CU value with error probability P e . The histogramȞ i is constructed using the CU values and at the end of each interval t, both the peer and the agent send the histogramȞ i to the cloud manager.

IV. INFORMATION ELICITATION AND PROBABILITY SCORING METHODS
We focus on eliciting multiple values scenario in which each agent and peer is asked to report entire frequency distribution (θ 1 , θ 2 , . . . ., θ 5 ) of CU states observed in a given time interval t by the module. When the agent reports the entire frequency distribution, we utilize proper scoring rules to reward the reporting AP agent. A scoring rule is said to be proper if it is incentive compatible, which means an agent cannot get higher reward by reporting non-truthful information as compared to when it reports truthful information [26]. Fig. 4 illustrates the reporting and reward payment mechanism. A Scoring rule provides two-fold functions: it provides incentive to report truthfully and also allows evaluation of reporting accuracy. As an incentive mechanism it aims to pay reward to an agent for reporting the truthful information about the measured event. As an evaluating mechanism, it estimates the relative accuracy of the agent's measurements [24]. We use two different proper scoring rules: logarithmic scoring rule and quadratic scoring rule. The operator pays the reward to the reporting AP agent based on its reported probability distribution against the reference peer AP (own AP).

1) QUADRATIC SCORING RULE
In quadratic scoring rule the agent's payoff is derived from the sum of squared distance between the reference distribution and the observed relative distribution [27]. According to this VOLUME 8, 2020 rule the reporting AP agent's payoff is given as represents the measured probability of a CU state i by agent AP.

2) LOGARITHMIC SCORING RULE
Logarithmic scoring rule is also used to elicit the agent's beliefs in terms of subjective probabilities. However, logarithmic scoring rule attaches larger penalties than the quadratic scoring rule [28]. The logarithmic scoring rule deducts for inaccuracy by adding the natural log of the occurred event's probability from the base score [26]. The reward R i is given as where E is the entropy of the prior probability distribution. Remark 1: The work in [22] has shown that the two proper scoring rules motivate the APs to report truthfully because the difference between truthful reporting and non-truthful reporting is greater than 0 for both logarithmic and quadratic scoring rules. For the logarithmic rule, this difference is given as where Kullback-Leibler divergence D KL ( ) ≥ 0 with equal to 0 when = . Hence reporting non-truthfully can only lower the payoff than reporting it truthfully. For the quadratic scoring rule, we have which is obviously always >= 0 with equality only if the two values are equal.

A. OBSERVATION ERRORS AND REWARDS
In practice, CU state observations (measurements) of APs can have errors. In our work, for the synthetic data usage case we take into account the impact of observation errors by considering that in a given interval t an AP can measure a CU state correctly with probability P c < 1. It is important to note that the real data usage case automatically incorporates measurement errors as the real measurement sensor used for collecting data has some but limited errors. The closed form expression given in Eq. 2 and Eq. 3 will change due to observation errors and/or for the case where non-truthful values are sent. For observations with error probability P e = 1 − P c , true measured observation with error is given by When the AP reports non-truthfully in the presence of observation errors thenφ e i is given aŝ Under observation with errors case, one can calculate the expected difference in payoff between truthful reporting and non-truthful reporting for the logarithmic rule case as: The expected difference in payoff between truthful reporting and non-truthful reporting for the quadratic rule case can be calculated as: In Section IV-C, we will verify the average reward results for Eq. 2 and Eq. 3 given by the closed-form expression we derived in Eq. 6 by comparing them with the estimated average reward from a Monte Carlo simulation.

B. COMPUTATION COST
An RF module in an AP requires processing of in-phase and quadrature (IQ) samples to obtain in real-time CU related statistics. In such an RF monitoring system, there is more computational cost associated with higher bandwidth. This is due to the higher sampling rate required for processing the IQ samples. For example, monitoring a 2 MHz channel requires sampling rate of 4 Msamples/s, whereas a 20 MHz channel requires sampling rate of 40 Msamples/s to satisfy the Nyquist sampling rate. As a result, it is possible that when the computation cost in monitoring are taken into account, an AP agent can increase its reward R i by saving the computational cost. In such a scenario, it is not favourable for a reporting AP agent to perform computation for each given interval t and report truthfully. In order to evaluate the effect of computational cost on AP agent's reward based on logarithmic and quadratic scoring rules, we present agent's reward for both the cases i.e., with and without considering the computational cost.
The reward with the computation cost is given as where R i represents the AP agent's reward using the scoring rules, V r represents the value of monitoring reward per unit which is in some digital monetary unit, such as a bitcoin, C represents the amount of computation done to process CU data samples for a given channel bandwidth in a given interval. σ represents the cost of computation per unit (where 1 unit represents computation of 1 MHz channel bandwidth).
In reality this cost is due to energy consumption and the use of more RF monitoring resources for IQ sample processing which is again in some digital monetary unit, such as bitcoin. Note that when σ = 0 this means that there is no computation cost per unit.

C. PERFORMANCE EVALUATION
In this subsection, we evaluate the performance of the probability scoring methods using synthetic data and APs with noisy measurements and also with and without computation cost. Table 2 presents the simulation parameters and values for synthetic data.

1) PERFORMANCE EVALUATION USING THE SYNTHETIC DATA
We consider the scenarios where the AP is asked to report the probability distribution (θ 1 , θ 2 , . . . ., θ 5 ), and the probability scoring method using quadratic scoring rule and logarithmic scoring rule presented in the previous subsections are utilized. In Fig. 5, we plot the average reward per round for an AP as a function of probability of correct measurement. We compare the average reward results for Eq. 2 and Eq. 3 given by the closed-form expression we derived in Eq. 6 and the estimated average reward from a Monte Carlo simulation. Some of the results are also tabulated in Table 3. Observe that the rewards estimated from Monte-Carlo simulations are within 1% of those obtained by applying Eq. 6 in Eq. 2 and Eq. 3.

2) PERFORMANCE EVALUATION USING THE REAL DATA
Next we present the results using the real data which has been collected in one of the busiest places in the university of Oulu and data collection details are given in section III-A. It can be seen from Fig. 1a that real measured data shows fluctuation in CU values. In general, filtering is used to compensate for fluctuations in real data. To take this account, we will also evaluate the impact on average reward when filtering, such as moving average filtering, is utilized on the real CU data. We compared the three different reporting strategies. Strategy1 Agent reports the true observed entire frequency distribution (θ 1 , θ 2 , . . . ., θ 5 ) of the CU states every time interval t = 22 seconds for the thirteen days and gets the reward R i in each round of reporting using the probabilistic scoring method. We call this as the honest reporting strategy. Strategy2 Agent reports the true observed entire frequency distribution (θ 1 , θ 2 , . . . ., θ 5 ) of the CU states every time interval t = 22 seconds for the first 24 hours of the day. To save the computation cost in the other 12 days, the agent instead of performing real monitoring reports sends the same frequency distributions that was measured for the same time interval t of the first day. For example, at the time interval t of the day two the agent reports the frequency distribution (θ 1 , θ 2 , . . . ., θ 5 ) measured at the same time of the day 1 and so on. We call this as the simple dishonest reporting strategy. Strategy3 Since if the agent is sending the same data during the same time of all the remaining days, it is easily detectable. To avoid easy detection, we consider the third strategy. In this strategy, first just like the simple dishonest strategy the agent reports the true observed entire frequency distribution (θ 1 , θ 2 , . . . ., θ 5 ) of the CU states every time interval t = 22 seconds for the first 24 hours of the day. However, to save the computation cost in the other 12 days, the agent instead of sending same data as monitoring reports it generates data based on the distribution of the first day data and sends that data as the monitoring reports. For example, for day 2 and onwards the agents uses the data of day 1 to find the distribution for the particular interval, generates (θ 1 , θ 2 , . . . ., θ 5 ) of the CU states using that distribution and sends this as the monitoring report. We call this as the distributionbased dishonest reporting strategy.

a: AVERAGE REWARD UNDER THE THREE STRATEGIES AND THE IMPACT OF DATA FILTERING
In Fig. 6, we evaluate the average reward performance for the three strategies under the probabilistic scoring method.
In the figure, we present the average reward as a function of moving mean window (M w ). The M w represents a parameter for the moving average filter utilized on the CU data, where each mean is calculated over a sliding window of length M w across neighboring elements of CU data vector. Note that M w = 1 means no moving average filter is utilized on the data and M w = 3 means moving average filter of length is utilized. It can be seen from the figure that for the honest reporting strategy use of moving average filter with M w = 3 can increase the agents reward by almost 17%. It can be also seen from the figure that the honest reporting strategy results in the highest average reward as compared to the other two dishonest reporting strategies. Moreover, the figure also shows that for the two dishonest reporting strategies there is little impact on using filtering over their reported data.

b: AVERAGE REWARD UNDER THE THREE STRATEGIES AND THE IMPACT OF COMPUTATION COST
In order to evaluate the impact of computation cost (see Eq. 10) on the reward performance, we calculated ρ R as a function of increasing computation cost, where ρ R is the ratio of agent's reward when it is honestly reporting to the agent's reward when it reports dishonestly. To evaluate the impact of computation cost Cσ we first set σ = 0 which represents the case where there is no computation cost assumed. For modeling Cσ > 0, we fix the value of cost per unit σ to be greater than 0 and increase C. Fig. 7 presents these results for the case when no filtering is performed on the monitoring reports and Fig. 8 shows the results when filtering is performed on the monitoring reports. In the figures, computation cost 0.05 means that an agent is required to measure a 2.8 MHz channel whereas the maximum computation cost 0.35 means that the agent is required to measure a 20 MHz channel. The two figures show that as the cost of computation is increased the ratio ρ R decreases and it can go below 1 for a 20 MHz channel which means that the agents have incentive to use the dishonest reporting strategies as they can result in either equal or higher rewards than the honest reporting strategies. The reason for this can be explained as follows: For the strategy 1, since each agent is exerting effort everyday in monitoring so there is always a computational cost. However, in the other two strategies each agent is reporting according to the day 1 distribution so there is computational cost for monitoring during day 1 but there is no monitoring computational cost in the other days. It can be also seen from the two figures that the quadratic scoring leads to smaller ρ R as compared to the logarithmic rule. This means that under computation cost taken into account when the quadratic scoring rule is utilized then the agents have more incentive to follow the two dishonest strategies for reporting.

V. DISTRIBUTION AWARE REWARD PAYMENTS
To discourage the agents using the past data distribution in their future reports we introduce a novel distribution aware  cost function which takes into account changes in underlying data distributions over time. The basic idea behind this new cost function is that the requesting operator performs computations which measure the changes overtime in the probability distribution of CU data. The cost function utilizes basic property of reporting data from the multiple AP agents and the own peer AP which is: Although the peer and agents perform independent measurements but they measure the same process (CU in a channel) which means that although their reports can be different from each other for a given interval but their distribution cannot differ widely from each other. The idea is that if the AP agent reports based on some previous day's data its reporting distribution is likely to differ more with the peer AP's reporting distribution as compared to if the AP agent reports real measured data.

A. EMD BASED DISTRIBUTION AWARENESS
We propose to use a statistical distance based technique called earth mover's distance (EMD) to create distribution awareness. In simple words, EMD can be defined as the minimum amount of effort needed to transform a probability distribution α (which represents the CU probability distribution at time t) towards probability distribution β (which represents the CU probability distribution at timet). The effort can be defined in simple words as: effort = (number of normalized CU count values moved) × (number of bins over which they are moved). Simply put, the idea of EMD is to imagine two probability distributions as piles of dirt and calculate the minimum amount of effort needed to reshape the first pile so that it has the same shape as the second pile. The important feature of EMD is that it takes into account distance. With increasing dissimilarity of two CU distributions the EMD increases because the probabilities need to be moved over larger number of bins (distances). We confirm this claim by showing results in the subsequent paragraphs but first we explain how the EMD is computed. In our work, EMD is computed by calculating the difference between the two cumulative CU histogramsĤ j andĤ m . A cumulative histogram can be defined as a mapping that counts the cumulative number of observations in all of the bins up to the specific bin. The cumulative histogram (Ĥ i ) of a histogramĤ i is defined as The EMD between the two cumulative histogramsĤ j andĤ m can be represented as where k = 5 for the five CU states. In Fig. 9 and Fig. 10, we present box plots of EMD values showing CU distribution differences between the different reporting strategies using the real CU dataset. Fig. 9 shows the results for reports without filtering and Fig. 10 shows the results when filtering is performed on the reports. Tables 4 and 5 show median, minimum and maximum EMD values between the different reporting strategies with   and without filtering. Strategy 1 means the EMD is calculated between the peer AP reports and the AP agent sending honestly measured reports; Strategy 2 means the EMD is calculated between the peer AP reports and AP agent sending dishonest reports using the simple dishonest strategy; and Strategy 3 means the EMD is calculated between the peer AP reports and the AP agent sending dishonest reports using the distribution-based dishonest strategy. It can be seen from Fig. 9 and Table 4 that for Strategy 1 while the median value (red line in the box) is 0.2, however, for Strategy 2 and Strategy 3 median values have slightly increased to 0.23 and 0.26, respectively. The figure also shows that the 75th percentile values (top edges of the boxes) have almost doubled for the dishonest reporting strategies. The whiskers in the figure extend to the most extreme data points not considered outliers, the outliers are plotted using the '+' symbol. It can be seen from the figure that both the outliers and the whiskers for the dishonest strategies have higher values than the honest reporting. The impact of using filtering on the EMD values is presented in Fig. 10 and Table 5. It can be seen that filtering reduces EMD values for Strategy 1 much more significantly as compared to EMD values for the two dishonest reporting strategies.

Cloud Manager Part
Data: Input CU data from a peer AP i and from each agent AP j represented byĤ i andĤ j , respectively whereĤ i = {(S 1 , θ 1 ), (S 2 , θ 2 ), · · · , (S k , θ k )} // same forĤ j Obtain prob dist: i = θ 1 ,θ 2 ,...,θ k k n=1 θ n // same step for j Penalty cost: δ = EMD( i , j ) Calculate accuracy: if(scoring_rule = logarithmic) Report true at each instance t elseif (strategy = 2) if (Day = 1) Report and store true at each instance t else Report measured at the same instance t of the day 1 end elseif (strategy = 3) if (Day = 1) Report and store true at each instance t else Report value generated at time instance t based on the distribution of the day 1 data end end

B. EMD BASED PENALTY COST
To discourage the dishonest reporting which leads to computation savings and in turn can lead to higher reward value is given in Eq. 10, we use differences in EMD values between the peer AP and the agent AP reports exceeding some threshold value τ as a penalty cost. The reward obtained by an agent is then given as In other words, if the EMD between the peer AP and the agent AP reports exceed some threshold τ then the agent AP incurs cost penalty equal to the EMD value δ given in Eq. 12. We use median EMD value as τ which means that the agent AP incurs cost penalty when the EMD value between the reported CU distribution by the peer AP and the agent AP exceeds their median EMD value calculated for the past 24 hours.
To evaluate the impact of penalty cost on the reward performance, we calculate ρ R as a function of increasing computation cost, where ρ R is the ratio of agent's reward when it is honestly reporting to the agent's reward when it reports dishonestly. Fig. 11 presents these results for the strategies when no filtering is performed on the monitoring reports and Fig. 12 shows the results when filtering is performed on the monitoring reports. The two figures show that the introduced penalty cost increases the ρ R as compared to the results in Fig. 7 and Fig. 8 when there was no penalty cost in the reward payment. This means that the penalty cost allows an agent to get relatively more reward for honest reporting as compared to when it uses a dishonest reporting strategy. Moreover, the two figures, Fig. 11 and Fig. 12 also show that even for high computation cost of 0.35 the ρ R for all the scenarios and both quadratic and logarithmic scoring rules is well above 1. This means that the AP agent can achieve the best reward outcome for itself just by honestly reporting. This is different than when there was no penalty  cost was introduced in the reward payment as then the ρ R was below 1 meaning more incentive to follow the two dishonest strategies for reporting. Fig. 11 shows the ratio of agent's utility by considering a new cost function or the penalty associated with the distance between the ground truth and agent reported histograms. In Fig. 11 the threshold τ = 0.2 which is the median of the distance between the histogram of ground truth and agent for Strategy 1 i.e, honest reporting without movmean (Table 4). If the distance between the histogram of ground truth and agent at interval t is greater than the threshold then cost or the penalty in Eq. 13 occurs. This cost is incurred in order to force agent to exert effort and send the true information about the CU state to the controller. Fig. 11 shows that the for both the scoring rules the agent's utility for honest strategy has been improved as compared to Fig. 7. Unlike Fig. 7 for the logarithmic scoring rule the ratios between the honest and dishonest reporting is always higher than 1. Moreover, for quadratic scoring rule almost at double cost the ratio is higher than 1, which means that proposed penalty in Eq. 13 improves the agent's utility for honest strategy as compare to Fig. 7. This is because whenever the agent is dishonest, cost or the penalty in Eq. 13 occurs which motivates agent to report the true information about CU state.
In Fig. 12, τ = 0.18218 which is the median of the distance between the histogram of ground truth and agent for Strategy 1 i.e, honest reporting with movmean (Table 5). If the measured distance between the histogram of ground truth and agent at interval t is greater than the threshold then cost in Eq. 13 occurs. Fig. 12 shows that for both the scoring rules the agent's utility for honest strategy is highest. Unlike Fig. 8 which only considers the computation cost, by considering the penalty for agent's reported information the agent's utility for honest strategy has been improved. Furthermore, the ratio between the Strategy 1 and Strategy 2 is less than the ratio between the Strategy 1 and Strategy 3 for both the scoring rules. This is because with filtering the distance between the ground truth and agent's histogram improves more for Strategy 2 than the Strategy 3 and hence the agent's utility improves for Strategy 2.

VI. CONCLUDING REMARKS AND FUTURE DIRECTION
We design a crowdsourcing solution to incentivize independent wireless networks to truthfully measure/send their CU data samples to a cloud managed wireless operator. We have utilized both synthetic and real crowdsourced CU data collected through multiple independent measuring/reporting devices deployed in the University of Oulu. We show that the real wireless CU in an enterprise network shows recurring patterns which can be exploited by independent access points (APs) to devise non-truthful reporting strategies that can save their measurement computation costs and obtain higher rewards as compared to truthful measurements and reporting. In our work, we use proper scoring rules to reward VOLUME 8, 2020 the reporting AP agents. We evaluate the performance of the scoring rules under the CU measurement errors and provide closed-form expressions that quantify their performance loss due to measurement errors. We also show that when AP agents computation costs are taken into account then the proper scoring rules are no longer behaviorally incentive compatible under high computation costs. To address this we incorporate a distribution aware penalty cost in the reward payments to the agent AP. We show that the new reward payment scheme performs better and enables truthful reporting of CU data even under high computation costs. In order to compensate the fluctuations in real data we also evaluate our results with and without moving average filter and results show improved performance for the proposed mechanism when filtering is used.
In our future work, we intend to use a game-theoretic model to study the strategic interactions between the cloud manager and the independent AP agents for crowdsourcing of wireless data which exhibits recurring patterns over a period of time. Another possible future work in this area can include the implementation of the proposed incentive mechanism to monitor/measure the wide range of key wireless metric information, such as interference, power etc. HAMED AHMADI (Senior Member, IEEE) received the Ph.D. degree from the National University of Singapore, in 2012. He was with different academic and industrial positions in Ireland and U.K. He is currently an Assistant Professor with the Department of Electronic Engineering, University of York, U.K. He is also an Adjunct Assistant Professor with the School of Electrical and Electronic Engineering, University College Dublin, Ireland. His Ph.D. degree was funded by the Institute for Infocomm Research, A-STAR. He has published more than 50 peer-reviewed book chapters, journals, and conference papers. His current research interests include design, analysis, and optimization of wireless communications networks, application of machine learning in wireless networks, airborne networks, wireless network virtualization, blockchain, the Internetof-Things, cognitive radio networks, and small cell and self-organizing networks. He is a Fellow of the U.K., Higher Education Academy, the Networks Working Group Co-Chair, and a Management Committee Member of the COST Action 15104 (IRACON). He serves as a member for the editorial board of IEEE ACCESS, Frontiers in Blockchain, and Wireless Networks (Springer).