Secure IoT Data Outsourcing With Aggregate Statistics and Fine-Grained Access Control

,


I. INTRODUCTION
Internet of Things (IoT) is developing rapidly and the use of IoT devices has dramatically increased in recent years. It was forecasted that the IoT market would grow from more than 15 billion devices in 2015 to more than 75 billion in 2025 [1]. These devices have the potential to improve the living standard of their users significantly through interactions with the physical and digital worlds [2]. For example, users with smart home and wearable devices can obtain seamless and customized services from digital housekeepers, doctors, and fitness instructors [3]. Managing a constant stream of data collected from a variety of devices is a significant burden for IoT users with limited storage and computing resources. The ''pay-as-you-go'' Cloud Computing model is an efficient alternative to manage data for customers. Users can outsource a large amount of IoT data to the cloud and recover whenever The associate editor coordinating the review of this manuscript and approving it for publication was Vyasa Sai. they need it. However, since IoT embeds different kinds of sensors and other devices into a variety of things in our daily life, IoT data usually involves much private information about users [12]. It might be the heart rate of the user at a certain moment collected from the smart sphygmomanometer, user exercise data collected from the smart watch, and the like. In order to protect the security of the outsourced data, an intuitive way is to encrypt the data before outsourcing it. But there will be some new problems coming with encryption.
The first challenge is how to perform aggregate statistical analysis on encrypted data as accurate as possible. For example, we may want to learn about our health condition in a certain period of time or whether our exercise has reached the average level of people in a certain area [10], [11]. Several attempts have been made to solve this problem. Sun et al. [12] use homomorphic encryption to encrypt the IoT data so that the service providers can process the needs of users without acquiring the plaintext data. However, the homomorphism technology is not mature currently. As we VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ all know, its encryption and decryption are quite inefficient, and only a few homomorphism properties are supported by the privacy homomorphism. All these bottlenecks will hinder the widespread use of the Internet of Things. What's more, the cloud service provider process all the data analysis, which will inevitably cause transmission latency and degraded service when traffic between IoT devices and the cloud becomes extraordinary huge. The most important is that the trusted third party is in charge of homomorphism encryption and decryption, meaning that it can obtain all the plaintext data. Such a strong security assumption is quite problematic. In addition to homomorphic encryption, there is no better solution yet. The second problem is how to achieve precise access control of encrypted data. The data owner may want to define an access policy and enable users that satisfy the policy to access the corresponding data. The attribute-based encryption (ABE) is a promising approach to realize this. It enables the data owner to define access policy over a universe of attributes that the user needs to possess in order to decrypt the ciphertext and enforce it on the data. ABE has two variants, key-policy attribute-based encryption (KP-ABE) [7] and ciphertext-policy attribute-based encryption (CP-ABE) [8]. The latter turns out to be well suited for access control in IoT due to its expressiveness in describing access policy of ciphertext. Huang et al. [9] achieves secure data access control using exactly the ciphertext-policy attribute-based encryption (CP-ABE) in IoT data outsourcing. But they did not consider aggregation of these encrypted data, thus data analysis is impossible.
Besides, due to the extraordinary huge volume of traffic between IoT devices and the cloud, the centralized cloud computing systems will suffer from unbearable transmission latency and degraded service. Fortunately, fog computing is a promising technology to solve this problem. Fog computing extends the cloud computing paradigm to the edge of the network, and it is characterized by low latency and widespread geographical distribution [5], [6]. Fog computing is actually a tool for cloud-based services that can be considered as an interface between the users and the cloud. In this paper, we also adopt fog as an auxiliary tool and propose a secure IoT data outsourcing scheme. As far as we know, our scheme is the first to achieve encrypted data aggregation and precise access control simultaneously in IoT data outsourcing. The main contributions of this paper are summarized as follows.
• We propose a novel and practical IoT data outsourcing scheme. Specifically, we make a combination of Corrigan-Gibbs et al.'s computation of aggregate statistics and Bethencourt et al.'s ciphertext-policy attributebased encryption (CP-ABE) to support both secure aggregate statistics and fine-grained access control of outsourced IoT data. And the introduction of fog computing enables our scheme to provide real-time and lowlatency services.
• Security analysis demonstrates that our scheme well protects the confidentiality of outsourced IoT data, and ensures that only users whose attributes sets satisfy the access policy can recover the corresponding data.
• A comprehensive performance comparison between our scheme and several present works is given, showing that our scheme behaves better in the computation overhead.

II. RELATED WORK A. AGGREGATION OF OUTSOURCED IoT DATA
The secure aggregation of outsourced IoT data is an urgent need, but there are very few works that have solved this problem. In fact, a large number of researchers have begun to pay attention to security and privacy issues on the internet of things (IoT) [17]- [19]. Fan et al. [20] designed secure and privacy-preserving RFID protocols from the perspective of data collection and transmission, to make sure that the IoT data collected by RFID tags could only be transmitted to legitimate readers. Doukas et al. [11] proposed to encrypt IoT data using public key encryption to ensure the confidentiality of data. But as we all know, the computation overhead of public key encryption is extremely huge, especially in Internet of Things where quite a lot of devices are limited in computation capability. And none of the above schemes consider the aggregation and analysis of encrypted IoT data. Both Sun et al. [12] and Gong et al. [21] attempted to protect data using homomorphic encryption. Specifically, Sun et al. introduced a trusted third party to encrypt and decrypt IoT data for resource-constrained users. It means that all the private data is transparent to the trusted third party. Gong et al.'s scheme is a bit simpler, and encryption and decryption are all on the user side. But they can only decide if the result is in the region [0,1]. Another problem with these two works is that the cloud server engages in the computation of every data, meaning that the cloud needs to interact with countless IoT users frequently and can become the bottleneck of the whole system easily. Recently, Guan et al. [22] proposed an anonymous and privacy preserving data aggregation scheme for fog-enhanced IoT systems. They adopted paillier encryption, which could achieve additive homomorphism property, to outsource the aggregation of IoT data to the cloud through fog nodes. However, the smart devices still need to bear expensive computation in the process of data collection and data aggregation.

B. ACCESS CONTROL OF OUTSOURCED IoT DATA
Secure access control is another desirable function in IoT data outsourcing since IoT users may want to share their data with certain people. Attribute-based cryptography is a well-known technology to guarantee data confidentiality and fine-grained data access control. Early in 2011, Yu et al. [23] adopted KP-ABE to achieve fine-grained data access control in wireless sensor networks. Then Hu et al. [24] employed CP-ABE to realize secure data communication between wearable sensors and data consumers. Yeh et al. [25] proposed a cloud-based fine-grained health information access control framework, which was the first scheme suitable for lightweight IoT devices. Only symmetric cryptography is required for IoT devices, such as wireless body sensors. However, the computational cost in the encryption and decryption phase is linear with the complexity of policy when using ABE in fog computing directly. Zhang et al. [26] and Huang et al. [9] introduced fog computing to alleviate the burden of IoT users. They outsourced expensive encryption and decryption operations partially to the fog servers, so that the computation that users needed to do in encryption and decryption was irrelevant to the number of attributes in the policies. Even so, users still have to bear a certain amount of computation, and such an encryption will inevitably hinder the aggregation and statical analysis of data. To our best knowledge, there is no scheme that supports both fine-grained access control and encrypted data aggregation in the context of Internet of Things (IoT).

C. FOG COMPUTING IN INTERNET OF THINGS
In this paper, we adopt fog computing as the intermediate layer between IoT users and the cloud to alleviate the burden of cloud as many previous works [4]- [6]. The main characteristics of fog computing include low latency, location awareness, wide-spread geographical distribution, and so on. So it can play an essential role in the Internet of Things (IoT) such as healthcare and activity tracking, connected vehicle, and smart grid. Reference [13] proposed a distributed dataflow programming model as a basis for fogbased IoT applications. They discussed the core requirements that fog-based IoT applications needed to meet and identified several issues that had not been considered in previous works. The above works focus primarily on the principles and concepts of fog computing and its significance in the context of internet of things (IoT). Sarkar and Misra [14] first proposed a mathematical formulation for fog computing and proved its significance by experiment. They showed that for a scenario where 25% of the IoT applications demand realtime and low-latency services, the mean energy expenditure in fog computing was 40.48% less than the conventional cloud computing model. Farahani et al. [15] discussed the applicability of IoT in healthcare and medicine specifically, and they presented a holistic architecture of IoT eHealth ecosystem. Oteafy and Hassanein [16] gave a recent survey on advancements and challenges in IoT, and they also presented a number of high-yield directions that would further propagate IoT development in the fog.

III. PRELIMINARY A. PRIO: CORRIGAN-GIBBS'S COMPUTATION OF AGGREGATE STATISTICS
Corrigan-Gibbs and Boneh [27] gives a simplified version and an extended version of Prio, respectively. The former does not provide robustness, meaning that a single malicious client can corrupt the protocol output completely by submitting an invalid value, while the latter provides robustness.
In this paper, we just adopt the simplified version of Prio.
It is because that the simplified version also provides privacy, i.e., the servers can only learn the result of aggregation but nothing else about the clients' private inputs. The other reason is that in our IoT data outsourcing scheme, clients may want to recover their data stored in the cloud at any moment, so we assume that the clients would not like to upload invalid values. Suppose each client holds a private value x i and that servers want to compute the sum of clients' private values i x i . The simplified Prio scheme for computing sums proceeds in three steps: • Upload. Each client i splits its private value x i into s shares, one per server, using a secret-sharing scheme.
In particular, the client picks random values x i,1 , . . . x i,s , subject to the constraint: The client then sends, over an encrypted and authenticated channel, one share of its submission to each server.
• Aggregate. Each server j computes the sum of its own shares i x i,j = S j .
• Publish. All servers publish their values S j , and the sum i x i will be j S j .

B. BEAVER'S MULTI-PARTY COMPUTING (MPC) PROTOCOL
As in Corrigan-Gibbs and Boneh Prio [27], the implementation of multiplication needs a combination with Beaver's multi-party computing (MPC) protocol [28], which proceeds as follows.
Suppose servers want to compute the product of a private value x and a constant A, and that each server i holds a share [x] i , then ith server can compute a share of Ax locally by Suppose servers want to compute the product of two private values x and y, and that each server i holds the shares [x] i and [y] i . Beaver showed that servers could use pre-computed multiplication triples to implement the multiplication. A multiplication triple is a one-time-use triple of values (a, b, c) chosen at random, being subject to the constraint that a·b = c.
of the triple to jointly compute the product xy. To do so, each server i uses its shares [x] i and [y] i along with the first two components of its multiplication triple to compute the following values:

C. ACCESS TREE
Let T be a tree representing an access structure. Each non-leaf node of the tree represents a threshold gate, described by its number of children num x and a threshold value k x , where 0 < k x ≤ num x . When k x = 1, the threshold gate is an OR gate and when k x = num x , it is an AND gate. VOLUME 8, 2020 Each leaf node x of the tree is described by an attribute and a threshold value k x = 1. Besides, the parent of the node x in the tree is denoted as parent(x), and the attribute associated with the leaf node x in the tree is denoted as attr(x). The access tree T also defines an ordering between the children of every node by numbering the children of the node from 1 to num. The function index(x) returns such a number associated with the node x, where the index values are uniquely assigned to nodes in the access structure for a given key in an arbitrary manner.
Let T x be a sub tree of T rooted at the node x. If a set of attributes S satisfies the access tree T x , we denote it as T x (S) = 1. We compute T x (S) recursively as follows. If x is a non-leaf node, evaluate T x (S) for all children x of node x. T x (S) returns 1 if and only if at least k x children return 1. If x is a leaf node, T x (S) returns 1 if and only if attr(x) ∈ S.

D. CIPHERTEXT-POLICY ATTRIBUTE-BASED ENCRYPTION (CP-ABE)
In the ciphertext-policy attribute-based encryption (CP-ABE) system, a user's private key is associated with an arbitrary number of attributes expressed as strings, and the encrypting party specifies an associated access structure over attributes when he encrypts a message. Only if a user's attributes set satisfies the access structure, is he able to decrypt the corresponding ciphertext. A ciphertext-policy attribute-based encryption scheme consists of the following four fundamental algorithms: • Setup(λ) → (PK , MK ) is the key generation algorithm that takes the security parameter λ as input and outputs the public parameters PK and a master key MK ; • Enc(PK , m, T ) → C is the encryption algorithm that takes the public parameters PK , a message m, and an access structure T over the universe of attributes as inputs and outputs a ciphertext C; • KeyGen(PK , MK , S) → SK is the key generation algorithm that takes the public parameters PK , the master key MK , and a set of attributes S that describe the key as inputs and outputs a private key SK .
• Dec(PK , C, SK ) → m is the decryption algorithm that takes the public parameters PK , a ciphertext C, and a private key SK as inputs and outputs the original message m only if the set S of attributes satisfies the access structure T .

IV. PROBLEM FORMULATION A. SYSTEM MODEL
There are five types of entities in our IoT data outsourcing model, including data owners, trusted authority (TA), cloud service provider (CSP), fog servers, and users, as shown in Fig. 1. The black arrows and blue arrows in the figure represent the information transmission in the process of data upload and data recovery, respectively. Note that the data recovery process begins with the download request initiated by IoT users. The concrete role that each entity play is described as follows. • The data owner has a considerable amount of data collected from IoT devices, and he is limited in computation and storage ability. So he wants to outsource the data to the cloud and enable legal users to access them later. Besides this, he also wants the fog servers to do some real-time aggregate statistical analysis either for his own data or for his and others' data together.
• The Trusted Authority (TA) is a party fully trusted by all the other parties. It is in charge of generating system parameters, as well as generating CP-ABE secret keys and pre-compute one-time-use multiplication triples for fog servers.
• The Cloud Service Provider (CSP) is an entity that provides cloud storage service. Explicitly, it stores t shares of ciphertext for each IoT data, where t is the number of fog servers that help encrypt the plaintext. It is also responsible for verifying if the attributes set of a user who wants to recover a data satisfies the access tree defined by the data owner.
• Fog servers are nodes deployed at the network edge. They can offer a variety of services, such as helping encrypt data shares and upload ciphertexts to the CSP for storage. What's more, some real-time data operations such as the summation, the product, and the variance are also their responsibility.
• The user equipped with IoT devices is also limited in computation and storage ability. Only when his attributes set satisfies the access policy defined by the data owner, can he gain the access of the corresponding ciphertext data stored in the CSP.

B. SECURITY MODEL
Our threat model considers two types of attackers: 1) An inside attacker refers to the CSP or fog servers. They are assumed to be ''honest-but-curious'' in our scheme, namely that they will execute the assigned tasks honestly, but would like to learn as much secret information as possible. For instance, they may attempt to extract useful information of data. In addition to this, we allow the collusion between fog servers. However, we require that the number of colluded fog servers is not more than t − 2 if the number of fog servers is t, namely at least two fog servers are honest, such that original data cannot be guessed by the colluded fog servers. 2) An outside attacker refers to a malicious user that intends to obtain some knowledge about data, which is not owned by him nor does he have access to. Therefore, all the sensitive IoT data need to be fully protected against both inside attackers and outside attackers. Moreover, pairwise authenticated and encrypted channels must be established between each IoT client and each fog server to protect data shares. Toward this end, we assume the existence of a public infrastructure and the basic cryptographic primitives (public-key encryptions, digital signatures, etc.) that make secure channels possible.

C. DESIGN GOALS
In this paper, we construct a secure IoT data outsource scheme supporting fine-grained access control and data aggregation. Specifically, the scheme aims at achieving the following security goals and functions.
• Data Confidentiality: Our scheme should guarantee the confidentiality of data, implying that the user whose attributes set does not satisfy the access tree of the data can not get any knowledge about it. Besides, fog servers can not know or infer the data that are being operated in the process of aggregate statistical analysis. The data should also be protected well against the curious CSP, even the trusted authority (TA).
• Fine-Grained Access Control: The data owner can define expressive and flexible policies so that the data can only be accessed by the users whose attributes satisfy these policies.
• Real-Time Data Aggregation: Some real-time data aggregation such as the addition, the multiplication, and the variance can be realized with the fog servers as an intermediate layer between users and the CSP. Besides, the computational complexity on the user side should be as small as possible.

V. SCHEME DESCRIPTION
Our scheme takes two aspects of both secure storage and computation of outsourced IoT data into consideration, and achieves fine-grained access control simultaneously. When a user wants to upload a data D to the CSP, he first splits it into t shares using a secret-sharing scheme, where t is the number of fog servers. For example, he can choose t random numbers such that their sum equals D. He also defines an access tree T for data D so that only the user whose attributes set S satisfies T can recover D later. Then he sends each share along with T to a fog server through an encrypted and authenticated channel. The fog servers will store shares of multiple data temporally so that they can jointly compute the summation, the product, and the variance of these data, without revealing the original data. They also use ciphertextpolicy attribute-based encryption (CP-ABE) and symmetrical encryption to encrypt their shares, then send the encrypted data shares to the CSP for storage. The CSP finally stores a set of encrypted data shares and an access tree for data D.
If a user with attributes set S wants to access a data D stored on the CSP, he sends a download request to the CSP. Once the CSP has checked that S satisfies the access tree T of data D, it sends S to the Trusted Authority (TA) and each encrypted data share to the fog server who uploaded it. Then TA generates a CP-ABE secret key using attributes set S and sends it to each fog server. Upon receiving the encrypted data share from the CSP and the CP-ABE secret key from the TA, each fog server can decrypt the data share and sends it to the user respectively over the encrypted and authenticated channel. Finally, the user can recover the data easily. The construction details of our scheme are described as follows.

A. SYSTEM SETUP
In this phase, two encryption schemes that will be used in our construction are initialized as follows: (1)  Specifically, the Setup CP−ABE algorithm will produce the following materials: public parameters PK = (G 0 , g, h = g β , e(g, g) α ), where G 0 is a bilinear group of prime order p, g is a generator of G 0 , e : G 0 × G 0 → G 1 denotes the bilinear map, α, β ∈ Z p are two random exponents. We also define the Lagrange coefficient i,S for i ∈ Z p and a set S of elements in Z p : i,S (x) = j∈S,j =i x−j i−j . We will employ a hash function H : {0, 1} * → G 0 additionally that we will model as a random oracle. The master key MK is (β, g α ), which is kept by the trusted authority (TA).
B. DATA UPLOAD 1. Supposing that the user i wants to upload a private data D i , he first splits it into t shares using a secret-sharing scheme, one per fog server. Specifically, the user chooses t random values (D i,1 , . . . , D i,t ) subject to the constraint: D i = D i,1 + . . . + D i,t . Then he sends each share of D i along with the access policy T i he defined to a fog server over an encrypted and authenticated channel.
2. Upon receiving the secret share D i,j , each fog server j stores the share temporarily (hours, days, or weeks, depending on the needs) for data aggregation such as addition and multiplication. In addition to this, it CP-ABE-encrypts and then symmetrically encrypts the share using its secret key k j . The specific encryption process is as follows.
(1)The fog server j invokes the Enc CP−ABE algorithm to encrypt the share D i,j under the access T i . It first chooses VOLUME 8, 2020 a polynomial p x for each node x in T i in a top-down manner starting from the root node R i . For each node x in the tree, let the degree d x of the polynomial p x to be one less than the threshold value k x of that node, that is, d x = k x − 1. Starting with the root node R i , the algorithm chooses a random s i,j ∈ Z p and sets p R i (0) = s i,j . Then, it chooses d R i other points of the polynomial p R i randomly to define it completely. For any other node x, it sets p x (0) = p parent (x)(index(x)) and chooses d x other points randomly to define p x completely. Besides, the fog server j chooses another random DK i,j ∈ Z p as the key to symmetrically encrypt D i,j . Let Y i be the set of leaf nodes in T i , the fog server j outputs a partial ciphertext (2)The fog server j uses its secret key k j to symmetrically encrypt C i,j computed in the previous step and obtains the new ciphertext Finally, fog server j uploads CT i,j to the cloud server. 3. Upon receiving ciphertexts of all the shares of D i from t fog servers, the cloud server stores CT i = {CT i,j } j=1,...,t for each data D i . Note that the first component of CT i,j uploaded by each fog server j is the same, namely T i , the cloud server just needs to store only one.

C. DATA AGGREGATION 1) ADDITION
Suppose n data (D 1 , D 2 , . . . , D n ) are to be added, and that each fog server j(j = 1, 2, . . . , t) stores a set of data shares DS j = {D i,j |i ∈ 1, 2, . . . , n}. The addition is implemented as following: each fog server j computes S j = D 1,j + D 2,j + . . . + D n,j and publishes S j . Then the sum of these n data is S = S 1 + S 2 + . . . + S t = D 1 + D 2 + . . . + D n . If the mean value of these data is needed, it can be easily computed by diving S by n.

2) MULTIPLICATION
When fog servers need to compute the product P of a constant A and a data D they stored together, each fog server j can compute a share P j of P by multiplying D j they stored with A, that is P j = A · D j . Then P = P 1 + P 2 + . . . + P t = A·D 1 +A·D 2 +. . .+A·D t = A·(D 1 +D 2 +. . .+D t ) = A·D, where t is the number of fog servers. Note that in this case, the private data D can be easily inferred by fog servers, but in real-life, fog servers usually need to compute more complex algebraic formulas such as A · D 1 + B · D 2 or A · D 1 + D 2 · D 3 , and we just give the basic multiplication of a constant and a private data here.
The multiplication of two data D 1 and D 2 can be implemented through Beaver's multi-party computing (MPC) protocol. Suppose each fog server j holds a share (a j , b j , c j ), where (a, b, c) is a one-time-use multiplication triple precomputed by the trusted authority (TA) and subject to the constraint that a · b = c. Fog server j computes d j = D 1,j − a j , e j = D 2,j − b j and then broadcasts d j and e j . Thus each fog server j can construct d and e and further compute P j = de/t + db j + ea j + c j . Then fog server j publishes P j and the product of D 1 and D 2 will be P = P 1 +P 2 +. . .+P t = D 1 ·D 2 .

3) VARIANCE
If we need to know the variance of n data (D 1 , D 2 , . . . , D n ), fog servers can compute the variance V = D 2 i − D 2 i , i = 1, 2, . . . , n as following.
Suppose fog server j stores a share D i,j of D i , and that it also holds a set of shares ..n , which are precomputed by the trusted authority (TA) and subject to the constraint that a i · b i = c i .
For each data share D i,j , the fog server j computes S j = D. DATA RECOVERY 1. When a user with an attributes set S wants to recover a data D i stored in the cloud, he sends a download request for D i to the cloud server. Since the server stores the access tree T i of D i , it first checks if T i (S) = 1. If not, the server will reject the download request. Otherwise, the server sends the attributes set S to the trusted authority (TA) and the ciphertext CT i,j (j = 1, . . . , t) to each fog server j.
2. The trusted authority (TA) runs the KeyGen CP−ABE algorithm to generate the secret key. It first chooses a random r ∈ Z p and then random r a ∈ Z p for each attribute a ∈ S. Then it computes the key as SK = (K = g (α+r)/β , ∀a ∈ S : K a = g r · H (a) r a , K a = g r a ) (3) and sends SK to all fog servers.
3. Upon receiving CT i,j from the cloud server and SK from the trusted authority (TA), fog server j first uses its secret key k j to symmetrically decrypt C i,j in CT i,j by C i,j = Dec k j (C i,j ) and obtains CT i,j . Then it can invoke the Dec CP−ABE algorithm to recover the data share D i,j .
The Dec CP−ABE algorithm is realized by a recursive algorithm DecryptNode ( CT i,j , SK , x). If the node x is a leaf node and if a = attr(x) ∈ S, then DecryptNode( CT i,j , SK , x) = e(K a ,C a ) e(K a ,C a ) = e(g r ·H (a) ra ,g px (0) ) e(g ra ,H (a) px (0) ) = e(g, g) rp x (0) . If a = attr(x) / ∈ S, then CT i,j , SK , x) =⊥.
If the node x is a non-leaf node, the algorithm DecryptNode( CT i,j , SK , x) proceeds as follows: For all nodes c that are children of x, it calls DecryptNode ( CT i,j , SK , c) and stores the output as DN c . Let S x be an arbitrary k x -sized set of child nodes c such that DN c = 1. If no such set exists, then the node is not satisfied and the function returns ⊥. Otherwise, it computes and returns the result, where i = index(c) and S x = {index(c) : c ∈ S x }. Therefore, if the tree T i is satisfied by S, the fog server j can obtain DN = DecryptNode( CT i,j , SK , R i ) = e(g, g) r·p R i (0) = e(g, g) rs i,j and then decrypt by computing C i,j /(e(C i,j , K )/DN ) = DK i,j · e(g, g) αs i,j /(e(h s i,j , g (α+r)/β )/e(g, g) rs i,j ) Finally, each fog server j decrypts D i,j by D i,j = Dec DK i,j (c i,j ) and sends D i,j to the user respectively, over the encrypted and authenticated channel. 4. After receiving all the t shares of D i , the user is able to recover D i by a simple additionD i = D i,1 + . . . + D i,t .

VI. SECURITY ANALYSIS
Our scheme aims at achieving secure data aggregation and precise access control for outsourced IoT data. First of all, we should guarantee the correctness of data aggregation on the fog server side, meaning that the results of addition, multiplication, and variance computed by fog servers are all correct. Through the detailed description in Section V-C, and the hypothesis in the security model that fog servers are honest but curious, it is obvious that correctness is fulfilled. As is described in Section IV-C, the security goal of the scheme is to realize data confidentiality and fine-grained access control. We will demonstrate the security of our scheme in these two aspects in the security analysis below.

A. DATA CONFIDENTIALITY
In our scheme, an IoT data D is first split into shares D 1 , . . . , D t by its owner before being uploaded, where t is the number of fog servers and D = D 1 + . . . + D t . Then these data shares are sent to fog servers over an encrypted and authenticated channel. It can be observed that the data is secure as long as at least two fog servers do not collude. Even if fog servers 1, 2, . . . , t − 2 collude, they can not infer D without D t−1 and D t .
These t data shares will be encrypted by fog servers using the ciphertext-policy attribute-based encryption (CP-ABE) and symmetrical encryption before being uploaded to the cloud service provider (CSP) finally. Concretely, the data share D j is symmetrically encrypted in the form c j = Enc DK j (D j ) by fog server j as in Equation 2. Here the encryption key DK j is uniformly chosen at random in Z p , thus the encryption of D j can be regarded as ''one-time pad''. Katz et al. have given a formal proof in [29] that one-time pad encryption scheme is perfectly-secret.
The symmetric encryption key DK j is not kept by the fog server j in our scheme. Instead, it is first encrypted using the ciphertext-policy attribute-based encryption (CP-ABE) as in Equation 1. Bethencourt et al. have demonstrated in [8] that their CP-ABE scheme is secure against chosen plaintext attacks (CPA-Secure). They argued that no efficient adversary that acts generically on the groups underlying their CP-ABE scheme could break the security of CP-ABE scheme with any reasonable probability, and they proved their argument using the generic bilinear group model and the random oracle model. Then DK j is further symmetrically encrypted by fog server j using its secret key k j as in Equation 2, which will enhance the security of DK j undoubtedly.
Another more critical consideration for further symmetrically encrypting DK j using the fog server j's secret key is to protect DK j against other fog servers during the data recovery process. Specifically, when a user whose attributes set S satisfies the access policy of data D wants to recover D, the trusted authority (TA) generates the secret key SK using S and sends SK to each fog server who has uploaded a share of D. If DK j is just encrypted using CP-ABE scheme and that the ciphertext CT j is, somehow, obtained by another fog server, DK j will be decrypted by this fog server. It is because that every other fog server can recover DK j by calling the CP-ABE decryption algorithm Dec(PK , CT j , SK ). But in our scheme, the final encryption of DK j is CT j = {T , C j = Enc k j (DK j · e(g, g) αs j ), c j = Enc DK j (D j ), C j = h s j , ∀y ∈ Y : C j,y = g p y (0) , C j,y = H (attr(y)) p y (0) }. No other fog servers except fog server j can decrypt C j , nor can they call the CP-ABE decryption algorithm to obtain DK j . Based on the above analysis, the confidentiality of data is achieved in our scheme.

B. FINE-GRAINED ACCESS CONTROL
The purpose of access control is to make sure that data can be correctly recovered by the user whose attributes set satisfies the associated access policy, while those who do not meet the policy cannot obtain anything about data. It can be easily observed in the data recovery process of our scheme that the former is achieved. For an invalid user, his attributes set can not pass through the check by the cloud on the access tree, namely T (S) = 1, so he is unable to proceed the normal process of data recovery. On the other hand, even if he gets the ciphertext of a data D somehow, he can not decrypt any CT j to obtain CT j , let alone recover the data.
Fine-grained is reflected in the ability to specify different access rights of individual users flexibly. By utilizing the ciphertext-policy attribute-based encryption (CP-ABE) in our scheme, the data owner is able to enforce expressive and flexible access policies. Specifically, the access policy of encrypted data supports complex operations to represent any desired attributes set. For example, we can represent a tree with ''AND'' and ''OR'' gates by using 2 of 2 and 1 of 2 threshold gates respectively. Therefore, our scheme achieves fine-grained access control by construction.

VII. PERFORMANCE ANALYSIS
In this section, we evaluate the performance of our secure IoT data outsourcing scheme, which is proposed to achieve both real-time aggregate statistics and fine-grained access control of outsourced IoT data. Since [12], [21], [22] are the very few works that consider secure IoT data aggregation, and [9], [26] are two state of the art works dealing with the access control in IoT, we will analyze our scheme in contrast to these five schemes. A overall comparison of these schemes is given in Table 1. '' √ '' represents that the scheme supports the corresponding function, while ''×'' the opposite. We will concentrate on the computation overhead on the client side and fog server side, including data upload and data recovery processes. In order to facilitate comparison, we choose the Advanced Encryption Standard (AES)-256 in Cipher-Block Chaining (CBC) mode as the symmetric key encryption scheme. Moreover, we utilize PBC library [30] and jPBC library [31] and choose type A pairing with 160-bit security level to conduct simulation experiments. Specifically, we adopt a desktop with Intel(R) Core(TM) i7-8700 CPU @3.20GHz and Linux version 4.19.36-1-MANJARO as a fog node, and an Android phone MI 6X with MIUI 10.3 and Android 9.0, Snapdragon 660 CPU, and 6 GB memory as the IoT device.

A. COMPUTATION OVERHEAD
We will analyze the computation overhead on the client side and fog side theoretically in this section. Let Pair denote one pairing operation on e : G 1 × G 2 → G T , Exp denote one exponentiation in group G 1 , Mul denote one multiplication in group G 1 , and Add denote one addition in group Z p .
In Sun et al. [12], a data owner needs to perform 2n multiplications to encrypt a data before uploading it, where n is the number of shares that a data is divided into. Data decryption needs 2n + 4 multiplications and 2n + 1 additions. In Gong et al. [21], a data owner needs to perform 512 multiplications and 448 additions, which is derived from their adoption of eight rounds eight-order matrix multiplication. This does not include the overhead of other operations such as XOR and HASH of DES encryption. Data recovery needs the same amount multiplications and one more eight-order matrix addition. In Guan et al. [22], to enable the cloud and the fog to achieve data aggregation, the data owner needs to perform 2 exponentiations and 1 multiplication during data collection and data aggregation. Note that they outsource the data aggregation but not data storage to the cloud, thus with no need to consider data recovery. In Huang et al. [9], upon receiving the partial ciphertext computed by fog nodes, the data owner needs to perform 4 exponentiations and 3 multiplications to finally encrypt the data. In Zhang et al. [26], the computation overhead is similar, being 3 exponentiations and 3 multiplications. In the above two schemes, the computation required by the user to decrypt a data is the same, that is 2 multiplications and 1 paring operation. While in our scheme, a data owner only needs to perform n−1 additions to upload an IoT data, where n is the number of fog servers, also the number of shares that the data is divided into. Since data upload and data recovery are symmetric processes, the computation overhead of data recovery is also n − 1 additions. The detailed computation overhead on the client side is listed in Table 2.
In terms of the computation overhead on the fog server side, Huang et al. [9], Zhang et al. [26] and our scheme all adopt fog computing and utilize the ciphertext-policy attribute-based encryption (CP-ABE) to accomplish finegrained access control of IoT data, thus the computation overhead on the fog server side is related to the access policy. While Guan et al. [22] just use fog nodes to perform outsourced aggregation of IoT data, with no need to consider uploading data to the cloud or helping users recover data from the cloud. And the the computation overhead on the fog side in their scheme is related to the number of smart devices in fog. Therefore, we will compare our schemes with Huang et al. [9] and Zhang et al. [26]. Let |Y | denote the number of leaf nodes of the access tree. In Huang et al. [9], the fog server needs to perform (2|Y | + 2) exponentiations to   compute the partial ciphertext for an IoT data that needs to be uploaded, and 2 pairing operations and 2 multiplications to recover a partial ciphertext. In Zhang et al. [26], the computation overhead for data recovery is the same, while they save |Y | exponentiations during data upload. In our scheme, the computation needed by each fog server for uploading a data is (2|Y | + 2) exponentiations and 1 multiplication, and that for recovering a data is 1 pairing operation and 2 multiplications. The detailed computation overhead on the fog server side of these three schemes is listed in Table 3.

B. EXPERIMENTAL ANALYSIS
According to the above analysis, our scheme is intuitively more efficient than the other five schemes, especially on the client side. We will demonstrate the effectiveness of our scheme through experiments in this section. Based on the experimental setting, our experiment results are as following: 1) The time required by a modular addition in Z p , multiplication and exponentiation in G 1 , and paring on e : G 1 × G 2 → G T on the fog side are 2.19 us, 2.19 us, 0.94 ms, and 0.75 ms, respectively. 2) The time required by a modular addition in Z p , multiplication and exponentiation in G 1 , and paring on e : G 1 × G 2 → G T on the IoT device are 0.14 ms, 0.13 ms, 31.57 ms, and 50.39 ms, respectively.
Combining the experiment results with Table 2 and Table 3, we can obtain Fig. 2 and Fig. 3, which provide a more clear and intuitive comparison between our scheme and the other works. In Fig. 2, the number of shares that an IoT data is divided into is assumed to be from 3 to 10. In Fig. 3, the number of attributes in an access policy is assumed to be from 5 to 50.
It can be seen from Fig. 2 that on the client side, our scheme has the lowest computation overhead in the process of both data upload and data recovery. Sun et al.'s computation overhead is slightly higher than ours. Gong et al. [21] has the highest computation overhead. In Fig. 2 (a), the computation overhead during data upload of Huang et al. [9], Zhang et al. [26] and Guan et al. [22] is decreased in turn. And Guan et al.'s computation overhead is much higher than ours even when the number of divided shares in our scheme reaches ten. Fig. 2 (b) shows that during data recovery, Huang et al. [9] and Zhang et al. [26] have the same computation overhead on the client side.
On the fog server side, Fig. 3 (a) shows that our scheme has the highest computation overhead during data upload, followed closely by Huang et al. [9] with a gap of 2.19 ms. The computation overhead of Zhang et al. [26] is the lowest, nearly half of ours. However, during data recovery, the computation overhead of our scheme is the lowest, which can be seen from Fig. 3 (b). Huang et al. [9] and Zhang et al. [26] have the same computation overhead, almost twice that of ours.
Based on the above comparison, we can draw a conclusion that our scheme enjoys a better performance as a whole, considering that we achieve both secure data aggregation and fine-grained access control of outsourced IoT data, while the other schemes just achieve one of the two functions.

VIII. CONCLUSION
In this paper, we propose a secure IoT data outsourcing scheme, which can support both real-time aggregate statistical analysis and fine-grained access control of outsourced IoT data. By utilizing Corrigan-Gibbs et al.'s computation of aggregate statistics -Prio and Beaver's multi-party computing (MPC) protocol, fog servers can perform aggregation such as addition, multiplication, and variance on the IoT data uploaded by the data owner, without knowing the original data. Ciphertext-policy attribute-based encryption (CP-ABE) helps us realize fine-grained access control, only allowing the user whose attributes set satisfies the access policy to recover the corresponding data. The security analysis shows that our scheme ensures correctness and data confidentiality. The extensive performance analysis and experiment demonstrate the efficiency of our scheme, meaning it is suitable for the resource-constrained IoT devices such as the MI phone used in our experiment, thus can be further used in real-time health monitoring and many other IoT environments. For our future work, we will try to seek ways to protect the confidentiality of results computed by fog servers, which is not considered in this scheme. Another problem is that the storage overhead in our scheme increases with the number of shares that a data is divided into, which is also the focus of our further research.
LING LIU received the B.S. degree from the School of Telecommunications Engineering, Xidian University, in 2013. She is currently pursuing the Ph.D. degree with the School of Cyber Engineering, Xidian University. Her research interests include cloud computing security, network and system security, and applied cryptography.
HE WANG received the Ph.D. degree in information security from Xidian University, China. She is currently a Lecturer with the School of Cyber Engineering, Xidian University. Her research interests include information security and quantum cryptography protocol.
YUQING ZHANG received the Ph.D. degree in cryptography from Xidian University, China. He is currently a Professor with the School of Cyber Engineering, Xidian University, and the Director of the National Computer Network Intrusion Protection Center, University of Chinese Academy of Sciences. He has published more than 100 research articles in international journals and conferences, such as ACM CCS, USENIX, the IEEE TPDS, and the IEEE TDSC. His research interests include network and system security, and applied cryptography. He has served as the Program Chair for more than five international workshops (e.g., SMCN-2017), and a PC member for more than ten international conferences in networking and security, such as the IEEE Globecom 16/17, the IEEE CNS 17, and IFIP DBSec 17. VOLUME 8, 2020