Blockchain and NFTs for Time-Bound Access and Monetization of Private Data

Digital data has enabled organizations to anticipate future threats, opportunities, and trends. However, digital data owners do not know how their data is accessed, shared, and monitized. In this paper, we propose using blockchain technology and non-fungible tokens (NFTs) to enable time-bound access and monitization of private data. Our approach allows users to upload encrypted content and mint it into NFTs. Other users can access the NFTs’ content by requesting a purchase or a license. Purchasing content transfers the ownership of the NFTs to the buyer; whereas, licensing them permits accessing the private data for a limited period of time, after which the data gets automatically deleted. Our developed approach uses the decentralized application (DApp), proxy reencryption (PRE), InterPlanetary File System (IPFS), and trusted execution environment (TEE) for managing a fully decentralized and robust system. We implement a proof-of-concept system in an Ethereum-based environment, which is used for testing and vulnerability checks. We present the cost and security analyses and discuss the generalization aspect of the solution. Our smart contracts and testing scripts are publicly available under an open-source license.

works [4], [5], [6]. Furthermore, additional features can be 23 integrated as part of the new solution, such as the ability to 24 present data ownership certificates, time-bound data sharing, 25 and secure data monetization. Such a solution can benefit 26 from a decentralization framework, in which no entity can 27 take control over the actions of the data owner [7], [8]. 28 The associate editor coordinating the review of this manuscript and approving it for publication was Yassine Maleh .
A primary use case that embodies all aspects of data own-29 ership, sharing, and monetization is concerned with patient 30 data. Patients are increasingly more cautious about their right 31 to own their data. Ideally, patients should be able to directly 32 receive and verify their medical reports and information, such 33 as x-ray scans and genomic sequences, so no other party can 34 claim ownership over the data or have access to it without 35 permission from the patient. Figure 1 illustrates this use case, 36 which also shows the ability of the patient to grant a paid 37 license of a subset of the data to a medical institution, where 38 the license expires after one week. the data with, and the duration before the data sharing  we discuss our proposed solution's feasibility, usability, 154 security, and resilience to malicious attacks. 155 This paper is organized into five sections. Section II 156 reviews the existing solutions for data ownership, sharing, 1 https://github.com/AnonGitter20220510/nft-content-sharing and monetization. Section III presents our proposed archi-158 tecture and approach details. Section IV discusses the imple-159 mentation of our algorithms and smart contract functions, 160 in addition to evaluating the implementation in terms of 161 functionality and cost. Finally, we examine our approach in 162 section V and recap our discoveries in section VI. 164 This section presents the existing solutions addressing storage 165 and ownership, time-bound data sharing, and data moneti-166 zation. Our exploration of the previous literature found no 167 proposal that integrates the three aspects into a single system. 168 Therefore, we discuss each component separately. 169 A. STORAGE AND OWNERSHIP 170 The current state of storage systems presents users with an 171 abundance of commercial solutions, such as Google Drive, 172 Microsoft OneDrive, and Dropbox. Although these systems 173 do not offer proof of ownership, they provide users with the 174 means to control who can access the data. However, com-175 mon aspects of such solutions include their expensive pricing 176 and limitations in feature sets for personal and small-scale 177 use [24]. In addition, centralized solutions fail to defend 178 against attacks and increasingly suffer in terms of availabil-179 ity [25], [26]. 180 Open source solutions offer more flexible and independent 181 categories of storage and ownership systems [27]. For exam-182 ple, open-source software (OSS) guarantees that the user 183 can potentially read and even make changes to the running 184 processes. However, setting up a cheap and reliable OSS 185 data storage system can quickly become a tedious job for 186 the ordinary user. One more problem with OSS solutions is 187 that they expose personal devices to allow data sharing and 188 interoperability across networks. 189 More recently, an exciting new type of storage system has 190 emerged. Filecoin, Storj, and Arweave are decentralized stor-191 age solutions that build peer-to-peer networks and incentivize 192 their nodes to keep the files stored [28], [29], [30]. Filecoin 193 is currently the most popular decentralized storage solution. 194 It uses IPFS as the basis of the storage network and relies on 195 proof-of-storage to ensure incentives are paid to the deserving 196 nodes. An interesting aspect of decentralized solutions is their 197 low pricing, which can be as little as one-tenth the price 198 of Amazon S3 Buckets [24]. However, since decentralized 199 storage systems only provide the means to upload and retrieve 200 data, they do not offer proof of data ownership. implementing an expiration date for the shared data.

229
One approach to the data expiration problem is to set an 230 expiration date for the access token granted by the owner, 231 after which it would not be possible for the receiving party to 232 proceed with the retrieval. This approach is only favorable for 233 problems where the data in question is dynamically produced 234 and streamed to the other party [38].

235
The research is advancing to a more confidential and 236 time-bound sharing mechanism where the receiving party 237 would not be able to duplicate or copy the data even after ware updates with all security patches [39], [40]. To avoid 244 these limitations on the client-side, researchers proposed 245 alternative solutions that assure data deletion and enforce 246 self-expiration, using techniques such as attribute-based 247 encryption and revocation, TEE, threshold secret sharing, 248 and frequently colliding hash tables [41], [42], [43]. The 249 drawback of these techniques is having to reside at a cen-250 tralized server that cannot be trusted. As far as decentralized   ing time and fees, especially for international transfers [16]. 266 4) Weak protection against data manipulation, leading to a 267 more time-consuming and challenging auditing process [46]. 268 Researchers and commercial solutions have densely 269 explored monetizing genomic data and incorporating 270 blockchain into its process [18], [45]. Early suggestions 271 incentivized patients to share their medical data with 272 researchers using an automated smart contract. For instance, 273 Nebula is a private blockchain platform for sharing genomic 274 data that uses proprietary storage, ledger, and analytics solu-275 tions [16]. In addition to genomic data, researchers proposed 276 letting patients sell their encrypted medical records by listing 277 them on a blockchain-based marketplace with mechanisms 278 to assure proof of delivery [47]. The blockchain-based mar-279 ketplace proposals were also introduced to a broader range 280 of industries, such as real-time IoT data streams and general 281 end-user documents [48], [49].

282
The research in the monetization field has been limited 283 to special and targeted use cases, such as paying small 284 fees in return for medical information or paying royalties 285 to obtain digital artworks. Additionally, the majority of the 286 marketplace research does not support the proposed design. 287 To the best of our knowledge, we are the first to propose and 288 implement a secure, global, and generalized architecture for 289 encrypted data monetization.

291
In this section, we discuss our proposed approach by describ-292 ing the different layers in the architecture and presenting the 293 responsibilities of the system's entities and how they interact 294 with it.

296
Our proposed architecture comprises four general compo-297 nents: the end-user who interacts through the DApp, the 298 blockchain smart contracts, the off-chain oracle nodes, and 299 the decentralized storage. The types of interactions among 300 the entities are depicted more clearly in Figure 2. Although 301 our design is agnostic of any specific framework, we chose to 302 present the solution with Ethereum and IPFS terminologies 303 because of their common usage in the academic literature. An end-user in our system is an entity that interacts with 306 the smart contracts, the oracle nodes, and the decentralized 307 storage through the DApp. We categorize the end-user capa-308 bilities into three personas that are not mutually exclusive:  2) Owner: The owner of the NFT assets, which could be 315 either uploaded by the same end-user or a different 316 one (producer). It can optionally delegate the minting 317 process to be done by the producer. It has the exclusive 318 right to grant sharing of data assets with consumers. 319 VOLUME 10, 2022

323
Considering that all personas require the end-user to inter-324 act with the decentralized storage, our system requires the 325 end-user to have a verified registration linking the smart con-326 tract with their storage address. Our developed mechanism 327 requires the user to publish a verify.txt file containing 328 their EA on the InterPlanetary Name Service (IPNS), and the 329 smart contracts can later verify the validity of the proof. IPNS 330 is a service that lets you make addresses that point to IPFS 331 content that can be read by humans and changed.

332
The DApp is where users can send requests to and receive 333 updates from the smart contracts. From the perspective of the 334 network, it is the DApp that gets connected to the blockchain 335 gateways (e.g., Infura) and nodes [50]. Therefore, the DApp     task of each node. The reputation scores eliminate unsatis-389 factory nodes and calculate the trust model ratings. Rather 390 than directly using the reputation scores, we incorporate 391 Laplace's rule of succession to produce a probabilistic rep-392 utation score [52], [53]. The probabilistic score is capable of 393 fairly comparing nodes with a high reputation score and few 394 participations against ones with a slightly lower reputation 395 score and many participations.

396
Interactions between end-users and oracle nodes are not 397 supervised by the trust models but instead managed by other 398 means of incentivizing correct actions, such as paying collat-399 eral. Additionally, oracles are prohibited by OracleSC from 400 participating in multiple requests. Locking oracles keeps 401 them from getting involved in a lot of requests when they 402 don't want to or can't answer all of them. 403

404
The decentralized storage component can be considered an 405 unmanaged entity, albeit crucial to the system. The two 406 required capabilities in a decentralized storage solution are 407 the ability to add and publish content assets. Adding a content 408 asset will result in a content-addressable URL and, therefore, 409 an inability to update the asset on the same URL, whereas 410 publishing an asset will result in a fixed URL that points to 411 the latest version of it. These fundamental capabilities are 412 offered in the IPFS network. However, one downside of IPFS 413 is the lack of monetary incentives, and as a result, it lacks 414 guarantees to store content for long periods. Filecoin solves 415 that problem by embedding monetary incentives based on 416 proof-of-storage on IPFS.

417
Uploading an asset to the network may optionally be 418 accompanied by encrypting the asset data beforehand and 419 injecting it with an identifier that can be used to trace back 420 the content to its uploader or supposed owner without relying 421 on blockchain to reveal the identity. As a result, regardless 422 of the storage or blockchain networks, or if an asset gets 423 minted, there will always be a claimed identity associated 424 with uploaded data, which remains unverified if the minting 425 does not carry out [54]. 426 The content structure of the IPNS storage is not restricted 427 to a fixed format, except for the verify.txt file, which 428 must be at the root. Figure 7 captures an example of the 429 IPNS content structure, and Figure 8 showcases an instance 430 of an ERC721-compliant JSON metadata file. For any of the 431 content assets published on IPNS, the user will get an IPNS 432 path such as IPNS/conference/meeting.mp3.json 433 and a content identifier (CID) to retrieve the asset from the 434 VOLUME 10, 2022    as an end-user or an oracle node. End-users must also verify     which is assumed to be valid, and proceeds to mint 502 the content asset as an original NFT. The producer and the content owner are two separate entities 505 in certain use cases. Instead of requiring the producer to 506 VOLUME 10, 2022 pay the minting fees, our approach lets the producer only 507 upload the encrypted assets, and it is depicted in Figure 4.

508
The uploaded assets are sent directly to the owner's address 509 and re-encrypted so that only the owner can access them.       reenecryption and content verification fees to Con-560 tentSC, which sends them to OracleSC. When an end-user only wants to access the material of 568 encrypted content NFTs, our approach offers the means to 569 request licensing for using the content. The owner shares 570 the data with the consumer in this use case, similar to the 571 purchasing scenario. However, since the NFT ownership is 572 not transferred to the consumer, we propose a mechanism 573 to allow access to the data in a time-bound manner. The 574 time-bound restrictions are enforced by a data viewer DApp 575 running in a TEE. Figure 6 presents the interactions for this 576 use case. 577 1a -3a) As in the purchasing use case, the consumer queries 578 the distributed search engine and requests licensing 579 for it. The owner approves and proceeds with the 580 reencryption process.

592
We implement the necessary smart contract functions that the 593 various types of entities can call to evaluate our proposed 594 approach. This section explains how we chose to build a 595 prototype of the system and put it into use.

597
For the Ethereum-based smart contracts development, 598 we used the Solidity language to write the contracts logic. cryptor's reputation threshold, a reward amount, a com-632 pletion state, and a chosen reencryptor.

633
A primary step for END_USER registration is to specify 634 their IPNS. First, the user calls algorithm 1 which updates 635 the IPNS verification attributes, resets the verification state, 636 holds the verification fees, and initiates a verification task 637 with a time limit of 1 hour. During this period, IPNS verifiers 638 submit their responses to algorithm 6, which updates the 639 attributes and locks the verifier from any further participation. 640 Once the task timeout expires, the verifiers can call algo-641 rithm 7 to unlock their accounts, update their reputations, and 642 receive their rewards, and a timeout calls ReturnVerify 643 function to submit the new verification state to ContentSC, 644 which triggers a callback function SetIPNSState.

645
Uploading a new content asset and minting it begins with 646 the user calling algorithm 2, which updates the content 647 attributes and calls StartVerify to initiate a new content 648 verification task with a time limit of 1 hour. Content verifiers 649 can call algorithm 6 during the hour, which updates the veri-650 fication attributes and locks the oracles. After the timeout, the 651 verifiers and timeout call algorithm 7 and ReturnVerify, 652 respectively. Upon receiving the verification, ContentSC 653 calls MintAsset to mint the content. 654 VOLUME 10, 2022    that the lowest correct score is higher than the highest incor-669 rect score, and the lowest score is equal to or greater than 670 zero. The score is then converted to a change in reputation by 671 mapping it to the reputation range, which in our implementa-672 tion is [0, 2 16 ). Solidity language does not offer native support 673 for floating points. Therefore, our code uses the larger range 674 to accommodate the reputations. The oracle's reputation and 675 participation count are updated based on a simple dynamic 676 averaging formula, and the award is calculated based on the 677 ratio of the oracle's score to the summation of correct scores. 678 For delegated uploading and minting, the owner needs to 679 approve the delegate by calling AddDelegate, which gives 680 privileges for another user to reencrypt content from their 681 keys to the owner's keys. Then, the producer and the owner 682 call algorithm 2 and PayDelegate respectively, which 683 automatically invokes the StartReencrypt function in 684 OracleSC. The smart contract at this point is waiting for 685 a reencryptor oracle with a satisfactory reputation score to 686 handle the processing, which, if available, should call algo-687 rithm 8, and upon completing the reencryption process, the 688 oracle calls algorithm 9 to unlock their account and assign 689 an initial score based on their response time. Once the reen-690 crypted content is uploaded and verified by the content veri-691 fiers, or once the owner confirms the delivery of the content 692 asset by calling algorithm 3, the smart contract transfers the 693 reward to the reencryptor, updates their reputation, and calls 694 back MintAsset to mint the asset for the owner.

695
Purchasing and licensing content takes a similar route 696 to delegated upload. The interested user makes a pur-697 chase or a licensing request by calling RequestSharing, 698 which locks the entity from making additional requests, and 699 locks the content from taking other purchases. If the owner 700 approves, they call AcceptSharing. After paying for 701 the content with PayForSharing, StartReencrypt is 702 triggered, making OracleSC wait for a satisfactory reencryp-703 tor to complete, calling algorithm 8 and algorithm 9. Once 704 the delivery to the new owner or the consumer is confirmed 705 by algorithm 4, the sharing process is considered complete. 706 If the sharing is a purchase, then the buyer can update the 707 metadata, which calls StartVerify and ends with trigger-708 ing algorithm 5 if the content is valid. If the sharing is a time-709 bound licensing, the timeout oracle notifies the smart contract 710 when the expiration is over, which forces the TEE-based data 711 viewer DApp to drop the asset data.

713
We used the Truffle suite 2 as a development framework 714 and deployment environment. Our code was written using 715 Solidity language version 0.8.0 and compiled using the Solc 716 compiler 3 of the same version, with optimization settings 717 enabled at 200 runs. Truffle also includes Ganache, 4 a local 718 Ethereum testing network (testnet), used to deploy the com-719 piled contracts. Lastly, Truffle tests, which were built on the 720 Mocha JavaScript test framework, 5 were used to run asyn-721 chronous and automated assertion tests. These tests could 722 Calculate maxScore as s max = max ( s + , s − ); 5 if response does not match majority then 6 Normalize score as s = s + s max ; 7 end 8 Calculate r ←ˆs r max 2 max ( s + , s − ) ; 9 Update reputation as r ← rp+ r p+1 ; 10 Update participations as p ← p + 1; 11 if response matches majority then 12 Calculate reward as a ←   tests use 17 accounts as detailed in Table 3.   on roles, so they can't be used by accounts that aren't 742 supposed to. 743 2) Minting: Users intending to add content must have 744 a valid IPNS address. Therefore, the owner and pro-745 ducer call the SetIPNS function and provide the 746 IPNS addresses 0 × 123 and 0 × 234, along with 747 0.15 Ether and 0.05 Ether as rewards for oracles 748 to verify their addresses for them. The verifiers call 749 RespondVerify within an hour and set the inputs 750 to the verification request identifier of type uint256 751 (256-bit unsigned integer) and validation response 752 of type Boolean. After an hour elapses, the oracles 753 call FinalizeVerify and specify the verification 754 request identifier to receive their rewards. The first 755 timeout oracle then calls ReturnVerify with the 756 request identifier as an input to update the states of 757 the IPNS addresses. Our test varies the responses of 758 the verifier oracles to confirm the robustness of the 759 approach. Verifiers and timeout oracles who did their 760 jobs right were rewarded with more reputation and 761 money, while those who got it wrong lost reputation 762 and didn't get back the transaction fees they paid.

763
The owner proceeds with adding a content asset by 764 calling AddContentFor and providing the inputs 765 starting with own Ethereum address as beneficiary, 766   valid, resulting in transferring the NFT from the origi-823 nal owner to the buyer. 824 5) License: Similar to the purchase test case, the third 825 user, as a consumer, calls RequestSharing with 826 input 2 and payment 0.7 Ether to request a license of the 827 second content asset from its owner. The owner accepts 828 the request with the AcceptSharing function given 829 input 2, which emits the event depicted in Figure 11. 830 Finally, the consumer calls PayForSharing, and 831 provides inputs 2 for the asset identifier, and payment 832 of 0.2 Ether for the reencrypters. The testing of our implementation extends to measuring the 835 average cost of calling the various functions in the smart 836 contracts. The cost analysis is based on the inputs and mone-837 tary rewards provided in the assertion testing. Table 4 shows 838 the cost of deploying and calling functions of the OracleSC. 839 The deployment of the contract is the most expensive opera-840 tion, compared to the registration functions. The costs of the 841 remaining oracle functions should indicate the corresponding 842 94198 VOLUME 10, 2022    of interactions scales linearly with the growth of users, assets, 870 and oracles.

872
The main factors that affect the throughput and latency of 873 the implemented system are the deployment network and the 874 number of transactions per task. At the time of writing, the 875 mainnet Ethereum and Polygon networks set a block size 876 limit of 30 million, whereas their block time durations are 877 13 seconds and 2 seconds, respectively. 8 Therefore, assuming 878 the network is not congested and the entities pay fair fees to 879 the miners, a transaction's latency averages half the block 880 time duration. As for the number of transactions (txn) per 881 task, they vary depending on the action and the desired num-882 ber of verifiers. type. This is done in addition to further requirements to 938 ensure unintended entities do not modify ledger infor-939 mation. Moreover, unlike traditional NFT solutions that 940 place blind trust in the user's claims of ownership, our 941 approach requires all users who wish to mint NFTs to 942 verify their IPNS address and verify that they (or their 943 delegates) uploaded the content onto the IPFS network. 944 These requirements minimize attacks of stealing con-945 tent and claiming ownership over it. Finally, it is worth 946 mentioning that although the uploader of the content can 947 be traced back to their EA, the substance is not verified 948 and accepted as claimed by the user. This is where the 949 manual rating reputation system comes into play, since 950 upon a successful content purchase or licensing; the 951 receiver can submit a rating for the original owner and 952 the NFTs.

953
Our security evaluation extends to quantitatively investi-954 gating and assessing the vulnerabilities hidden in the imple-955 mentation. Specifically, we utilize the Slither smart contract 956 analysis framework [56] to audit the developed Solidity smart 957 contracts: ContentSC and OracleSC. Our test uses Slither 958 version 0.8.2, and does not exclude any vulnerability detector 959 from running. The tool resulted in unmasking 152 vulnera-960 bilities, 136 of which are related to formatting and optimiza-961 tion improvements. Among the remaining 16 vulnerabilities, 962 we manually investigated the ones marked as 'High' and 963 'Medium' impact, to realize they are false positives. How-964 ever, we noticed 10 'Low' impact vulnerabilities have been 965 overlooked, which can be addressed in security patches to the 966 smart contracts.

968
Our paper intends to maintain a generalizable approach that 969 can be taken advantage of in various systems. A wide range of 970 industries can adopt the proposed approach where encrypted 971 data ownership is essential. For example, in the medical field, 972 the hospital is considered the producer of the data and can 973 provide the encrypted electronic health records of the patient 974 (the owner) described in the delegated upload and minting 975 process. Additionally, the patient may allow other parties, 976 such as a medical research institution, to have license-based 977 access to the encrypted data without sharing the encryption 978 key with the institution and guarantees never to exceed the 979 time-bound rules set for the shared content. A different sce-980 nario can be embodied in the digital art medium, where artists 981 can mint their work in the form of encrypted content (image, 982 video, or 3D model) and let other collectors propose a price. 983 What sets our approach apart from existing solutions is the 984 automation of the data transfer process while maintaining 985 complete confidentiality when the data is in transit.

986
The modules we made for our proposed architecture could 987 be used in other systems as well, such as: 1) Hybrid cryptosys-988 tem reencryption and a fully decentralized reputation system. 989 2) Mechanisms to upload encrypted content once and share 990 it multiple times. 3) Trace content-addressable assets back to 991 their original uploader even on networks such as IPFS. Our 992 We compare our solution with the existing solutions. 997 We choose cutting-edge works that propose and implement 998 decentralized solutions for the comparison. Table 7 shows 999 how our proposed solution compares against the referenced 1000 literature. The comparison covers the three main components: that can self-assess the performance of oracles and update 1033 their reputations with mechanisms to prevent skewed results 1034 because of a low number of participants. The core functions 1035 of the decentralized system were developed using the Solidity 1036 programming language and deployed on an Ethereum-based 1037 testnet for extensive testing that concluded in a fully func-1038 tional system with efficient gas costs and a lack of common 1039 vulnerabilities. Lastly, we discussed how safe our system is 1040 against attacks and how much our approach can be used in 1041 other areas, such as medicine and digital art.