Long-term secure distributed storage using quantum key distribution network with third-party verification

The quantum key distribution network with Vernam's One Time Pad encryption and secret sharing are powerful security tools to realize an information theoretically secure distributed storage system. In our previous work, a single-password-authenticated secret sharing scheme based on the QKD network and Shamir's secret sharing was experimentally demonstrated; it confirmed ITS data transmission, storage, authentication, and integrity. To achieve data integrity, an ITS message authentication code tag is employed and a data owner of the secret sharing performs both the MAC tag generation and verification. However, for a scenario in which the data owner and end users are different entities, the above approach may not work since the data owner can cheat the end users. In this paper, we resolve this problem by proposing an ITS integrity protection scheme employing a third-party verification with time-stamp.


I. INTRODUCTION
Long-term protection of integrity, authenticity, and confidentiality are required for critical information assets, for example, medical information such as genomic data and classified national information.Information leakage may cause criminal and/or civil penalties to data owners and system providers [9,10].In addition, these information assets should be available against disasters and various technical faults, in view of reasonable "Business Continuity Plan (BCP)s".Distributed data backup in distant places is one of effective solutions to the BCP issue.To combine this approach with the long-term security, transmission and storage of critical data must be strictly protected even against future technologies including quantum computers, which will be realistic, taking into account recent efforts on its development around the world [11,12].In this regard, a distributed data storage system with information theoretical security, which consists of quantum key distribution (QKD) [1,2] links and a secret sharing protocol [4], is a very promising solution.A QKD link enables two remote users to share Information Theoretically Secure (ITS) key (random number).Vernam's One Time Pad (OTP) [3] with such keys provides ITS data transmission.
Shamir's secret sharing scheme [4] can realize ITS storage system, if data transmission and authentication is carried out in an ITS way.These two schemes can provide ITS confidentiality of transmission and storage of data.Besides this, integrity protection of data is also an important concern in practice.
Here the integrity means that illegitimate or accidental changes of data can be discovered (see, e.g., Ref. [13]).And a correct data must be shared between the data owner and the ultimate user of the data (we name him as an end user).This is the main topic of this paper.In our previous work [5], we have proposed an ITS distributed storage system with ITS authentication based on a user-friendly single-password-authenticated secret sharing (SPSS) scheme and secure transmission using QKD and demonstrated it in the Tokyo QKD network [8].In this scheme, the ITS message authentication code (MAC) tag, generated by the universal2 hash function [6,14,15] with a password used as the key, is added to the storage data.Thereby, the data owner can confirm the ITS integrity of the data by oneself when the data owner reconstructs the data.We note that several methods have been reported to also protect data integrity against share holders' cheating by using hash functions [16,17], while our method [5] enables the data owner to know falsification of the data at a singlepassword authentication simultaneously.(Outline of this scheme is summarized in section 3.) In some cases, it is valuable that the function of the integrity protection be provided by a third-party.If the data owner and the end user are different, for example, saving a testament, the integrity check of the testament data by the third-party is extremely important since either the data owner or the end user may tamper with the data.To provide the third-party integrity check, in our subsequent work, we proposed Longterm INtegrity and COnfidentiality protection System (LINCOS) under some security assumptions [13].In this system, the secret data is reconstructed at regular intervals, and integrity of the data is protected by commitment renewal guaranteed by the evidence service and time-stamp service (i.e. the third-party) in the authenticated network.Such a verification system with the third party is widely used in time-stamp service [18].It is known that commitment schemes used for data integrity cannot achieve ITS binding and ITS hiding, simultaneously [19].In [13], therefore, only ITS hiding is employed.In this paper, we propose another third-party verification scheme with MAC, which realizes both the ITS binding and hiding simultaneously.The proposed system is based on the distributed storage system in [5] but additionally introduces the third-party verifier.We use universal2 hash function [6,14,15] to calculate MAC tags, which preserves the ITS confidentiality.Unlike in the case of [5], where the MAC tag-generation and verification are both performed by the data owner alone, in the present scheme, the MAC taggenerator and the MAC verifier are different persons.A problem of this setting is that the MAC generator enables to falsify the MAC tags easily due to the property of the universal2 hash function.Therefore, in the proposed scheme, we introduce "a trusted calculator" [20] for the MAC taggenerator.Such a concept of secure computation using a trusted hardware, which is trusted but has small long-term memory capacities, is used in practices (an example of such a device is in [21,22]).In our system, the trusted calculator computes shares and MAC tags.In calculation of MAC tags, the calculator uses random number provided from a QKD network and memorizes this random number until verification by the end user.The proposed scheme is experimentally demonstrated in a QKD network testbed, referred to as the Tokyo QKD Network [8].A distributed storage system with four share holders is implemented, which is supported by the five-nodes QKD network.In the same system, we also implement a simple share renewal function based on Pedersen's verifiable secret sharing scheme, which may be of independent interest on practical implementation of the share renewal function.The paper is organized as follows.In section 2, we describe our verification scheme with the trusted calculator which calculates shares and MAC tags, and memorizes random numbers used in tag-generation of the MAC.And summary of SPSS scheme is described in section 3. The experimental results of the verification scheme with the simple share renewal function are given in section 4. Conclusion is summarized in section 5.

Motivation and basic ideas
Conventional secret sharing schemes usually consist only of a data owner and share holders.We consider here a new setting where there is an additional player, called an end user, who receives the data from the data owner and casts doubt on its integrity.The goal of our third-party verification function is to resolve a dispute about data integrity, even if a malicious end user entertains false doubts about the secret data, disclosed by the data owner.Figure 1 shows a basic configuration of a secret sharing scheme with data integrity.Note here that a third-party, the verifier, is newly introduced.This is necessary in order to ensure data integrity to an end user [13].As mentioned in introduction, the efficient way to achieve data integrity is to use commitment scheme [13].However, in commitment schemes, ITS binding and ITS hiding have not been achieved simultaneously [19].As a solution to break through this limitation, we adopt a trusted server with small long-term memory as a key player to realize ITS binding and ITS hiding in a distributed data backup system on the QKD network.The overview of our scheme is described below.

Setting and the goals
Given a  SH -out-of- SH secret sharing (SS) scheme SS, we add to it a third-party verification function as follows.The underlying SS scheme SS can be arbitrary, hence we do not specify its details here.

Players and their roles:
We consider the situation where there are the following players with the following roles (Fig. 2). Data owner is the original owner of data  ∈  ( is the set of all possible data).He asks the share calculator to register and store .Whenever requested by an end user, he must retrieve and release . Share calculator calculates shares  = � 1 , … ,   SH � of  using the SS scheme SS, and sends  ℎ to share holder ℎ ∈ {1, … ,  SH }.
He also generates the MAC tag  using random number  MAC provided from the QKD network.The MAC tag  serves as the evidence that  was received at time  1 .He then asks the verifier to store .He memorizes  MAC . Share holders.There are  SH share holders.Each of them, indexed by ℎ ∈ {1, … ,  SH }, stores a share  ℎ of  calculated by the share calculator. Verifier receives and stores tag  delivered from the share calculator, along with the time  2 of its receipt. End user: An end user requests data owner to send  and  1 .He can detect a possibility of receiving false data with the help of the share calculator and the verifier.We stress that the end user may not be specified until the date reconstruction phase.As mentioned above, we assume that the share calculator is trusted but has small long-term memory capacities.That is, the share calculator can store small data (the random numbers) for a long time but can store large data only for short time (the original data and its shares) [21,22].In other words, the share calculator is fully trusted but kept minimal for a practical purpose.One of important roles of the share calculator is to store the random numbers used in calculation of MAC tags.This makes it impossible for the data owner to guess the MAC tag of the data.

Goals (Security Criteria)
We let  ∈ ℕ be the security parameter.Typically, k is 256 .Our goal in the scheme defined below is to fulfill the following security criteria.Below, ||denotes the cardinality of the set  of all possible data .

 SC1 (Integrity from the viewpoint of the end user):
Except with a probability ≤ 2 − log 2 ||, an honest end user can detect when the data owner reveals the data ′′, which differs from  that was registered. SC2 (Integrity from the viewpoint of the data owner): Except with a probability ≤ 2 − log 2 || , the data owner, if honest, can refute a false claim made by an end user that received a data ′′, which differs from . SC3 (Secrecy): The amount of information that the end user or the verifier obtains concerning , prior to the reveal phase, is less than k bits.

Description of our scheme
In this subsection we defined our scheme.

Assumptions
We begin by the listing the underlying assumptions. A1 (Share calculator): The share calculator is trusted , meaning that he follows the procedure specified in the next subsection correctly and leaks no information. A2 (Data reconstruction phase): At the data reconstruction phase of SS, the share calculator can verify that he indeed recovered the correct data . A3 (Verifier): The verifier is honest but curious, meaning he follows the procedure specified in the next subsection correctly but may leak information. A4 (Channels with the perfect security in the ITS sense): Each pair of players (all players listed in Sec.2.1.1)are connected by a channel with the perfect secrecy and the perfect authenticity in the ITS sense.That is, every player pair can use a channel where no eavesdropping or modification is possible, even when equipped with an unlimited computation power.Several remarks are in order concerning these assumptions.First, none of assumptions above restricts behaviors of the data owner and the end user; hence these two players can always deviate from our scheme in any way.Second, item A2 can be guaranteed e.g. by assuming (i) that share calculators always submit correct shares, or (ii) that the underlying SS schemes SS is equipped with certain cheater detection mechanisms (see e.g.Refs.[5,16,17]).Third, item A3 entails that the verifier, when asked, answers the correct values of  and  2 , but does not necessarily keep them secret.Finally, we stress that item A4 (the perfectly secure channels) can easily be realized in QKD networks, where every player pair p, q has access to an arbitrarily long secret key  with the perfect security.The secrecy of the channel can be guaranteed by OTP using   .The authenticity can be guaranteed, e.g., by message authentication codes with the ITS, which uses   only once (see e.g.Ref. [15]).

Specification of the MAC tag σ
In generating, the MAC tag σ, the share calculator uses an almost universal2 hash function , which has the following property.

Lemma 1 (Existence of an almost universal2 function):
There exists a function : ℛ MAC ×  → {0,1}  for which Here || denotes the cardinality of set , and 1[] is the function that equals 1 if proposition  holds, and 0 otherwise.Proof: Let ℛ MAC be a finite field   with the size  satisfying 2  ≥  ≥  −1 2  .Then let  be the hash function family given in Theorem 3.5 of Ref. [23].
In order to guarantee the ITS, we require that variable  MAC be a true random number [15]; e.g. a quantum random number provided from a QKD system.We also require that  MAC be generated newly every time the scheme is started [15].

Procedures
The procedure of our scheme consists of two phases, the data registration phase and the data reconstruction phase.The latter includes the integrity check of the reconstructed data ' corresponding to .
(1) Data registration phase: 1.1 Initiation by the data owner: The data owner sends data  ∈  to the share calculator.

Security of our scheme
We state and then prove the security of our scheme.

Optional: Simplified scheme achieving a weaker security (computational security)
So far, we have restricted ourselves with a scheme achieving information theoretical security.In this subsection, we consider a case with the computational security, a weaker notion of security, and show that it admits a simplification of our scheme.The basic idea is simply to replace the almost universal2 hash function u, used in the above scheme, with a computationally collision resistant hash function [14], instantiated, e.g., by SHA-512.More precisely, we prepare an optional mode in which the data owner and the end user themselves calculate hash values ℎ( 1 |) using the SHA-512 function h.In this option, they need not use the share calculator.The data owner calculates ℎ( 1 |)and sends it to the verifier.The end user, after receiving ′′', calculates ℎ( 1 |′′) and sends it to the verifier.The verifier checks if ℎ( 1 |′′) = ℎ( 1 |) and  1 ≤  2 hold and informs the end user of the result.
While this option can omit the share calculator, the data integrity (SC1 and SC2) must be somewhat mitigated; it can no longer be guaranteed in the sense of ITS, but only in the sense of computational security.That is, the success probability of an attack in SC1 or SC2 can no longer be bounded by 2 − log 2 || , but can only be shown to be negligible (with respect to the security parameter k) against a probabilistic polynomial-time (PPT) malicious user [6].On the other hand, the secrecy (SC3) still holds, as long as we set the MAC tag length to be smaller than k (see the proof of Theorem 1).

Advantage of our schemes (for both information theoretical security and computational security)
If there is a conflict during the authentication process, the verifier enables to judge whether the data owner or the end user is correct.As mentioned above, no commitment scheme for data integrity application can achieve instant ITS binding and hiding [19].Our third-party verification scheme is based on the use of a trusted third-party [24].The novelty of our scheme is to introduce the share calculator to guarantee an integrity of the data.The trusted assumptions on processing hardware are practical and have often been introduced in secure multiparty computation studies [20,25].In our case, we require that share calculations are performed secretly and the memory used to calculate MAC tags is long-term secure but small.We think these assumptions are acceptable for practical use.

III. SPSS scheme and share renewal process
In the previous section, we introduced the third-party verification scheme which can be combined with an arbitrary secret sharing (SS) scheme, SS.In this section, we choose the underlying SS scheme, SS, to be the single-password-authenticated secret sharing (SPSS) given in our previous paper [5], and discuss details of the combined scheme.Then we present its demonstration on an actual QKD network testbed, called the Tokyo QKD Network [8] in section 4. Our SPSS scheme [5] achieves ITS data transmission, storage, authentication, and integrity.In particular, the data owner can verify the integrity of the data when he reconstructs the data by himself.By combining this scheme with the third-party verification scheme of the previous section, we can enhance the data integrity guarantee function.In this section, we outline the procedure of the combined scheme.Further details and the security proof of the SPSS scheme are described in the supplemental information in Ref. [5].We introduce the (3,4)-threshold scheme below.
(1) Registration phase  The data owner sends data D and password P to the share calculator.The share calculator informs received time  1 to the data owner.For the efficiency of calculation, Mersenne primes should be used in the following calculation.Of course, other primes can be applied.To better understanding, we show an example of calculation with Mersenne prime.Since each calculation in the finite field with prime order  = 2  − 1 can deal with only blocks of length at most  − 1 bits, secret data , which has generally a much longer length, needs to be divided into pieces of ( − 1) -bit block, e.g. pieces;  =   | −1 | ⋯ | 1 .The data owner sets a ( − 1) -bit password  , which should have sufficient entropy against the on-line dictionary attack, then computes a message authentication code, MAC tag=     +  −1  −1 + ⋯ +  1 , which is denoted as  +1 , and finally adds it to the data for later purpose of message authentication by the data owner.(1-3) They are then sent to the corresponding share holders from the share calculator.
(1-4) Each share holder stores the set of shares.
(1-5) Simultaneously, the share calculator computes the other MAC tag of  1 | by using random number  MAC for verification, and sends  1 and the MAC tag to the verifier as mentioned in section 2.  MAC and  1 are stored in the share calculator.Here, we used the Toeplitz matrix-multiplication [26] as the almost universal2 hash function  (satisfying the property of Lemma 1) for calculating the MAC tag.Overall process of (1-1) to (1-5) are shown in Fig. 3 (1).
(2) Pre-computation and communication phase (2-1) Each share holder generates a random number, denoted as   for the  -th storage server, and makes its shares   (1),   (2),   (3),   (4) by using polynomial   of degree at most 1.Furthermore each server generates shares of the "0"  0 (1),  0 (2),  0 (3),  0 (4) by using polynomial  0 of degree at most 2, such that  0 (0) = 0 should hold so as to keep confidentiality of the share in data reconstruction phase without changing the value of the data share.
(2-2) The share holders send these shares to each other.
(3-3) The share calculator computes the MAC tag from  1 (0), … ,   (0) by using the password.If  +1 (0) is equal to calculated MAC tag, the share calculator determines that the stored data has been successfully reconstructed and sends the data to the owner.If necessary, the data owner or the end user can check the data integrity as mentioned in section 2. These processes are shown in Fig. 3

IV. Experimental setup and demonstration on the QKD Network
The third-party verification scheme described in sections 2 and 3 is experimentally demonstrated in a QKD network testbed.As mentioned above, the QKD network can provide not only secure communication lines but also random numbers generated from physical random number generators, because these devices are inside QKD systems.The random numbers generated by intrinsically non-deterministic physical processes are useful for various crypt applications.The verifiable share renewal function is useful to realize longterm security.Several verification methods have been reported [27][28][29].We use Pedersen's protocol [28] for a verification of the share renewal among the share holders.In our scheme, the share holders renew shares by adding shares of "0" with verifying whether their partial information is correct or not.The detailed process is described in the Appendix.This scheme enables a verification of the share renewal with ITS hiding but with computationally secure binding.This verification scheme relies on the hardness of discrete logarithm problem.Therefore, if an eavesdropper and/or an adversary has a quantum computer, the data integrity in this scheme (binding or often called correctness of the data) can be compromised.However, this scheme would still be useful to protect malicious insiders who do not have quantum computers.In fact, the share renewal can be carried out before the number of the compromised share holders exceeds the threshold.Furthermore, outsiders cannot get information about share renewal, because transmission lines are encrypted by OTP.Note that even if this process is eavesdropped, the information of the secret data is not leaked.This share renewal function is the optional countermeasure against malicious classical cyber attack.

QKD network structure
The structure of our system is shown Fig. 4. The QKD network [8] works as a secure key supply infrastructure.Secret sharing or other services are installed in this QKD network.The data owner, the share calculator, share holders, the verifier, and the end user communicate through OTP encrypted communication lines in which secure key are provided from the QKD network.The share calculator also requests random number to the key supply agent (KSA) of the QKD network to calculate shares or MAC tags.Once supplied with the keys or random numbers, the key data in the QKD network are erased and the responsibility of key management moves to application users.Generated keys in each QKD link are pushed up to servers, called key management agents (KMAs).Each KMA is set in a physically protected place, referred to as "a trusted node".A KSA is integrated to the KMA.The KSAs supply users the keys.A key management server (KMS) gathers link information and instructs KMAs to execute key relay according to request from the application layer.The Tokyo QKD Network consists of five nodes (called Koganei-1-4 and Ohtemachi-1) connected by six QKD links.The QKD links consist of the QKD systems provided by NEC [30], Toshiba [31], NTT-NICT [32], Gakushuin [33], and SeQureNet [34].Specification of each QKD link are listed in Table 1 [5,13].The key relays are carried out in the key management layer of the Tokyo QKD Network with OTP.Each communication line in the application layer is also encrypted by an OTP manner and authenticated with MAC tag based on Wegman-Carter [15] protocol by using the key from the QKD network.Therefore, each player uses ITS communication.Moreover, high quality physical random number is provided to the share calculator to calculate shares and MAC tags from the KSA of the QKD network.In the application layer, servers of the data owner, the share calculator, share holders, the verifier, and the end user are set.In this experiment, the data owner is set in Ohtemachi-1.Each share holder (1-4) is located in Koganei-1-4.The verifier and the share calculator are set in Koganei-1 and Koganei-2 respectively.The terminal of the end user is established in Koganei-3.

Experimental results
A prime number q is used in calculation of shares with Galois field.A condition to carry out the share renewal, q must be a divisor of p-1.We selected p and q from data sets in [35].From the viewpoint of fast computation, q should be selected from Mersenne primes.However, it is not so easy to find a prime to meet conditions.Therefore, we selected p and q from data sets in [35].When we carried out this protocol, threshold was set (3,4) shown in Fig. 3.The experimental results are shown in Fig. 5.These results indicate practical processing time including data transfer time.Compared with previous results, about threefold processing time are necessary.It is because we did not use Mersenne prime in calculation of shares.To improve throughput, we shall find Mersenne primes which meet conditions mentioned above.

V. Conclusion
We propose and demonstrate third-party verification with information theoretical security in a distributed storage system built on the QKD network.We demonstrated, for the first time to our best knowledge, a distributed storage system with information theoretically secure data transmission, storage, authentication, and data integrity with the third-party verification in a real metropolitan area network.By establishing the trusted share calculator and the verifier, falsifying data by the data owner or the end user become extremely difficult, and accidental data leakage from the verifier can be also protected with information theoretical security.We use the universal2 hash function to calculate MAC tags, therefore, calculation load can be extenuated compared with SHA families.It may imply that our scheme is suitable to guarantee integrity of big data.Moreover, we add share renewal function on our previous system "singepassword-authenticated secret sharing (SPSS) system" which enables ITS authentication, data transfer, data storage, and data reconstruction for resisting to against classical cyber attack.In verification scheme, we use universal2 hash function, however, we also developed another option with strongly universal2 hash function.This option enables to prevent accidental data leakage more efficiently, though more random numbers are necessary [6,14].Our proposed scheme will make a significant contribution to enhancing function of the longterm secure distributed storage system.Our scheme consists of five constituent members (data owner, share calculator, share holder, end user, and verifier) because our scheme includes time-stamp and verification functions.On the other hand, there exists a simpler ITS commitment scheme [24] and its realization with the QKD network is an interesting future work.Another important future direction is an improvement of the share renewal process.We demonstrated the optional-simple share renewal function based on Pedersen's verifiable secret sharing scheme against malicious insiders or classical cyber attack.The security of this share renewal is based on computational complexity, therefore, it is efficient only against malicious attackers who do not have quantum computers and need a certain time to crack the share holders.It future, it is desirable to realize long-term integrity based on a novel ITS share renewal function, e.g. using a trusted calculator.

FIGURE 1 .
FIGURE 1. Conceptual view of a secret sharing scheme with data integrity.The verifier enforces the third-party verification.

FIGURE 2 .
FIGURE 2. Conceptual view of third-party verification scheme.Long-term means elapsed time between (1) data registration phase and (2) data reconstruction and verification phase.The end user is not always fixed at data registration phase.

FIGURE 3 .
FIGURE 3. Schematic diagram of distributed storage with password authenticated secret sharing and share renewal scheme.

FIGURE 4 .
FIGURE 4. Schematic view of the layer structure of QKD Network and our third-party verification system.

FIGURE 5 .
FIGURE 5. Processing time of the registration phase (Registration), the communication and communication among share holders phase (Communication), the share renewal phase (Renewal), and the data reconstruction phase (Reconstruction) as functions of data size.

Calculation of tag 𝝈𝝈 : Choose
MAC randomly ( MAC ∈ R ℛ MAC ) and calculate a MAC tag  = ( MAC ,  1 |) , where  1 | denotes the concatenation of  1 and .Then send  1 and  to the verifier via an authenticated channel (cf.item A4 in Sec.2.3.1).1.2.3 Post-processing: Record  1 ,  MAC and erase ,  from the memory.1.3 Verifier records the receipt time  2 of messages  1 and , sent by the share calculator.( 1.2 Share calculator: The share calculator executes the following processing.1.2.1 Share calculation: Record the receipt time  1 of  , sent by the data owner.Then calculate shares  = � 1 , … ,   SH � of  and send  ℎ to share holder ℎ. 1.2.2

2) Data reconstruction phase
From the collected shares, he calculates a data ′ (which is supposed to equal ) and sends it to the data owner.2.2.2The data owner sends  1 and ′′ to the end user.If the data owner is honest, ′′ = ′.2.3 Integrity check by the end user: 2.3.1 The end user sends ′′ and  1 to the share calculator.2.3.2The share calculator calculates a MAC tag ′′ = ( MAC ,  1 |′′).He then sends ′′ and  1 to the verifier.2.3.3The verifier verifies the integrity of ′′, by searching his memory for MAC tag  registered with time  1 and by checking if it satisfies ′′ =  and  1 ≤  2 .If the check was successful, he sends "success" to the data owner and the end user; and otherwise send "fail".
: 2.1 Initiation by the data owner: The data owner requests the share calculator to send .2.2 Data reconstruction: 2.2.1 The share calculator collects from the share holders shares  = � 1 , … ,   SH � of .If he could collect only less than  SH shares, he announces "abort" of the entire scheme.
This corresponds to the case where the data owner chooses an incorrect data ′′(≠ ′) registered with time  1 without knowing  MAC , and sends ′′, instead of , to the end user in step 2.2.2.An honest end user can detect this alteration of  in step 2.3.3,except with a probability no larger than 2 − log 2 ||, due to the almost universality2 (cf.Lemma 1) of function .SC2: This corresponds to the case where, after the scheme is finished, the end user chooses an incorrect data ′′(≠ ′) without knowing  MAC , and claims that the data owner sent ′′ with time  1 instead of ′ in step 2.2.2.The data owner, if honest, can refute such claim by i. calculating the MAC tag ′′ = ( MAC ,  1 |′′) , with the help of the share calculator, and ii.then demonstrating to the end user, with the help of the verifier, that ′′ differs from the correct tag  stored by the verifier.This refutation fails only when ′′ equals  accidentally, which occurs with a probability no larger than 2 − log 2 ||, again due to the almost universality2 of function .SC3: This is because the verifier is correlated only through the MAC tag  ∈ {0,1}  .