Key Agreement Using Physical Identifiers for Degraded and Less Noisy Authentication Channels

Secret-key agreement using physical identifiers is a promising security protocol for the authentication of users and devices with small chips, owing to its lightweight security. In the previous studies, the fundamental limits of such a protocol were analyzed, and the results showed that two auxiliary random variables were involved in the capacity region expressions. However, with two auxiliary random variables, it is difficult to directly apply the expressions to derive the computable forms of the capacity regions for certain information sources such as binary and Gaussian sources, which hold importance in practical applications. In this paper, we explore the structure of authentication channels and reveal that for the classes of degraded and less noisy authentication channels, a single auxiliary random variable is sufficient to express the capacity regions. As specific examples, we use the expressions with one auxiliary random variable to derive the computable forms for binary and Gaussian sources. Numerical calculations for the Gaussian case show the trade-off between secret-key and privacy-leakage rates under a given storage rate, which illustrates how the noise in the enrollment phase affects the capacity region.


I. INTRODUCTION
I N THE age of fast and momentous advancements in communication technologies, the number of Internet-of-Things (IoT) devices has increased remarkably. Since IoT devices equipped with small chips have resource-constrained capabilities, they may not be suitable for deploying high-profile cryptography schemes such as public-key encryption/decryption for device authentication. Lightweight security protocols handily feasible on physical layers have been receiving recent attention to a greater extent since they enable the devices to securely communicate with low latency as well as low power consumption [2]. Secret-key agreement in which physical identifiers are used as information sources to generate secret keys for authentication, called authentication system in this paper, has emerged as a promising candidate since it provides a low-complexity design, consumes less power, and preserves secrecy [3]. As authentication can be performed on demand, the cost is lower than that of key storage in non-volatile random access memories [4], [5]. Physical identifiers could be physical unclonable functions (PUFs), making use of intrinsic manufacturing variations of the integrated circuit to produce source sequences [6]. Several PUF designs have been proposed over the last few decades and can be largely classified into either strong PUFs or weak PUFs. We focus on weak PUFs such as static random-access memory (SRAM) PUFs and ring oscillator (RO) PUFs since they produce reliable challenge-response pairs that can be used as unique cryptographic keys for IoT device security [7]. Although generating processes are different, PUFs and biometric identifiers have several aspects in common, and nearly all assumptions and analyses of PUFs can be applied to biometric identifiers [8]. Thus, the theoretical results developed in this study should be applicable to the scenario where biometric identifiers are treated as sources.
A block diagram related to the data flows of an authentication system with PUFs is illustrated in Figure 1, and the system consists of two phases, i.e., enrollment (top) and authentication (bottom) phases. In the enrollment phase, observing a measurement of the source sequence via a channel, which is assumed to be noise-free in some previous studies, the encoder generates a pair of secret key and helper data. The helper data is shared with the decoder via a noiseless public channel to assist in the reconstruction of the secret key. 1 In the authentication phase, the decoder estimates the secret key using the helper data and another measurement observed through a channel in this phase [9], [10]. In this paper, the channels in the enrollment and the authentication phases are called the enrollment channel (EC) and authentication channel (AC), respectively. EC and AC are modeled to represent the Fig. 1. A basic concept of secret-key agreement using physical and biometric identifiers [9]. noises added to the identifiers during the enrollment and authentication phases, respectively.
Relevant practical applications of the system described above include biometrics-based access control systems [11], fuzzy extractor schemes [12], [13], and field-programmable gate array (FPGA) based key generation with PUFs for IoT device authentication [14]. As a connection to physical layer security, PUFs are deployed to assist with key generation in poor scattering environments to enhance the randomness of bit sequences extracted from wireless channels, and it has been demonstrated that a higher secret-key generation rate is realizable [15].

A. Related Work
Seminal studies [9] and [10] independently investigated the fundamental limits of secret-key and privacy-leakage rates, called the capacity region, of the authentication systems. The capacity region elucidates the best possible trade-off between secret-key and privacy-leakage rates. The revealed trade-off may provide direct insights and serves as significant indicators for researchers seeking to design good practical codes that could achieve the largest achievable secret-key rate and the lowest implementable privacy-leakage rate for an authentication system. 2 In [9], eight different systems were taken into consideration, but among them, the generated-secret (GS) and chosen-secret (CS) models are the two major systems that are closely related to real-life applications and have been frequently analyzed in subsequent studies. 3 The secret-key capacity increases for multiple rounds of enrollments and authentications in the GS model [16] and CS model [17] with static random-access memory PUFs (SRAM PUFs). The work [9] is extended to include a storage constraint [18], a multi-identifier scenario with joint and 2 Note that when referring to the capacity region in the upcoming sections, it includes an extra dimension, the storage rate, along with the secret-key and privacy-leakage rates. To decrease memory usage in the public database, the storage rate should be minimized, similar to the privacy-leakage rate. 3 The difference between the two models appears in the enrollment phases. In the GS model, the secret key is extracted from the measurement of identifiers observed at the encoder and does not need to be saved in the public database. By contrast, in the CS model, the secret key, chosen uniformly and independently of other random variables, is combined with the measurement. The combined information, which contains data relevant to the secret key but not the plain form of the key, is stored in the database so that the decoder can reliably estimate the secret key. Hence, compared with the GS model, the minimum amount of storage rate required for the CS model is larger in general. See [9, Section III] for a more comprehensive explanation. distributed encoding [19], and polar codes for achieving the fundamental limits [20]. All the theoretical results mentioned above [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20] are clarified under a common assumption, i.e., the EC is noiseless, and this particular model is known as a visible source model. Recently, the capacity regions of the GS and CS models have been characterized in a more realistic setting where EC is noisy [21], and this model is called a hidden source model. As an extended scenario of authentication systems, the GS and CS models that involve not only secret-key authentication but also user identification can be found in, e.g., [22], [23], and [24].
For practical code constructions on the authentication systems, some state-of-the-art approaches for binary source sequences are investigated in [25] for polar codes and in [26] for both Wyner-Ziv and nested polar codes. Compared to the simulation results in [25] and [26], better performance in terms of secret-key versus storage ratio is achieved by deploying nested randomized polar subcodes [27]. Lately, a model with non-binary sources is developed in [28] with multilevel coding, and its performance is also evaluated by taking coded modulation and shaping techniques into consideration [29].
The capacity region of a GS model with the structure of AC following the channel of the wiretap channels or two-receiver broadcast channels with confidential messages [30] was investigated in [31]. In this model, AC is composed of the channel to the encoder, referred to as the main channel, and the channel to the eavesdropper (Eve), referred to as Eve's channel. Eve can obtain not only the helper data transmitted over public channels but also a correlated sequence of the source identifiers via her channel. This setup can be viewed as the source model of key-agreement problems [32], [33], [34] with one-way communication only and a privacy constraint. The privacy constraint is imposed to minimize the information leakage of the identifiers, and in general, its analysis becomes challenging especially when the noise in the enrollment phase is taken into account [21]. An extension of the work [31] by considering noisy EC and action cost at the decoder was presented in [35], and in both [31] and [35], it was shown that the resulting expressions of the capacity regions involve two auxiliary random variables for a general class of ACs.
In a different setting, the GS and CS models with jointmeasurement channels, where EC and AC are modeled as broadcast channels [36] to assume correlated noises in the measurements, were examined in [37]. Models with joint-measurement channels that incorporate Eve's channel can be found in [38]. These studies analyzed the capacity regions for some classes of broadcast channels, e.g., degraded and less noisy channels [36]. In a similar manner, we also investigate the capacity regions of the authentication systems for similar classes of channels, but the models and the point to which we direct our attention are different from those of [37] and [38]. More precisely, we deal with the models with separate measurements as in [35], and focus on the structure of AC, e.g., the main channel is less noisy than Eve's channel or Eve's channel is degraded with respect to the main channel, to simplify the expressions of the capacity regions with two auxiliary random variables that have been characterized in the paper.

B. Motivations
In real-life applications, the observations of PUFs and biometric identifiers are usually corrupted by noise. For instance, the measurements of PUFs' signals are affected by surrounding environments of integrated circuits such as temperature variation, change of supply voltage, and electronic noise [3], [8]. Likewise, a scanned picture of a fingerprint corresponds to a noisy version of its original image. Therefore, the assumption of the hidden source model as in [35] is considered to be a more realistic setting compared to that of the visible source model [31]. We thus adopt the setting of [35] on our model.
As we mentioned in the previous subsection, the expressions of the capacity regions of the GS and CS models characterized in [35] under a general class of AC involve two auxiliary random variables. Nevertheless, these expressions are impractical for developing the computable and tight bounds for some specific information sources and channels directly. Therefore, we explore and identify the classes of ACs that require only one auxiliary random variable for expressing the capacity regions, and use the simplified expressions to derive the computable forms for those specific sources and channels.
In this paper, we first investigate and characterize the capacity region with a single auxiliary random variable of the authentication systems for discrete sources and then apply this result to derive the capacity regions of GS and CS models for binary sources and channels. As an application of the systems with binary sources, it is well-known that SRAM-PUF responses are binary, and the outputs of sources and channels of SRAM PUFs can be modeled as binary bit sequences [16].
Furthermore, the measurements of the majority of PUFs are represented by continuous values. As an instance, the samples generated by RO PUFs obey a Gaussian distribution [39]. In addition, the noise in most communication channels is modeled as additive white Gaussian noise (AWGN). Motivated by this nature, we later extend the GS and CS models considered in [35] to characterize the capacity regions for Gaussian sources and channels.

C. Summary of Contributions
Unlike the technique used in [31] and [35], we apply information-spectrum methods [34], [40] to derive our main results. An advantage of leveraging these methods is that the argument does not depend on the size of the source alphabet, so it can also encompass continuous sources. The main contributions of this work are listed as follows: • We demonstrate that one auxiliary random variable suffices to characterize the capacity regions of the GS and CS models when ACs are in the class of less noisy channels. Though less noisy ACs are a subclass of a general class of ACs, our results are not obtainable by a trivial reduction from the result derived in [35] under the general class of ACs.
• We apply the simplified expressions to derive the capacity regions for binary sources under less noisy ACs, which is a more general setting than the one discussed in [35,Section IV]. To obtain the tight regions, we establish a new lemma and use it to match the inner and outer bounds.
• The work [41] is extended to characterize the closed-form expressions of the capacity regions for a hidden source model. Also, numerical calculations of the Gaussian case are provided to demonstrate the trade-off between secretkey and privacy-leakage rates in the visible and hidden source models and to capture the effects of noise in the enrollment phase toward the capacity region.

D. Modeling Assumptions
We assume that each symbol in the source sequences is independently and identically distributed (i.i.d.). Techniques such as principal component analysis [42] and transformcoding-based algorithms [43] can be applied to convert biometric and physical identifiers into a vector having (nearly) independent components. However, under various environments and conditions, it may not be feasible to completely remove the correlations among symbols in the source sequence. For simplicity in the analysis, in this paper, we derive all the results under the assumption that every symbol of the source and measurement vectors is i.i.d. generated according to a joint distribution.
In principle, Eve can be classified as either a passive or active eavesdropper. In this paper, we only focus on a passive attack and do not address the issues of active attacks on PUFs, e.g., machine learning and side-channel attacks [35, Section IV]. The obtained results are analyzed under the common assumption that a PUF is capable of fending off these invasive attacks that may transform the physical features of PUF outputs permanently [8].

E. Notation and Organization
Italic uppercase A and lowercase a denote a random variable and its realization, respectively. A n = (A 1 , · · · , A n ) represents a string of random variables and subscript represents the position of a random variable in the string. P A (·) denotes the probability mass function of the random variable A. H (·) and H b (·) denote the Shannon entropy and the binary entropy function, respectively. For other notation, refer to Table I.
The rest of this paper is organized as follows: In Section II, we introduce the system models and formulate achievability definitions. Section III derives the capacity regions of the authentication systems with one auxiliary random variable, and Section IV focuses on binary and Gaussian examples. Finally, concluding remarks and future work are given in Section V.

A. System Models
The GS and CS models, with mathematical notations, are depicted in Figure 2. The sequences (X n , X n , Y n , Z n ) are i.i.d., and their joint distribution is factorized as PX n X n Y n Z n = n t=1 PX t |X t · P X t · P Y t Z t |X t . Let S n = [1 : M S ] and J n = [1 : M J ] be the sets of secret keys and helper data, respectively. Here, M S and M J stand for the largest values in the sets from which secret key and helper data take values. The random vectorsX n and (Y n , Z n ) denote the measurements of the identifier X n , generated from i.i.d.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  source P X , via EC (X , PX |X ,X ) and AC (X , P Y Z |X , Y × Z), respectively. Assume that all alphabetsX , X , Y, and Z are finite, but this assumption will be relaxed in Section IV-B.
In the GS model, observing the measurementX n , the encoder e generates a helper data J ∈ J n and a secret key S ∈ S n ; (J, S) = e(X n ). The helper data J is shared with the decoder via a noiseless public channel. Detecting Y n , the decoder d estimates the secret key generated at the encoder using Y n and helper data J ; S = d(Y n , J ), where S denotes an estimation of the secret key S. In the CS model, the secret key S is chosen uniformly from S n and is independent of other random variables. It is embedded into the measurementX n to form the helper data J ; J = e(X n , S). For the decoder, similar to the decoder of the GS model, the estimate is produced as S = d(Y n , J ).
As the helper data J is sent over public channels, Eve can completely eavesdrop on this information. In addition to the helper data, Eve has a sequence Z n , an output of the marginal channel P Z |X , and both J and Z n are exerted to learn the secret key S as well as the source identifier X n . In essence, the information leaked to Eve regarding the identifier can not be made negligible because of the high correlation among X n , J , and Z n . However, it is possible to decelerate the distributions of S and (J, Z n ) and make them almost independent, so Eve may be able to recover only some insignificant bits but not the entire secret key based on the data available on her side.

B. Problem Formulations for the GS and CS Models
In this section, the formal achievability definitions of the GS and CS models are provided. We begin with the GS model.
Definition 1: A tuple of secret-key, storage, and privacyleakage rates (R S , R J , R L ) ∈ R 3 + is said to be achievable for the GS model if for sufficiently small δ > 0 and large enough n there exist pairs of encoders and decoders satisfying Also, R G is defined as the closure of the set of all achievable rate tuples for the GS model, called the capacity region. □ The technical meaning of each constraint in Definition 1 can be interpreted as follows: Condition (1) evaluates the error probability of estimating the secret key. This is related to the reliability of the authentication systems and the probability must be bounded by a sufficiently small number δ. Equation (2) is the constraint on the secret-key rate, and the generated key should be forced to be nearly uniform in the entropy sense so as to extract as large a key size as possible. Constraint (3) is imposed to minimize the size of the local random codebook that is required for enrollment and authentication. The rate of the codebook must not exceed a given storage rate R J .
Equation (4) measures the information leaked about the secret key to Eve, called secrecy leakage, and the secrecy leakage is evaluated under a strong secrecy criterion, which requires that the amount of leakage should be bounded by a small value regardless of the block length n. In other words, Eve can only obtain an ignorable amount of information regarding the secret key through the helper data and the correlated sequence. The last condition (5) assesses the amount of privacy leakage for the biometric or physical identifiers to Eve. In general, unlike the secrecy leakage (4), it is infeasible to make this amount vanish since the helper data itself are generated fromX n , a correlated sequence of X n , and Z n is also correlated to X n . However, it is important to minimize this quantity to protect the sensitive data of users or the characteristics of PUFs embedded inside the integrated circuits of IoT devices.
The achievability definition of CS model is defined below. Definition 2: A tuple of (R S , R J , R L ) ∈ R 3 + is said to be achievable for the CS model if for any δ > 0 and large enough n there exist pairs of encoders and decoders satisfying all the requirements imposed in Definition 1 with replacing (2) by We define R C as the capacity region of the CS model. □ The interpretations of the constraints in Definition 2 are the same as that in Definition 1; therefore, the details are omitted. In (6), the enforcement of the secret key to be uniform is no longer needed for the CS model as the secret key is uniformly chosen from the set S n .
There are other possible ways to define the secrecy-leakage and privacy-leakage in the authentication systems such as conditional entropy and variational distance. Nevertheless, in this paper, we adopt mutual information as the main metric, as in [31] and [35], so that it would be easy for us to connect our main results with those clarified in the previous studies.

C. The Capacity Regions With Two Auxiliary Random Variables
To facilitate the understanding of our main contributions in Section III, we highlight a complete characterization of the capacity regions of the GS and CS models (without action costs) derived in [35] for discrete sources.
where auxiliary random variables U and V satisfy the Markov chain V −U −X − X −(Y, Z ) and their cardinalities are limited to |V| ≤ |X | + 6 and |U| ≤ (|X | + 6)(|X | + 5). □ The single-letter expressions of the regions above associate two auxiliary random variables U and V . Theorem 1 tells us that similar to the conclusion drawn in [33] for the key-agreement problem, two auxiliary random variables are required for expressing the capacity regions of the authentication systems for the general class of ACs.
In general, once the single-letter expressions for discrete sources are established, it is common to characterize a computable form of the capacity region for special cases via such expressions. However, it is challenging to directly employ the expressions in (7) and (8) so as to derive a computable form on the capacity regions for binary and Gaussian sources due to the difficulty of handling two auxiliary random variables. In the next section, we explore the classes of ACs such that the capacity regions can be expressed by one auxiliary random variable.

III. STATEMENT OF MAIN RESULTS
As mentioned in the introduction, the structure of AC in the system model is similar to the channel of two-receiver broadcast channels with confidential messages. When it comes to the discussion on broadcast channels, degraded, less noisy, and more capable channels are three important classes of channels that are often discussed because the single-letter characterization for the capacity region of these types of broadcast channels is determinable [36,Chapter 5]. The class of degraded channels can be further subdivided into two classes: the classes of physically and statistically degraded channels. It is known that the latter class is larger than the former. In this section, we will take a look into each characterization of the capacity regions for these important channel classes.
Prior to the presentation of our main results, the formal definitions of physically and stochastically degraded channels, less noisy, and more capable channels [36] are defined. In order not to confuse with AC of the authentication systems, we denote the conditional probability of the channel of two-receiver broadcast channels as P BC|A (b, c|a) for (a, b, c) ∈ A × B × C, and P B|A (b|a) and P C|A (c|a) correspond to the conditional marginal distributions of the broadcast channels.
Definition 3 Physically Degraded Channel: (A, P C|A , C) is physically degraded with respect to (A, P B|A , B) if P BC|A (b, c|a) = P B|A (b|a) · P C|B (c|b) for some transition probabilities P C|B .
(Stochastically Degraded Channel): We say that (A, P C|A , C) is stochastically degraded with respect to (A, P B|A , B) if there exists a channel (B, P C|B , C) such that P C|A (c|a) = b∈B P C|B (c|b)P B|A (b|a). □ A clear relation among these classes of channels is that degraded channels are a subclass of less noisy channels, and less noisy channels are a subclass of more capable channels.
In some literature, e.g., [44], less noisy channels are called noisier channels. More precisely, it is said that (A, P C|A , C) is noisier than (A, P B|A , B) if for every random variables W such that W − A − (B, C), we have that I (B; W ) ≥ I (C; W ). In this manuscript, we sometimes use the terms "less noisy channels" and "noisier channels" interchangeably.
In order to simplify the statement of our main results, we define five new rate regions. The following rate constraints are used in the newly defined rate regions.
Definition 4: Rate regions of secret-key, storage, and privacy-leakage rates are defined as The auxiliary random variable U satisfies (9), (11), and (13) , Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
The auxiliary random variable U satisfies (9), (12), and (13) , The auxiliary random variable U satisfies (10), (11), and (13) , (16) The auxiliary random variable U satisfies (10), (12), and (13) , (17) where auxiliary random variable U in the regions (14) and (15) satisfies the Markov chain U −X − X − Y − Z and auxiliary random variable U in the regions (16) and (17) satisfies U −X − X − (Y, Z ). The cardinality of the alphabet U on the auxiliary random variables U in all regions above is constrained by |U| ≤ |X | + 3. Also, define □ The regions A 1 and A 2 in Definition 4 correspond to the capacity regions of the GS and CS models when the AC is physically or statistically degraded, and the regions A 3 and A 4 are related to the capacity regions of the GS and CS models for less noisy ACs. The region A 5 is used in a special case for degraded, less noisy, and Gaussian ACs, and no auxiliary random variable is involved in the expression in this region.
We start presenting our main results by showing a theorem when AC is degraded.
Theorem 2: Suppose that AC P Y Z |X has a structure such that Eve's channel P Z |X is physically degraded with respect to the main channel P Y |X , meaning that the Markov chain X − Y − Z holds. The capacity regions among secret-key, storage, and privacy-leakage rates of the GS and CS models are given by Reciprocally, if the Markov chain X − Z −Y holds, the regions are characterized as □ The proof of Theorem 2 is similar to that of Theorem 3; therefore, it is omitted.
Remark 1: The capacity regions of physically and stochastically degraded ACs are given in the same form as in Theorem II-C. This is because the capacity region depends on the marginal distributions (PX |X , P Y |X , P Z |X ), and for the model considered in this paper, these distributions coincide for both physically and stochastically degraded ACs.
The following theorem states the capacity regions of the GS and CS models for less noisy ACs.
Theorem 3: If AC P Y Z |X has a structure such that P Y |X is less noisy than P Z |X , i.e., I (Y ; W ) ≥ I (Z ; W ) for every random variable W such that W − X − (Y, Z ), we have that For the case where P Z |X is less noisy than P Y |X , i.e., I (Y ; W ) ≤ I (Z ; W ) for every W such that W − X − (Y, Z ), the capacity regions of the systems are provided by □ The proof of Theorem 3 is available in Appendix A. By a similar method used in [9, Section V-A], it can be checked that both R G and R C are convex. In case of no presence of Eve (Z is independent of other random variables), Theorems 2 and 3 naturally reduce to the capacity regions given in [21].
Note that the assumption of less noisy channels seen in Theorem 3, i.e., I (Y ; U ) ≥ I (Z ; U ) (or I (Y ; U ) ≤ I (Z ; U )), is satisfied for every U satisfying the Markov chain U −X − X − (Y, Z ). This fact is utilized in the proof of this theorem.
Remark 2: The class of more capable channels includes less noisy channels as a special case [36]. When the AC is in the class of more capable channels, i.e., I (X ; Y ) ≥ I (X ; Z ) or I (X ; Y ) ≤ I (X ; Z ), it is not yet known whether the capacity region can be characterized by one auxiliary random variable. More specifically, due to the impact of noise on the enrollment phase, the condition I (X ; Y ) ≥ I (X ; Z ) does not guarantee that I (Y ; U ) ≥ I (Z ; U ) and I (X ; Y ) ≥ I (X ; Z ), making it difficult to identify the sign in the right-hand side of the secret-key rate constraint in (10). The same observation applies to the case in which I (X ; Y ) ≤ I (X ; Z ).
An observation from the theorems and remarks shown above is that in the wiretap channels, the fundamental limits, e.g, the capacity-equivocation regions, depend on the channel P Y Z |X only through the marginal distributions of the main channel P Y |X and Eve's channel P Z |X [45]. This conclusion may be applicable to a visible source model of the authentication systems. However, for the settings of hidden source model, the capacity regions are hinged on by not only the marginal distributions of AC P Y Z |X but also the EC PX |X .

A. Binary Sources
In this section, the characterization of a binary example for Theorem 3 in the case where Eve's channel is noisier than the main channel is presented.
Consider the source random variable X ∼ Bern( 1 2 ), PX |X is a binary symmetric channel with crossover probability p ∈ [0, 1/2], P Y |X is a binary erasure channel with an erasure probability q ∈ [0, 1], and P Z |X is a binary symmetric channel with crossover probability ϵ ∈ [0, 1/2]. Note that if the relation of ϵ and q is such that 2q < ϵ < 4q(1 − q), P Y |X is less noisy than P Z |X , but P Y |X is not a degraded version of P Z |X . An illustration of this setting is described in Figure 3. Let the test channel P U |X be a binary symmetric channel with crossover probability β ∈ [0, 1/2]. The optimal rate regions of the GS and CS models in this case are given below.
Theorem 4: For binary sources when the main channel is less noisy than Eve's channel, the capacity regions of the GS and CS models are given as where the convolution operation * is defined as x * y = x(1 − y) + (1 − x)y for x ∈ [0, 1] and y ∈ [0, 1]. □ The proof of Theorem 4 is given in Appendix C. In [35], the rate region of the GS model for binary sources was derived under the assumptions thatX = X (EC is noiseless) and the AC is physically degraded, i.e., the Markov chain X − Y − Z holds. Theorem 4 is provided under a more general setting, and the key idea for deriving this theorem is to apply Mrs. Gerber's Lemma [46] in the reverse direction of Eve's channel to obtain an upper bound on the conditional entropy H (Z |U ). However, the obtained bound is not yet tight. We establish a simple lemma (Lemma 3) to acquire the optimal upper bound on H (Z |U ) to match the outer region with the inner region.

B. Scalar Gaussian Sources
Unlike the discrete sources, for Gaussian sources, we provide the capacity regions of the system for a general class of Gaussian ACs. A picture of data flows for Gaussian sources is depicted at the top of Figure 4. Assume that the source is given by X ∼ N (0, 1), and the channels PX |X , P Y |X , and P Z |X are modeled as where |ρ 1 |, |ρ 2 |, |ρ 3 | < 1 are the correlation coefficients of each channel, N 1 ∼ N (0, 1 − ρ 2 1 ), N 2 ∼ N (0, 1 − ρ 2 2 ), and N 3 ∼ N (0, 1 − ρ 2 3 ) are Gaussian random variables, and independent of each other and of other random variables.
Using a technique of transforming the exponent part of the joint distributions used in [47] or covariance matrix where N x ∼ N (0, 1 − ρ 2 1 ) and is independent of other random variables. A depiction of the data flows for (26) is displayed at the bottom of Figure 4, and the capacity regions for Gaussian sources are derived via (26) instead of (25). The result is given below.
Theorem 5: Under the condition of ρ 2 2 > ρ 2 3 , i.e.,X − X − Y − Z (cf. [34, Lemma 6]), the capacity regions of the GS and CS models for Gaussian sources are given by The auxiliary random variable U satisfies (9), (11), and (13) , The auxiliary random variable U satisfies (9), (12), and (13) , where auxiliary random variable U satisfies the Markov chain U −X − X − Y − Z . Unlike Theorem 2, the random variable U is a continuous random variable and its cardinality is unbounded. For the case of ρ 2 2 ≤ ρ 2 3 , i.e.,X − X − Z − Y , the regions are characterized in the same form □ Theorem 5 can be proved by a similar method for deriving Theorem 3, and thus we omit the detailed proof. In Theorems 2, 3, and 5, when the structure of ACs is such that the main channel is degraded with respect to Eve's channel or is noisier than Eve's one, the capacity regions of the GS and CS models are given in the same form. The secret-key generation at a positive rate is not possible, and the minimum value of the storage rates is zero, but that of the privacy-leakage rate can still be positive depending on the joint marginal densities of (X, Z ). Even when the encoding procedure is not needed, e.g., U is set to be a constant, the information leaked to Eve via her channel P Z |X is at minimum rate I (Z ; X ), which is equal to the capacity of this channel. This quantity corresponds to an uncontrollable amount of the privacy-leakage rate at the encoder, and it is avoidable if the privacy-leakage rate is constrained by conditional mutual information, i.e., I (X n ; J |Z n ), as in [48]. Note that due to the unbounded cardinality of the auxiliary random variable, the regions in (27) and (28) are not directly computable. Next, we show that the parametric forms, i.e., computable expressions, of Theorem 5 are determined by a single parameter. The parameter α which appears in the following corollary acts as an adjusting parameter for the variance of the auxiliary random variable U . Unlike random variables (X , X, Y, Z ), in which their variances are always one, the auxiliary random variable U could be any Gaussian random variable with a variance in the range (0, 1]. Corollary 1: When the condition ρ 2 2 > ρ 2 3 is satisfied, we can compute the regions in (27) and (28) as respectively, and that of (29) is given as

C. Behaviors of the Capacity Region for Gaussian Sources
In this section, we investigate the ultimate (asymptotic) limits of the secret-key and privacy-leakage rates and provide some numerical results under Gaussian sources. For brevity, we focus only on the GS model. First, we find expressions for the optimal secret-key and privacy-leakage rates under a fixed condition of R J for the hidden source model. Let us fix the storage rate . Now define two rate functions Using the value of α, we can write that The asymptotic limits of secret-key and privacy-leakage rates when R α J tends to infinity are given by For the visible source model, the asymptotic limits of secret-key and privacy-leakage rates for a given storage rate are determinable by substituting ρ 2 1 = 1 into (36), i.e., are defined in the same manner as (34) and correspond to the maximum secret-key rate and the minimum privacy-leakage rate for this model under fixedR α J . One can see that in the second equation of (37), the optimal value of the privacy-leakage rate increases linearly with the storage rate.
Next, we provide some numerical calculations of the region R G in (30), and take a look into special points of both the visible source model (ρ 2 1 = 1) and hidden source model (ρ 2 1 < 1). The following two scenarios are considered.  1) ρ 2 1 varies over three values 1.0, 0.9, and 0.7, but (ρ 2 2 , ρ 2 3 ) is fixed at (0.8, 0.5). This is the case where the probability of enrollment channel PX |X could be changed, but that of the authentication channel P Y Z |X remains the same. 2) ρ 2 1 is fixed at 0.9, but (ρ 2 2 , ρ 2 3 ) could be either one of the pairs (0.8, 0.5), (0.7, 0.6), or (0.6, 0.7). This is the opposite example of Scenario 1). Figures 5 and 6 plot the optimal values between secret-key and storage rates (R α J , R * S (R α J )) and privacy-leakage and storage rates (R α J , R * L (R α J )), respectively, for Scenario 1). Figures 7 and 8 illustrate the relations of the same rate pairs for Scenario 2). These figures are obtained by calculating the values of R α J defined in (33), and R * S (R α J ) and R * L (R α J ) defined in (35) with respect to the parameter α. In this calculation, we set the step size of α to be 10 −5 , which was found to be sufficiently small for numerical implementation.
From Figures 5 and 6, the visible source model produces a better secret-key rate but leaks more privacy of physical identifiers to Eve compared to the performances of the hidden source model. More precisely, the asymptotic values of secret-key rate are lim R α J →∞ R * S (R α J ) = 1 2 log( 5 2 ) = 0.458 nats, 0.338 nats, Fig. 7. Projection of the capacity region R G in (30) with different ρ 2 2 and ρ 2 3 onto R J R S -plane. and 0.195 nats when ρ 2 1 is equal to 1.0, 0.9, and 0.7, respectively. This indicates that when ρ 2 1 decreases, implying that the noise introduced to the identifiers in the enrollment phase increases, the secret-key rate becomes smaller.
Conversely, in terms of the privacy-leakage rate, the asymptotic limits become lim R α J →∞ R * L (R α J ) → ∞, 0.861 nats, and 0.538 nats when ρ 2 1 is equal to 1.0, 0.9, and 0.7, respectively. Evidently, when ρ 2 1 is low, less information about the identifiers leaks to Eve. By the reason that the noise in the enrollment phase serves as a fixed filter [49] to obscure the privacy of identifiers, when ρ 2 1 is small (the variance of noise added to the identifiers in the EC is large), the amount of information leaked to Eve is also small. By contrast, when ρ 2 1 approaches 1, the effectiveness of the filter is lessened, and the hidden source model behaves similarly to the visible source model. Thus, a larger amount of privacy of the identifiers could be leaked.
For Scenario 2), Figures 7 and 8 show that the achievable secret-key rate gradually decreases and the privacy-leakage rate rises as the value of ρ 2 2 declines and that of ρ 2 3 increases, which can explicitly be verified by comparing two distinct values of the secret-key and privacy-leakage rates in (35) with different pairs (ρ 2 2 , ρ 2 3 ) under the same storage rate. These behaviors suggest that when the noise variance of the measurements observed through the main channel is large, corresponding to the case where a low-quality quantizer, e.g., quantizer with few quantization levels, is deployed at the decoder, it leads to a small secret-key generation rate and a high privacy-leakage rate. This effect becomes particularly remarkable when Eve uses a high-quality quantizer.
In the authentication systems, it is favored for achieving a high secret-key rate while maintaining low storage and privacy-leakage rates, but these simulation results reflect the difficulty of achieving such a tuple simultaneously. Therefore, to prevent a circumstance such that a significant loss of privacy occurs, it may be important not only to focus on increasing the gain for the secret-key rate but also to weigh its balance with the storage and privacy-leakage rates as well when designing practical codes for an authentication system.

V. CONCLUSION
In this paper, we investigated the classes of ACs for which the capacity regions of the GS and CS models can be characterized by one auxiliary random variable. The obtained results revealed that only a single auxiliary random variable is required to characterize the capacity regions for degraded and less noisy ACs. Moreover, the capacity regions of the authentication systems for both binary and Gaussian sources were derived. All the expressions derived in this work are not only tight but also readily computable. They may serve as a performance benchmark when practical channel codes such as LDPC and polar codes are constructed for the authentication systems as in [25] for a visible source model. We also provided some numerical calculations for the Gaussian case to demonstrate the impact of noise in the enrollment phase on the capacity region as well as to examine the trade-off between secret-key and privacy-leakage rates for a given storage rate.
For future work, a natural extension of this work is to investigate whether polar codes can achieve all the rate points in the capacity region of binary sources. In fact, for the typical key-agreement problem [20], polar codes were shown to achieve the fundamental limits by exploiting the degraded and less noisy properties of the main and Eve's channels. Due to the similarities of the key-agreement problem and our model, it may be possible to demonstrate that the code achieves the fundamental limits of the authentication systems as well. Furthermore, extending the results in Section IV-B to vector Gaussian case is another interesting research topic.

APPENDIX A PROOF OF THEOREM 3
This appendix deals with the proof of the capacity regions for less noisy ACs. We only provide the proof of (21) since that of (22) follows similarly by simply setting the auxiliary random variable U to be constant. The entire proof is divided into two parts, namely, the converse and achievability parts. For the converse part, the derivation of each rate constraint for the GS and CS models is discussed in detail, while in the achievability part, only the key point is addressed. Fig. 9. The possibility of a reduction in Theorem 1 to obtain the outer regions of GS and CS models for less noisy ACs.

A. Converse Part
Note that following the same technique used in [35], the capacity regions derived under the general class of ACs also hold for less noisy ACs. Figure 9 illustrates the possibility of a direct deduction of the capacity regions of the GS and CS models for less noisy ACs via the expressions with two auxiliary random variables that we have seen in Theorem 1.
More specifically, it is possible to derive the outer region on R G in Theorem 3 directly via the region in (7) by exploiting the long Markov chain V − U −X − X − (Y, Z ), as shown in Figure 10, and the property of less noisy channels, but the same approach cannot be applied to the CS model. In the proof, we demonstrate the proofs of the GS and CS models via different approaches. The proof begins with the GS model and follows by the detailed argument of the CS model.
Converse Proof of GS Model: Since the bounds on R J in both the regions in (7) and (21) remain unchanged, we need to check the constraints on the secret-key and privacy-leakage rates. Transform the bound on the secret-key rate as follows: where (a) follows by the Markov chains V −U −Y and V −U − Z , derivable from the Markov chain V − U −X − X − (Y, Z ) (cf. Figure 10), and (b) follows because less noisy ACs fulfill the condition Likewise, for the bound on the privacy-leakage rate, we have where (a) is due to the Markov chain V − X − (Y, Z ) and (b) is due to the property that I (Y ; V ) ≥ I (Z ; V ) for less noisy ACs. Hence, the converse proof of the GS model is attained. □ Converse Proof of CS Model: Observe that the right-hand side of the storage rate of the CS model with two auxiliary random variables can be reshaped as where (a) is due to the Markov chains U −X −Y and V −U − (Y, Z ), and (b) follows from the Markov chain U −X − Z .
In (40), since I (Y ; V ) ≥ I (Z ; V ) for less noisy ACs, this lower bound cannot be further reduced to the one seen in (12). We cannot apply the technique used for the GS model to derive the outer bound directly from the region with two auxiliary random variables (cf. eq. (8)) for the CS model, and thus an alternative approach is required. Here, we make use of a standard technique that relies on the assumption of auxiliary random variables and Fano's inequality.
Suppose that a rate tuple (R S , R J , R L ) is achievable, implying that there exists a pair of encoders and decoders such that all requirements in Definition 2 are satisfied for small enough δ > 0 and block length n ≥ n 0 (n 0 ≥ 1). For t ∈ [1 : n], define auxiliary random variables U t = (J, S, Y n t+1 , Z t−1 ) and V t = (J, Y n t+1 , Z t−1 ). Under these settings, it is easy to verify that the Markov chain Analysis of Secret-key Rate: Define δ n = 1 n (1 + δ log M S ), and this quantity is related to an upper bound of Fano's inequality. From (6), the secret-key rate can be bounded by (41) where (a) is due to a similar argument in [21, eq. (40)], (b) holds since the Markov chains V t − U t − Y t and V t − U t − Z t are applied, and (c) follows from the Markov chain V t − U t − (Y t , Z t ) and the property of less noisy channels that Analysis of Storage Rate: For the CS model, note that the secret key S is independent of random variables (X n , X n , Y n , Z n ), and the helper data J is a function of (X n , S). From (3) ≥ I (X n ; J, Z n |S) − I (X n ; Z n ) = I (X n ; J |Z n , S) + I (X n ; Z n |S) − I (X n ; Z n ) where (a) is due to the Markov chain Z n − (X n , S) − J and S is independent of other random variables, (b) is due to the Markov chainX t − (X t−1 , J, S, Z n ) − Y n t+1 , and (c) follows because conditioning reduces entropy.
Analysis of Privacy-Leakage Rate: We can develop the right-hand side of (5) as n(R L + δ) ≥ I (X n ; J, Z n ) = I (X n ; J, S, Z n ) − I (X n ; S|J, Z n ) ≥ I (X n ; J, S, Z n ) − H (S) = I (X n ; J, S|Z n ) + n I (X ; Z ) − H (S) where (a) follows from similar steps between (42) and (43), and (b) follows since H (S) is upper bounded by the last inequality in (41). The proof wraps up with the standard argument for single letterization using a time-sharing random variable Q, where the random variable Q is uniformly distributed on [1 : n] and independent of other random variables. More specifically, define X = X Q ,X =X Q , Y = Y Q , Z = Z Q , and U = (U Q , Q), so that U −X − X − (Y, Z ) forms a Markov chain, and finally, letting n → ∞ and δ ↓ 0, from (41), (43), and (44), we obtain R C ⊆ A 4 .
For the cardinality bound on the set U of the auxiliary random variable U , we apply the support lemma [36,Lemma 3.4] to show that |U| ≤ |X | + 3. More precisely, |X | − 1 continuous functions suffice to preserve H (X ), and other four more elements are necessary for preserving the conditional entropies H (X |U ), H (X |U ), H (Y |U ) and H (Z |U ). See [21, Appendix A] for a detailed discussion. Now the converse proof for the CS model is attained. □

B. Achievability Proof
In the proof, we provide only the main contribution of this part, which is the analysis of the privacy-leakage rate for the GS models, since other constraints can be proved by techniques developed in previous studies. As we have seen in the reduction of (39), the maximum lower bound on the privacy-leakage rate for less noisy ACs decreases compared to that of general ACs, so the important objective in the analysis is to check whether the decreased bound can be achieved or not. For the proof of the CS model, it follows similarly to that of the GS model with one-time pad operation to conceal the secret key, and thus it is omitted. The readers may refer to [1, Appendix A] for a detailed discussion.
Fix the test channel P U |X and let γ be small enough positive. Set R S = I (Y ; U )−I (Z ; U )−6γ , R J = I (X ; U |Y )+4γ , and R L = I (X ; U |Y ) + I (X ; Z ) + 7γ , and the sizes of the set of helpers |J n | = exp{n R J } and the set of secret keys |S n | = exp{n R S }. Define the sets T n = (u n ,x n ) : 1 n log P U n |X n (u n |x n ) P U n (u n ) ≤ I (X ; U ) + γ , A n = (u n , y n ) : 1 n log P Y n |U n (y n |u n ) P Y n (y n ) ≥ I (Y ; U ) − γ , K n = (u n ,x n , x n ) : 1 n log PX n |U n X n (x n |u n , x n ) PX n |X n (x n |x n ) where U n ∼ n t=1 P U t with P U t = P U for t ∈ [1 : n]. Next, we determine the codebook, and the enrollment (encoding) and authentication (decoding) procedures.
Random Code Generation: Generate exp{n(I (X ; U ) + 2γ )} i.i.d. sequences ofũ n from P U and denote the set of these sequences as Q n . Let g n :X n → Q n ⊂ U n be the mapping of measurementx n intoũ n . The mapping rule is that it searches u n such that (ũ n ,x n ) ∈ T n . In case there are multiple such u n , the encoder picks one at random. On the contrary, if there does not exist such a sequence,ũ n 1 is chosen. Now prepare M J = e n R J bins. Assign each sequenceũ n ∈ Q n to one of M J bins according to a uniform distribution on J n . This random assignment is denoted by φ n (ũ n ). Let j = φ n (ũ n ), j ∈ J n , denote the bin's index to whichũ n belongs. Also, let F n be a universal hash family of functions [50] from Q n to S n . A function f n : Q n → S n is selected uniformly from F n and satisfies that P F n ({ f n ∈ F n : f n (ũ n ) = f n ( u n )}) ≤ 1 |S n | for any distinct sequencesũ n ∈ Q n and u n ∈ Q n , where P F n is a uniform distribution on F n .
In the actual encoding and decoding processes, the set Q n and the random functions φ n and f n are fixed.
Encoding: Observingx n , the encoder first uses g n to map this sequence toũ n ∈ Q n . It then determines the index j of the bin to whichũ n belongs, i.e., j = φ n (ũ n ), and generates a secret key s = f n (ũ n ). The index j is shared with the decoder for authentication.
Decoding: Seeing y n , the decoder looks for a unique u n such as j = φ n ( u n ) and ( u n , y n ) ∈ A n . If such a u n is found, then the decoder sets ψ n ( j, y n ) = u n , and distills the secret keyŝ = f n ( u n ). Otherwise, the decoder outputsŝ = f n (ũ n 1 ) and error is declared.
The random codebook C n consists of the set Q n = {U n i : i ∈ [1 : exp{n(I (X ; U ) + 2γ )}]} and the functions (g n , φ n , ψ n , f n ), and it is revealed to all parties.
By a similar argument for evaluating the error probability for the Wyner-Ziv problem for general sources in [51], the error probability of the authentication systems averaged over the random codebook vanishes for large enough n. The bound on the storage rate is straightforward from the rate setting. The secret-key rate can be proved via [52,Lemma 3], and using [34, Lemma 12] and [52,Lemma 3] together, the secrecy-leakage can be made negligible for large enough n.
In the remainder of this proof, we evaluate the average performance of the privacy-leakage rate (5) over all possible C n . Before diving into the detailed analysis, we introduce some useful lemmas for the analysis.
Lemma 1: It holds that for large enough n, where E C n [·] denotes the expectation over the random codebook C n . □ By the definition of the set K n , the probability Pr{(U n ,X n , X n ) / ∈ K n } → 0 for large enough n, and therefore using [51, Lemma 1], it guarantees that (45) holds. For detailed discussions of the above lemma, the readers may refer to the appendix in [51].
The following lemma is needed for the analysis of the privacy-leakage rate. The lemma was proved in [53,Lemma 4] for a strong typicality set [36] and [47, Lemma A4] for a modified-weak typicality set [9]. Here, a different proof, based on the information-spectrum methods, is given.
Lemma 2: We have that where r n = 1/n(1 − log(1 − γ )) + γ log |X |, and r n tends to zero as n approaches infinity and γ ↓ 0. Proof: The proof is given in Appendix B. □ Analysis of Privacy-Leakage Rate: For (5), we have that I (X n ; J, Z n |C n ) = I (X n ; J |C n ) + I (X n ; Z n |J, C n ) (a) where (a) holds because for a given C n , the Markov chain J − X n − Z n holds, and (X n , Z n ) are independent of C n , and (b) follows because conditioning reduces entropy.
Next, we focus on bounding the term I (X n ; J |C n ) in (47): = n R J − n H (X |X ) + H (X n |X n , g n (X n ), C n ) + I (X n ; g n (X n )|X n , J, C n ) = n R J − n H (X |X ) + H (X n |X n , g n (X n ), C n ) where (c) holds as (X n , X n ) are independent of C n , (d) follows because J is a function of g n (X n ), i.e., J = φ(g n (X n )), (e) is due to the Markov chain g n (X n ) − (X n , J ) − Y n , (f) follows because conditioning reduces entropy, (g) follows as the codeword g n (X n ) can be estimated from (J, Y n ) with high probability, and thus Fano's inequality is applied, (h) follows from Lemma 2, (i) is due to the Markov chain U −X − X , and the last equality holds as we set R J = I (X ; U |Y ) + 4γ . Merging (47) and (48), we obtain that I (X n ; J, Z n |C n ) ≤ n(I (X ; U |Y ) + I (X ; Z ) + 8γ ) for large enough n, which gives the desired bound on the privacy-leakage rate constraint (5) in Definition 1, and this also hints that the decreased lower bound on the privacy-leakage rate in (39)  Define a binary random variable T = 1{(g n (X n ),X n , X n ) ∈ K n }, where 1{·} denotes the indicator function. When T = 0, using Lemma 1, it is straightforward that E C n [P T (0)] ≤ γ .
In the rest of equation developments, let c n andũ n be realizations of the random codebook C n and the mapping function g n (X n ), namely,ũ n = g n (x n ), respectively. The conditional entropy on the left-hand side of (46) can be evaluated as ≤ H (T ) + H (X n |X n , g n (X n ), T, C n ) ≤ 1 + E C n [P T (0)]H (X n ) + c n P T,C n (1, c n )H (X n |X n , g n (X n ), T = 1, C n = c n ) where the last inequality is due to Lemma 1. Next, we concentrate only on bounding the conditional entropy H (X n |X n , g n (X n ), T = 1, C n = c n ) in (50). For a given C n = c n , we define the following probability distribution P g n (X n )X n X n |T (ũ n ,x n , x n |1) =    P g n (X n )X n X n (ũ n ,x n , x n ) and P T (1) = (ũ n ,x n ,x n )∈K n P g n (X n )X n X n (ũ n ,x n , x n ), which is obvious from the definition of the random variable T . For every tuple (ũ n ,x n , x n ) ∈ K n , observe that PX n |X n g n (X n )T (x n |x n ,ũ n , 1) (a) = P g n (X n )X n X n (ũ n ,x n , x n ) P g n (X n )X n T (ũ n , x n , 1) ≥ P g n (X n )X n X n (ũ n ,x n , x n ) P g n (X n )X n (ũ n , x n ) = PX n |g n (X n )X n (x n |ũ n , x n ), where (a) is due to (51). Also, we have that log 1 PX n |X n g n (X n ) (x n |x n ,ũ n ) = log PX n |X n U n (x n |x n ,ũ n ) PX n |X n g n (X n ) (x n |x n ,ũ n ) + log PX n |X n (x n |x n ) PX n |X n U n (x n |x n ,ũ n ) + log 1 PX n |X n (x n |x n ) (b) ≤ log PX n |X n U n (x n |x n ,ũ n ) PX n |X n g n (X n ) (x n |x n ,ũ n ) − n(I (X ; U |X ) − γ ) + n(H (X |X ) + γ ) = log PX n |X n U n (x n |x n ,ũ n ) PX n |X n g n (X n ) (x n |x n ,ũ n ) + n(H (X |X, U ) + 2γ ) (53) for all large n, where (b) follows because the condition of the set K n is applied to the second term, and using the law of large numbers, the i.i.d. property of (X n , X n ) guarantees that log 1 PX n |X n (x n |x n ) ≤ n(H (X |X ) + γ ) for large enough n. In light of (50), we have that H (X n |X n , g n (X n ), T = 1, C n = c n ) ≤ H (X n |X n , g n (X n ), T = 1) = (ũ n ,x n ,x n )∈K n P g n (X n )X n X n T (ũ n ,x n , x n , 1) Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
· log 1 PX n |X n g n (X n )T (x n |x n ,ũ n , 1) (c) ≤ (ũ n ,x n ,x n )∈K n P g n (X n )X n X n T (ũ n ,x n , x n , 1) · log 1 PX n |X n g n (X n ) (x n |x n ,ũ n ) (d) ≤ (ũ n ,x n ,x n )∈K n P T (1) · P g n (X n )X n X n |T (ũ n ,x n , x n |1) · log PX n |X n U n (x n |x n ,ũ n ) PX n |X n g n (X n ) (x n |x n ,ũ n ) + n(H (X |X, U ) + 2γ ) where (c) and (d) follow from (52) and (53), respectively, and (e) is due to (55) shown below. To derive (55), we define A n = (g n (X n ),X n , X n ) and a n = (ũ n ,x n , x n ) for brevity. From (51), it follows that P g(X n )X n X n (ũ n ,x n , x n ) = P T (1) · P A n |T (a n |1), and thus we have that a n ∈K n P A n |T (a n |1) log PX n |X n U n (x n |x n ,ũ n ) PX n |X n g n (X n ) (x n |x n ,ũ n ) (f) ≤ log a n ∈K n P A n |T (a n |1) · PX n |X n U n (x n |x n ,ũ n ) PX n |X n g n (X n ) (x n |x n ,ũ n ) = log a n ∈K n P g n (X n )X n (ũ n , x n ) · PX n |X n U n (x n |x n ,ũ n ) P T (1) ≤ log (ũ n ,x n )∈Q n ×X n P g n (X n )X n (ũ n , x n ) · x n ∈X n PX n |X n U n (x n |x n ,ũ n ) − log P T (1) where (f) is due to Jensen's inequality and (g) follows because Lemma 1, implying that P T (1) ≥ 1 − γ , is used. Lastly, substituting (54) into (50), it follows that H (X n |X n , g n (X n ), C n ) ≤ n(H (X |X, U ) + 2γ + r n ), (56) where r n = 1/n(1 − log(1 − γ )) + γ log |X |. □

APPENDIX C PROOF OF THEOREM 4
Note that due to the uniformity of the sources, the reverse channel P X |X also results in a binary symmetric channel with crossover probability p, and the entropies H (X ), H (X ), and H (Z ) are all equal to one. Achievability: We begin by proving the inner region of R G . Observe that each rate constraint in the region can be bounded as follows: where (a) follows because Y = X with probability 1 − q, and (b), (c), and (d) are achieved by considering the test channel P U |X to be a binary symmetric channel with crossover probability β. For the CS model, we argue only for the storage rate as the others follow the same analysis seen in the GS model: where (a) follows from the Markov chain U −X − Z . □ Converse Part: Before the proof, we introduce a simple lemma that will be used to match the inner and outer bounds of the capacity regions for binary sources.