On the Zero-Error Capacity of the Modulo-Additive Noise Channel With Help

The zero-error helper capacity of the modulo-additive noise channel is studied both in the presence and in the absence of feedback. In its presence, a complete solution of said capacity is provided. In its absence, a solution is provided when the alphabet size is prime. For all other cases, upper and lower bounds are derived, and a necessary and sufficient condition for positivity is provided. Thanks to the help, the zero-error capacity may increase by more than the help’s rate, and it can be positive yet smaller than one bit.


I. INTRODUCTION
T HIS paper investigates the extent to which the zero-error capacity can benefit from a rate-limited description of the noise.We study both encoder assistance, where the description is provided to the encoder before transmission begins, and decoder assistance, where it is provided to the decoder.We show that, perhaps paradoxically, the zero-error helper capacity can be calculated as a function of the description rate even for some channels whose no-help zero-error capacity is unknown.This is not a contradiction, because a zero-rate description is not tantamount to no description: it still allows for a binary description of length that is sublinear in the blocklength.In fact, as we shall see, the solution of the zerorate help case is the key to the general solution.
We focus on the memoryless modulo-additive noise channel (MMANC) whose time-k output Y k corresponding to the time-k input x k is where {Z k } ∼ IID Q Z is the channel noise; x k , Z k , and Y k all take values in the set A = 0, 1, . . ., |A| − 1 ; and "⊕" denotes mod-|A| addition.The channel law where "⊖" denotes mod-|A| subtraction.A key role is played by the cardinality |S| of the support set S of Q Z Example 1: When |S| = 2, the MMANCs corresponding to |A| being 3, 5, and 7 are, respectively, the Triangle channel, Shannon's Pentagon channel [1], and the Heptagon channel (a.k.a. the 3/2, 5/2, and 7/2 channels, respectively).
In the presence of a noiseless feedback link from the receiver to the encoder, we calculate the zero-error helper capacity both for encoder and for decoder assistance (Theorem 3).In its absence, we derive upper and lower bounds on the zero-error helper capacity (Theorem 5) and establish a positivity result for the zero-error capacity (Corollary 3): if the assistance rate is positive, then so is the capacity; otherwise, the capacity is positive if and only if (iff) the support S of the noise is a strict subset of A. When the cardinality of A is prime (as in Example 1) we calculate the zero-error helper capacity in Theorem 4 using structured codes.Calculating the zero-error helper capacity without feedback when |A| is not prime is left as an open problem.
The rest of the paper is organized as follows.Section II introduces some notation, defines the key quantities of interest, and surveys some of the literature that touches on this work.Section III presents the paper's main results and some of their consequences.The proof of Theorem 3 pertaining to feedback is presented in Section IV.The proofs of Theorems 4 and 5 pertaining to the no-feedback setting are presented in Sections V-A and V-B respectively.

A. Notation
Unless stated otherwise, all logarithms in this paper are to base 2. The positive integers are denoted Z + , and if n ∈ Z + , then [n] denotes the set {1, 2, . . ., n}.The cardinality of a set K is denoted |K|, and the set of all probability mass functions (PMFs) on it P(K).
Mod-|A| addition "⊕" and mod-|A| substraction "⊖" are extended to n-tuples, which are usually designated in boldface, componentwise: x ⊕ y = x 1 ⊕ y 1 , . . ., x n ⊕ y n (4) 1 Throughout this paper, "rare-error capacity" and "rare-error feedback capacity" refer to the supremum of the achievable rates, in the sense that the probability of error tends to zero as the blocklength tends to infinity [2].We refrain from calling it Shannon capacity lest it be confused with the Shannon capacity of a graph.and likewise for x ⊖ y.If B ⊆ A n is a set of n-tuples, then B * denotes B \ {0}, i.e., B without the all-zero n-tuple.For B, B ′ ⊆ A n , we denote the sumset and the difference set by and for x ∈ A n , we write x ⊕ B and x ⊖ B for {x} ⊕ B and {x} ⊖ B. We use {ξ} + to denote max{0, ξ}.

B. Definitions and Preliminaries
A blocklength-n code for a (general) discrete memoryless channel (DMC) Q Y |X (•|•) with input alphabet X and output alphabet Y, consists of a message set M = 1, 2, . . ., |M| and an encoding function Since the codewords x(1), . . ., x(|M|) need not be distinct, the codebook C = x(1), x(2), . . ., x(|M|) is a multiset (i.e., an unordered collection of elements that may repeat).Its cardinality is |M|.A sequence of codes, indexed by the blocklength n, is said to have transmission rate lim inf n→∞ 1 n log |M|.The zero-error capacity C 0 [1] is the supremum of rates R for which there exists a sequence of rate-R codes, indexed by the blocklength, that fulfill the zero-error requirement that to every output sequence y ∈ Y n there correspond at most one compatible message, i.e., a message m ∈ M for which Here A necessary and sufficient condition for C 0 to be positive is that there exist channel inputs . This characterization can be used, for example, to conclude that C 0 is zero for the Triangle channel.It also shows that, whenever C 0 is positive, we can transmit a bit by using the channel once (with the input x or x ′ ).Consequently, C 0 cannot be positive yet smaller than one.As we shall see, this is not the case in the presence of help (Remark 3).
Determining the zero-error capacity for general DMCs is an open combinatorial problem and is one of the holy grails of information theory.It is known for some specific channels, including the Pentagon channel: Shannon showed that 1  2 log 5 ≤ C 0 ≤ log 5 2 in 1959 [1], and Lovász proved, using algebraic graph theory, that the lower bound is tight in 1979 [11].The zero-error capacity of the 7/2 channel is to date unknown.
The problem is greatly simplified if the time-i channel input may depend not only on the message m but also on the past channel outputs y i−1 that are revealed to the encoder via a feedback link from the channel output to the encoder.A blocklength-n encoder now consists of n functions , and the zero-error feedback capacity C 0F is defined like C 0 except that x(m) in ( 7) is replaced by x(m, y) = (x 1 (m), x 2 (m, y 1 ), . . ., x n (m, y n−1 )). 2 Since the encoder may ignore the feedback link, The zero-error feedback capacity C 0F was determined by Shannon: Theorem 1 ( [1]): On a DMC, if C 0 = 0, then the zero-error feedback capacity C 0F is also zero.Else, C 0F = − log π 0 , where and X y comprises the inputs that can induce the output letter y with positive probability: Note that, since C 0F > 0 iff C 0 > 0, and since C 0F ≥ C 0 , also C 0F cannot be positive yet strictly smaller than one.We shall see that this is not true in the presence of zero-rate help (Remark 3).
Applying Theorem 1 to the MMANC yields the following corollary.
Corollary 1: On the MMANC, if C 0 = 0, then the zero error feedback capacity C 0F is also zero.Else, Proof: We can lower-bound π 0 by lower bounding the maximum over y ∈ A by the average: where in (13) we lower-bounded the maximum over A by the arithmetic average; and (14) holds since in the double sum, each x ∈ A is contained in X y for exactly |S| different y's in A. The corollary follows by noting that this lower bound is tight as can be seen by considering P equiprobable.■ Henceforth, we focus on MMANCs.A helper is an altruistic party that has no message to send and only wishes to assist the transmission.To do so, it observes the noise sequence Z = Z n (noncausally); it produces a rate-limited description T of it; and it reveals the description to the encoder, or to the decoder, or to both.It is incognizant of the transmitted message.More formally, a blocklength-n helper, represented by the helping function h : A n → T , observes the noise sequence Z and describes it as T = h(Z), with T taking values in a finite set T .For a given sequence of coding schemes, the help rate R h is defined as lim sup n→∞ 1 n log |T |.We distinguish between two kinds of assistance: Decoder assistance corresponds to the scenario where the description T is revealed to the decoder, as in Fig. 1a.In this scenario we use C 0,dec (R h ) to denote the supremum of rates R for which there exists a sequence of zero-error coding schemes (without feedback) with transmission rate at least R, and with help rate no larger than R h .By zero-error we now mean that for any y ∈ A n and t ∈ T , at most one message m is compatible with (y, t) in the sense that In the presence of feedback, we denote the analogous capacity C 0F,dec (R h ): we merely replace x(m) with x(m, y) in (16).
Encoder assistance corresponds to the scenario where T is revealed noncausally to the encoder, as in Fig. 1b.In the absence of feedback, the encoding function is f : M × T → A n , (m, t) → x(m, t), and C 0,enc (R h ) is defined with the requirement that to every y ∈ A n there correspond at most one compatible message m in the sense that3 and C 0F,enc (R h ) is defined analogously by replacing x(m, t) with x(m, t, y) = (x 1 (m, t), x 2 (m, t, y 1 ), . . ., x n (m, t, y n−1 )) in (17).
Assumption 1: We shall assume throughout that and If |S| = 1, then the noise is deterministic and even without feedback or help, the zero-error capacity is log |A|.And, for any sequence of helpers h : for all z ∈ S n , so that h(Z) and h ′ (Z) are identical with probability one.
In this paper, we present results on C 0,dec (R h ) and C 0,enc (R h ), the zero-error capacity in the presence of decoder or encoder assistance, and on C 0F,dec (R h ) and C 0F,enc (R h ), the analogous quantities in the presence of feedback.

C. Related Work
Related to our work is the following theorem [12] on the case where the noise is of full support: Theorem 2 (MMANC With Noise of Full Support [12]): On the MMANC with rate-R h decoder or encoder assistance, if S = A, then and in particular, and Our present work extends this result by studying the general case where the noise need not be of full support.As we shall see ahead (Corollary 3), the condition S = A is also necessary for (20b) to hold, and likewise for (20c).We also study the effect of feedback on the zero-error helper capacity.
Also related to our results is the work of Merhav on error exponents [8].To see the relevance, note that on a DMC, the Reliability Function E(R) equals infinity iff R can be achieved with zero error.The intuition is the following.Let the transition matrix be Q(y|x) and the input sequence be x n , then all output sequences of positive probability have probability at least α n , where Consequently, whose contrapositive can be rewritten as For our MMANC, Merhav [8, Eq. ( 57)] derived an upper bound on the Reliability Function for R < log |A|, This upper bound is finite iff R > log |A| − log |S| + R h : only in this range of rates, there exists a PMF QZ that is feasible in the minimization and satisfies supp( QZ ) ⊆ S.This implies an upper bound on the zero-error capacity When |A| is a prime, our results (Theorem 4) show that this upper bound is tight.

A. Feedback Link Present
The following theorem addresses the zero-error helper capacity of the MMANC in the presence of a feedback link.
Theorem 3 (Assistance and Feedback): On the MMANC with feedback and rate-R h decoder or encoder assistance,

B. Feedback Link Absent
When the cardinality |A| of the alphabet A is prime, the zero-error helper capacity without feedback is determined in the following theorem: Theorem 4 (Assistance Without Feedback: Prime Cardinality): On the MMANC with rate-R h decoder or encoder assistance, if |A| is prime, then When |A| is not necessarily a prime, we provide the following upper and lower bounds: Theorem 5 (Assistance Without Feedback: General Cardinality): On the MMANC with rate-R h decoder assistance, and These bounds also hold for C 0,enc (R h ).Theorem 5 has two corollaries.The first characterizes the zero-error helper capacity when the noise support "tessellates" the alphabet A and thus strengthens Theorem 2.
Corollary 2 (Special MMANCs): Proof of Corollary 2: Follows from Theorem 5 by substituting log |A| − log |S| for C 0 on the RHS of (28) and noting that the result matches the RHS of (27).
■ When S = A, which implies that C 0 = 0 and hence that C 0 = log |A| − log |S|, the corollary recovers Theorem 2. But see Corollary 3 ahead for a stronger statement.
For another application of this corollary, consider the MMANCs with |A| = 4 and with S = {0, 1} or S = {0, 2} (so |S| = 2).In both cases The second corollary to Theorem 5 provides a necessary and sufficient condition for the positivity of the zero-error helper capacity.
Corollary 3 (Positivity): The following statements are equivalent: Proof of Corollary 3: The equivalence of iii), iv), and v) follows from Theorem 3 on feedback.The implications v) =⇒ i) and v) =⇒ ii) follow from (27) and the analogous result for C 0,enc (R h ); the implication i) =⇒ v) follows from (28) as follows: the proof that ii) =⇒ v) is similar.■ The above theorems and corollaries have some noteworthy implications: Remark 1 (Benifit of Assistance): Assistance can increase the zero-error capacity by more than its rate.Even zero-rate assistance can increase the zero-error capacity: on the Pentagon channel, it raises the zero-error capacity from Lovász's 1 2 log 5 to log 5  2 , i.e., to C 0F (Corollary 1); on the Triangle channel, it raises the zero-error capacity from zero to log 3  2 , which even exceeds C 0F (the latter being zero for this channel).
Remark 2: Thanks to Theorem 4, the zero-error capacity with a helper can sometimes be determined even if it is unknown in the absence of help, e.g., for the Heptagon channel.
Remark 3 (Less Than One Bit): As on the Gel'fand-Pinsker channel with feedback [13], in all cases (with or without feedback, and with decoder or encoder assistance), the zero-error capacity can be positive yet smaller than 1 bit.This is not the case in the absence of assistance.

IV. FEEDBACK LINK PRESENT
In this section, we study the zero-error feedback capacity with helper and establish Theorem 3; see Fig. 1a and 1b with the feedback link.To this end, we need the following lemma, stating that feedback does not increase the helper rare-error capacity on the MMANC.
Lemma 1: On the MMANC with feedback and rate-R h decoder or encoder assistance, the rare-error capacities are given by Proof: In light of [3,Theorem 12] and [4, Theorem 8], which establish that the RHS of (33) can be achieved without feedback, we only need to prove a converse.To that end, we prove the stronger claim that-even if the description T is presented to both encoder and decoder-the rare-error feedback capacity does not exceed the RHS of (33).We assume R h ≤ H(Q Z ), because otherwise the result is obvious.
Consider a message M that is drawn equiprobably from the message set M. For any sequence of coding schemes of rate R with rate-R h assistance and vanishing probabilities of error, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
≤ H(Y) − H(Z) where (36) holds for some {δ n } tending to zero by Fano's inequality; (37) and (41) hold because T is a function of Z, so (Z, T ) is independent of M ; and (40) holds because, in the presence of feedback and help, Z is a function of (Y, M, T ) Dividing the inequalities by n and letting n tend to infinity establishes the converse.■ Proof of Theorem 3: We first establish the converse for decoder assistance.If QY |X is any auxiliary MMANC over A of noise PMF QZ ∈ P(A) that is absolutely continuous with respect to Q Z (i.e., whose support is contained in S, denoted by QZ ≪ Q Z ), then its rare-error feedback capacity with decoder assistance CF,dec (R h ) forms an upper bound on C 0F,dec (R h ), because any error-free coding scheme for the original channel is also error-free on the auxiliary channel.Indeed, for any y ∈ A n and t ∈ T , the absolute continuity hypothesis implies that so if a message m is compatible with (y, t) on the auxiliary channel (in the sense that Qn Y |X (y|x(m, y)) > 0 and h(y ⊖ x(m, y)) = t), then it is also compatible with (y, t) on the original channel.
Therefore, upon minimizing over the choice of QY |X to get the tightest bound, = min where (47) follows from Lemma 1, and (48) holds because, subject to a support constraint, the uniform PMF maximizes entropy.Similar arguments apply also to encoder assistance.We now turn to the direct part.
• Case 1: R h ≥ log |S|.In this case feedback is unnecessary.The codebook comprises all the distinct sequences in A n .Using ⌈n log |S|⌉ bits, the helper describes the noise sequence Z precisely.The decoder (resp.encoder) subtracts the noise from the received sequence (resp.from the codeword to be transmitted), so the codeword and the message can be received error-free.This establishes the achievability of log |A| bits per channel use.
• Case 2: R h = 0.A two-phase coding scheme is proposed.In Phase 1, we follow the construction (for a uniform input distribution) in Shannon's proof of Theorem 1 in [1], where the encoder sequentially reduces the decoder's ambiguity.In the i-th channel use, thanks to the feedback, the encoder reconstructs the list of messages compatible with Y i−1 and evenly assigns them to different input symbols (in a manner agreed upon with the decoder prior to transmission).
Only |S| of the |A| input symbols are compatible with Y i , and the number of compatible messages is reduced by a factor of roughly |S| |A| .More precisely, Shannon showed that if |M| = |S| |A| −n , then after n channel uses, the number of compatible messages is at most |A| 2 .The final ambiguity is removed in Phase 2, where the helper comes into play.Since the messages that are compatible with the outputs from Phase 1 are known to the encoder, and since their number does not exceed |A| 2 , the encoder can inform the decoder which compatible message was sent in two additional clean channel uses.To clean these two channel uses, the helper informs the decoder (resp.encoder) of the exact value of Z n+2 n+1 ∈ S 2 and the decoder (resp.encoder) subtracts the noise after (resp.before) the transmission.The rate of help is therefore and the transmission rate • Case 3: 0 < R h < log |S|.We divide the transmission block into two parts of relative length Rh log |S| and 1 − Rh log |S| .We then apply the aforementioned coding schemes for helper rates of log |S| and zero, respectively.The total rate achieved by this time-sharing scheme is

V. FEEDBACK LINK ABSENT
In this section, we provide proofs pertaining to the zero-error helper capacity in the absence of feedback; see Fig. 1a and 1b without the feedback link.

A. Prime Cardinality
We begin with the case where |A| is a prime, which we denote p.We denote the cardinality-p finite field F p and identify it with the set Z p = {0, . . ., p − 1} with mod-p arithmetic.
Proof of Theorem 4: Since feedback cannot hurt, it follows from Theorem 3 that we only need to prove the direct part.This is trivial unless |S| < |A|, which we proceed to assume.We first focus on decoder assistance.
• Case 1: R h ≥ log |S|.The achievability in this case is as in the proof of Theorem 3, where the feedback link is ignored.
• Case 2: R h = 0. We will construct a sequence of blocklength-n codebooks of rate log |A| |S| − ϵ n that can be decoded error-free utilizing rate-ϵ ′ n decoder assistance, for some {ϵ n } and {ϵ ′ n } tending to zero.The codes we construct have two key properties.The first is that they are L-list-decodable [14], [15], [16] where L ∈ Z + grows subexponentially with n.That is, every y ∈ A n is compatible with at most L messages.This guarantees that the decoder's ambiguity could be eliminated with a sublinear Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.number of bits.Elias [14] established the existence such codebooks of rate log |A| |S| − Θ(L −1 ).But this is not enough, because, in the absence of feedback, neither the transmitter nor the helper can determine the list facing the decoder.This is where the second property comes in: To overcome this issue and enable the helper to remove the ambiguity, we shall impose a linear structure on the code, and this is where the assumption that |A| is a prime will be essential: it will allow us to view A as a field.
The existence of linear L-list-decodable codes can be established using a variation on a theme by Elias [14] using tools that were used successfully in the analysis of random linear codes (e.g.[17], [18], [19]).
Lemma 2: Consider a MMANC with |A| = p, where p is prime.Given L ∈ Z + , define Then, for any n ∈ Z + , there exists a blocklength-n linear code over the field F p of rate log |A| n ⌊ nR L log |A| ⌋ that is L-list-decodable.Proof: Assume R L > 0 (because otherwise there is nothing to prove).Given some blocklength n, let the message set be M = F k p , with k ∈ Z + to be specified later.A generic message m ∈ M is thus represented by a k-vector, and the transmission rate is k/n in base-p logarithm, or (k/n) log 2 p bits per channel use.
Pick a random (n × k)-matrix A whose entries are drawn IID equiprobably from F p , and consider the encoding rule m → X(m) = Am.Let C be the random linear code (multiset) it induces.This encoding rule maps any ℓ linearly independent messages to independent codewords, each having IID equiprobable random components.
Among any (L + 1) messages, at least ℓ ≜ ⌈log p (L + 1)⌉ are linearly independent, so the probability that there exists some y ∈ A n compatible with (L + 1) messages is upper bounded by the probability that there exists some y ∈ A n that is compatible with ℓ linearly independent messages.The latter, by the Union Bound, is strictly smaller than (because C n , being L n -list-decodable, contains no codeword more than L n times).This latter property and the fact that {L n } is subexponential imply that {C ′ n } has the desired rate: where (61) follows from (60) and the fact that {L n } is subexponential; and (64) holds because {L n } tends to infinity.We next show that-although the helper is incognizant of the list of messages that are compatible with the received sequence-a ⌈log L n ⌉-bit description of the noise sequence (which is of zero rate as L n is subexponential in the blocklength n) suffices to guarantee zero-error transmission of the codebook C ′ n .To this end, we propose the following helper.To simplify its description, we drop the subscript n.
For z, z Since C ′ is a subgroup of F n p , this relation is an equivalence relation, and z ∼ z ′ , i.e., z and z ′ are equivalent iff z and z ′ belong to the same coset of C ′ .We shall use [z] ⊆ S n to denote the equivalence class containing z.
Our proposed helper assigns labels (descriptions) only to noise sequences in S n , and it does so in such a way that, unless identical, equivalent noise sequences are assigned differing labels.Such a helper leads to zero errors, because if x ∈ C ′ is transmitted and x ⊕ z is received (where z ∈ S n ), then the decoder can confuse x with some x ′ only if: (i) x ′ is also a codeword; (ii) x ⊕ z = x ′ ⊕ z ′ for some z ′ ∈ S n ; and (iii) z Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.and z ′ have the same label.The former two conditions imply that z ∼ z ′ , and hence that z and z ′ are identical or else of differing labels.The third condition then implies that they are, in fact, identical, so x ′ equals x.
It remains to verify that we can find a labeling rule as above with at most L different labels.This will follow once we show that, for every z ∈ S n , To establish (66), we note that the L-list-decodability property of C ′ , namely i.e., to every coset of C ′ intersecting S n in at most L points.This establishes (66) and concludes the achievability proof.
The case with encoder assistance is essentially identical.If R h ≥ log |S|, the rate log |A| is achievable as in the proof of Theorem 3. If R h = 0, the relation holds because, in the presence of encoder assistance, any zerorate help to the encoder can be conveyed to the decoder with negligible extra help and negligible loss in rate: the encoder simply appends a frame to convey the help, with the frame being of sublinear length (because the help to be conveyed is of zero rate); it requests that the helper provide it with a precise description of the noise affecting the frame (with the extra help being negligible because the frame is short); and it subtracts that noise from the transmission in that frame so as to render it noise free.For intermediate values of R h , the achievability follows by time sharing.■ Remark 4 (Gap to Capacity vs L): By Lemma 2, as the number of labels L tends to infinity, it is possible to communicate error-free at transmission rates that converge to the zero-error helper capacity, with the gap to capacity decaying in L like O (1/ log L).Although irrelevant to the computation of the capacity, it might be interesting to investigate whether the gap to capacity can decay faster in L.
Remark 5 (L-List-Decodability and (ℓ, L) Recoverability): On the MMANC, L-list-decodability is related to (zero-error) list-recoverability [19]: Given ℓ, L ∈ Z + and a finite set X , a codebook C ⊆ X n is (ℓ, L)-list-recoverable if for any collection of n subsets S 1 , S 2 , . . ., S n of X , each of which has no more than ℓ elements, On the MMANC, (|S|, L)-list-recoverability implies L-listdecodability, because, given any output sequence y ∈ A n , we can substitute y i ⊖ S for each S i in (70) to recover L-listdecodability.
Remark 6: If instead of defining R L as in (52), we defined then the resulting weaker version of Lemma 2, while still sufficient for our purposes, could have been recovered from the literature on (ℓ, L)-list-recoverability, specifically from the result that a random linear code of such rate is (|S|,L)-listrecoverable with high probability [17], [20] and, a fortiori, with positive probability.Remark 7: The factor of |S| in the numerator of the second term on the RHS of (71), which is absent from (52), can be improved for large |A| using [21, Theorem 5.1] and [20].

B. General Case
We now turn to the general case where |A| need not be prime and establish Theorem 5.
Proof of Theorem 5: As in the proof of Theorem 4, we only need to prove the direct part, and we focus on decoder assistance, so our goal is to establish that The achievability for encoder-assistance will then follow as in the proof of Theorem 4. To establish (72), we propose the following coding scheme based on time sharing.
• Case 1: R h = log |S|.That C 0,dec (R h ) ≥ log |A| follows from the proof for Theorem 3, where the feedback link is not utilized.
To show that we first introduce some notation.Given a codebook C ⊆ A n and a noise sequence z ∈ S n , let F z (C)-or F z for shortbe the confusion set of z comprising the noise sequences confusable with z: The proposed helper assigns confusable noise sequences different labels: if z ′ ∈ F z , then the labels assigned to z and z ′ are different.This guarantees error-free recovery of the noise sequence and hence, if the code has no repeating codewords, also of the transmitted message.
As we next argue, the number of different labels required is at most Indeed, the number of required labels is the chromatic number of the confusion graph of the noise sequences, i.e., the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
undirected graph with vertices S n and with z connected to z ′ if z ′ ∈ F z .This graph is well-defined because The degree of a vertex z in this graph is F z , and our claimed upper bound on the number of required labels follows from the fact that the chromatic number of any graph is upper bounded by its maximum degree plus 1.
It remains to establish the existence of a code with no repeating codewords that induces small confusion sets.This is established by the following lemma, whose proof is postponed to the appendix.
Lemma 3: On the MMANC, let L > 3 be a positive integer, and define Then for any n ∈ Z + , there exists a codebook of cardinality ⌊2 nR L ⌋, of differing codewords, and for which With the aid of the lemma, we can conclude the achievability for zero-rate help using arguments similar to those we used in the proof of Theorem 4: Consider a sequence of blocklength-n codebooks whose existence is guaranteed by Lemma 3 when we substitute L n for L, where {L n } tends to infinity subexponentially in n.With these codebooks and the proposed helper, transmission is error-free, the helping rate is zero, and the transmission rate approaches 1  2 log |A| |S| , thus proving (73).
• Case 3: 0 < R h < log |S|.Follows from time sharing, by dividing the transmission block into two parts of relative length Rh log |S| and 1 − Rh log |S| and applying the aforementioned schemes. ■

APPENDIX
Proof of Lemma 3: Without loss of generality, assume R L > 0 and |S| n > L (otherwise the result is obvious).Define and let M = {1, . . ., 2 nR }.Generate a random codebook C = {X(1), . . ., X(|M|)} by drawing its codewords independently, each equiprobably from A n .We will show that with positive probability, the properties (i) C contains no repeating codewords, and (ii) max z∈S n F z ≤ L − 1 hold simultaneously.
Using the Union Bound, we can upper-bound the probability that Property (i) is violated as follows: As for Property (ii), where (85) follows from the Union Bound; (86) follows from the definition of F z in (75); and in (87) we introduced To analyze the probabilities appearing on the RHS of (87), we first rule out the degenerate cases.We say that a collection of n-tuples ξ 1 , . . ., Lemma 4: Among any L distinct n-tuples ξ 1 , . . ., ξ L ∈ A n , there exist at least ⌈log 3 L⌉ that are tri-independent.
■ With the aid of Lemma 4, and defining we can return to the RHS of (87) to conclude that Ignoring, for now, the constraint that ξ 1 , . . ., ξ ℓ be in S n ⊖ z, we claim: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Claim 1: Given ℓ tri-independent n-tuples ξ 1 , . . ., ξ ℓ ∈ A n , and a random codebook C generated as above Proof: Since {ξ 1 , . . ., ξ ℓ } are tri-independent and hence, a fortiori, nonzero, the event Hence, by the Union Bound, where (94) follows from Lemma 5 ahead and from the fact that the number of terms on the RHS of ( 93) is upper bounded by 2 2nRℓ .■ From (87) and (90) we now obtain where (96) follows from the Union Bound, and (97) follows from Claim 1. From ( 83) and (98) (and the Union Bound) we infer that, the two properties (i) and (ii) hold simultaneously with probability strictly larger than where (99) holds because, by ( 78) and (80) Thus, with positive probability, the random codebook C satisfies both desired properties simultaneously.This concludes the proof of Lemma 3 (assuming Lemma 5 ahead).■ We next state and prove Lemma 5. Lemma 5: Let the ℓ n-tuples {ξ i } ℓ i=1 be tri-independent, and let {(m i , m , where e i denotes the edge (m i , m ′ i ).If X(1), . . ., X(|M|) are drawn IID, each equiprobably from A n , then the probability Pr equals |A| −nℓ if the graph G is acyclic, 4 and equals zero otherwise.
We next consider the case where the graph G is acyclic.A fortiori, G contains no loops, so is tantamount to proving that the events X(m i ) ⊖ X(m ′ i ) = ξ i i∈ [ℓ] are independent, because it can be readily verified that for any m ̸ = m ′ and ξ ∈ A n , Pr X(m) ⊖ X(m ′ ) = ξ = 1 |A| n .Because the codewords are chosen independently, the events corresponding to edges in different connected components of G are independent.We can therefore focus on one non-empty connected component, say G 1 = (V 1 , E 1 ), and show that the events corresponding to E 1 are independent.That is, we need to show that To prove this, we will show by induction on |E 1 | that the system of linear equations (with |E 1 | equations and |V 1 | variables) corresponding to this connected component, namely, has |A| n solutions.This is to be expected because G is acyclic, so G 1 is a tree, and hence If |E 1 | = 1, there are two variables and one equation, so the number of solutions is, indeed, |A| n .If |E 1 | ≥ 2, let m 0 be a degree one vertex of G 1 (which is guaranteed to exist because G 1 is a tree).As such, x(m 0 ) appears in only one of the equations, and it is therefore uniquely determined by the remaining (|V 1 | − 1) variables.The number of solutions is thus as for the system that remains when we remove x m0 and the equation in which it appears from the system, which leaves us with (|E 1 | − 1) equations corresponding to the induced subgraph G 1 [V 1 \ {m 0 }].This subgraph is still a tree, and hence, by the induction hypothesis, this restricted system with (|E 1 | − 1) equations and (|V 1 | − 1) variables has |A| n solutions.This concludes the induction and establishes that, as we claimed, the system of equations (111) has |A| n solutions.
We can now complete the proof of (110): (113) where the last equality holds because G 1 is a tree, so
We now use Lemma 2 to complete the proof of Theorem 4 for the case of R h = 0. Let {L n } be a sequence of positive integers tending to infinity subexponentially, e.g., L n = Θ(n).The lemma implies that, for every blocklength n, there exists a linear codeC n of rate log |A| n ⌊ nR Ln log |A| ⌋ that is L n -list-decodable.A minor annoyance is that the codewords in C n need not be distinct.To overcome this, we consider the code C ′ where (53) holds because-when m 1 , . . ., m ℓ are linearly independent-the codewords X(m 1 ), . . ., X(m ℓ ) are independent, each having IID equiprobably distributed components, and because |X y | = |S| for every y ∈ A; and in (56) we choose k = ⌊n log p