The Levenshtein’s Sequence Reconstruction Problem and the Length of the List

In the paper, the Levenshtein’s sequence reconstruction problem is considered in the case where the transmitted words are chosen from an <inline-formula> <tex-math notation="LaTeX">$e$ </tex-math></inline-formula>-error-correcting code, at most <inline-formula> <tex-math notation="LaTeX">$t$ </tex-math></inline-formula> substitution errors occur in each of the <inline-formula> <tex-math notation="LaTeX">$N$ </tex-math></inline-formula> channels and the decoder outputs a list of length <inline-formula> <tex-math notation="LaTeX">$\mathcal {L}$ </tex-math></inline-formula>. Previously, when <inline-formula> <tex-math notation="LaTeX">$t = e+\ell $ </tex-math></inline-formula> and the transmitted word is long enough, the numbers of required channels were determined for <inline-formula> <tex-math notation="LaTeX">$\mathcal {L}=1, 2~\text {and }~\ell +1$ </tex-math></inline-formula>. Here we determine the exact number of channels in the cases <inline-formula> <tex-math notation="LaTeX">$\mathcal {L}= 3, 4, \ldots, \ell $ </tex-math></inline-formula>. This also provides the size of the largest intersection of <inline-formula> <tex-math notation="LaTeX">$\mathcal {L}$ </tex-math></inline-formula> balls of radius <inline-formula> <tex-math notation="LaTeX">$t$ </tex-math></inline-formula> (with respect to substitutions) centered at the words with mutual Hamming distances at least <inline-formula> <tex-math notation="LaTeX">$2e+1$ </tex-math></inline-formula>. Furthermore, with the aid of covering codes, we also consider the list sizes in the cases where the length <inline-formula> <tex-math notation="LaTeX">$n$ </tex-math></inline-formula> is rather small (improving previously known results). After that we study how much we can decrease the number of required channels when we use list-decoding codes. Finally, the majority algorithm is discussed for decoding in a probabilistic set-up; in particular, we show that the output word of the decoder can be verified to be the transmitted one with high probability.


Introduction
In this paper, the Levenshtein's sequence reconstruction problem, which was introduced in [2], is studied when the errors are substitution errors.For many related sequence reconstruction problems (concerning, for instance, deletion and insertion errors) consult, for example, [2][3][4][5][6][7][8].Originally, the motivation for the sequence reconstruction problem came from biology and chemistry where the familiar redundancy method of error correction is not suitable.The sequence reconstruction problem has come back to the focus, since it was recently pointed out that this problem is highly relevant to information retrieval in advanced storage technologies where the stored information is either a single copy, which is read many times, or the stored information has several copies [5,9].This problem (see [5]) is especially applicable to DNA data storage systems (see [10][11][12][13]) where DNA strands provide numerous erroneous copies of the information and the goal is to recover the information using these copies.
Let us denote the set {1, 2, . . ., n} by [1, n].Denote by F the finite field of two elements, and denote the binary Hamming space by F n .The support of the word x = x 1 . . .x n ∈ F n is defined via supp(x) = {i | x i = 0}.Let us denote the all-zero word 0 = 00 . . .0 ∈ F n and by e i ∈ F n a word with 1 in the ith coordinate and zeros elsewhere.The Hamming weight w(x) of x ∈ F n is |supp(x)|.
The Hamming distance is defined as d(x, y) = w(x + y) for x, y ∈ F n .Let us denote the Hamming ball of radius t centered at x ∈ F n by B t (x) = {y ∈ F n | d(x, y) ≤ t} and its cardinality by V (n, t) = Next we consider the sequence reconstruction problem.For the rest of the paper, let C ⊆ F n be any e-error-correcting code.A codeword x ∈ C is transmitted through N channels where, in each of them, at most t substitution errors can occur.In the sequence reconstruction problem, our aim is to reconstruct x based on the N distinct outputs Y = {y 1 , . . ., y N } from the channels (see Fig. 1).
It is assumed that t > e(C) (if t ≤ e(C), then only one channel is enough to reconstruct x).For ≥ 1, let us denote t = e(C) + = e + for the rest of paper.The situation where we obtain sometimes a short list of possibilities for x instead of always recovering x uniquely, is considered in [14,15].Based on the set Y and the code C, the list decoder (see Fig. The value L depends on e, , n and N .Obviously, one would like to have as small L as possible. Observe that we consider the worst case scenario of the output channels regarding L. In such a situation, the channels are sometimes called adversarial; for example, see [16].The problem of minimizing L is studied, for example, in [14,15,[17][18][19][20].Also a probabilistic versions of this problem have been studied (often under the name trace reconstruction) for example in [21,22].
In this paper, we mainly consider the relation between N and L for various n after we fix two parameters and e (while letting C be any e-error-correcting code).The sequence reconstruction problem is also closely related (see [17]) to information retrieval in associative memory introduced by Yaakobi and Bruck [14,15].
The structure of the paper is as follows.In Section 2, we recall some of the known results.In particular, it is pointed out that if we have at least (resp.less than) V (n, − 1) + 1 channels, then the list size is constant with respect to n (resp.there are e-error-correcting codes with list size depending on n).In Section 3, we give the complete correspondence between the list size and the number of channels when we have more than V (n, − 1) + 1 channels and n is large enough.It is sometimes enough to increase the number of channels only by a constant amount in order to decrease the list size (see Corollary 13).Section 4 focuses on improving the bounds on the list size when n is not restricted and we obtain strictly more channels than V (n, − 1) + 1. Section 5 is devoted to list size when we have less than V (n, − 1) + 1 channels.The final section deals with the reconstruction with the aid of a majority algorithm on the coordinates among the output words in Y .

Known results
In this section we present some known results on how the two values N and L are linked.The basic idea on estimating L is the following: we analyse the maximum number of output words (N ) we can fit in the intersection of L t-radius balls centered at codewords.As expected, the length L of the outputted list strongly depends on the number of channels.
Previously, in [2] and [14], the problem has been considered for L = 1 and L = 2, respectively.Moreover, in [23], the exact number of channels N required to have L constant on n has been presented, see Theorems 4 and 5. Following theorem gives an exact number of channels required to have L = 1.
The next result is a a reformulation of a result by Yaakobi and Bruck [14,Algorithm 18] proven in [23].
The bound in Theorem 3 can be improved to 2 which has been shown to be tight in [23].
Besides the 2 part, also the value V (n, − 1) + 1 for the number of channels is tight, that is, if the value for N is less, then list size L can be linear with respect to n.

Let us denote for the rest of the paper
Although the bound for L in Theorem 4 cannot be improved in general, we can improve it, when n is large, to + 1.Moreover, the bound + 1 is tight.
Finally, in [14,Theorem 6], the authors have given exact number of channels required to have L ≤ 2. All in all, N is known precisely only for these three values when L is constant on n.In the following section, we give the missing values for N .

List size with more channels
In this section, we give exact bounds for the number of channels N = N h + 1 (when n is large) which is required for satisfying L < h for every constant value h.Previously, N h was known only for three values h = 2, 3, + 2. To achieve this, we need to introduce two technical lemmas from [23].
In the following lemma, when n is large, it is shown that if any three codewords in T (Y ) differ within some subset of coordinates D of constant size b, then there exists an output word y which differs from these codewords in at least − 1 coordinate positions outside of D. Notice that supp(w + z) gives the set of coordinates in which w and z differ.The following lemma shows that the distance between any codewords in T (Y ) is either 2e + 1 or 2e + 2.
We denote by N (n, , e, h) the maximum number of t-error channels such that there exists a set of output words Y ⊆ F n satisfying |Y | = N (n, , e, h) and |T (Y )| ≥ h for some e-error correcting code C. By Theorems 5 and 6, N (n, , e, h) = V (n, − 1) for all + 2 ≤ h ≤ n/(e + 1) when n is large enough.Hence, when we use notation N (n, , e, h), we assume that h is the smallest integer for which N (n, , e, h) = N (n, , e, h ).When the exact formulation is not necessary for clarity, we denote N (n, , e, h) = N h .Observe that, if N ≥ N h + 1, then L < h for all e-error-correcting codes.In particular, the difference between N (n, , e, h) and N (n, , e, h) is that N (n, , e, h) exists for each value of h but may give the same values for different choices of h while N (n, , e, h) does not exist for every choice of h but each value it attains is unique.
We need the following two technical notations in the next theorem.Let us have N h channels, then In the following theorem, we give the maximum number of channels N h which gives list size L = h for some e-error-correcting code and if N > N h , then L < h for all e-error-correcting codes.Let us then count the number of words in Clearly, each word y with w(y) ≤ − 1 belongs to the intersection contributing V (n, − 1) words to it.Assume then that w(y) = w ≥ .As d(y, c j ) ≤ t for all j ∈ [1, h], we have w(y) + w(c j ) − 2|supp(y) ∩ supp(c j )| ≤ t.Denote i j = |supp(y) ∩ supp(c j )|.Assume first that w(c j ) = e + 1 for all j.Then y ∈ B t (c j ) if and only if we have w + e + 1 − 2i j ≤ t.Hence, e + 1 ≥ i j ≥ (w + 1 − )/2.Moreover, h j=1 i j ≤ w since w(y) = w and supp(c j1 ) ∩ supp(c j2 ) = ∅ for each j 1 = j 2 .In other words, y ∈ In the case where w(c k ) = e for some k, say k = 1, we have e ≥ i 1 ≥ (w − )/2.Thus, y ∈ h i=1 B t (c i ) if and only if (i 1 , . . ., i h ) ∈ W w .Hence, there exist (i1,...,i h )∈W words of weight w ≥ in h i=1 B t (c i ).Together these give the claim.
Observe by the proof that the bounds given in Theorem 10 are tight.Notice that geometrically the output sets giving maximal list size are more complicated than, for example, in Theorem 5 (where a ball of volume V (n, − 1) is essential).If we increase N by one, then L decreases by one since we cannot place the output word within the intersection of t-balls centered at codewords in T (Y ).Another observation is that although the sums do not include an upper bound for w, there is one.Namely the definition, for W w , gives that w ≤ 2e + + 1 and for W w that w ≤ 2e + .
In the following theorem, we improve the previous result by showing that the two binomial sums within the max are actually equal.
Proof.Observe that the claim follows from Theorem 10 if we can prove that the two binomial sums in the claim are equal.For that, we will be considering subsums and We claim that S a = S a for each non-negative integer a.When w = + 2a we have Moreover, for w = + 2a + 1 we have Assume now that the codewords c i ∈ T (Y ) are arranged as in the proof of Theorem 10, that is, each codeword has weight of e or e + 1.Hence, their supports do not intersect.Recall that we have at most one codeword with weight e and if that word exists, then we are using set W w in our binomial sum.For further rearranging of the Hamming space, we assume that if each word has weight e + 1, then supp(c i ) = [(e + 1)(i − 1) + 1, (e + 1)i] and if there exists a word with weight e, it is denoted by c 1 (replacing c 1 ) and we have supp Recall that w describes the weight of a word We will construct the proof by first showing that Assume that y ∈ Y 2a .Thus, w(y) = + 2a and |supp(y Assume then that y ∈ Y 2a+1 .Thus, w(y Let us now consider output word y ∈ Y 2a \ Y 2a .We again have w(y Now, we have S a = S a and the claim follows. In the following theorem, we show that N h exists and is unique for each value h ∈ [3, + 1] when n is large.As we have seen previously, this does not hold when h ≥ + 2. , e, h) > N (n, , e, h + 1) for each h and thus, each N h exists and attains unique value.
Proof.Since N (n, , e, h + 1) denotes the maximum value for | h+1 i=1 B t (c i )| over all sets T (Y ) = {c 1 , . . ., c h+1 }, we clearly have N (n, , e, h) ≥ N (n, , e, h + 1).As we have seen in the previous theorems, the maximum value for N (n, , e, h + 1) is attained when the codewords in T (Y ) = {c 1 , . . ., c h+1 } have supports supp(c i ) = [(i − 1)(e + 1) + 1, i(e + 1)].We show that when we choose h of these codewords, then we can fit more words in the intersection of their t-balls.We clearly have In the following, we show that there exists a word y ∈ Thus, the claim follows.
Using Theorem 11, we can improve the bound L ≤ +1 of Theorem 6 just by adding a constant number (e + 1) +1 of channels.
Proof.We calculate the value N h with h = + 1 of Theorem 11.For this purpose, we consider the set W w with w ≥ .We get that i j ≥ 1 for each j ∈ [1, h] and thus, w ≥ + 1.On the other hand, w > w (a contradiction).Therefore, as w = + 1 and i j ≥ 1 for each j, we have W w = {(1, 1, ..., 1)}.Thus, the sum corresponding to W w in Theorem 11 gives In the following corollary, we present the asymptotic behaviour of N h on n.Notice that Corollary 13 considers the case h = + 1 and Corollary 14 cases 3 ≤ h ≤ .
, and e and be fixed.By Theorem 11, we have attains its maximum value when w − h j=1 i j is as large as possible since w ≤ 2e + + 1 (recall that e and are constants with respect to n).This occurs exactly when i j = (w + 1 − )/2 for each j and (w+1− )/2 is as small as possible.In particular, when w ∈ { , +1}, we have (w+1− )/2 = 1 and when w = + 1 + a for some integer a ≥ 1, we have (w + 1 − )/2 = 1 + a/2 .Moreover, when w = , we have w − h j=1 i j ≤ − h.When w = + 1, we have w − h j=1 i j ≤ + 1 − h and when w = + 1 + a, we have w − Thus, we may concentrate on the case with w = + 1.
Furthermore, as w = + 1 and i j = 1 for each j = 1, 2, . . ., h, we have = (e + 1) h .Recall that e is constant on n.Hence, for large n, it is enough to consider the binomial coefficient n−h(e+1) +1−h .Moreover, the second largest binomial coefficient is n−h(e+1) −h and we have In Theorem 2, a tight bound for the number of channels to certainly attain the list size L ≤ 2 is presented when code C has minimum distance d.Observe, that when we choose h = 3 in Theorem 11, we attain the number of channels required to have L ≤ then L ≤ 2 for any e-error-correcting code C and value N is equal to the lower bound obtained in Theorem 2.
Proof.We first show that the formulation of the bound follows from Theorem 11 when h = 3.This gives the minimum number of channels N required to have L ≤ 2. In particular, we have We have renamed the indices for convenience (the index i 1 will be saved for later use in the proof).Moreover, we have Earlier, we have used W w only with the assumption that w ≥ .However, we may allow here that w < .Hence, we may have some binomial coefficients with i j < 0 for some j.In these cases (and when i j > e+1), we use the common convention that the binomial coefficient attains the value 0. Observe that We now get that 2i 2 ≥ 2(w+1− )/2 = i 1 +i 2 +i 3 +i 4 − +1 and similar inequalities for i 3 and i 4 .Since we do not have to take into account lower bound i 1 ≥ 0 (cases with i 1 < 0 increase binomial sum by 0) or the cases with i j > e + 1 for j ∈ {2, 3, 4}, we can consider following system of inequalities: (3) Our goal is to show that this system of inequalities is equivalent with the following system of inequalities: Let us first show that the second system of inequalities follows from the first system of inequalities.
Inequality (6) follows from We obtain Inequality (7) in similar manner.Moreover, from Inequalities (4) and ( 5) we obtain Finally, the lower bound inequality in (8) follows directly from (3).Let us then show that the first system of inequalities follows from the second one.First of all, Inequality (3) follows immediately from Inequality (8).Assume first that i 4 ≥ i 3 .Then the upper bound of Inequality ( 8) is i 2 ≤ − 1 − i 1 − i 4 + i 3 .This implies Inequality (4) and inequality (5 The case with i 3 ≥ i 4 is similar.Finally, we may add lower bounds i j ≥ 0 for all j ∈ {1, 2, 3, 4} due to binomial coefficient context.Similarly we notice that if i 1 ≥ , then i 4 < 0. Thus, we may also add upper bound i 1 ≤ − 1.Hence, the first part of the claim follows.
Let us then derive the bound of Theorem 2 by Yaakobi and Bruck from this new lower bound.The case with d = 2e + 1 is included in Appendix and we consider here only the case with d = 2e + 2. When we have d = 2e + 2, Theorem 2 can be presented in the following way: then L ≤ 2 for any e-error-correcting code C (with minimum distance d = 2e + 2).Next, we modify the presentation we got for N 3 into the formulation above.
Let us denote by i 2 = e + 1 − i 2 and by = 0 when i 4 < 0.Moreover, we have 0 Notice that the upper bound of i 3 can be replaced by t − (i 1 + i 4 ) since t − (i 1 + i 4 ) ≥ e + 1 as i 1 + i 4 ≤ − 1 and e+1 i 3 = 0 when i 3 > e + 1.For i 2 we have By comparing these inequalities with the bounds used in Theorem 2, we notice that they are identical.The case with d = 2e + 1 is similar and is included in Appendix.Hence, we get the claim.

New Bounds with the aid of Covering Codes
Notice that although we have the bound L ≤ + 1 when n is rather large (see Theorem 6), for smaller lengths of the codes our best bound is still L ≤ 2 (see Theorem 4) when the number of channels satisfies N ≥ V (n, − 1) + 1.Although this bound is attained in some cases (see [23]) and thus cannot be improved in general, we can try to get a smaller list size L when we increase the number of channels as we have seen Theorem 10.In this section, we utilize covering codes when we increase the number of channels.A code C ⊆ F n is an R-covering code if for every word x ∈ F n there exists a codeword c ∈ C such that d(x, c) ≤ R. For an excellent source on results concerning covering codes, see [24].Let us denote by k[n, R] the smallest possible dimension of a linear R-covering code of length n.
Let us next present the well-known Sauer-Shelah lemma (see [25,26]).Let F be a family of subsets of [1, n], where n is a positive integer.We say that a subset S of [1, n] is shattered by F if for any subset E ⊆ S there exists a set F ∈ F such that F ∩ S = E.The Sauer-Shelah lemma states that if |F| > k−1 i=0 n i , then F shatters a subset of size (at least) k.Since the subsets of [1, n] can naturally be interpreted as words of F n , the Sauer-Shelah lemma can be reformulated as follows.Notice that ([25, 26]).If Y ⊆ F n is a set containing at least V (n, k − 1) + 1 words, then there exists a set S of k coordinates such that for any word w ∈ F n with supp(w) ⊆ S there exists a word s ∈ Y satisfying supp(w) = supp(s) ∩ S.Here we say that the set S of coordinates is shattered by Y .
Observe that each Hamming ball of radius e contains at most one codeword of C. Thus, if the intersection of the balls of radius t centered at the output words of Y can be covered by k balls of radius e, then we have |T (Y )| ≤ k.This approach is formulated in the following lemma.
Lemma 17 ([23]).Let C ⊆ F n be an e-error-correcting code.If for any set of output words Y = {y 1 , . . ., y N } we have for some words Notice that Lemma 17 also gives a decoding algorithm.Indeed, if the words β i are known, then there is at most one codeword in each B e (β i ), we can use the decoding algorithm of C on β i and the codeword can be added to the list T .
Theorem 18.Let C be an e-error-correcting code.If the number of channels satisfies N ≥ Proof.Let x be the input word.We have Next we show that with this number of outputs we can guarantee that there exists a set S of + 2R coordinates such that within these coordinates of S a subset Y ⊆ Y contains a linear R-covering code of length + 2R.Due to Theorem 16, we know that if we had more output words, namely, |Y | ≥ V (n, + 2R − 1) + 1, then we would have a set S of coordinates such that a subset Y ⊆ Y contains all the 2 +2R words of length + 2R among these coordinates of S. Let D be a linear R-covering code in F +2R with dim(D) = k[ + 2R, R].Notice that any coset u + D, u ∈ F +2R , of the linear code D is also an R-covering code, and there are 2 +2R−dim(D) distinct cosets.Therefore, the set Y can miss any 2 +2R−dim(D) − 1 words of F +2R and still the remaining subset contains at least one R-covering code of length + 2R.Consequently, it follows that Y contains an R-covering code of size 2 k[ +2R,R] because Y can be obtained from Y by removing some 2 +2R−dim(D) − 1 words.Now let s ∈ F n be a word such that supp(s) = S and Y 1 = {y 1 , . . ., y 2 k[ +2R,R] } ⊆ Y the subset of output words corresponding to the R-covering code.Denote β i = s+y i for i = 1, . . ., 2 k[ +2R,R] .Since the words in set Y 1 form, among the coordinates corresponding to S, an R-covering code of length + 2R, we know that there exists y j , j ∈ {1, . . ., 2 k[ +2R,R] }, such that the words y j and x + s differ in at most R places among the coordinates of S. Consequently, as d(x, y j ) ≤ t, the words x and β j = y j + s have distance at most t − ( + R) + R = e from one another.Therefore, by Lemma 17, we get that Note that if = 5 and N ≥ V (n, 4) + 1, then, by Theorem 4, we have L ≤ 2 5 = 32.If we have N ≥ V (n, 6) − 6, then (using as the linear 1-covering code D the Hamming code of length 7), we obtain by the previous result, that L ≤ 16.

List size with less channels
By the following theorem (of [23]), it is clear that if we have less than V (n, − 1) + 1 channels, then the list size cannot in general be constant for e-error-correcting codes of length n.
Theorem 19.[23] Let V (n, − p − 1) + 1 ≤ N ≤ V (n, − p) where 0 ≤ p ≤ − 1.Moreover, let C ⊆ F n be such an e-error-correcting code that L is maximal.Then we have Due to this result, in order to have a smaller list size, let us concentrate on certain e-errorcorrecting codes, namely, those with at most M codewords within any ball of radius e + a, for some a > 0.
Theorem 20.Let N ≥ V (n, − a − 1) + 1 where 0 ≤ a ≤ − 1.Let C be an e-error-correcting code such that |B e+a (u) ∩ C| ≤ M for every u ∈ F n .Consequently, Proof.Assume that we received the set Y of output words from the channels.Due to the number of channels, we know, by Theorem 16, that there exists a subset Y ⊆ Y of size |Y | = 2 −a such that these words have in some − a coordinates, denote this set of coordinates by S, all possible words of length − a. Suppose x is the input word and denote Y = {y 1 , . . ., y 2 −a }.Let β i = y i + s, i = 1, . . ., 2 −a where supp(s) = S.It is easy to check that x has distance at most e + a from one of the β i 's.Since there are at most 2 −a β i 's and there are at most M codewords within distance e + a from each of them, we obtain the bound L ≤ 2 −a M.
The previous result is useful when our e-error-correcting code is a code for traditional listdecoding, see [27].For number of channels being less than V (n, − 1) + 1, it also gives, for every e-error-correcting code with suitable a, small exponent for n compared to Theorem 19 (see Corollary 21(ii) below), or even constant bounds (see Corollary 21(i)).The following corollary follows straightforwardly from applying the results in [27, Theorem In what follows, we continue our study of the list size for codes with |B e+a (u) ∩ C| ≤ M for every u ∈ F n .First, we introduce two technical lemmas.Lemma 22 can be seen as a reformulation of [23,Lemma 13] for lesser number of channels and Lemma 23 as a reformulation of Lemma 9.
such words, which is strictly less than N and hence, the claim holds.Together with the two previous lemmas, we can now prove an upper bound for L, depending on the maximum number M of codewords in an (e + a)-radius ball.In what follows, we show using an iterative approach that there exists a central word w ∈ F n with respect to T (Y ).We begin the iterative process by considering a subset C 0 = {c 1 , c 2 } ⊆ T (Y ) such that c 1 = 0 and w(c 2 ) = e + a + 1 + p 1 with 1 ≤ p 1 ≤ e + a + 1.Indeed, we may assume that such a codeword c 2 exists since otherwise we are immediately done due to w = 0 being the searched central word.Observe that the weight of a central word w ∈ W C0 satisfies p 1 ≤ w(w) ≤ e + a + 1.In particular, there are exactly e+a+1+p1 p1 central words w ∈ W C0 of weight p 1 .For each such central word w, we may assume that there exists a codeword c ∈ T (Y ) such that d(w, c) > e+a+1 as otherwise w ∈ W T (Y ) and we are done.Now we form a new code C 1 by adding such a codeword c for each w ∈ W C0 with w(w) = p 1 .The number of added codewords is at most we have |C 1 | ≤ (2e + 2a + 2) p1 + 1.Notice that there are no central words in W C1 of weight at most p 1 .Furthermore, by the previous observation for W C S , W C1 is nonempty as Assume that p 2 > p 1 is now the smallest weight of a central word in W C1 .Let w be a central word with respect to C 1 of weight p 2 .The support of w is a subset of c∈C1 supp(c) since otherwise there exists a central word w ∈ W C1 with w(w ) < w(w) (a contradiction).Therefore, as (c , the number of central words of weight p 2 in W C1 is at most Again, for each central word w ∈ W C1 of weight p 2 , there exists a codeword c ∈ T (Y ) such that d(w, c) > e + a + 1.Now we form a new code C 2 by adding such a codeword c for each w ∈ W C1 with w(w) = p 2 .Thus, we have The process can be iteratively continued by forming a new code C i based on the previous code C i−1 , until we have reached p i = e + a + 1 or we have already found a central word with respect to T (Y ).In what follows, the iterative process is explained in more detail: i−1 k=h p k .By the observation above for W C S , this implies that W Ci−1 is nonempty as |C i−1 | ≤ b/(2e + 2a + 2) (see also Equation ( 10)).Furthermore, we have • Let w ∈ W Ci−1 be of weight p i .As previously, the support of w is a subset of c∈Ci−1 supp(c) since otherwise there exists a central word w ∈ W Ci−1 with w(w ) < w(w) (a contradiction).Therefore, as pi .
• Again, for each central word w ∈ W Ci−1 of weight p i , there exists a codeword c ∈ T (Y ) such that d(w, c) > e + a + 1.Now we form a new code C i by adding such a codeword c for each w ∈ W Ci−1 with w(w) = p i .Thus, we have =(2e + 2a + 2) Notice that since 1 ≤ p 1 < p 2 < • • • < p i ≤ e + a + 1, we reach p j = e + a + 1 at some point (or the central word w with respect to T (Y ) has already been found in an earlier step).By (9), we have Therefore, by the previous observation, W Cj is nonempty.Thus, in conclusion, there exists a central word w ∈ F n with respect to T (Y ).
Let us now translate the Hamming space so that w = 0 and thus, T (Y ) ⊆ B e+a+1 (0).In other words, we have w(c i ) ≤ e + a + 1 for each i.Recall that we have N ≥ V (n, − a − 1) + 1.Thus, there exists a word y ∈ Y such that w(y) ≥ − a.Moreover, since d(c i , y) ≤ t for each i, we have w(y) ≤ t + e + a + 1.The proof now divides into the following three cases depending on the weight of w(y).
(i) Assume first that − a ≤ w(y) ≤ t.Now the support of each c i ∈ T (Y ) of weight e + a + 1 intersects with supp(y) since otherwise d(y, c i ) ≥ (e + a + 1) + ( − a) = t + 1 (a contradiction).Hence, B e+a (e i ).
(ii) Assume then that t ≤ w(y) ≤ t + e + a.Let y s be a word such that supp(y s ) ⊆ supp(y) and w(y s ) = t.Now the support of each c i ∈ T (Y ) of weight e + a + 1 intersects with supp(y s ) since otherwise d(y, c i ) ≥ t + 1 (a contradiction).Hence, B e+a (e i ).
(iii) Assume finally that w(y) = t + e + a + 1.Then we have w(c i ) = e + a + 1 for any c i ∈ T (Y ) as d(y, c i ) ≤ t.Let y s be a word such that supp(y s ) ⊆ supp(y) and w(y s ) = t + 1. Again the support of each c i ∈ T (Y ) of weight e + a + 1 intersects with supp(y s ).Observing that w(c i ) = e + a + 1 for any c i ∈ T (Y ) as d(y, c i ) ≤ t, we have Based on the cases (i)-(iii), the set of codewords T (Y ) is always contained in a union of t + 1 balls of radius e + a.Thus, as each ball of radius e + a has at most M codewords, we obtain that L ≤ (t + 1)M .

Decoding with majority algorithm
In this section, we focus on decoding the transmitted word x = (x 1 , x 2 , . . ., x n ) ∈ C based on the set Y of the output words using a majority algorithm.For the rest of the section, we assume that each word of B t (x) is outputted from a channel with equal probability.Here we actually allow -unlike elsewhere in the paper -some of the output words y i to be equal.Probabilistic set-up has been studied for different error types, for example, in [21,22].Our approach differs from these articles in that we have given an upper limit for the possible number of errors in any single channel.This allows us to have a verifiability property in Theorem 28 unlike, for example, in [21,22].With the verifiability property we mean that although we cannot be certain whether we can deduce the transmitted word correctly before looking at the output words, we can sometimes deduce the word with complete certainty after seeing the output words.In other words, some output word sets have properties which allow us to know the transmitted word with certainty.
First we describe the (well-known) majority algorithm using similar terminology and notation as in [14].The coordinates of the output words y j ∈ Y are denoted by y j = (y j,1 , y j,2 , . . ., y j,n ).Furthermore, the number of zeros and ones in the ith coordinates of the output words are respectively denoted by and m i,1 = N − m i,0 .Based on Y , the majority algorithm outputs the word where In other words, for each coordinate of z, we choose ?, 0 or 1 based on whether the numbers of 0s and 1s are equal or which ones occur more frequently.Observe that the coordinate z i outputted by the majority algorithm is equal to x i if and only if at most N/2 − 1 errors occur in the ith coordinates of Y .Observe that the complexity of the majority algorithm is Θ(N n) and since reading all the output words takes Θ(N n) time, majority algorithm has optimal time complexity.
In [14, Example 1], it is shown that the majority algorithm does not always output the correct transmitted word x when the number of channels N is equal to the value in Theorem 1 even though we take the e-error-correction capability of C into account.In [7], a modification of the majority algorithm is presented for decoding and it is shown that if the number of channels satisfies the bound of Theorem 1, then the output word of the algorithm belongs to B e (x) and can be uniquely decoded to x.In what follows, we demonstrate that with high probability the word z is within distance e from x with significantly smaller number of channels (than in [7]).For example, the Monte Carlo simulations for the values t = 5 and n = 28 are illustrated in Table 1.
For this purpose, we first consider a variant of the so called multiple birthday problem; the multiple birthday problem has been studied, for example, in [28] and [29].Here we assume that s, q, n and t are integers satisfying 2 ≤ s ≤ q and 0 ≤ t ≤ n.A throw consists of placing t balls randomly into n buckets in such a way that each ball lands in a different bucket and each subset of the buckets of size t for a throw has an equal probability.Denote then by C t (n, q, s) the event that after q throws, at least one bucket contains at least s balls.Observe that if t = 1, then we are actually considering the multiple birthday problem; furthermore, if (t = 1 and) s = 2, then the case is the (regular) birthday problem.Furthermore, by P r[C t (n, q, s)] we denote the probability that there exist at least s balls in a bucket.In the following theorem, we present (based on P r[C t (n, q, s)]) a lower bound on the probability that the output z of the majority algorithm is equal to x. Theorem 25.Let x be the transmitted codeword of C ⊆ F n and N the number of channels.The probability that the output z of the majority algorithm is equal to x is at least Proof.Let P 1 denote the probability that a coordinate of the outputs Y contains at least N/2 errors when at most t errors occur in each channel and P 2 the probability that a coordinate of the outputs Y contains at least N/2 errors when exactly t errors occur in each channel.It is immediate that P 1 ≤ P 2 .Observe that the outputs Y of the N channels in the case of the probability P 2 can be represented as the above described variant of the multiple birthday problem as follows: an output y i ∈ Y with exactly t errors can be considered as a throw of t balls to n buckets, and there are N throws in total.Therefore, P 2 = P r[C t (n, N, N/2 )] and the claim follows.
In order to obtain a lower bound on the probability 1 − P r[C t (n, N, N/2 )], we require an upper bound on P r[C t (n, N, N/2 )].In the following lemma, we present such an upper bound loosely based on a recursive idea in [28] for computing the exact probability of the multiple birthday problem.
Lemma 26.Let s, q, n and t be integers satisfying 2 ≤ s ≤ q and 0 ≤ t ≤ n. (i) Now the probability P r[C t (n, q, s)] is at most where P r[C 0 (n, q, s)] = 0 and P r[C t (n, q, s)] = 0 if q < s. (ii) Furthermore, we obtain that Proof.(i) Observe first that we clearly have P r[C 0 (n, q, s)] = 0. Let then i be an integer such that s ≤ i ≤ q.Denote by C t (n, q, s, i) the event that after the ith throw of t balls there exist first time at least s balls in a bucket; notice that after the ith throw it is possible that s balls appear in multiple buckets.Using this notation, we have The probability P r[C t (n, q, s, i)] can be calculated based on the following facts: (i) Let B be one of the buckets that first attains s balls.Clearly, the bucket B can be chosen in n ways.
(ii) The s − 1 throws placing balls into B before the ith throw can be chosen from the previous i − 1 throws in i−1 s−1 ways.
(iii) As the probability that a ball of a single throw lands into B is n−1 t−1 / n t , the probability of the event that the s selected throws put balls into B is equal to (iv) As the probability that no ball of a single throw lands into B is n−1 t / n t , the probability of the event that no other throw (than the s selected ones) puts a ball into B is equal to (v) Finally, let B denote the set of n − 1 buckets other than B and P i denote the probability that no bucket in B contains at least s balls after the first i − 1 throws with the conditional assumption that the events of (iii) and (iv) occur.Observe that if a ball of a throw lands into B, then the throw puts t − 1 balls into the buckets of B , and otherwise t balls land into B .
Thus, in conclusion, we have Therefore, by (11), we obtain that P r[C t (n, q, s)] is equal to notice that P i is equal to the probability that no bucket in B contains at least s balls after s − 1 throws with t − 1 balls and i − s throws with t balls have been performed in the buckets of B (corresponding to the events of (iii) and (iv), respectively).Therefore, we obtain that that P i ≤ 1 − P r[C t−1 (n − 1, i − 1, s)] since at least t − 1 balls are thrown i − 1 times into the n − 1 buckets of B (by the observation in (v)).Hence, the claim immediately follows.
(ii) For the second upper bound, we first notice that by the so called hockey stick identity for the binomial coefficients, we have Therefore, as (n − t)/n ≤ 1 and the probability P i ≤ 1, the second claim immediately follows.
In the following theorem, the upper bound (ii) of the previous lemma is applied to estimate the probability in Theorem 25.Observe that according to the second claim of the theorem, the probability that the majority algorithm outputs the transmitted word is as close to one as required when the rather weak condition n > 2t • e, where e is the Napier's constant, is satisfied and N is large enough.
Theorem 27.Let C be an e-error-correcting code and x the transmitted word of C. The probability that the output z of the majority algorithm is equal to the transmitted word x is at least Applying this upper bound, we obtain that as N → ∞ since n > 2t • e.Thus, the second claim follows.
In Table 1, we illustrate the various approaches to approximate the probability that the output z of the majority algorithm is equal to the transmitted word x when t = 5 and n = 28 (satisfying the condition of Theorem 27): the lower bound on the probabilities of Theorem 25 together with Lemma 26 and Theorem 27 as well as the Monte Carlo simulations with 100000 samples.Observe that the majority algorithm outputs z = x with high probability for significantly smaller number N of channels compared to Theorem 1, for which the required number of channels is 41709 when e = 0, t = 5 and n = 28; here e = 0 is chosen in order to meet the requirement z = x.
Above, we saw that it is highly probable that the majority algorithm works correctly with significantly smaller number of channels compared to the one given in Theorem 1, which was required in the algorithm presented in [7].In what follows, we take another approach on the Furthermore, if x = z, then the number of errors is exactly n i=1 m i .In addition, if x = z, then for each coordinate i in which the words differ, max{m i,0 , m i,1 } = N − m i is contributed to the sum of errors (instead of m i ).The following theorem is based on the idea that even the modified sum (in the left hand side of ( 13)) has to satisfy Inequality (12).Theorem 28.Let C be an e-error-correcting code, m i be the integers m i ordered in such a way that m 1 ≥ m 2 ≥ • • • ≥ m n and z be the output word of the majority algorithm.We have d(x, z) ≤ k if k is a positive integer such that Proof.Let k be a positive integer satisfying (13).Suppose to the contrary that d(x, z) ≥ k + 1.This implies that for at least k +1 coordinates i, the number of errors is max{m i,0 , m i,1 } = N −m i .Therefore, by the ordering of m i , the number of errors is at least Thus, due to (13), we have a contradiction with the maximum number of errors being tN and the claim follows.
Observe that (13) allows us to estimate the accuracy of z.In particular, if k ≤ e, then d(x, z) ≤ k ≤ e and the word z can be decoded to x as C is an e-error-correcting code.Furthermore, if k > e, then x ∈ C ∩ B k (z) and the decoding algorithm outputs a list of words containing x.Moreover, the size of the list is at most max u∈F n |C ∩ B k (u)|, which is closely related to the traditional list decoding (see [27]).In conclusion, the theorem gives us a condition guaranteeing that the transmitted word can be decoded uniquely or with certain accuracy.
The probability, that in Theorem 28 there exists a certain k satisfying (13), can be analysed analytically as shown below, but first we approximate it using Monte Carlo simulations.In Table 2, the probability is approximated using 100000 samples for n = 24, t = 7, e = 2, 3, 4 and varying number of channels N ; here we choose k = e and we strive for the exact transmitted word x.
From the table, we can notice that as the number of channels increases it becomes very likely that In what follows, we further study analytically the probability that for a set Y of outputs there exists an integer k satisfying the conditions of Theorem 28.For this purpose, let C t (n, q, s) denote the event that after q random throws of t balls, each of the n buckets contains at most s balls.Furthermore, regarding the number of errors occurring in the channels (in total), let Er(r) denote the event that at least r errors happen in the outputs Y in total.Moreover, let p(N ) denote the parity of N , i.e., p(N ) = 1 if N is odd, and otherwise p(N ) = 0. Now we are ready to formulate the following theorem.
Theorem 29.Let C be an e-error-correcting code and α be a positive integer smaller than N/2 .The probability that a positive integer k satisfies (13) is at least Proof.Assume that (i) at most N/2 − α errors occur in each coordinate of the outputs Y and that (ii) at least tN − (k + 1)(2α − p(N )) + 1 errors occur in the channels in total (when at most t errors occur in each channel).Now the difference (N − m i ) − m i = N − 2m i gives the number of additional errors occurring in a coordinate in the case that max{m i,0 , m i,1 } = N − m i errors happen instead of m i = min{m i,0 , m i,1 } errors.The difference (N − m i ) − m i can be estimated based on the parity of N as follows: In conclusion, we have N − 2m i ≥ 2α − p(N ).Therefore, using the notation of Theorem 28, we have ) + 1 by the assumption (ii).Thus, the integer k satisfies the condition of Theorem 28.
In order to estimate the probability of the events of the assumptions (i) and (ii) occurring simultaneously, we denote by A the event that at most N/2 − α errors occur in each coordinate of the outputs Y when at most t errors happen in each channel.Clearly, since the event C t (n, N, N/2 − α) can be interpreted as the outputs Y of the channels as in Theorem 25, we have C t (n, N, N/2 − α) ⊆ A. Hence, the probability of the events of the assumptions (i) and (ii) occurring simultaneously is at least P r[A ∩ Er(tN − (k + 1)(2α − p(N )) + 1)] ≥ P r[C t (n, N, N/2 − α) ∩ Er(tN − (k + 1)(2α − p(N )) + 1)] and the claim follows.
Observe that if A and B are events (independent or dependent), then the probability of the event A ∩ B can be estimated as follows by the inclusion-exclusion principle: The lower bound on P r[C t (n, N, N/2 −α)] = 1−P r[C t (n, N, N/2 −α+1)] immediately follows by Lemma 26(ii) and hence, the first claim follows.Assume then that t − µ < k and n > 2t • e/A.For the limit of second claim, observe first that h • σ √ N + 1 + p(N )(k + 1) ≤ N when N is large enough since k is a (fixed) constant.Therefore, by straightforward calculations, we have This further implies that Applying this inequality, for large enough N , we have Notice that the condition t − µ < k of the lemma is rather undemanding for our purposes.Observe that the lemma is usually applied for k = e as then the output z of the majority algorithm can be uniquely decoded to the transmitted word x according to Theorem 28.In this case, the condition can be formulated as < µ.Moreover, we trivially have µ > tp t = t n t /V (n, t).Furthermore, we have the estimation Thus, if n ≥ t 2 + t − 1, we obtain where the assumption n ≥ t 2 +t−1 is used in the second inequality.Furthermore, we have t−1 ≥ when e ≥ 1.Therefore, as µ > t − 1, it is actually enough for the requirement < µ that t − 1 ≥ .In other words, condition t − µ < e is satisfied when n ≥ t 2 + t − 1 and e ≥ 1.Moreover, even this quite meagre condition for n is more than we actually need due to rather crude estimates; for example, the condition t − µ < e is met for the values n = 10, e = 1 and t = 5 (as well as for all n > 10 for the same choices of e and t).
As explained before the previous lemma, we can get the lower bound of Corollary 30 as close to 1 as desired for carefully chosen h In what follows, we have renamed indices for convenience in such a way that i 4 corresponds to i 1 of Theorem 11 (the index i 1 will be saved for later use in the proof).Moreover, we have W w = {(i 2 , i 3 , i 4 ) | for j ∈ [2, 3] : (w+1− )/2 ≤ i j ≤ e+1 and (w− )/2 ≤ i 4 ≤ e, w ≥ i 2 +i 3 +i 4 }.Again, if we have some binomial coefficients with i j < 0 for some j, then we use the common convention that the binomial coefficient attains the value 0.Then, as in the case d = 2e + 2, we obtain that N 3 = w≥0 (i2,i3,i4)∈W w n−3e−2 w−i2−i3−i4 e+1 i2 e+1 i3 e i4 .Let us denote by i 1 = w − i 2 − i 3 − i 4 .We again get that 2i 2 ≥ i 1 + i 2 + i 3 + i 4 − + 1 and similar inequality for i 3 .Moreover, for i 4 , we have 2i 4 ≥ i 1 + i 2 + i 3 + i 4 − .Since we do not have to take into account lower bound i 1 ≥ 0 (cases with i 1 < 0 increase binomial sum by 0) or the cases with i j > e + 1 for j ∈ {2, 3, 4}, we can consider following system of inequalities: Our goal is to show that this system of inequalities is equivalent with the following system of inequalities: Let us first show that the second system of inequalities follows from the first system of inequalities.
Inequality (18) follows from We obtain Inequality (19) in similar manner.Indeed, and the upper bound follows from the fact that i 3 is an integer.Moreover, from Inequalities ( 16) and ( 17) we obtain i 2 ≤ − 1/2 − i 1 − i 4 − 1/2 + i 3 and i 2 ≤ − 1/2 − i 1 − i 3 + i 4 + 1/2, respectively.Together, these imply Finally, the lower bound inequality in (20) follows directly from (15).Let us then show that the first system of inequalities follows from the second one.First of all, Inequality (15) follows immediately from Inequality (20).Assume first that i 4 + 1/2 > i 3 .Then the upper bound of Inequality ( 20) is i 2 ≤ − 1 − i 1 − i 4 + i 3 .This implies Inequality (16) and inequality (17) since Notice that we cannot attain the lower bound in Inequality (17) in this case.
Finally, we may add lower bounds i j ≥ 0 for all j ∈ {1, 2, 3, 4} due to binomial coefficient context.Similarly we notice that if i 1 ≥ , then i 3 < 0. Thus, we may also add upper bound i 1 ≤ − 1.Now, we are ready to compare this bound with the bound of Theorem 2 by Yaakobi and Bruck.

1 )
D gives an estimation T D = T D (Y ) = {x 1 , . . ., x |T D | } on the sequence x which we try to reconstruct.We denote by L D the maximum cardinality of the list T D (Y ) over all possible sets Y of output words.The decoder is said to be successful if x ∈ T D .In this paper, we focus on the smallest possible value of L D over all successful decoders D, in other words, on L = min D is successful {L D }.Let us denote T = T (Y ) = C ∩ (Y )| | Y is a set of N output words}.
in the proof of Theorem 10 within the notations W w and W w .When w(y i ) = w = 2a + and we are considering case S a , then |supp(y i ) ∩ supp(c j )| = i j ≥ a + 1.Let us denote by Y 2a , Y 2a+1 , Y 2a and Y 2a+1 the sets of output words contributing to the sums S a and S a , respectively, where Y w (resp.Y w ) contains output words of weight w + .Let us denote by D = [h(e + 1) + 1, n].

2 .
The bound of Theorem 2 by Yaakobi and Bruck looks quite different from the bound in Theorem 11.However, Theorem 2 can be obtained as a corollary from Theorem 11 as shown in Corollary 15.The new presentation in Corollary 15 somewhat simplifies the inequalities for the indices compared to Theorem 2. Corollary 15.Let n ≥ n(e, , b), b ≥ max{3t, 4e + 4} and ≥ 2.
Then for any codeword c ∈ T (Y ) and any set D ⊆ [1, n] with cardinality |D| = b there exists an output word y ∈ Y such that |supp(y + c) \ D| ≥ − a − 1.Proof.Let us assume without loss of generality that c = 0. Thus, w(y) ≤ t for each y ∈ Y .Let us count the number of binary words of weight at most t which have |supp(y) \ D| < − a − 1.There are at most
Moreover, let C be such an e-error-correcting code that there are at most M codewords in any (e + a)-radius ball.ThenL ≤ max{(t + 1)M, b/(2e + 2a + 2)}.Proof.Let us denote T (Y ) = {c 1 , c 2 , . . ., c L }.If L ≤ b/(2e + 2a + 2), then the claim follows.Assume now that L > b/(2e + 2a + 2).Assume then, without loss of generality, that c 1 = 0.By Lemma 23, we havew(c i ) ≤ 2e + 2a + 2 for each i ∈ [1, L].Let Z be a subset of F n .We say that w ∈ F n is a central word with respect toZ if d(w, z) ≤ e + a +1 for all z ∈ Z.Moreover, the set of all central words with respect to Z is denoted by W Z .In what follows, we first show a useful observation stating that if a subset C S ⊆ T (Y ) is such that 0 ∈ C S and |C S | ≤ b/(2e + 2a + 2), then there exists a word w ∈ F n such that w(w) ≤ e + a + 1 and d(w, c) ≤ e + a + 1 for any c ∈ C S , i.e., w ∈ W C S = ∅.Since w(c) ≤ 2e + 2a + 2 for any c ∈ C S , we have c∈C S supp(c) ≤ (2e + 2a + 2) • b 2e + 2a + 2 = b.Therefore, by Lemma 22, there exists an output word y ∈ Y such that |supp(y) \ c∈C S supp(c)| ≥ − 1 − a.Let then w ∈ F n be such that supp(w) = supp(y) ∩ c∈C S supp(c).Now, as d(y, c) ≤ t = e+ for any c ∈ C S , we have d(w, c) ≤ e+a+1.Moreover, w(w) ≤ e+a+1 since 0 ∈ C S .

( 1 −
P r[C t (n, N, N/2 )]) = 1.Proof.The first claim follows immediately by applying Theorem 25 and Lemma 26.Assuming m and r are integers such that 1 ≤ r ≤ m, we have the following well-known upper bound on the binomial coefficient: C t (n, N, N/2 − α)] > 1 − 2te A 2te An N/2 −αand the lower bound approaches one as N tends to infinity since n > 2t • e/A and N/2 − α → ∞ due to Inequality(14) and the assumption t − µ < k.Thus, the second claim follows.

Table 1 :
The lower bound of 25 together with Lemma 26 and Theorem 27 as well as the Monte Carlo approximations with 100000 samples of the probability that z = x when n = 28, t = 5 and N = 11, 21, 31, 41, 101.show that if a certain criteria (see Theorem 28) is met for the outputs Y , then we can verify that the output z of the majority algorithm belongs to B e (x).For this purpose, notice first that the total number of errors occurring in the ith coordinates of the outputs Y is at least m i = min{m i,0 , m i,1 }.On the other hand, there happens at most t errors in each channel and, hence, the total number of errors in the channels is at most tN .Thus, we obtain that