SECTION I

LDPC CODES are known as a class of capacity approaching codes in the sense of Shannon's limit, when decoded with sum-product decoding [1]. Computer experiments have shown LDPC codes can achieve good error-correcting performance. On the other hand, there are no formulas for an accurate error-rate evaluation of sum-product decoding of a given LDPC code. LDPC codes have been chosen for standardization of communication products, for example, DVBs [2], and wireless communication [3]. For practical use, extremely small error-rate is expected. The smaller the error-rate required, the higher computational cost of computer experiments. The ideal goal of our research is to give a formula for an accurate error-rate of a given LDPC code. Therefore obtaining the formula is meaningful not only for theoretical interests, but also for the development of consumer equipment.

One research trend on LDPC codes is the theoretical analysis of code space and its related structures, e.g., “minimum distance” [4], [5], [6], “weight distribution” [7], [8], [9], “stopping set” [10], “trapping sets,” “near codewords” [11], “pseudocodewords” [12], [13], and so on. These approaches provide not only bounds or approximate values of error-rates, but also guidelines for the construction of good LDPC codes.

In this paper, we generalize sum-product decoding by introducing an additional parameter “initialization.” In the process, we introduce the concept of correctable error set. While many of the previously mentioned approaches tend to identify errors that are caused by the suboptimality of iterative decoding, we attempt to characterize errors that are guaranteed to be corrected by iterative decoding.

The paper is organized as follows: In Section II, we briefly review sum-product decoding. In Section III, we introduce the concept of a correctable error set for the BSC by fixing parameters of the sum-product decoding. A few examples of correctable error sets are given. In Section IV, we analyze theoretically the word-error-rate of sum-product decoding from a point of view of multi variable functions. In Section V, we establish a relation between the correctable error set and a set of syndromes. This relation allows us to reduce the computational complexity of determining the correctable error set. Section VI presents another result that reduces computational complexity. It uses the symmetry of the parity-check matrix. In Section VII, we introduce two applications of our results: One gives a relation among the “initialization,” the “iteration,” and the “word-error-rate.” The other is computational complexity reduction for computer experiments for quantum LDPC codes. In Section VIII, we conclude the paper and propose directions for future research.

SECTION II

Consider the following general communication scenario: a sender chooses a codeword $c$ of a binary linear error-correcting code associated with an LDPC matrix $H$ and sends $c$ to a receiver over a noisy channel. If the noisy channel is a binary symmetric channel (BSC) with crossover probability $q$, each bit of the codeword $c$ flips with the probability $q$ and the receiver obtains a bit sequence $y$. The receiver inputs the parameters $H$, $l_{\max}$, $q$ and the received word $y$ to a sum-product decoder ${\rm SPDec}$, where $l_{\max}$ is a parameter called the maximum iteration number. The communication succeeds if the output is $c$, and fails otherwise. The word error-rate is the failure probability of the communication scenario. In this paper, we analyze the word error-rate of LDPC codes with sum-product decoding over a BSC.

In this section we review the definition of the sum-product decoding algorithm. We take as input a bit-sequence $s$, which is referred to as a syndrome in the following definition. If we assume $s=0$, the algorithm becomes a standard sum-product decoding.

Let $H=(H_{m,n})_{1\leq m\leq M, 1\leq n\leq N}$ be a binary matrix of size $M\times N$. Define TeX Source $$\eqalignno{&A(m):=\{1\leq n\leq N\vert H_{m,n}=1\},\cr&B(n):=\{1\leq m\leq M\vert H_{m,n}=1\}.}$$ The set $A(m)$ (resp. $B(n)$) is called the column (resp. row) index set of the $m$th row (resp. $n$th column). The sum-product decoding algorithm is performed as follows.

**Input:**a binary matrix $H$, a bit sequence $y=(y_{1}, y_{2},\ldots, y_{N})\in\{0,1\}^{N}$, the crossover probability $p\in [{0,1}]$ of a BSC, an integer $l_{\max}$, and a bit sequence $s=(s_{1},\ldots, s_{M})\in\{0,1\}^{M}$.**Output:**$c=(c_{1}, c_{2},\ldots, c_{N})\in\{0, 1,\emptyset\}^{N}$.- Step 1 (initialize): Let $Q^{0}=(q_{m,n}^{(0)})$, $Q^{1}=(q_{m,n}^{(1)})$, $R^{0}=(r_{m,n}^{(0)})$, and $R^{1}=(r_{m,n}^{(1)})$ be matrices of size $M\times N$. For all $1\leq m\leq M$, $1\leq n\leq N$ with $H_{m, n}=1$, set TeX Source $$q_{m, n}^{(0)}=q_{m,n}^{(1)}=1/2,$$ and set $l=1$.
- Step 2 (row process): For each $1\leq m\leq M$, $n\in A(m)$ and $i\in\{0, 1\}$, compute TeX Source $$r_{m, n}^{(i)}:=K_{m, n}\sum_{(c_{1},\ldots,c_{N})\in X(m)^{(s_{m})}}\prod_{x\in A(m)\setminus\{n\}}q_{m, x}^{(c_{x})}P(y_{x}\vert c_{x}),$$ where $X(m)^{(i)}:=\{c\in\{0,1\}^{N}\vert c H_{m}^{T}=i\}$, $H_{m}$ is the $m$th row of $H$, and TeX Source $$P(a\vert b):=1-p {\rm if }a=b$$ and TeX Source $$P(a\vert b):=p {\rm if }a\ne b$$ for $a$, $b\in\{0, 1\}$. $K_{m, n}$ is a constant such that $r_{m,n}^{(0)}+r_{m, n}^{(1)}=1$.
- Step 3 (column process): For each $1\leq n\leq N$, $m\in B(n)$, and $i=0, 1$, compute TeX Source $$q_{m, n}^{(i)}:=K^{\prime}_{m, n}\prod_{x\in B(n)\setminus{m}}r_{x, n}^{(i)},$$ where $K^{\prime}_{m, n}$ is a constant such that $q_{m, n}^{(0)}+q_{m, n}^{(1)}=1$.
- Step 4 (temporary word): For $1\leq n\leq N$ and $i=0$, 1, compute TeX Source $$Q_{n}^{(i)}:=K^{\prime\prime}_{n}P(y_{n}\vert c_{n}=i)\prod_{x\in B(n)}r_{x, n}^{(i)}$$ and TeX Source $$\eqalignno{{\mathhat{c}}_{n}:=&\,0 {\rm if }Q_{n}^{(0)}>Q_{n}^{(1)},\cr{\mathhat{c}}_{n}:=&\,1 {\rm if }Q_{n}^{(0)}<Q_{n}^{(1)},}$$ and TeX Source $${\mathhat{c}}_{n}:=\emptyset {\rm if }Q_{n}^{(0)}=Q_{n}^{(1)},$$ where $K^{\prime\prime}_{n}$ is a constant such that $Q_{n}^{(0)}+Q_{n}^{(1)}=1$.
- Step 5 (parity-check): If one ${\mathhat{c}}_{n}$ is $\emptyset$, go to Step 6. If $({\mathhat{c}}_{1},\ldots,{\mathhat{c}}_{N}) H^{T}=s^{T}$, output $({\mathhat{c}}_{1},\ldots,{\mathhat{c}}_{N})$ and stop the algorithm.
- Step 6 (count the iteration number): If $l<l_{\max}$, increment $l$ and go to Step 2. If $l=l_{\max}$, output $({\mathhat{c}}_{1},\ldots,{\mathhat{c}}_{N})$ and stop the algorithm.

In general, an input $p$ may be different from the channel crossover probability $q$ for various reasons. For example, $p$ might be a quantized value of $q$ or the accuracy of $q$ could not be determined.

In Step 4, we assume that the temporary bit is $\emptyset$ if $Q_{n}^{(0)}=Q_{n}^{(1)}$. This assumption is required to properly define the correctable error set, which is one of the motivations of this paper (see Section III).

Another popular definition for sum-product replaces Steps 2 and 3 with

- Step 2' (row process): For each $1\leq m\leq M$, $n\in A(m)$ and $i\in\{0, 1\}$, compute TeX Source $$r_{m, n}^{(i)}:=K_{m, n}\sum_{(c_{1},\ldots,c_{N})\in X(m)^{(i\oplus s_{m})}}\prod_{x\in A(m)\setminus\{n\}}q_{m, x}^{(c_{x})},$$ where $X(m)^{(i)}:=\{c\in\{0, 1\}^{N}\vert c H_{m}^{T}=i\}$, $H_{m}$ is the $m$th row of $H$, $\oplus$ is the XOR operation, and TeX Source $$P(a\vert b)=1-p {\rm if }a=b$$ and TeX Source $$P(a\vert b)=p {\rm if }a\ne b$$ for $a$, $b\in\{0, 1\}$. $K_{m, n}$ is a constant such that $r_{m, n}^{(0)}+r_{m, n}^{(1)}=1$.
- Step 3' (column process): For each $1\leq n\leq N, m\in B(n), {\rm and} i=0$, 1, compute TeX Source $$q_{m,n}^{(i)}:=K^{\prime}_{m, n}P(y_{n}\vert c_{n}=i)\prod_{x\in B(n)\setminus{m}}r_{x, n}^{(i)},$$ where $K^{\prime}_{m, n}$ is a constant such that $q_{m, n}^{(0)}+q_{m, n}^{(1)}=1$.

These replacements define an equivalent algorithm, i.e., the two algorithm always produce the same output.

SECTION III

Assuming $s=0$, we denote the output of the sum-product decoding as ${\rm SPDec}[H, y, p, l_{\max}]$. Then for a linear code the following statements are equivalent:

- ${\rm SPDec}[H, y, p, l_{\max}]=0$,
- ${\rm SPDec}[H, c+y, p, l_{\max}]=c$,

where $c$ is any codeword defined by $H$. It suggests that we can define the **correctable error set** for an LDPC code under fixed initialization $p$ and maximal iteration number $l_{\max}$ as the set of error patterns corrected by ${\rm SPDec}$. Let the correctable error set be denoted by ${\cal E}_{H, p, l_{\max}}$. Note that if we change $p$ or $l_{\max}$, the correctable set ${\cal E}_{H, p, l_{\max}}$ changes. Significantly, ${\cal E}_{H, p, l_{\max}}$ is independent of the crossover probability $q$ of the channel.

Let the parity-check matrix $H_{1}$ be the $10\times 20$ matrix TeX Source $$H_{1}=\left(\matrix{10000100001000010000\cr01000010000100001000\cr 00100001000010000100\cr00010000100001000010\cr00001000010000100001\cr10000010000010000010\cr 01000001000001000001\cr00100000100000110000\cr00010000011000001000\cr00001100000100000100}\right).$$

We can obtain the correctable error set ${\cal E}_{H_{1}, p, l_{\max}}$ by exhaustive search for fixed $p$ and $l_{\max}$.

Case $p=0.258$ and ${l}_{\max}=5$: ${\cal E}_{H_{1}, 0.258, 5}=\{0\}$, where 0 is the all-zero vector.

Case $p=0.258$ and ${l}_{\max}=16$: ${\cal E}_{H_{1}, 0.258, 16}=\{0\}$.

Case $p=0.257$ and ${l}_{\max}=5$: ${\cal E}_{H_{1}, 0.257, 5}=\{0\}$.

Case $p=0.257$ and ${l}_{\max}=16$: ${\cal E}_{H_{1}, 0.257, 16}=\{0,e_{1},\ldots, e_{20}\}\cup E$, where $e_{i}$ is the $i$th unit vector, and TeX Source $$\eqalignno{E=&\,\{(10000000000000000001),(00000010000000100000),\cr&(01000000000000010000),(00000000010010000000),\cr&(00000001001000000000),(00100000000000001000),\cr&(00001000000000000010),(00000100000001000000),\cr&(00000000100100000000),(00010000000000000100)\}.}$$

Case $p=0.220$ and ${l}_{\max}=5$: ${\cal E}_{H_{1}, 0.220, 5}=\{0,e_{1},\ldots, e_{20}\}\cup E$.

Case $p=0.220$ and ${l}_{\max}=16$: ${\cal E}_{H_{1}, 0.220, 16}=\{0,e_{1},\ldots, e_{20}\}\cup E$.

The cardinality of ${\cal E}_{H_{1}, p, l_{\max}}$ for $l_{\max}=16$ is plotted in Fig. 1 as a function of $p$. For this case, we observe that the best values of $p$ represented in Fig. 1 are in the set $\{0.019, 0.020, 0.021, 0.022\}$. $\hfill{\square}$

Table I shows the weight distribution of the correctable error set for an array type (3,11) LDPC code, known as an FSA code [6] of type (3, 11), which is a quasi-cyclic LDPC code with base model matrix $(m_{i,j})_{1\leq i\leq 3, 1\leq j\leq 11}$, $m_{i,j}:=(i-1)\times (j-1)$ and circulant size 11, obtained with $l_{\max}=16$, and initialization $p=0.01$. This code is of length 121 $(=11\times 11)$. We observe that for this value of $p$, ${\rm SPDec}$ achieves bounded distance decoding since, for this code, its minimum distance is 6. Furthermore, 87% of weight 3 error patterns are also correctable.

From Table I, we observe that the error of weight 121 is correctable for the array type (3, 11) LDPC code. In fact, this phenomena occurs under $0\leq p<0.5$ and $l_{\max}\geq 1$ if each row-weight of the parity-check matrix is odd. The reason is the following: In Step 2 of ${\rm SPDec}$, $r_{m, n}^{(i)}$ is obtained from $x\in A(m)\setminus\{n\}$. Under the assumption “the row-weight is odd,” the value $r_{m,n}^{(i)}$ for the all-1 error is the same as the value for the all-0 error. Therefore, the all-1 error is correctable, “without iteration.”

SECTION IV

In Section III, it was observed that the correctable error set ${\cal E}_{H, p, lmax}$ was independent of the BSC crossover probability $q$. This suggests introducing another value $p$ to ${\rm SPDec}$. We call $p$ the **initialization** of the decoder. As a result, the word-error-rate is regarded as a function of two variables $p$ and $q$ for fixed parameters $H$ and $l_{\max}$. Let it be denoted by ${\rm WER}_{H, l_{\max}}(p,q)$. With this notation, the word-error-rate of the original sum-product decoding is equal to ${\rm WER}_{H, l_{\max}}(q,q)$. The following theorem presents properties of ${\rm WER}_{H, l_{\max}}(p,q)$:

Let $H$ be a parity-check matrix and ${\rm WER}_{H, l_{\max}}(p,q)$ the word-error-rate for an initialization $p$, a crossover probability $q$, and the maximum iteration number $l_{\max}$. Then we have the following:

- For a fixed initialization $p_{0}$, ${\rm WER}(p_{0}, q)$ is a polynomial in $q$.
- For any initialization $p$, ${\rm WER}(p, 1/2)\times 2^{N}$ is an integer, where $N$ is the number of columns of $H$. Furthermore, ${\rm WER}(p, 1/2)$ is a discrete function of $p$.
- For some parity-check matrix $H$, ${\rm WER}(p,p)$, the word-error-rate of the original sum-product decoding, is not a continuous function of $p$.

Theorem IV.1 3) implies that there may be no polynomial representation of the word-error-rate ${\rm WER}(p, p)$ as a function of $p$ for some LDPC codes. On the other hand, by Theorem IV.1 1) for a fixed initialization $p_{0}$, there exists a polynomial representation of the word-error-rate ${\rm WER}(p_{0}, q)$ as a function of $q$ for any LDPC code. This fact motivates us to investigate sum-product decoding with a fixed initialization.

Before the proof is given, we recall the weight enumerator as follows: For a set ${\cal E}\subset\{0,1\}^{N}$, the weight enumerator with variables $X$ and $Y$ is defined as TeX Source $$A_{\cal E}[X, Y]:=\sum_{e\in{\cal E}}X^{N-{\rm wt}(e)}Y^{{\rm wt}(e)},$$ where ${\rm wt}(e)$ is the Hamming weight of $e$.

- As we pointed out, it is possible to define the correctable error set ${\cal E}_{H, p_{0}, l_{\max}}$ by fixing $H$, $p_{0}$, $l_{\max}$. Let $A_{{\cal E}_{H, p_{0}, l_{\max}}}[X, Y]$ be the weight enumerator of ${\cal E}_{H, p_{0}, l_{\max}}$. The probability that the error-vector is contained in ${\cal E}_{H, p_{0}, l_{\max}}$ is equal to $A_{{\cal E}_{H, p_{0}, l_{\max}}}[1-q, q]$ over a BSC with crossover probability $q$. This implies that TeX Source $${\rm WER}(p_{0}, q)=1-A_{{\cal E}_{H, p_{0}, l_{\max}}}[1-q, q],$$ which is a polynomial of $q$.
- By the argument above, we have TeX Source $${\rm WER}(p, 1/2)=1-A_{\cal E}[1/2, 1/2].$$ On the other hand, TeX Source $$A_{\cal E}[1/2, 1/2]=\sum_{e\in{\cal E}}(1/2)^{N}=\vert{\cal E}\vert\times2^{-N},$$ where $\vert A\vert$ is the cardinality of the set $A$. Therefore, TeX Source $${\rm WER}(p, 1/2)\times 2^{N}=2^{N}-\vert{\cal E}\vert,$$ which is an integer.
- Define a finite set ${\cal A}\subset\BBZ [X, Y]$ as TeX Source $${\cal A}:=\left\{\sum{a_{i}}X^{N-i}Y^{i}\vert a_{0}=1, 0\leq a_{i}\leq 2^{N}\right\},$$ where $\BBZ [X, Y]$ is the ring of polynomials in $X$ and $Y$ over the integers. Then any weight enumerator of a correctable error set is an element of ${\cal A}$, since TeX Source $${\rm SPDec}[H, (0, 0, \ldots, 0), p, l_{\max}]=0\eqno{\hbox{(1)}}$$ for any $H$, $0<p<1/2$, $l_{\max}>0$, and there is only one vector of Hamming weight zero so that $a_{0}=1$. Note that the cardinality of $\{0,1\}^{N}$ is $2^{N}$. Therefore, the cardinality of a correctable error set is at most $2^{N}$. This implies TeX Source $$a_{i}\leq 2^{N}.$$

The word-error-rate ${\rm WER}(q, q)$ over a BSC with crossover probability $q$ is represented by TeX Source $${\rm WER}(q, q)=1-f_{i}[1-q, q]$$ for some $f_{i}[1-q,q]\in{\cal A}$. Since the element $f_{i}[1-q,q]$ is a polynomial, $f_{i}[1-q, q]$ is a function of $q$. Furthermore, each $1-f_{i}[1-q, q]$ is a continuous function.

Next, we observe the discontinuity property of $H_{1}$ in Example III.1 with $l_{\max}=5$, 16. We have TeX Source $${\cal E}_{H_{1}, 0.258, l_{\max}}=\{(0,0,\ldots,0)\}.$$ Its weight enumerator is TeX Source $$f_{0}=X^{20}.$$ Then TeX Source $${\rm WER}(0.258)=1-f_{0}[1-0.258, 0.258].$$ Remember that we have TeX Source $${\cal E}_{H_{1}, 0.220, l_{\max}}\ne\{(0,0,\ldots,0)\}.$$ Denoting the weight enumerator at $q=0.220$ by $f_{1}$, we obtain TeX Source $$f_{1}=X^{20}+20 X^{19}Y+10 X^{18}Y^{2}.$$ this implies $f_{0}\ne f_{1}$.

On the other hand, $f_{0}[1-q, q]$ never crosses another $f[1-q, q]$, $f\in{\cal A}$ in the interval (0, 0.5), since the coordinates of $f[X,Y]-f_{0}[X,Y]$ are nonnegative and there exists at least one positive coordinate. By (1), every $f[X, Y]$ has the form TeX Source $$f[X, Y]=X^{20}+\sum_{1\leq i\leq 20}a_{i}X^{20-i}Y^{i}$$ where $a_{i}$ is a nonnegative integer. Since $f_{0}[X, Y]=X^{20}$, we have TeX Source $$f[X, Y]-f_{0}[X, Y]=\sum_{1\leq i\leq 20}a_{i}X^{20-i}Y^{i}.$$ Note that $f[X, Y]-f_{0}[X, Y]=0$ if and only if $f_{0}[X, Y]=f[X, Y]$. Remember that TeX Source $$\eqalignno{f_{0}[X, Y]=&\,1-{\rm WER}(0.258),\cr f_{1}[X,Y]=&\,1-{\rm WER}(0.220),}$$ and TeX Source $$f_{1}[X,Y]\ne f_{0}[X, Y].$$ Therefore, the word-error-rate is discontinuous in the interval $(0.220, 0.258)$. $\blackboxfill$

From the proof of 2), we have $\vert{\cal E}\vert=2^{N}\times (1-{\rm WER}(p,1/2))$. Therefore, we can easily read the word-error-rate ${\rm WER}(p,1/2)$ from the graph of the cardinalities of the correctable error set (see Fig. 1).

The key point of the proof for 3) is not only the different $f_{i}$'s but also the finiteness of the cardinality of the set ${\cal A}$. In fact, if the cardinality is infinite, it is possible for $f$ to be continuous. For example, define TeX Source $${\cal A}=\{f_{i}\vert f_{i}(x):=i\;\; (0\leq i, x\leq 1)\}$$ as a set of constant functions. Although $f_{i}\ne f_{j}$ for all $i\ne j$, the function TeX Source $$f(x):=f_{x}(x)=x$$ is continuous.

The finiteness of ${\cal A}$ in the proof 3) is due to the finiteness of the set of error vectors. This is one of the reasons we assume the communication channel is a BSC.

SECTION V

From Theorem IV.1 2), the word-error-rate of fixed initialization decoding (FID) is closely related to the cardinality of the correctable error set. In this section, we discuss properties of correctable error sets.

We denote the output of the sum-product syndrome decoding by ${\rm SynDec}[H, y, p, l_{\max}, s]$, which emphasizes the input $s$, called a syndrome.

For any $H$, $y$, $p$, $l_{\max}$, the following two statements are equivalent:

- ${\rm SPDec}[H, y, p, l_{\max}]=0$
- ${\rm SynDec}[H, 0, p, l_{\max}, H y^{T}]=y$

Theorem V.1 follows from the following proposition:

Let $Q$ and $R$ be the matrices updated by ${\rm SPDec}[H, y, p, l_{\max}]$ and let $\bar{Q}$ and $\bar{R}$ be the matrices updated by ${\rm SynDec}[H, 0, p, l_{\max}, H y^{T}]$.

For any $1\leq m\leq M$, $1\leq n\leq N$ and $1\leq l\leq l_{\max}$, the following holds over a binary symmetric channel: TeX Source $$q_{m,n}^{(i)}=\bar{q}_{m,n}^{(i\oplus y_{n})},r_{m,n}^{(i)}=\bar{r}_{m,n}^{(i\oplus y_{n})}, Q_{n}^{(i)}=\bar{Q}_{n}^{(i\oplus y_{n})}.$$

The key idea of the proof follows from the observation of the two decoders ${\rm SPDec}$ and ${\rm SynDec}$ simultaneously.

For Step 1 (initialize), we remark that TeX Source $$q_{m,n}^{(i)}=\bar{q}_{m,n}^{(i\oplus y_{n})}=1/2,r_{m,n}^{(i)}=\bar{r}_{m,n}^{(i\oplus y_{n})}=1/2.$$

For $1\leq l\leq l_{\max}$, we prove recursively that the proposition holds at the $(l+1)$th iteration assuming it holds at the $l$th iteration.

For Step 2 (row process), since the channel is a binary symmetric channel, TeX Source $$\eqalignno{r_{m,n}^{(i)}=&\, K_{m,n}\sum_{(c_{1},\ldots,c_{N})\in X(m)^{0}}\prod_{x\in A(m)\setminus\{n\}}q_{m,x}^{(c_{x})}P(y_{x}\vert c_{x})\cr=&\, K_{m,n}\sum_{(c_{1},\ldots, c_{N})\in X(m)^{0}}\prod_{x\in A(m)\setminus\{n\}}q_{m, x}^{(c_{x})}P(0\vert c_{x}\oplus y_{x}).}$$ Since $q_{m,x}^{(c_{x})}=\bar{q}_{m,x}^{(c_{x})}$ holds on the previous iteration, the last term is TeX Source $$K_{m,n}\sum_{(c_{1},\ldots, c_{N})\in X(m)^{0}}\prod_{x\in A(m)\setminus\{n\}}\bar{q}_{m, x}^{(c_{x}\oplus y_{x})}P(0\vert c_{x}\oplus y_{x}).$$ Since $H y^{T}=s$, the term above is equal to TeX Source $$K_{m,n}\mkern-12mu\sum_{(c_{1}\oplus y_{1},\ldots, c_{N}\oplus y_{N})\in X(m)^{s_{m}}}\prod_{x\in A(m)\setminus\{n\}}\mkern-12mu\bar{q}_{m,x}^{(c_{x}\oplus y_{x})}P(0\vert c_{x}\oplus y_{x}).$$ By the definition of $\bar{r}_{m,n}^{(i\oplus y_{n})}$, it is equal to $\bar{r}_{m,n}^{(i\oplus y_{n})}$.

For Step 3 (column step), since $r_{x,n}^{(i)}=\bar{r}_{x,n}^{(i\oplus y_{n})}$ holds as it is shown above, TeX Source $$\eqalignno{q_{m,n}^{(i)}=&\, K^{\prime}_{m,n}\prod_{x\in B(n)\setminus\{m\}}r_{x,n}^{(i)}\cr=&\, K^{\prime}_{m,n}\prod_{x\in B(n)\setminus\{m\}}\bar{r}_{x,n}^{(i\oplus y_{n})}\cr=&\,\bar{q}_{m,n}^{(i\oplus y_{n})}.}$$

For Step 4 (temporary word), since $r_{x,n}^{(i)}=\bar{r}_{x,n}^{(i\oplus y_{n})}$ holds as shown above, TeX Source $$\eqalignno{Q_{n}^{(i)}=&\, K^{\prime\prime}_{n}P(y_{n}\vert c_{n}=i)\prod_{x\in B(n)}r_{x,n}^{(i)}\cr=&\, K^{\prime\prime}_{n}P(0\vert c_{n}=i\oplus y_{n})\prod_{x\in B(n)}\bar{r}_{x,n}^{(i\oplus y_{n})}\cr=&\,\bar{Q}_{n}^{(i)}.}$$ Therefore, we obtain the proposition. $\blackboxfill$

At the $l_{0}$th iteration of the decoding, we have ${\rm SPDec}[H, y, p, l_{\max}]=0$

- $\iff$ for $1\leq l<l_{0}$, parity-check is not satisfied at the $l$th iteration and at the $l_{0}$th iteration we have $Q_{n}^{(0)}>Q_{n}^{(1)}$ for all $1\leq n\leq N$.
- $\iff$ for $1\leq l<l_{0}$, parity-check is not satisfied at the $l$th iteration and at the $l_{0}$th iteration we have $\bar{Q}_{n}^{(y_{n})}>\bar{Q}_{n}^{(1\oplus y_{n})}$ for all $1\leq n\leq N$ by Proposition V.1.
- $\iff$ at the $l_{0}$th iteration, we have ${\rm SynDec}[H, 0, p, l_{\max}, H y^{T}]=y$.

$\blackboxfill$

For $H$, $p$, and ${l}_{\max}$, TeX Source $$\eqalignno{&{\cal E}_{H, p,l_{\max}}\cr=&\,\{y\!\in\!\{0,1\}^{N}\!\mid\!{\rm SynDec}[H, 0, p,l_{\max}, s]\!=\!y,\exists s\!\!\in\{0,1\}^{M}\}.}$$

We call a syndrome $s\in\{0, 1\}^{M}$ a **decodable syndrome** if there exists $y\in{\cal E}_{H, p, l_{\max}}$ such that ${\rm SynDec}[H, 0, p, l_{\max}, s]=y$. Let us denote the set of the decodable syndromes by ${\cal D}_{H, p, l_{\max}}$.

Thanks to Corollary V.2, we can determine the correctable error set if its co-dimension $N-K$ is small. For example, for the code of Example III.2, $N=121$ and $K=90$, so that it is impossible to perform an exhaustive search to determine the correctable error set. However, the co-dimension of the parity-check matrix is $N-K=31$. Therefore it is possible to obtain Table I, by Corollary V.2.

Table I allows us to obtain a theoretical formula of ${\rm WER}(0.01, q)$ with ${l}_{\max}=16$ for a FSA code of type (3, 11). Fig. 2 compares the theoretical values and the computer experimental results of the word-error-rate over a BSC. Assuming an AWGN channel with one-bit quantized output, the SNR values 4.0, 5.0, 6.0, 7.0, and 8.0 dB correspond to crossover probabilities $q=\,$0.026615, 0.015044, 0.007475, 0.003162, and 0.001093, respectively. Note that numerical calculations are performed in floating point in the computer experiments, while our analysis for the sum-product decoding is theoretical. We observe that the theoretical and experimental values match well, as expected. Since we have a theoretical formula as a polynomial in $q$, it is possible to calculate word-error-rates at high SNR values. For example, ${\rm WER}(0.01, 0.00001)=9.04\times 10^{-12}$.

SECTION VI

Let $H$ be an $M\times N$ low-density parity-check matrix and let $\sigma$ be an $M\times M$ permutation matrix on the index set $[M]:=\{1, 2,\ldots, M\}$. The permutation $\sigma$ acts naturally on the columns of $H$. Let $\sigma H$ denote the permuted matrix of $H$ by $\sigma$, and let $\sigma s$ denote the permuted vector of a column vector $s$ by $\sigma$. Similarly, $\tau$ denotes an $N\times N$ permutation matrix on the index set $[N]$, which acts on the rows of $H$. Let $H\tau$ denote the permuted matrix of $H$ by $\tau$, and let $s\tau$ denote the permuted vector of a row vector $s$ by $\tau$.

The following is a natural observation:

Let $0\leq p\leq 1$ and $l_{\max}$ an positive integer. For any sequence $y\in\BBF_{2}^{N}$, the following are equivalent:

- ${\rm SPDec}[H, y, p, l_{\max}]=0$,
- ${\rm SPDec}[\sigma H, y, p, l_{\max}]=0$,
- ${\rm SPDec}[H\tau, y\tau, p, l_{\max}]=0$,

for any permutation $\sigma$ for the row index set of $H$ and any permutation $\tau$ for the column index set of $H$.

The equivalence of 1. and 2. is obtained directly from the definition of the sum-product decoding, since any row permutation for a parity-check matrix does not change the temporary word in Step 4.

The equivalence of 1. and 3. is obtained by the following equality: TeX Source $${\rm SPDec}[H\tau, y\tau, p, l_{\max}]={\rm SPDec}[H, y, p, l_{\max}]\tau.$$ The equality follows from the definition of the sum-product decoding. $\blackboxfill$

In the last section, we discussed the relation between syndrome decoding and the correctable error set. The following is a similar statement to Proposition VI.1.

Let $0\leq p\leq 1$ and $l_{\max}$ a positive integer. For any sequence $y\in\BBF_{2}^{N}$, the following are equivalent:

- ${\rm SynDec}[H, 0, p, l_{\max}, Hy^{T}]=y$,
- ${\rm SynDec}[H\tau, 0, p, l_{\max}, Hy^{T}]=y\tau$,

for any permutation $\tau$ of columns of $H$.

The proof is a similar to that of Proposition VI.1. $\blackboxfill$

The parity-check matrix $H$ of an LDPC code can be characterized by a bipartite graph $([M],[N], H)$ with vertex sets $[M]=\{1, 2,\ldots, M\}$ and $[N]=\{1, 2,\ldots, N\}$. This bipartite graph is called the Tanner graph of $H$. Therefore, it is natural that an automorphism of an LDPC code is a graph automorphism of its Tanner graph, although an automorphism of a linear code is defined as an index permutation which stabilizes the code space. We define the automorphism $(\sigma,\tau)$ of an LDPC code with parity-check matrix $H$ as a pair of index permutations on $[M]$ and $[N]$ which satisfies TeX Source $$\sigma^{-1}H\tau=H.$$ If we define the product of automorphisms $(\sigma_{1},\tau_{1})$ and $(\sigma_{2},\tau_{2})$ by TeX Source $$(\sigma_{1},\tau_{1})\times (\sigma_{2},\tau_{2}):=(\sigma_{1}\sigma_{2},\tau_{1}\tau_{2}),$$ The automorphisms constitute a finite group ${\rm Aut}(H)$. Note that we regard $\sigma$ and $\tau$ as permutation matrices of size $M\times M$ and $N\times N$ respectively, since they act on $H$ as index permutations.

Since an automorphism stabilizes the Tanner graph, we obtain the following result:

Let $H$ be a parity-check matrix. Let ${\cal E}_{H, p, l_{\max}}$ (resp. ${\cal D}_{H, p, l_{\max}}$) be a correctable error set (resp. a decodable syndrome set) with initialization $p$ and maximal iteration number $l_{\max}$.

- For any error vector $y$ and any automorphism $(\sigma,\tau)\in{\rm Aut}(H)$, TeX Source $$y\in{\cal E}_{H, p, l_{\max}}\iff y\tau\in{\cal E}_{H, p,l_{\max}}.$$
- For any syndrome $s$ and any automorphism $(\sigma,\tau)\in{\rm Aut}(H)$, TeX Source $$s\in{\cal D}_{H, p, l_{\max}}\iff\sigma s\in{\cal D}_{H, p,l_{\max}}.$$

- .1.) By the definition of a correctable error set, TeX Source $$\eqalignno{& y\in{\cal E}_{H, p, l_{\max}}\cr\iff &{\rm SPDec}[H, y, p, l_{\max}]=0.}$$ By Proposition VI.1, TeX Source $$\eqalignno{&{\rm SPDec}[H, y, p, l_{\max}]=0\cr\iff &{\rm SPDec}[H\tau, y\tau, p, l_{\max}]=0.}$$ Since $(\sigma,\tau)$ is a automorphism of $H$, TeX Source $$\eqalignno{&{\rm SPDec}[H\tau, y\tau, p, l_{\max}]=0\cr\iff&{\rm SPDec}[\sigma H, y\tau, p, l_{\max}]=0.\cr&}$$ By applying Proposition VI.1 again, TeX Source $$\eqalignno{{\rm SPDec}[\sigma H, y\tau, p, l_{\max}]=0\iff&{\rm SPDec}[H, y\tau, p, l_{\max}]=0\cr\iff & y\tau\in{\cal E}_{H, p, l_{\max}}}$$
- .2.) Let $s\in{\cal D}_{H, p, l_{\max}}$ and $y={\rm SynDec}[H, 0, p, l_{\max}, s]$ Now we have $y\in{\cal E}_{H, p, l_{\max}}$ and $H y^{T}=s$. By 1.), $y\in{\cal E}_{H, p, l_{\max}}$ implies $y\tau\in{\cal E}_{H, p, l_{\max}}$. Therefore, $H (y\tau)^{T}\in{\cal D}_{H, p, l_{\max}}$. If $(\sigma,\tau)$ is an automorphism on $H$, (i.e., $\sigma^{-1}H\tau=H$,) then $(\sigma^{-1},\tau^{-1})$ is an automorphism too, (i.e., $H=\sigma H\tau^{-1}$). Note that $\tau^{T}=\tau^{-1}$, since $\tau$ is a permutation matrix.

Therefore, TeX Source $$H (y\tau)^{T}=H\tau^{-1}y^{T}=(\sigma H) y^{T}=\sigma s.$$ Thus $\sigma s\in{\cal D}_{H, p, l_{\max}}$. $\blackboxfill$

Proposition VI.3 generalizes the main theorem of [16] that “an error vector obtained by a quasi-cyclic permutation of a correctable error is also a correctable error.” Matsunaga et.al. pointed out in [16] that this result reduces the computational costs of computer experiments for calculating bit error-rates. We point out this idea is applicable to reduce the computational costs for determining the correctable error set. In group theory, TeX Source $$O_{x}:=\{gx\in\BBF_{2}^{M}\vert g\in{\rm Aut}(H)\}$$ is called an orbit of $x\in\BBF_{2}^{M}$. The orbits $\{O_{x}\}$ give a partition of $\BBF_{2}^{M}$. Therefore the computational cost for determining a correctable error is reduced to the number of orbits.

Let $H$ be a parity-check matrix of size $M\times N$ such that columns consist of all of weight two vectors in $\BBF_{2}^{M}$, where $N=M(M-1)/2$. Then the set of check node permutations $\tau$ for its Tanner graph is the symmetric group $S_{M}$ of degree $M$, in other words, $S_{M}$ acts on $[M]$. The orbits consist of $O_{0}, O_{1},\ldots, O_{M}$, where $O_{i}=\{x\in\BBF_{2}^{M}\vert{\rm wt}(x)=i\}$. Therefore we can determine the correctable error set by $M+1$ computer experiments.

In general, the number of orbits can be determined from the following theorem.

Let $G$ be a finite group and $[M]$ a set such that $G$ acts on $[M]$. Then the number of the orbits of $[M]$ by $G$ is TeX Source $${{1}\over{\vert G\vert}}\sum_{\tau\in G}\#\{m\in M\mid\tau m=m\}.$$

As a direct corollary of Theorem VI.5, we can count how many syndromes are sufficient to determine the correctable error set for an FSA code of type $(3, p)$. This number is TeX Source $${{2^{3p-2}+(p-1)(3\cdot 2^{p}-2p+4)}\over{p^{2}}},$$ which represents roughly a reduction by a factor $p^{2}$ with respect to the typical number of syndromes $2^{3p-2}$. For example, for $p=11$, this number is 17, 748, $308\approx 2^{24.1}$. It is significantly smaller than the number of syndromes of an FSA code of type (3, 11), i.e., $2^{31}$.

SECTION VII

We performed experiments of our decoding method, with FID for MacKay code of length 504 and of rate 0.5 [14] (See Table II). FID with $p=0.10$ and $l_{\max}=10$ always outperforms the original sum-product decoding for crossover probabilities 0.02, 0.03, 0.04, and 0.05. In the case $l_{\max}=10$ the difference between FID with $p=0.10$ and SPDec is larger than the one with $l_{\max}=30$.

Table III summarizes another experimental result for the different set cycle (DSC) code (273, 191) [15].

FID with $p=0.07$ always outperforms the original sum-product decoding at the crossover probability $q=0.02$. At the crossover probability 0.04, the original sum-product decoding shows the best performance among other FIDs and it is almost the same as FID with $p=0.07$. Similarly to the MacKay code case, the difference between FID with $p=0.07$ and SPDec for $l_{\max}=10$ is larger than the one with $l_{\max}=100$ at the crossover probability $q=0.02$.

Fig. 3 depicts the word-error-rate as a function of the maximal iteration number for the DSC code (273, 191) for FID with $p=0.01, 0.07$ at the crossover probability $q=0.01$. This figure indicates that both initializations 0.01 and 0.07 lead to the same WER for a very large number of iterations. On the other hand, a suitable initialization $p=0.07$ converges to the final word-error-rate much faster than the $p=q=0.01$ case.

We introduce an application to evaluate the word-error-rate for quantum LDPC codes. It is known that the experiment for quantum CSS codes is implementable on a classical computer [18]. A quantum LDPC code of type CSS is a pair of LDPC codes associated with parity-check matrices $H_{x}$ and $H_{z}$ such that $H_{x}H_{z}^{T}=0$. A quantum codeword of the CSS code is characterized by a complex linear combination of quantum states defined by $\sum_{d\in D^{\perp}}\vert c+d\rangle$, where $D^{\perp}$ is the dual code associated with $H_{z}$ and $c$ is a codeword defined by $H_{x}$.

The experiment is composed of the following processes:

- Set a quantum Pauli channel $p(I)+p(X)+p(Z)+p(XZ)=1$.
- Randomly, generate a pair of error vectors $e_{x}$, $e_{z}$ related to the Pauli channel.
- Calculate syndromes $s_{x}$, $s_{z}$ by $s_{x}=H_{x}e_{x}$, $s_{z}=H_{z}e_{z}$.
- Input syndromes $s_{x}$ and $s_{z}$ to a sum-product syndrome decoder and obtain outputs $e_{x}^{o}$, $e_{z}^{o}$.
- Decoding succeeds if $e_{x}-e_{x}^{o}\in\langle H_{z}\rangle$ and $e_{z}-e_{z}^{o}\in\langle H_{x}\rangle$. Decoding fails otherwise, where $\langle H\rangle$ is a code space generated by $H$.

From Theorem V.1, we can omit the 3rd step. We can also replace the 4th step with “Input error vector $e_{x}$ and $e_{z}$ to a sum-product decoder and obtain outputs $c_{x}^{o}c_{z}^{o}$. Finally we can replace the 5th step with “Decoding succeeds if $c_{x}^{o}\in\langle H_{z}\rangle$ and $c_{z}^{o}\in\langle H_{x}\rangle$. Decoding fails otherwise.”

Hence, we reduce the computational complexity of the experiment. Note that we can also replace the sum-product decoder with our FID. It becomes applicable to a security evaluation of quantum cryptography, known as a BB84 protocol [19].

SECTION VIII

In this paper, we introduced the concept of a correctable error set and a fixed initialization decoding, by noticing that the sum product decoder with a given iteration number only depends on the initialized probability of error, for a BSC. Although this value has been conventionally selected as the BSC crossover probability, we showed that other selections can provide better performance or faster convergence. We also proved that for any fixed initialization $p$ (i.e., any given correctable error set), the word-error-rate can be represented as a polynomial of the BSC crossover probability. This suggests that the word-error-rate can be analytically derived from the (total or partial) knowledge of the correctable error set.

For further research, the following topics seem meaningful from the points of view of theory and practice:

- While these results have been derived for the BSC, the same concepts can be extended to any discrete input—discrete output channel model, in particular quantized versions of the AWGN channel.
- Construct a LDPC code such that its length is practically large but its correctable error set can be identified. One approach is to construct a bipartite graph with high symmetry. For example, consider a parity-check matrix $H$ such that the columns consist of all column vectors of Hamming weight 2. Then its length is $M(M-1)/2$, where $M$ is the column size of $H$, e.g., the length is 4950 for $M=100$. Actually the computational complexity to determine the correctable error set is $M$, thanks to the symmetry of $H$. Unfortunately, the LDPC code associated with $H$ does not show good error-correcting performance. However, this example implies that it is not impossible to determine correctable error sets for large length codes.

The authors thank the anonymous reviewers, and Mr. W. DeMeo and Mr. J. Kong for their valuable comments and suggestions to improve the quality of the paper.

This work was supported by KAKENHI 22760286. The paper was presented in part at the 2010 IEEE International Symposium on Information Theory (ISIT) and in part at the 32nd Symposium on Information Theory and Its Applications (SITA 2009, in Japanese).

M. Hagiwara is with the National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba City, 305-8568, Japan, and also with the Center for Research and Development Initiative, Chuo University, Tokyo, 112-8551, Japan (e-mail: hagiwara.hagiwara@aist.go.jp).

M.P.C. Fossorier is with the ETIS, ENSEA/UCP/CNRS UMR-80516, Cergy-Pontoise, 95014, France.

H. Imai is with the AIST and Department of Electrical, Electronic and Communication Engineering, Faculty of Science and Engineering, Chuo University, Tokyo, 112-8551, Japan.

Communicated by I. Sason, Associate Editor for Coding Theory.

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

No Data Available

No Data Available

None

No Data Available

- This paper appears in:
- No Data Available
- Issue Date:
- No Data Available
- On page(s):
- No Data Available
- ISSN:
- None
- INSPEC Accession Number:
- None
- Digital Object Identifier:
- None
- Date of Current Version:
- No Data Available
- Date of Original Publication:
- No Data Available

Normal | Large

- Bookmark This Article
- Email to a Colleague
- Share
- Download Citation
- Download References
- Rights and Permissions