Girth-Based Sequential-Recovery LRCs

In this paper, we prove that a linear block code with girth <inline-formula> <tex-math notation="LaTeX">$2(t+1)$ </tex-math></inline-formula> is a <inline-formula> <tex-math notation="LaTeX">$t$ </tex-math></inline-formula>-sequential-recovery locally repairable codes (LRCs) with locality <inline-formula> <tex-math notation="LaTeX">$r$ </tex-math></inline-formula> if its parity-check matrix has column weight at least 2 and row weight at most <inline-formula> <tex-math notation="LaTeX">$r+1$ </tex-math></inline-formula>. This gives a new connection between sequential-recovery LRCs and linear block codes. We also derive that the repair time of the <inline-formula> <tex-math notation="LaTeX">$t$ </tex-math></inline-formula>-sequential-recovery LRCs from the linear block codes by this connection is at most <inline-formula> <tex-math notation="LaTeX">$\lceil t/2 \rceil $ </tex-math></inline-formula>.


I. INTRODUCTION
As an erasure-correcting code, the locally repairable codes (LRCs) [4] was proposed to improve the repair efficiency, which needs only a few nodes to repair any erasure. For an erasure, its locality is the number of other symbols needed to repair it, which is the key concept of LRCs. A linear code is an LRC with locality r if each symbol is locally repaired by at most r other symbols [4].
Various LRCs for multiple erasures were proposed [17], [18], [22]. These LRCs are divided into sequential-and parallel-recovery LRCs based on whether the repair process is either sequential or parallel. Let C be a linear code of length n and c = (c 1 , c 2 , . . . , c n ) be a codeword of C. The code C is said to be a t-sequential-recovery (t-seq) LRC if, for any s ≤ t erasures, there exists an arrangement (j 1 , j 2 , . . . , j s ) of s erasure positions such that for each u = 1, 2, . . . , s, there exists a subset R(j u ) ⊂ {1, 2, . . . , n} \ {j u } satisfying 1) |R(j u )| ≤ r, 2) R(j u ) ∩ {j u , j u+1 , . . . , j s } = ∅, and 3) c j u = l∈R(j u ) a l c l , for some a l ∈ F q .
The associate editor coordinating the review of this manuscript and approving it for publication was Zihuai Lin .
Repair time is another metric for t-seq LRCs, which is the maximum number of steps that are needed to repair any t erasures [23]. Obviously, the repair time of t-seq LRCs is at most t and it is indeed t in general. To reduce the repair time, the LRC with joint sequential-parallel-recovery is proposed for any t erasures [23], where some u(≤ t) erasures can be repaired locally and parallelly. So, its repair time is at most t − u + 1. Note that a t-para LRC has the repair time 1 for any t.
Let C be a linear block (or low-density parity-check) code with a parity-check matrix H = (h i,j , i = 1, 2, . . . , m; j = 1, 2, . . . , n). H can be represented as a Tanner graph with m check nodes and n variable nodes [15]. The i th check node and the j th variable node are connected if the element h i,j is non-zero. In the Tanner graph, the length of all cycles is even and greater than or equal to four [3]. The girth is the length of the shortest cycle in its Tanner graph. The girth is an important issue for low-density parity-check (LDPC) codes, since it heavily affects its performance with sumproduct decoding [5]. A lot of constructions for LDPC codes with a large girth were proposed [1], [5], [8], [9], [12], [13], [14], [21], [24], [25]. A cycle of length 2s can be seen in the corresponding matrix when the graph has a cycle of length 2s. The patterns of 4-cycle and 6-cycle in the matrix are summarized as shown in Fig. 1 [13], [16].
Consider the definition of the sequential-recovery LRCs from the viewpoint of the parity-check matrix. The linear code C is a t-seq LRC if, for any s ≤ t erasures, there exists a row of its parity-check matrix whose support contains the coordinate of precisely one of the s erasures. Therefore, we want to know what is the form of the parity-check matrix that satisfies the above condition of the t-seq LRCs.
Our contribution: In this paper, we prove two theorems for sequential-recovery LRCs as follows.

Theorem 1: A linear block code is a t-seq LRC with locality r if its parity-check matrix satisfies the following:
1) the girth is 2(t + 1), 2) the column weight is at least 2, and 3) the row weight is at most r + 1.

Theorem 2: The repair time of the t-seq LRCs from Theorem 1 is at most t/2 .
The t-seq LRCs from the linear block code by Theorem 1 has two advantages. One is that both the parameters t and r of the LRCs from Theorem 1 apply to any positive integer. The other is that t-seq LRCs from Theorem 1 has a small repair time compared to some other known t-seq LRCs as shown in Table 1. In general, the performance of LRCs is considered in terms of repair efficiency r, local repair capacity t, and repair time, which is different from those of other error-correcting codes [23]. These three metrics have been fully discussed in this paper. So, the various t-seq LRCs can be obtained from the parity-check matrix with different girth, column weight and row weight.
Section II shows the proof of Theorem 1. Section III calculates the repair time of the t-seq LRCs from the linear block code by Theorem 1. Section IV concludes the paper.

II. RELATIONSHIP BETWEEN t -SEQ LRCs AND GIRTH
The support of a vector u = (u 1 , u 2 , . . . , u n ) is defined as is represented as a Tanner graph (bipartite) with m row nodes r 1 , r 2 , . . . , r m on one side and n column nodes c 1 , c 2 , . . . , c n on the other side. In this graph, nodes r i and c j are connected if m i,j = 0. Figure 2 shows two matrices and their corresponding Tanner graphs.
We will consider the connectivity of columns of a matrix as follows. Two columns of a matrix are said to be connected if the corresponding column nodes are connected by a path in its Tanner graph. A matrix is said to be connected if every pair of columns of M are connected. Figure 2 shows two matrices which are connected in (a) and non-connected in (b). Observed that the second column in (b) is not connected to the remaining columns. Note that a path does not have to be a cycle. Therefore, a connected matrix M has a property that there exists a rearrangement of all columns such that the intersection of the supports of any two adjacent columns is non-empty. For example, the columns of H 1 in Fig.2 can be reordered (the second and fifth are swapped) so that any two adjacent columns are connected.
Next, for a connected matrix, we give a sufficient condition for the existence of a cycle.
is the set of row indices of M whose row weight is 1.
Proof: Since M is a connected matrix, without loss of generality, we may assume that all the adjacent columns of M are connected. That is, if we use h 1 , h 2 , . . . , h s to denote the columns of M , then We denote by b i the i th row of M , i = 1, 2, . . . , m. Since and the length is 2 [(β − α + 1) + 1] ≤ 2(s+1). We note that the equality is achieved when α = 1 and β = s. VOLUME 10, 2022 Lemma 1: Let C be a linear block code of length n and girth 2(t + 1). Let H be its parity-check matrix of size m × n, whose column weight is at least 2. Let E be an s-subset of {1, 2, . . . , n}, for 1 ≤ s ≤ t, and H (E) be the corresponding submatrix containing only the columns indexed by E. Then, the corresponding submatrix H (E) has at least two rows whose weight is 1.
Proof: Let h j be the j th column of H , for j = 1, 2, . . . , n. For any subset E, we denote by I 1 (E) the set of row indices of the corresponding submatrix H (E) whose row weight is 1: where b i is the i th row of H (E). That is to say, we will claim that, for any nonempty subset E of size ≤ t, The proof will be distinguished in two cases: 1) H (E) is a connected matrix and 2) H (E) is a non-connected matrix.
First, Case 1) is proved by induction on the size of E. When |E| = s = 1, it is obvious that |I 1 (E)| ≥ 2 since the column weight of H is at least 2. When 2 ≤ s ≤ t, the s-subset E is divided as follows.
where E is an (s − 1)-subset. Assume the induction hypothesis that, |I 1 (E )| ≥ 2 for any (s − 1)-subset E . The number of rows in H (E) whose weight is 1 can be counted as follows. where and b i is the i th row of H (E ). To count the size of I 1 (E), we classify the relation between I 1 (E ) and h δ into the following three subcases: i) For Subcase i), we can get that, based on (2) then we can get that since |I 1 (E )| ≥ 2 by induction hypothesis and w(h α ) ≥ 2. If then H (E) has a cycle of length ≤ 2s, which is similar to the proof of Proposition 1. It contradicts the code with girth 2(t + 1). For Subcase iii), it is impossible since H (E) = (H (E ) | h δ ) has a cycle of length ≤ 2s by Proposition 1, which contradicts the code with girth 2(t + 1).
Next, we consider Case 2). H (E) is a non-connected matrix with τ connected submatrices H (E 1 ), . . . , H (E τ ) such that for 1 ≤ i < j ≤ τ . It is obvious that Now, we will continue to prove Theorem 1. Let C be a linear block code of length n. Let H = (h i,j , i = 1, 2, . . . , m; j = 1, 2, . . . , n) be its parity-check matrix, whose column weights are at least 2 and row weights are at most r + 1. Then, any s ≤ t erasures can be repaired locally and sequentially by the following algorithm. For readability, we first give some notations as follows: The size of I in Line 3 is at least 2 as long as E = ∅ by Lemma 1. Clearly, J in Line 4 is also a non-empty set because I is a non-empty set. So, the program can be run until E becomes empty. It means that all s erasures are repaired.
In Line 5, for each u ∈ J , such row index l exists since the elements of J are only from the support of rows whose weight is 1. Since supp(b l ) = {u}, the erased symbol c u is repaired by |supp(a l ) \ {u}| ≤ r symbols, which are either unerased symbols or repaired erasures.

Set E = {1}.
Step 2: The erasure e 1 can be locally repaired by the first or the forth row of H .
Remark 2: Recently, an irregular girth-8 type-II LDPC codes of length 2mKP was proposed in [13], for any integers m ≥ 3, P > 6 and K ≥ 1. For its mKP × 2mKP paritycheck matrix, the weight of the first mKP columns is 2 and the last mKP columns is m + 1, and the row weight is the constant m + 3. Therefore, this LDPC code is a 3-seq LRC with locality m + 2.
Example 2: We assume the same m, K and P as in Remark 2,which

III. REPAIR TIME OF THE LRCs FROM THEOREM 1
Next, we will show that, in each loop of Algorithm 1, at least 2 erasures can be repaired parallelly when the number of erasures is at least 2. Lemma 2: Let C be a linear block code of length n and girth 2(t + 1). Let H be its parity-check matrix of size m × n, whose column weight is at least 2. Let E be an s-subset of the column indices, for 2 ≤ s ≤ t, and H (E) be the corresponding submatrix containing only the columns indexed  (1). The proof will be distinguished in two cases: 1) H (E) is a connected matrix and 2) H (E) is a non-connected matrix.
For Case 1), from the proof of Lemma 1, the s-subset E is divided as Let α l be a row index of I 1 (E l ), for 1 ≤ l ≤ τ . It is obvious that the supports of those τ rows of H (E) indexed by α l are disjoint. Now, we will continue to prove Theorem 2. For the t-seq LRCs from Theorem 1, any t erasures are repaired by Algorithm 1. When |E| ≥ 2, there exist at least 2 rows indexed by I such that the supports of those two rows are disjoint by Lemma 2. So, at least two erasures can be locally repaired in each loop when the number of unrepaired erasures is larger than one. Therefore, any t erasures can be repaired in at most t/2 loops.
We compare the repair time of the t-seq LRCs from Theorem 1 with those constructed by others in Table 1. The first three codes in Table 1 are the general t-seq LRCs, and t erasures are locally repaired one by one. The last two codes are LRCs with joint sequential-parallel recovery, which have a smaller repair time than the general t-seq LRCs. For the t-seq LRCs from Theorem 1, each loop can repair at least 2 erasures locally and parallelly when there is more than one unrepaired erasure.

IV. CONCLUDING REMARK
In this paper, we propose a new connection between the sequential-recovery LRCs and the girth of linear block codes. A linear block code with girth 2(t + 1) is a t-seq LRC if its parity-check matrix has column weight at least 2. It is noted that the other direction is invalid. Let H be a new matrix by adding an additional row under H in Example 1 as follows. It is obvious that the corresponding code of H is also 3-seq LRCs. We can get a 6-cycle in H by connecting the 1s that have the underline.
We also show that for the t-seq LRCs from Theorem 1, any s ≤ t erasures are locally repaired in joint sequential-parallel mode. At least 2 erasures are repaired locally and parallelly in each step of the repair process, when the number of the unrepaird erasures is at least 2.