Generalized Joint Shuffled Scheduling Decoding Algorithm for the JSCC System Based on Protograph-LDPC Codes

The joint shufﬂed scheduling decoding (JSSD) algorithm can reduce the decoding complexity of the joint source-channel coding system (JSCC) based on double protograph low-density parity-check (P-LDPC) codes. However, the JSSD algorithm will not work when the linking matrix between check nodes (CNs) of the source P-LDPC and variable nodes (VNs) of the channel P-LDPC is adopted in such a system, and this linking matrix has a signiﬁcant inﬂuence on the system performance. In this paper, a generalized joint shufﬂed scheduling decoding (GJSSD) algorithm is designed to work for the system, and the JSSD algorithm can be regarded as a special case of this algorithm. Simulations show that the proposed GJSSD algorithm can reduce the decoding complexity with performance improvement when compared with the joint belief-propagation (JBP) algorithm.


I. INTRODUCTION
Shannon's classical separation principle is a milestone of information theory, which states that joint data compression and channel coding cannot bring gains and the source coding and channel coding can be optimized separately to keep the separate source-channel coding (SSCC) optimality [1], [2].However, when considering finite-length communications, the residual redundancy left by the source code may not be utilized efficiently to improve the error-correcting performance at the channel decoder in the SSCC system.Therefore, the joint source-channel coding (JSCC) system has drawn increasing attention through effective utilization of the residual redundancy.The JSCC system have good error-correcting performance [2], complexity [3], [4] and transmission delay [5], and these advantages promote the JSCC system for applications in image processing [6], video transmissions [7] and so on.The JSCC system, where The associate editor coordinating the review of this manuscript and approving it for publication was Zesong Fei .one low-density parity-check (LDPC) code [8] is used for source compression and one LDPC code is used for channel error correction, was proved to perform well in practical applications by utilizing the joint Tanner graph on the decoder side, named the double LDPC (D-LDPC) JSCC system [2].As a subclass of multi-edge type LDPC codes, the protograph LDPC (P-LDPC) codes are constructed from a protograph, which is a Tanner graph with a relatively small number of nodes [9].P-LDPC codes can gain more advantages, such as simpler and faster encoding speed, when compared with LDPC codes.The capacity-approaching constructions of P-LDPC codes were discussed in [10] and the rate-compatible codes were designed for various applications [11]- [13].The P-LDPC codes were designed over non-standard channels in [14] and introduced to bitinterleaved coded modulation [15], [16].The JSCC system adopted P-LDPC codes as source code and channel code in [17]- [19], named the double P-LDPC (DP-LDPC) JSCC system.The DP-LDPC JSCC system can be seen as an evolution of the D-LDPC JSCC system, which gains the benefits of the D-LDPC JSCC system and inherits the excellent properties of P-LDPC codes.

A. RELATED WORKS AND MOTIVATION
In recent years, considerable research efforts have been devoted to investigating the DP-LDPC JSCC system.They mainly analyze two stages, one is the initial exploration stage and the other is the development stage.
In the initial exploration stage, the impact of the source statistic on the performance of the system was studied in [17].It was observed that this system can obtain better bit error rate (BER) with lower source entropy.This observation promoted the invention of the source protograph extrinsic information transfer (PEXIT) tool, which can be used to calculate the source decoding thresholds.Through the source PEXIT (SPEXIT) tool, a matching criterion was proposed to coordinate the relation between the source entropy and the source coding rate [17].
In the development stage, the research focus was mainly on the code designs from different perspectives over different channels.The source code was optimized to obtain lower error floor [18], [28].The channel code was redesigned to improve water-fall performance, for which the joint PEXIT (JPEXIT) tool was proposed to analyze the performance [19].The linking matrix connecting variable nodes (VNs) of the channel code and check nodes (CNs) of the source code, named type-I linking matrix, and the linking matrix connecting CNs of the channel code and VNs of the source code, named type-II linking matrix, were studied in [20]- [22].The type-II linking matrix was introduced in [20], and further study showed that the type-II linking matrix has a significant influence on the system performance [22].The code pair of source code and channel code were jointly optimized in [23], [29].Moreover, three or four components among source code, channel code, type-I linking matrix and type-II linking matrix were jointly optimized over the additive white Gaussian noise (AWGN) channel by binary phase shift keying (BPSK) modulation [25], or over the Rayleigh fading channel [26] by M-ary differential chaos shift keying (DCSK) modulation [27].
With more in-depth research works on the DP-LDPC JSCC system, research interests were gradually shifted to the study of low decoding complexity design, which is desirable by practical applications of this system.The works mentioned above all adopted the joint belief-propagation (JBP) algorithm, also known as the joint flooding decoding algorithm.The JBP algorithm has a high decoding complexity therefore is not suitable for low-power practical applications.The decoding complexity of LDPC codes is mainly determined by two factors: one is the number of edges in the parity-check matrix, and the other is the maximum number of iterations [34].The first factor is determined by code design, when the structures of LDPC codes are designed and the number of edges is fixed.Thus, reducing the number of iterations provides an alternative method to reduce the decoding complexity.As is well known, scheduling decoding algorithms can efficiently reduce the number of iterations in the SSCC system.There are various scheduling methods, such as shuffled scheduling method [32]- [41], layer scheduling method [36]- [38], informed dynamic scheduling method [39], [40], and so on, but they were all designed for the SSCC system, not suitable for the JSCC system.
Recently, a joint shuffled scheduling decoding (JSSD) algorithm was proposed for the DP-LDPC JSCC system, which was the first work to apply the scheduling decoding algorithm to this system [41].However, the JSSD algorithm only considered three components including source code, channel code and type-I linking matrix, while the type-II linking matrix was not taken into account.The JSSD algorithm without considering type-II linking matrix cannot take full advantage of the complete structure of the DP-LDPC JSCC system, since the influence of type-II linking matrix on the performance of the DP-LDPC JSCC system is not negligible [20], [22].Therefore, it is beneficial for the DP-LDPC JSCC system to have a decoding algorithm to support the shuffled mode with a complete structure including all four components in the system.

B. CONTRIBUTIONS
In this paper, a generalized joint shuffled scheduling decoding (GJSSD) algorithm is designed for the low decoding complexity applications of the DP-LDPC JSCC system.The GJSSD algorithm has a complete structure including all four components and the JSSD algorithm can be regarded as a special case of the GJSSD algorithm when the type-II linking matrix is zero-matrix.The simulation results show that the GJSSD algorithm can obtain nearly two-fifths reduction in decoding complexity and achieve performance improvement as compared with the JBP algorithm.Furthermore, the GJSSD algorithm can obtain a lower error floor than the JSSD algorithm.
The rest of the paper is organized as follows.Section II briefly introduces the DP-LDPC JSCC system model.The proposed GJSSD algorithm is presented in Section III.The simulation results are shown in Section VI.The last section concludes the paper.

II. DP-LDPC JSCC SYSTEM MODEL
For the DP-LDPC JSCC system, a joint base matrix B J of size (m sc + m cc ) × (n sc + n cc ) is defined by where B sc of size m sc × n sc represents the source protograph base matrix, B cc of size m cc × n cc represents the channel protograph base matrix, B L1 of size m sc × n cc represents the connecting edges between the variable nodes (VNs) in the channel protograph and check nodes (CNs) in the source protograph, and B L2 of size m cc × n sc represents the connecting edges between CNs in the channel protograph and VNs in the source protograph, respectively.B L1 and B L2 are also called type-I and type-II linking base matrixes, respectively.The corresponding joint parity-check matrix ) is obtained by the progressive edge growth (PEG) algorithm [31], as where H sc of size M sc × N sc is the source P-LDPC code, H cc of size M cc × N cc is the channel P-LDPC code, H L1 of size M sc × N cc represents the linking relationship between CNs in the source P-LPDC code and VNs in the channel P-LDPC codes, and H L2 of size M cc × N sc represents the linking connections between VNs in the source P-LDPC code and CNs in the channel P-LDPC code.H L1 and H L2 are called type-I and type-II linking matrixes, respectively.Fig. 1 and Fig. 2 show the system model and the joint Tanner graph, respectively.The point-to-point communication system is considered.At the encoder of the DP-LDPC JSCC system, the sequence of information bits, s = (s 1 , s 2 , . . ., s i ), s i ∈ (0, 1), is generated from a binary independent and identically distributed (i.i.d.) Bernoulli source.The source entropy is calculated by where p = Pr (s i = 1) is the probability of the ''1'' of the source and p = 0.5.Next, the source sequence s is compressed by Then, part of source sequence s and compressed sequence b are combined as a new sequence [s , b], and the codeword sequence c is generated by where G new cc is the generator matrix of a new parity-check matrix To that end, the sequence of encoded bits is transmitted over an AWGN channel by BPSK modulation as sequence x.At the receiver, the joint source channel decoder (JSCD), which is shown in Fig. 2, is used to recover the sequence of the corrupted information bits y by JBP decoding algorithm.The JSCD works in parallel and exchanges the messages between these two decoders along the edges between the channel decoder and the source decoder.

III. THE GENERALIZED JOINT SHUFFLED SCHEDULING DECODING ALGORITHM
Consider the i-th iteration of the joint decoder.Assume that the codeword sequence c is transmitted over an AWGN channel with zero mean and variance σ 2 .The maximum number of the decoding iterations is set to I max .Denote the set of all CNs connected to the n-th variable node (VN) by M (n).Similarly, denote the set of all VNs connected to the m-th check node (CN) by N (m).Moreover, denote M (n)/m as the set M (n) with the m-th CN excluded, and N (m)/n as the set N (m) with the n-th VN excluded.Also, use F cc n (n = 1, . . ., N cc ) to represent the channel log-likelihood ratio (LLR) of the n-th VN in the channel decoder and F sc n (n = N cc + 1, . . ., N cc + N sc ) to represent the source LLR of the n-th VN in the source decoder.
For illustration, eight types of log-likelihood ratios (LLRs) are defined as follows, which are shown in Fig. 2. Step 1: LLRs updating.

Channel decoder:
1) For 1 ≤ n ≤ N cc , and for each m ∈ M (n), process the next two steps jointly.
1.1) Horizontal Step: Compute ε cc,(i) mn using Eq. ( 6), as shown at the bottom of the page. When 2) For 3) For 1 Source decoder: 1) For N cc + 1 ≤ n ≤ N cc + N sc and for each m ∈ M (n), process the next two steps jointly.
1.1) Horizontal Step: Compute ε sc,(i) mn using Eq.(11), as shown at the bottom of the page.
1.2) Vertical Step: When 1 When 3) For N cc + 1 ≤ n ≤ N cc + N sc and for each m ∈ M (n), compute sc→cc,(i) mn Step 2: Hard decision.
If H sc ŝ = b and H cc ĉ = 0 are both satisfied, or i = I max , the iteration will be stopped and go to Step 4. If the conditions are not met, set i = i + 1 and go to Step 1.
Step 4: Output ŝ as the decoded source sequence.The GJSSD algorithm contains four steps: initialization, LLR-updating process, hard decision, stopping criterion and outputting the decoded source sequence.The major difference between the GJSSD algorithm and the JBP algorithm is the LLR-updating process, but the other three steps are the same [41].The LLR-updating process in the GJSSD algorithm is sequential, while the LLR-updating process in the JBP algorithm is parallel.The JSSD algorithm can be tanh ε cc,(i) mn VOLUME 9, 2021 regarded as a special case of the GJSSD algorithm with B L2 = 0. Different from the JSSD algorithm, the GJSSD algorithm takes the influences of B L2 into full consideration.The followings are the major differences between the GJSSD algorithm and the JSSD algorithm.
In the initialization process of the GJSSD algorithm, the values sc→cc,(0) mn and the values cc→sc,(0) mn , associated with B L2 , are initialized as 0 and F cc n , respectively.But, these are not included in the JSSD algorithm.In the LLR-updating process of the GJSSD algorithm, the values in the source decoder, respectively (Eq. ( 6) and Eq. ( 12)).Furthermore, the values cc→sc,(i) mn and the values sc→cc,(i) mn are computed in the channel decoder and source decoder, respectively (Eq. ( 10) and Eq. ( 15)).However, these calculations are not involved in the JSSD algorithm.With the help of the complete structure including B sc , B cc , B L1 and B L2 in B J , the GJSSD algorithm can make full use of the structure of the DP-LDPC JSCC system to achieve better performance, compared with the JSSD algorithm.

IV. SIMULATION RESULTS
In this section, simulation results are reported in two parts: performance analysis of the GJSSD algorithm and performance comparison between the GJSSD algorithm and the JBP algorithm.The simulations are all realized with C++ language in Visual Studio 2013.In all simulations, the length of information bits generated by the source is set as 3200 and the parity-check matrix is obtained from the protograph with a lifted factor of 800 by the PEG algorithm.The maximum iteration number I max is 30.In the simulation results, the R4JA code with rate-1/4 and rate-1/2 are used as source code, and AR3A code with rate-1/2 is used as channel code, respectively.Their corresponding joint base matrixes are given as follows.B L1 and B L2 are contained in B J 4 .Without loss of generality, choose a i,j = 1 (i = 1, 3, 4; j = 5) and the other a i,j = 0 when B L2 = 0 in all simulations.The label (p value , r, L, D name ) in the simulation figures means that the length of source bits is L, with R total = r, which is tested under the decoding method of D name to decode when p = p value .For example, the label (0.02, 2, 3200, GJSSD) means that the source bits with length 3200 and R total = 2 are decoded under the decoding method of the GJSSD algorithm, when the p = 0.02.Here, the method used in [41] can be seen as the GJSSD algorithm with B L2 = 0.

FIGURE 4.
The average number of iterations using the GJSSD algorithm with different source entropies, R total and lengths of source bits.
impacts of the total code rate R total , the length of the source bits and the source entropies.When p = 0.02, the GJSSD algorithm with R total = 1 and L = 3200 achieves coding gains of 1.6 dB and 2.5 dB at BER = 1 × 10 −6 , respectively, compared to R total = 2 with L = 12800 and L = 3200.When R total = 2 and L = 3200, the GJSSD algorithm with p = 0.01 has coding gains of 1.7 dB and 4.7 dB at BER = 1 × 10 −6 , compared to p = 0.02 and p = 0.03, respectively.When R total = 2 and p = 0.03, the GJSSD algorithm with L = 12800 obtains 3.3 dB coding gain at BER = 1 × 10 −6 , compared with L = 3200.Fig. 4 shows the average number of iterations of the GJSSD algorithm under different conditions.When R total = 2 and L = 3200, the GJSSD algorithm with p = 0.01 converges at lowest E b /N 0 , followed by p = 0.02, and p = 0.03, yielding the highest at the same BER level.When R total = 2 and p = 0.03, the GJSSD algorithm with L = 3200 and L = 12800 nearly have the same average number of iterations.Fig. 5 and Fig. 6 exhibit BER performance and the average number of iterations of the GJSSD algorithm when B L2 = 0 (solid line) and B L2 = 0 (dash line), respectively.
In Fig. 5, when p = 0.03, the GJSSD with B L2 = 0 can successfully work while the algorithm with B L2 = 0 cannot.It is because the source statistic is larger than the source decoding threshold, hence the algorithm with B L2 = 0 cannot work.When p = 0.02, the GJSSD with B L2 = 0 achieves lower error floor than the algorithm with B L2 = 0, with a negligible performance loss of the water-fall region in the moderate-to-low SNR region.
In Fig. 6, one can observe that the average iteration number of the GJSSD with B L2 = 0 is the same as the GJSSD with B L2 = 0 when p = 0.01.When p = 0.02, the GJSSD with B L2 = 0 has a slightly larger iteration numbers than the GJSSD with B L2 = 0.When p = 0.03, the GJSSD with B L2 = 0 cannot work and oscillation occurs.Fig. 7 shows the BER performance of the GJSSD algorithm in the DP-LDPC JSCC system and the shuffled decoding algorithm in the SSCC system.One can easily see the superiority of the GJSSD algorithm.For example, the GJSSD algorithm can achieve 2.4 dB coding gain at BER = 2 × 10 −6 , compared with the shuffled decoding algorithm in the separated mode.
These simulation results also demonstrate the rationality and validity of the GJSSD algorithm, summarized as follows: 1) When it comes to the error correction performance, the factors in the descending order of degree of influence are R total , source entropies, and the code length.A smaller R total , a lower source entropy, and a longer code length, altogether yields a better BER performance.
2) The convergence speed has a relation with R total and source entropy, but not the code length.With the same E b /N 0 , the smaller of the R total or the source entropy, the faster of the convergence speed.
3) The GJSSD algorithm with B L2 = 0 can achieve lower error floor with a little loss of the water-fall region in the moderate-to-low region and a little larger of the iteration number, compared to the GJSSD algorithm with B L2 = 0.

B. PERFORMANCE COMPARISON BETWEEN GJSSD ALGORITHM AND JBP ALGORITHM
Fig. 8 and Fig. 9 show the average number of iterations at different E b /N 0 for joint decoder and separated decoder, respectively.In Fig. 8, the GJSSD algorithm can achieve faster convergence than JBP when p = 0.01, p = 0.02 and p = 0.03, with B L2 = 0 and B L2 = 0.For example, in Fig. 8(a), the average number of iterations is reduced from 13.88 to 8.68 at E b /N 0 =-1 dB when p = 0.01, from 12.73 to 7.99 at E b /N 0 =1 dB when p = 0.02, and from 9.91 to 6.10 at E b /N 0 =4.5 dB when p = 0.03, with B L2 = 0. Details of the convergence speeds of the JBP and the GJSSD algorithm, when p = 0.01, p = 0.02 and p = 0.03, can be found in Table 1, Table 2 and Table 3.In these three Tables, the ratio γ is defined as γ = T GJSSD /T JBP , where T GJSSD and T JBP denote the iterative numbers of the GJSSD algorithm and  the JBP algorithm at convergence, respectively.The GJSSD algorithm converges faster than the JBP algorithm if γ < 1.Moreover, the smaller the value of γ , the faster the converged speed of the GJSSD algorithm.From the tables, one can find that the ratios of the iteration numbers of GJSSD to JBP at convergence are all nearly 0.6, where γ is nearly 0.6, and for SSCC system, γ is 0.5 [30].
Fig. 9 shows the unequal convergence rates (UCR) phenomenon [41] in the DP-LDPC JSCC system.One can find that the GJSSD restrains the UCR to a certain extent and there still exist slight UCR in some regions with a high signalnoise ratio (SNR), while the UCR in the JBP algorithm with B L2 = 0 and B L2 = 0 are obvious.Fig. 10 shows the comparison of BER performances between the GJSSD algorithm and the JBP algorithm under different conditions.From Fig. 10(a), one can find that the GJSSD algorithm still outperforms the JBP (B L2 = 0) by nearly 0.4dB at 1 × 10 −5 BER level when p = 0.03, 0.25 dB   floor when p = 0.02, compared with the JBP algorithm.Compared to the algorithms with B L2 = 0, one can see that the GJSSD algorithm can lower the error floor level, and GJSSD can also obtain lower error floor than the JBP with B L2 = 0.
In summary, compared to the JBP algorithm, the GJSSD algorithm can reduce the decoding iteration number by nearly 40% for convergence, and achieve better waterfall performance and lower error floor.

V. CONCLUSION
This paper designs a generalized joint shuffled scheduling decoding (GJSSD) algorithm to fully exploit the complete structure of the DP-LDPC JSCC system.The proposed algorithm's influential factors, including the code length, the code rate and the source entropy, are comprehensively investigated.Through simulations, it is found that the GJSSD algorithm can reduce decoding complexity by nearly 40 percent in the cases with a moderate-to-high signal-to-noise ratio (SNR) compared to the JBP algorithm, and achieve lower error floor than the JSSD algorithm.In the future, the performance of the proposed decoding algorithm will be studied for specific applications, for example over in-body or on-body channels.
• ε cc,(i) mn represents the LLR sent from the m-th CN to the n-th VN and z cc,(i) mn represents the LLR sent from the n-th VN to the m-th CN in the channel decoder.•ε sc,(i) mn represents the LLR sent from the m-th CN to the n-th VN and z cc,(i) mn represents the LLR sent from the n-th VN to the m-th CN in the source decoder.•sc→cc,(i) nis the LLR sent from the corresponding CN in the source decoder to the n-th VN in channel decoder, and cc→sc,(i) n is the LLR sent from the n-th VN in the channel decoder to the corresponding CN in the source decoder.These two types of LLRs are indexed only by n because each CN in the source decoder is connected to only one single VN in the channel decoder.•sc→cc,(i) mnis the LLR sent from the n-th VN in the source decoder to the m-th CN in the channel decoder, and cc→sc,(i) mn is the LLR sent from m-th CN in the channel decoder to the n-th VN in the source decoder.Based on the above definitions, the generalized joint shuffled scheduling decoding algorithm is described as follows.Initialization: For all m and n, set ε sc,(0) mn = 0, ε cc,(0) mn = 0, sc→cc,(0) n = 0.For n = 1, . . ., N cc , set z cc,(0) mn = F cc n = 2y n /σ 2 , where 128374 VOLUME 9, 2021 y n = (1 − 2x n ) + G n and G n ∼ N (0, σ 2 ).For n = N cc + 1, . . ., N cc + N sc , set z sc,(0) mn = F sc n = ln((1 − p))/p).Set i = 1.

) 2 )
For 1 + N cc ≤ n ≤ N sc + N cc and for each m ∈ M (n are involved in the computation of the values ε cc,(i) mn in the channel decoder and the values z sc,(i−1) mn

FIGURE 6 .
FIGURE 6.The average number of iterations using the GJSSD algorithm when B L2 = 0 (solid line) and B L2 = 0 (dash line).

FIGURE 7 .
FIGURE 7.Comparison of BER performances using the GJSSD algorithm and the separated shuffled decoding algorithm.

FIGURE 8 .
FIGURE 8.The average numbers of iterations for different types of decoding algorithms under different conditions.The solid line represents the GJSSD algorithm and the dash line represents the JBP algorithm.(a) B L2 = 0; (b) B L2 = 0.

FIGURE 9 .
FIGURE 9.The average numbers of iterations for source decoder (solid line) and channel decoder (dash line) with different types of decoding algorithms under different conditions.

FIGURE 10 .
FIGURE 10.Comparison of BER performances between GJSSD algorithm and JBP algorithm under different conditions.The solid line represents the GJSSD algorithm and the dash line represents the JBP algorithm.(a) B L2 = 0; (b) B L2 = 0.