Block-Sparsity Log-Sum-Induced Adaptive Filter for Cluster Sparse System Identification

In this work, an effective adaptive block-sparsity log-sum least mean square (BSLS-LMS) algorithm is proposed to improve the convergence performance of cluster sparse system identification. The main idea of the proposed scheme is to add a new block-sparsity induced term into the cost function of the LMS algorithm. We utilize the ${l}_{{1}}$ norm of adaptive tap weights and log-sum as a mixed constraint, via optimizing the cost function through the gradient descent method, the proposed adaptive filtering method can iteratively move the identified signals towards the optimal solutions, and finally identify the unknown system accurately. The cluster-sparse system response, with block length and arbitrary average sparsity, is generated by a Markov-Gaussian (M-G) model. For the white Gaussian input data, the theoretical formulas of the steady-state mis-adjustment and convergence behaviors of the BSLS-LMS are derived in a general sparse system and a block sparse system, respectively. Numerical experiments demonstrate that the proposed adaptive BSLS-LMS algorithm achieves much better convergence behavior than conventional sparse signal recovery solutions. The experimental study also verifies the consistency between the simulation results and the theoretical analysis.


I. INTRODUCTION
Signal reconstruction technologies have attracted much attention in the fields of channel estimation, image recovery, sparse unknown system identification [1]- [6]. In many cases, sparse unknown systems have only a few nonzero entries, and these limited nonzero or large coefficients response appear independently in different locations over a long pulse. Different from the general sparse signal, another type of sparse signal is called the block-structured sparse signal, and it has a special cluster structure in the form of non-zero coefficients [7]- [10]. Nonzero coefficients can be located randomly in general sparse systems, but for a block sparse system, the impulse response typically consists of one or several clusters in which nonzero coefficients gather. Multipleinput multiple-output (MIMO) communication networks, the The associate editor coordinating the review of this manuscript and approving it for publication was Yingsong Li . satellite-linked path and other practical applications are typical examples of multi-clustering sparse structures.
A number of block-sparsity signal recovery algorithms based on conventional sparse signal recovery methods have been designed. Convex optimization algorithms have been applied to block sparse signal reconstruction and studied [10]- [12]. Mixed l 2,1 program block-sparse signal recovery was proposed in [10], [11]. The block version of the zero-point attracting projection algorithm employing an approximate l 2,0 norm as the cost function was proposed in [12]. Convex relaxations equivalent to the original nonconvex formulation using the l q,1 norm were studied in [13]. Greedy pursuit algorithms have been proposed for blocksparse systems [14], [15]. The block-version of the orthogonal matching pursuit (BOMP) algorithm that identifies block-sparse signals successfully via a mixed-optimization approach was investigated in [11]. Model-based compressive sampling matching pursuit (CoSaMP) was proposed in [15], and Bayesian Compressed Sensing framework-based VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ algorithms were developed in [16]- [18]. The Stochastic Taps Normalized Least Mean Square (STNLMS) [19], the Select and Queue with a Constraint (SELQUE) [20] and improved M-SELQUE algorithm [21] were applied to identify an unknown system by estimating the scattered region. The aforementioned algorithms can reconstruct unknown block sparse signals using prior knowledge of the block partition and cluster structure. To recover a block signal accurately, the block structure is fairly important. However, the key prior knowledge of the block partition of sparse coefficients is not always practically available. In fact, the locations and group sizes of the sparse clusters in an unknown system are random and entirely unknown.
Recently, block sparse least mean squares adaptive filtering has received extensive attention [14], [22]- [26]. These algorithms can recover the block-sparse signals of unknown clustering sparse systems without prior knowledge of the block-structured non-zero coefficients. The sparse adaptive algorithms have been shown to outperform other algorithms due to their low computational complexity and robustness against noise. To accelerate the convergence process of coefficients, zero-attracting block-sparsity induced LMS algorithms have been proposed and applied to recover blocksparse impulse responses in previous studies [22]- [24]. Since most of the partitions in block-sparse systems are zero-coefficient groups, zero-attraction algorithms generally converge faster than other algorithms. The main idea of zero-attracting block-sparsity-aware algorithms is to insert the constraint of block sparsity into the cost function of the standard LMS algorithm, such as the l 0 norm LMS and block-sparse LMS algorithms [25], a mixed l 2,1 norm, or an approximate mixed l 2,0 norm of the uniformly divided filter tap-weight vector [26].
It is reported that log-sum minimization requires fewer measurements to recover sparse signals than l 1 minimization [27]. This advantage of using the log-sum penalty function has been proved in the Reweighted Zero-Attracting LMS (RZA-LMS) algorithm [28]. Inspired by the RZA-LMS and new developments in compressive sensing [29], [30], we propose a new log-sum penalty LMS to identify unknown sparse systems and theoretically verify the adoption of the proposed new penalty as an alternative sparsity-aware function. In the proposed novel block sparse LMS algorithm, we insert a mixed l 1 norm and log-sum penalty for the coefficients of the unknown system into the cost function, so that the attractor selectively promotes the zero taps instead of uniformly promoting the zeros on all the taps. In this paper, the M-G model [25] is applied to generate and describe block-sparse systems. Then, the optimal group size of the sparse-structure is provided based on M-G model and the proposed algorithm. It is demonstrated that the behavior of the BSLS-LMS algorithm is better than that of the block-sparse LMS (BS-LMS) when the group size is properly selected. The convergence behaviors of the algorithm are simulated by using the mean square deviation (MSD). The experimental results show that the proposed BSLS-LMS algorithm performs better than the BS-LMS and other block sparse filtering algorithms. In addition, when the number of blocks in the system increases, the block-sparsity log-sum LMS demonstrates robustness and the performance loss is very small compared with the standard. The main contributions of this paper are summarized as follows: • We propose a new block-sparse system identification algorithm to reconstruct the system response.
We present theoretical expressions of the steadystate mis-adjustment and transient convergence behavior of the proposed algorithm with an appropriate group partition size for white Gaussian input data. Then, we theoretically demonstrate that the BSLS-LMS has much better convergence behavior than previous methods [19]- [21], [25].
• We propose a new cost function by utilizing the l 1 norm of the adaptive tap weights and the log-sum as a mixed penalty. The proposed adaptive filtering method can iteratively move the identified signals towards the optimal solutions, and finally identify the unknown system accurately. The proposed adaptive algorithm has low computational complexity and has much better convergence behavior than conventional sparse signal recovery solutions. The rest of the paper is organized as follows. In section 2, log-sum block sparse algorithm is proposed to improve reconstruct performance with analysis of the computational complexity provided. In section 3, the theoretical results of the steady-state performance and convergence of the BSLS-LMS are analyzed. In section 4, several experiments are implemented to verify the theoretical derivation, and the effectiveness of the BSLS-LMS is evaluated by comparing it with existing block sparse algorithms. Finally, Section 5 concludes the proposed algorithm.

II. BLOCK-SPARSITY LOG-SUM LEAST MEAN SQUAREs ALGORITHM
The unknown system coefficients vector to be identified is denoted as s = [s 1 , s 2 , · · · , s L ] T , and the input signal at time instant n is x n = [x n , x n−1 , · · · , x n−L+1 ] T , where L is the length of the unknown system, both s and x n are real-valued vectors, and (·) T represents the transposition. The output signal of the unknown system at time instant n is defined as where v n is the additive white Gaussian noise of the unknown system. We denoteŝ as the adaptive tap-weights. In other word,ŝ is the reconstructed signal of the unknown coefficients vector s. As a result, the estimated error at time instant n between the desired output of the unknown system and the output of the adaptive filter is whereŝ n = [ŝ 1,n ,ŝ 2,n , · · · ,ŝ L,n ] T represents the adaptive tap-weights vector. In many practical scenarios, the nonzero coefficients of the unknown systems appear in the form of clusters rather than randomly distribution. To define block sparsity, we consider s as a concatenation of blocks, and throughout this paper we assume that there are M block groups in the unknown system and the group partition size length of every block is G with G < L. The block sparse signal is represented aŝ We adopt the mixed l 1 norm and log-sum constraint to evaluate the block sparsity ofŝ = [ŝ 1 ,ŝ 2 , · · · ,ŝ L ] T as where w i = [ŝ (i−1)G+1 ,ŝ (i−1)G+2 , . . . ,ŝ iG ] T denotes the ith group ofŝ. We assume that the signals in vectorŝ can always be evenly divided into M groups with a length of G by adding zero taps at the end ofŝ. Aiming to study the unknown systems by utilizing the block-sparsity, we propose a new cost function that combines the expectation of the estimated error, the mixed l 1 norm, the log-sum constraint of the tap-weighted vector and the balance parameters, i.e., where ε (0 < ε < 1) and η (0 < η < 1) are positive constants that guarantee that the cost function is well-defined and that better sense the sparsity of the block-sparse signal, respectively. The regularization parameter γ is a positive factor that regulates the strength between the block-sparsity penalty and the mean squared error, and the parameter should be appropriately chosen to ensure satisfactory performance [24]. It is noted that this idea differs from RZA-LMS, as the constraint function in RZA-LMS is γ L i=1 log (1 + ŝ(n) η), while we define 0 < ε < 1 as a positive constant. By using gradient descent, the new recursion formula of the adaptive tap-weights can be defined aŝ where η = 1 η, µ is the step-size, and ρ = 1 2 µγ η denotes a zero attraction parameter that balances the strength of the block-sparse penalty term for a given step-size and group zero attraction and where sgn[·] is a sign function.  The proposed algorithm and comparison of the computational cost are detailed in Table 1 and Table 2, respectively.
According to the recovery process of the log-sum penalty LMS as summarized in Table 1, the computational complexity of this new algorithm is O(L). The detailed calculations per iteration is listed and compared with those of the l 0 -LMS and BS-LMS. As shown in Table 2, the computations amount of the BSLS-LMS is less than that of the BS-LMS [17]. In practical applications, since unknown systems vary slowly, partial update can be used for saving calculation.
In fact, literature [27], [28] shows that when ε = 0, the logsum penalty is virtually the same as the l 0 -norm. Therefore, it is plausible that when ε is very small, the behavior of the log-sum penalty function, as shown in equation (6) is similar to that of the l 0 -norm. Moreover, we define η as 0 < η < 1 which guarantees better convergence performance of the log-sum penalty block LMS than the approximate l 0 -norm LMS.

III. PERFORMANCE ANALYSIS OF THE BSLS-LMS ALGORITHM IN GENERAL SPARSE SYSTEMS
In the proposed BSLS-LMS algorithm, the sparse constraint term adopts a mixed l 1 norm and has a modified log-sum penalty on the impulse response coefficients of the unknown system, which is similar to the l 0 -pseudo-norm of the coefficient vector and forces the solution of the proposed algorithm to be sparse. The basic assumptions of the system are as VOLUME 8, 2020 follows: (a) the input signal x n follows i.i.d. zero-mean Gaussian distribution, (b) the input vector x n , tap-weightsŝ n , and additive white noise v n are independent of each other, and (c) the variance of x n is σ 2 x , and all the tap-weightsŝ n can be modeled by Gaussian variables. To provide a theoretical basis for this novel induced term as an alternative sparsityaware function, we analyze the mean and mean-squared performances of the BSLS-LMS based on the above assumptions in this section.
The rationale of making the above assumptions is the following: in the standard LMS, the tap-weights ofŝ j,n can converge to their optimal values uniformly under an i.i.d. Gaussian input signal. In the proposed block sparse logsum LMS, due to the group zero-attracting in (5), uniform convergence exists globally. Therefore, the strength of the temporary tap-weight is very close to that of the unknown coefficient in the group. In fact, numerical experiments have verified that this assumption is valid.

A. MEAN PERFORMANCE BASED ON THE PROPOSED ALGORITHM
The misalignment vector of identification is defined as Combining (1), (2), and (6), the update equation of the misalignment vector can be formulated by Taking the expectation for (9) and using the assumption (c), when n goes to infinity, E (r n ) and E f ŝ n converge, as one has The upper bound of the derivation is where 1 is the vector with all one entries. This means that the proposed algorithm has a stability condition for the coefficient misalignment vector convergence.

B. MEAN SQUARED STEADY STATE ANALYSIS OF THE BSLS-LMS ALGORITHM
In this section, we derive the steady-state mean squared deviation (MSD) between the original signal and the estimated signal, and then, we deduce a criterion for zero attraction parameter selection for the proposed algorithm to outperform other block sparse algorithms.
Multiply both sides of (9) by their respective transposes, We denote S n as the second-order moment matrix of the coefficient misalignment vector r n and D n as the MSD at iteration n.
Substituting (9) into (13), taking the expectations of both sides of (13) and utilizing the independence assumption, we obtain Using the fact that D n = tr[S n ] and taking the trace on both sides of (15), one can know From (16), we obtain the condition of convergence as follows Using (16) and considering the kth diagonal element, we obtain (17), as shown at the bottom of the next page.
To derive E r k,∞ f ŝ k,∞ , we classify the index set {k = 0, 1, · · · , L − 1} into two groups according to the unknown coefficients. We define C NZ and C Z as the index sets of the adaptive non-zero tap-weights and zero tapweighs, respectively.
When k ∈ C NZ , we consider that the tap-weightŝ k,n has the same sign as the corresponding unknown coefficient s k .
Therefore, it can be shown that When k ∈ C Z , we have Thus, we can have According to assumption (c), from the property of the Gaussian distribution, the following results are obtained: For k ∈ C NZ , substituting (25) and (26) into (17), we obtain where 0 1 − µσ 2 x . For k ∈ C Z , combining (25), (26) and (17), we obtain where ω denotes E r 2 k,∞ , and k ∈ C Z for simplicity. We Summarize (27) and (28) for all k, and consider that where Q denotes the number of tap-weights of the non-zero group partitions, that is Q (G) C NZ (G). It can be further derived that where 0 2 − µσ 2 x , and Q 2 − (Q + 2)µσ 2 x . Then, we define L 2 − (L + 2)µσ 2 x . Combining (27) and (28), it can be concluded that ω can be defined by the following equation After solving the quadratic equation of (31), the final mean squared deviation of the log-sum block sparse LMS is where In the steady-state MSD formula shown in Eq. (32), the first item on the right hand side is the steady-state MSD of the traditional LMS, and the latter two terms are caused by group zero attraction. If ρ = 0, then the other parts disappear accordingly and the block sparse log-sum LMS becomes the standard LMS. When the group zero-attraction is negative, the proposed block sparse algorithm has a smaller steadystate MSD, which means that it has a better steady-state performance than the standard LMS. Therefore, we derive the condition that the block sparse log-sum LMS is superior to other adaptive LMS algorithms in the steady state as The log-sum penalty function has greater sparsityencouraging potential than the approximate l 0 norm [22]. As shown in (32), the final MSD is proportional to the zero attraction parameter ρ and the power of the measured noise, which means that a large ρ will lead to a large MSD, and a small ρ will mean a weak zero attraction. Then, a weak zero attraction will slow the convergence. Therefore, in particular applications, the parameter ρ is determined by a trade-off . VOLUME 8, 2020 between the convergence rate and signal estimation accuracy. The proposed algorithm will obtain the minimum steady-state MSD, when ρ is taken as, Substituting (32) and (33), the minimum steady-state MSD is yielded as

IV. PERFORMANCE OF THE LOG-SUM LMS FOR BLOCK SPARSE SYSTEMS
The performance of the log-sum block sparse LMS in blocksparse applications is further studied by using the M-G model. An appropriate assumption 2 Q (G) L is adopted, which means that the partition size G is very small relative to the filter length to ensure that the system response is still sparse in the group partition. The principle behind this assumption is that G is an important predefined parameter that needs to be carefully selected. In addition, it is acceptable that the log-sum LMS penalizes sparsity in group-partition and a too large G will definitely destroy the sparsity In this paper, we study the performance of the BSLS-LMS for an unknown system response generated by the M-G model proposed in [22]. Following the approach in [22], we define θ as , G is even.
In block sparse systems, the other parameters in (32) are defined as Here we present some approximations that µσ 2 x → 0. Utilizing 0 ≈ 1 and 0 = 2, we have We denoteD min as the minimum transient MSD. Utilizing (35)-(38) in (33), we achieve the temporary result that In a sparse unknown system, we assume that 2 Q (G) L, and then, the above equation can be simplified as We denote the optimal group partition size as G opt which can be found numerically by The above equation shows the selection criteria of the optimal partition size.

V. SIMULATION RESULTS
In this section, we design three experiments to verify the effectiveness of the proposed algorithm. The first two experiments verify the performance of the log-sum sparse LMS for a general sparse system and a block sparse system under different signal-to-noise ratio, and the last experiment theoretically analyzes the BSLS-LMS. Finally, all the simulation results prove that the convergence rate of the BSLS-LMS is remarkably superior to those of other classical algorithms when the group partition size G is close to its optimal value G opt .

A. THE PERFORMANCE OF THE BSLS-LMS ON CHANNEL ESTIMATION FOR COOPERATIVE SYSTEM
In this section, the particular application considered is that of estimating the channel state information (CSI) for a communication system. We apply the proposed algorithm to study the performance of channel estimation in a cooperative communication network. Channel estimation is a system identification problem that seeks to identify the CSI of the unknown system. Suppose the cooperation channel model is d n = x T n h + v n as in [32]. h h 1 * h 2 is the unknown channel impulse response, corresponding to s in Eq.(1). Denote h 1 (i) (i = 0, 1, · · · , L 1 − 1) as the baseband channel between the source node and the relay node, and h 2 (i) (i = 0, 1, · · · , L 2 − 1) as the channel impulse response from the relay node to the destination node. h (with a length of L = L 1 + L 2 − 1) is the cascaded channel that is the convolution between h 1 and h 2 .
Two cases are designed to prove the estimation performance of the block sparse log-sum constraint LMS algorithm through comparison with several existing classical algorithms, including the conventional LMS, Zero-Attracting LMS (ZA-LMS), the RZA-LMS, the reweighted l 1 norm penalized LMS [29]- [31] and the reweighted l p norm constrained LMS [32]. The non-zero coefficients of the channel impulse response are Gaussian variables with a zero mean and unit variance and their positions are randomly selected. The input signal and additive noise are zero mean Gaussian sequences with various SNRs.
In case 1, we assume that the channel vectors of the cooperative relay system have the same length L 1 = L 2 = 16, so the length of the convolution channel vectors is L = L 1 +L 2 −1 = 31. There are four large coefficients of h j (j = 1, 2) uniformly distributed and the rest are all zeroes, hence the sparsity of the system is 4 31 . In case 2, the concatenated channel has the same length L 1 = L 2 = 32 and the convolution channel vectors length is L = L 1 + L 2 − 1 = 63. Eight random tapweights of h j (j = 1, 2) are nonzero in case 2, and the sparsity of the cooperative channel is 8 63 . The convergences of the proposed algorithm in the cases of a low SNR of 10dB and a high SNR of 20dB are tested respectively. The step-size parameter is set as µ = 0.02 for all algorithms. The average estimated mean squared errors (MSEs) between the actual channel state information and the estimated CSI are shown in Figure 1 and Figure 2. The parameters of the proposed BSLS-LMS channel estimation algorithm are set as follows: ρ = 5 × 10 −4 , η = 0.02 and ε = 0.08. For the reference algorithms, all the adjustment parameters are properly chosen to obtain the fastest convergence speed. For the ZA-LMS and RZA-LMS, we set the series zero-attraction parameters as ρ ZA = ρ RZA = 5 × 10 −4 and ε RZA = 10.
The MSE results of the estimated channel impulse response in sparsity case 1 are shown in Figures 1(a) ∼ (b) and the estimated result of sparsity case 2 are shown in Figure 2 (a) ∼ (b). It can be seen from Figures 1 and 2 that the convergence of the sparse-aware parameter estimation algorithms decreases as the channel sparsity increases. By comparing the convergence curves of all algorithms, it is concluded that the block-sparsity log-sum LMS algorithm is generally superior to the other algorithms. However, when the SNR = 10 dB, the performance of the reweighted l 1 norm constrained sparse filtering algorithm gets more closely to that of the log-sum penalty LMS algorithm. In both cases, the proposed algorithm achieves much better MSE performance when the SNR becomes large. According to Figure 3, the channel length is the same as the system of Figure 2. As the channel sparsity increases, the convergence performances of the sparse parameter estimation algorithms decrease accordingly. Through analysis of the convergence curves, it is concluded that the performance of the proposed algorithm is superior to those of all the other algorithms in the cases of SNR = 10dB and SNR = 20dB. We also proved the MSE performance against iteration number in the scenario of density channel, as shown in Figure 4, it is clear that the block sparse log-sum algorithm has a better performance than the traditional algorithm and all the other sparse-aware constrained adaptive filter algorithms, which is due to the fact that the upper bound of E[r ∞ ] in Eq. (11).
The MSEs between the coefficients of the BSLS-LMS algorithm and the CSI are shown in Figure 1∼Figure 4. The proposed BSLS-LMS algorithm converges faster and has better steady-state performance than all the reference algorithms. When there are very few non-zero coefficients in the impulse response of the unknown system, all the sparse-aware adaptive filters have faster convergence speeds and better steadystate performances than the traditional LMS. In addition, it can be observed that the proposed method achieves a lower MSE than the other algorithms. After 500 iterations, when the number of non-zero coefficients increases the BSLS-LMS still achieves the best convergence among the algorithms for both low and high SNRs.

B. ON THE PERFORMANCE OF THE BLOCK LOG-SUM LMS
In the second experiment, we test the convergence performance of the BSLS-LMS algorithm by identifying block-sparse systems and comparing it with the reference algorithms, including the BS-LMS [25], l 0 -LMS [30], STNLMS [19], SELQUE [20], and M-SELQUE [21]. The unknown systems have the same length L = 800 and the impulse response of the unknown system is generated by the M-G model. Based on assumption (a), the input signal and the additive noise are independent zero-mean Gaussian sequences, and the variance of the input signal is unit, i.e., σ 2 x = 1. In this experiment, the SNRs are set as 40dB and 20dB, the balance parameters ε and η are set as 0.03 and 0.008, respectively. For the BSLS-LMS and BS-LMS, the step-size µ is always set as half of the maximum value 1/(Lσ 2 x ) and the zero attraction parameter ρ opt and the group partition size G opt are set as 8.66e-7 and 4, respectively. For the reference algorithms, all the parameters are properly adjusted to obtain their fastest convergence speed and their best performance. The simulation results are averaged by 10 independent tests for each unknown system, 100 unknown systems are generated and identified, and then the learning curves of identifying these systems are averaged.
The simulation results are plotted in Figure 6 corresponding to the unknown systems whose impulse response is plotted in Figure 5(a). When there are four clusters in the system response, the BS-LMS achieves the fastest steady-state at 20dB and 40dB. The convergence performance of the BSLS-LMS is better than those of other block-sparsity recovery algorithms.
In the third experiment, there are more than four clusters in the identifying various unknown block-sparse system whose impulse response are plotted in Figure 5(b). The experimental results are shown in Figure 7, where the BSLS-LMS and BS-LMS show their advantages. It is noted that the BSLS-LMS, similar to BS-LMS, does not need to detect the active regions of the unknown system. The convergence speeds of the SELQUE and STNLMS are obviously reduced because the latency between the two clusters and other active regions are regarded as a long active region [20]; meanwhile, the other two algorithms cannot detect the active region effectively. Although the M-SELQUE algorithm can detect all the regions, the convergence of the unknown system is still poor when it has multiple clusters.
According to Figure.7, the BSLS-LMS is always the best way to identify various block-sparse systems generated by the M-G model with different parameter sets. Utilizing the blocksparsity prior, the BSLS-LMS algorithms has almost the same convergence speed but a lower steady-state performance comparing with l 0 -LMS; on the other hand, comparing against BS-LMS, BSLS-LMS shows slower convergence speed but a better steady-state performance. Although the M-SELQUE algorithm performs better than other active-region-detection algorithms, its convergence speed is still slower than the proposed algorithm. The main reason is that more iterations will be needed to determine the location of non-zero coefficients when there are more and scattered clusters, hence the convergence rate will reduce correspondingly, which is highly likely to be caused by using the M-G model. From the learning curves, the convergence performances of SELQUE and STNLMS are poor because they are not suitable for the multi-cluster system. It can be seen from Figure 7 and Figure 6 that the convergence performance of the proposed algorithm becomes worse slightly as the number of clusters increases. In summary, we can infer that the BSLS-LMS is more robust than other classical algorithms, especially in multiple distributed cluster sparse systems.
In the fourth experiment, the steady-state MSDs of the BSLS-LMS with various group partition sizes G against ρ are tested. The results of this experiment are plotted in Figure 8 corresponding to the unknown systems shown in Figure. 5(c). The step-size is set as 0.66/(Lσ 2 x ). G is set as 1, 5, and 15. For each G, the variation range of ρ is from 10 −9 to10 −5 . It can be seen from Figure.8 that the theoretical steadystate performance of the log-sum LMS is in good agreement with the simulation results. For each group partition size, the steady-state MSD of the BSLS-LMS decreases when ρ increases from 10 −9 , which means that the appropriate group zero-attraction helps to decrease the amplitudes of the coefficients in C Z . Meanwhile, the stronger group zero attraction enhances the deviation of the coefficients in C NZ as parameter ρ continues to increase. The minimum steady-state MSD and the corresponding optimal ρ are different for different group  partition sizes. It is concluded that the simulation result of the optimal ρ agrees well with the theoretical ρ opt as shown by the solid square in Figure 8.

VI. CONCLUSION
In this paper, we proposed a new block sparse LMS algorithm for unknown system identification. Specifically, we utilized the l 1 norm and the log-sum of adaptive tap weights as a mixed constraint in the cost function algorithm. The Markov-Gaussian model was adapted to generate the impulse response of the cluster-sparse unknown system. In addition, based on the expressions of the mean square misalignment, the performance of the block sparse log-sum LMS was theoretically analyzed and the results proved that the proposed algorithm is superior to other algorithms in theory. Finally, several experiments were designed to verify the effectiveness of the theoretical results, and the simulation results demonstrated the superior performance of the proposed algorithm. Moreover, the proposed BSLS-LMS algorithm has low computational complexity which makes the method practical for channel estimation and system identification. Therefore, we can conclude that the theoretical results well predict the trend for MSD when the signal-to-noise ratio is 20dB/40dB, and the simulations results agree well with the analytical values.

VII. ABBREVIATIONS
Please see Table 3.
JUN SUN received the bachelor's and master's degrees from Zhengzhou University and Huazhong University of Science and Technology, in 2003 and 2008, respectively. He is currently pursuing the Ph.D. degree with Zhengzhou University. He is also a Lecturer with the School of Electronic Information, Zhongyuan University of Technology. His research interests include massive MIMO systems and signal processing in wireless communication systems.
BING NING (Member, IEEE) received the B.E. degree in electronic information engineering from the Nanjing Artillery Academy of PLA, Nanjing, China, in 2009, and the Ph.D. degree in information and communication system from Zhengzhou University, Zhengzhou, China, in 2016. She is currently working in the Zhongyuan University of Technology as a Lecturer. From October 2012 to January 2013, she was a Visiting Researcher with the Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, USA. Her research interests are in the areas of wireless communications, including cognitive radio, non-orthogonal multiple access, and multiple-input multiple-output.