Efficient ADMM-based Algorithms for Convolutional Sparse Coding

Convolutional sparse coding improves on the standard sparse approximation by incorporating a global shift-invariant model. The most efficient convolutional sparse coding methods are based on the alternating direction method of multipliers and the convolution theorem. The only major difference between these methods is how they approach a convolutional least-squares fitting subproblem. This letter presents a solution to this subproblem, which improves the efficiency of the state-of-the-art algorithms. We also use the same approach for developing an efficient convolutional dictionary learning method. Furthermore, we propose a novel algorithm for convolutional sparse coding with a constraint on the approximation error.


I. INTRODUCTION
Sparse representations are widely used in various applications of signal and image processing [1]- [8]. The sparse synthesis model admits that natural signals can be approximated using a linear combination of only a small number of atoms (columns) of a dictionary (matrix). A common formulation of the sparse coding problem is given as where D = [d 1 , d 2 , . . . , d K ], d k ∈ R n , k = 1, . . . , K, is the dictionary, x ∈ R m is the vector of sparse coefficients, and s ∈ R n is the signal. Moreover, is the upper bound on the energy of the approximation error and Γ(·) represents a function that measures the level of sparsity of a vector, for example, the number of nonzero elements (denoted by · 0 ) or its convex relaxation the 1 −norm (denoted by · 1 ). The problem of finding sparsity promoting dictionaries is called dictionary learning [9], [10].
The applications of sparse representations and dictionary learning usually involve either or both extraction and estimation of local features. Typically, this is handled by a prior decomposition of the original signal into vectorized overlapping blocks (e.g., patches in image processing). As a drawback, this strategy results in multi-valued representations, so that each point in the signal is estimated multiple times. Moreover, since the relationships among neighboring blocks are ignored, dictionaries learned using this approach tend to contain shifted versions of the same features.
Convolutional sparse coding (CSC) incorporates a single-valued and shift-invariant model that represents the entire signal. In this model, the product Dx in the standard sparse coding problem is replaced by a sum of convolutions. The convolutional form of the standard sparse coding problem (1) can be written as follows where * denotes the convolution operator (usually, with a "same" padding), and x k ∈ R n and d k ∈ R m , k = 1, · · · , K, are the sparse coefficient maps and the dictionary filters, respectively.
Several applications have shown that the CSC model performs better in handling natural signals, such as audio and images, in comparison with its standard version [11]- [18].
A majority of available CSC algorithms, including [19]- [27], are based on the alternating direction method of multipliers (ADMM) framework [28]. ADMM breaks the CSC problem into two main sub-problems, one of which is a sparse approximation problem which is efficiently addressed using hard-thresholding (when Γ(x) = x 0 ) or a shrinkage operator (when Γ(x) = x 1 ), and the other entails a convolutional least-squares regression. An efficient solution to the second sub-problem based on the convolution theorem and the Sherman-Morrison formula is given in [21]. CSC problem (2) is typically addressed by solving its unconstrained equivalent, which is written as where λ > 0 is a Lagrange multiplier. It is known that there is a unique λ for each . However, the appropriate value of λ also depends on s and {d k } K k=1 . Thus, despite being more convenient to solve, the unconstrained reformulation introduces data dependency to the CSC algorithm.
A common approach for convolutional dictionary learning (CDL) entails optimizing the filters and the sparse coefficient maps using a batch of P training signals [20]- [23]. This problem can be formulated as follows where D = {d k ∈ R m | d 2 = 1, k = 1, . . . , K}. The CDL problem is usually addressed by alternating optimization with respect to {x p k } K k=1 and {d k } K k=1 [19]- [21]. Several works have shown that solving (4) with respect to {d k } K k=1 can be also done effectively and efficiently using ADMM in frequency domain [29].
This paper presents a direct method for solving the convolutional least-squares regression which yields a constant improvement on the complexity of the available CSC algorithms. The same method can be used to improve the efficiency of existing CDL methods. Additionally, we propose an efficient CSC algorithm with a constraint on the energy of the approximation residuals using our solution to the unconstrained CSC problem. MATLAB implementations of the proposed algorithms are available at GitHub repository [30].
Throughout the paper, we use (·) T to denote the (non-conjugate) transpose operator. (·) represents complex-conjugate of complex number, (·) denotes the discrete Fourier transform of a signal, and (·) denotes the solution to an optimization problem. Moreover, we use and to denote element-wise multiplication and element-wise division operators, respectively.

A. Unconstrained CSC
In this work, we consider the convex formulation of CSC problem, i.e., we use Γ(x) = x 1 .
Using variable splitting, problem (3) in ADMM form can be reformulated as [28], The augmented Lagrangian corresponding to (5) is written as where ρ > 0 is the penalty parameter and {y k } K k=1 are Lagrangian multipliers. Defining u k = ( 1 /ρ)y k , the scaled-form ADMM iterations are expressed as The second subproblem (x-update step) can be addressed in an element-wise manner using a shrinkage (soft-thresholding) operator. The solution is written as with the shrinkage operator defined as follows The only challenging step is solving the first subproblem (z-update step). In a general form, this step entails solving the optimization problem Using the convolution theorem, problem (10) in Fourier domain can be written as Note that the filters {d k } K k=1 are zero-padded to the size of {z k } K k=1 before performing the discrete Fourier transform. Denoting with i = 1, . . . , n, problem (11) can be addressed as n independent problems: Equating the derivative with respect to ζ i to zero, we have which gives the solution to the z-update step based on (15) can be written aŝ Computational Complexity: The available ADMM-based CSC algorithms usually address the z-update step by computing the following which can be inferred from the second line of (14). Solving problem (18) using direct matrix inversion results in a time complexity of O(K 3 ) [19]. However, the work of [21] demonstrated that this can be reduced to O(K) using the Sherman-Morrison formula. The time complexity of the proposed method is also of O(K). However, using further simplifications, the proposed approach eliminates the need for explicit matrix inversion and requires fewer computations. In particular, performing the z-update step on a batch of P images using the proposed method requires ((4K + 1)P + 3K + 1)n flops, while it takes (7KP + 3K + 1)n flops using the method of [21], indicating a considerable improvement provided by our method.

B. Constrained CSC
The ADMM formulation of the constrained CSC problem (2) is given as where f {z k } K k=1 is an indicator function of the constraint set in (3), that is, where The ADMM iterations are The z-update step requires solving the following optimization problem Depending on {w k } K k=1 , problem (23) either has a trivial solution or it is equivalent to an equality-constrained optimization problem. This can be expressed as Using a suitable Lagrange multiplier ν, the problem in the second term of (24) can be reformulated as which has the same form as problem (10). Finding the solution of (25) using (17) and plugging it into (21) gives where the division by n is required by Parseval's theorem. Thus, problem (23) is simplified to a single-variable optimization problem for finding the optimal multiplier ν , which satisfies Considering that e {z k } K k=1 is monotonically increasing in ν > 0, this problem can be efficiently addressed, for example, using the secant method. Once ν is known, the z-update can be performed asẑ whereĉ ν k andr are calculated using (16).

C. Dictionary Update
Addressing CDL optimization problem (4) over {d k } K k=1 is equivalent to solving the following optimization problem where Ω(d k ) is an indicator function associated with the constraint set in (4). Problem (29) can be efficiently addressed using the consensus ADMM method [29]. The consensus ADMM formulation of problem (29) is given as with the ADMM iterations The first subproblem (g-update) is similar to problem (10). Thus, it can be efficiently addressed using the proposed approach in Section II-A. The use of the Fourier domain-based approach requires {g p k } K k=1 to be the same size as {x p k } K k=1 . As a result, the filters {d k } K k=1 are zero-padded to the size of {x p k } K k=1 to be conformable with {g p k } K k=1 . The second subproblem (d k -update) can be solved simply by projecting 1 P P p=1 (g p,t+1 k + v p,t k ) on the constraint set by mapping the entries outside the constraint support to zero before normalizing the 2 −norm.

D. CDL Algorithm
CDL problem (4) is addressed by alternating between CSC (see Section II-A) and dictionary update (see Section II-C) subproblems. We use a single iteration for each subproblem. This approach has been shown to be effective while simplifying the algorithm [21], [29]. We also use the variable coupling approach suggested in [31] which is shown to provide a better numerical stability [21], [29]. Specifically, the sparse codes {x p k } K k=1 and the constrained filters {d k } K k=1 are passed to the next subproblem.

III. EXPERIMENTAL RESULTS
In this section, we first compare the proposed unconstrained CSC algorithm with the stateof-the-art method, which uses the Sherman-Morrison formula in convolutional fitting step (the SM method) [21]. Then, we compare our unconstrained and constrained CSC methods in terms of convergence speed. Finally, we compare the proposed CDL algorithm with three available methods. All methods are based on the same alternating approach explained in Section II-D and use ADMM in both phases (CSC and dictionary update). All compared methods use the SM method in CSC phase. The compared dictionary learning methods are based on the conjugate gradient method (CG) [21], the iterative Sherman-Morrison method (ISM) [21] and a method based on the consensus ADMM framework and the Sherman-Morrison formula (SM-cns) [29]. i5-8365U 1.60GHz CPU. The algorithm complexities have been compared in Section II-A.

A. CSC Results
The proposed constrained and unconstrained CSC methods are compared in Fig. 2. Specifically, we executed the unconstrained CSC method using λ = 0.05, then we used the observed quadratic  equally effective. However, as it can be seen, the proposed method is substantially faster. This is achieved by using the method explained in Section II-A instead of the Sherman-Morrison formula, in both the z-update step (CSC phase) and the g-update step (dictionary update phase). In Fig. 4 the convergence speeds of the proposed CDL method and SM-cns using different dictionary sizes (K) are compared. The improved computational efficiency of the proposed method can be clearly observed.

IV. CONCLUSION
An efficient solution for the convolutional least-squares fitting problem has been presented.
The proposed method has been used to substantially improve the efficiency of the state-of-theart convolutional sparse coding and dictionary learning algorithms. In addition, a novel method for convolutional sparse approximation with a constraint on the approximation error has been proposed.