A Family of Derivative-Free Conjugate Gradient Methods for Constrained Nonlinear Equations and Image Restoration

In this paper, a derivative-free conjugate gradient method for solving nonlinear equations with convex constraints is proposed. The proposed method can be viewed as an extension of the three-term modified Polak-Ribiére-Polyak method (TTPRP) and the three-term Hestenes-Stiefel conjugate gradient method (TTHS) using the projection technique of Solodov and Svaiter [Reformulation: Nonsmooth, Piecewise Smooth, Semismooth and Smoothing Methods, 1998, 355-369]. The proposed method adopts the adaptive line search scheme proposed by Ou and Li [Journal of Applied Mathematics and Computing 56.1-2 (2018): 195-216] which reduces the computational cost of the method. Under the assumption that the underlying operator is Lipschitz continuous and satisfies a weaker condition of monotonicity, the global convergence of the proposed method is established. Furthermore, the proposed method is extended to solve image restoration problem arising in compressive sensing. Numerical results are presented to demonstrate the effectiveness of the proposed method.


I. INTRODUCTION
Let ϕ : R n → R n be a map, and be a nonempty closed, convex set of R n . We consider the following problem: finding a vector v such that The problem (1) exist in a wide variety of applications that includes chemical equilibrium systems [1], economic equilibrium problems [2], power flow equations [3], non-negative matrix factorisation [4], [5], phase retrieval [6], [7], nonlinear The associate editor coordinating the review of this manuscript and approving it for publication was Yeliz Karaca .
compressed sensing [8], learning constrained neural networks [9] and financial forecasting problems [10]. Iterative methods such as the Newton method, fixed-point method, quasi-Newton method and the conjugate gradient method have been used to solve the unconstrained version of (1), that is, when the constraint set = R n . Among the above-mentioned methods, the conjugate gradient methods (see [11] for instance) are well known and particularly effective for solving large-scale unconstrained optimization problems due to their simplicity and low storage requirement. On this note, motivated by the projection scheme in Solodov and Svaiter [12], several researchers have extended the conjugate gradient methods  for unconstrained optimization to solve large-scale nonlinear equations. Wang et al. [13] extended the work by Solodov and Svaiter [12], then proposed a projection-type method to solve the nonlinear equation (1) based on the inexact Newton backtracking approach. Furthermore, Ma and Wang [14] presented a modification of the extra gradient algorithm with a projection for solving constrained nonlinear monotone equations. By popularizing the idea of Zhang and Zhou [15], Yu et al. [16] proposed a constrained version of the spectral gradient projection algorithm for solving nonlinear monotone equations in which computing the sequence of steps does not need matrix storage as well as the solution of linear systems of equations. For recent articles see ( [17]- [28]) and references therein.
In [29] and [30] Zhang, Zhou and Li proposed the threeterm modified Polak-Ribiére-Polyak (TTPRP) and the threeterm Hestenes-Stiefel (TTHS) conjugate gradient method. Also, Cheng [31] proposed the three-term Hestenes-Stiefel conjugate gradient method (TTHS). The search directions of these methods are sufficiently descent and independent of the line search. They were also used to solve unconstrained optimization problems. Quite recently, Cheng, Xiao and Hu [32] proposed a derivative-free conjugate gradient method for large-scale systems of equations. The proposed method combines a derivative-free form of the threeterm modified Polak-Ribiére-Polyak method (TTPRP) [29] and the two-term modified PRP method (TMPRP) proposed by Cheng [31] using a line combination. The numerical results reported indicates that they compete with the CG_DESCENT [33].
Inspired by [32], as an attempt, using the projection method [12] and the modified line search scheme proposed in [34], we effectively extend the TTPRP method and the TTHS to solve the nonlinear equation with convex constraint (1). Our proposed search direction can be viewed as an affine combination of the derivative-free version of TTPRP and TTHS method, which also satisfies the descent condition. It is worth noting that, unlike the mainstream line search, the modified line search scheme proposed in [34] has an adaptive property that gives it a suitable performance, which reduces the computation cost of the line search. Furthermore, under the assumption that the underlying operator is Lipschitz continuous and satisfies a weaker condition of monotonicity, we establish the global convergence of the proposed method. We note that, with a weaker condition of monotonicity, a larger class of functions can be considered. Numerical experiments are reported to show the robustness and effectiveness of the proposed method. Furthermore, the method is extended to solve the image restoration problem arising in compressive sensing.
The paper is organized as follows. In the next section, we recall the key methods relevant to this work. In the core Section III, the global convergence of the proposed method is  established. In Section IV we provide numerical experiments to validate the efficiency of the proposed method. Finally, in Section V, the proposed algorithm is used to solve the 1 norm regularized compressive sensing problem.

II. ALGORITHM
In this section, we begin by focusing on the conjugate gradient method designed to solve the following unconstrained optimization problem: where φ : R n → R is a smooth nonlinear function whose gradient at point v t is ψ(v t ). The iterative formula of a nonlinear conjugate gradient method for solving (2) generate sequences of iterates recurrently by  with where β t is a suitable scalar parameter known as the conjugate gradient parameter and α t is a positive stepsize determined by a line search. Recently, Zhang, Zhou and Li [29] proposed a three term Polak-Ribiére-Polyak (TTPRP) conjugate gradient method with its search direction defined as follows where . Note, throughout, · denotes the Euclidean norm.
It is easy to see that the above search direction generated by (5) satisfy Thus, this implies that d t provides a sufficient descent direction of φ at v t . Zhang et al. [30] proposed the TTHS method. The search direction of the TTHS method has the following form VOLUME 8, 2020 It is also evident that the search direction defined by TTHS method also satisfies (6). In both work, that is [29] and [30], numerical results indicates that both methods are efficient and outperforms the CG_DESCENT method [33]. Motivated by the good practical behaviour of TTPRP and TTHS, as an attempt, we extend the TTPRP and TTHS method to solve (1). We consider the search direction d * t (denotes d t determined by (5)) and d * * t (denotes d t determined by (7)), a line combination where {λ t } is a bounded sequence. The direction (8) can be rewritten as where β * PRP We construct the search direction with the form (9) only from theoretical point of view. Observe that if we set λ t = 0, then we get the TTPRP method, while λ t = 1 yields the TTHS method. In the following, we focus on solving (1). Projection procedure systematic generates a sequence of iterates {x t } by where α t is the stepsize determined by a line search procedure that is later described and the search direction d t has the following form where ϕ t = ϕ(v t ) for simplicity and with y t−1 defined as In order to ensure boundedness of the proposed direction (11), motivated by the idea of Li and Fukushima [35], we introduced the following modification. Define u t−1 by Based on the above, we next describe our algorithm for solving the nonlinear equation (1). But first, we recall the fundamental concept and property of the projection operator. The operator of the projection P defined as a mapping from R n to a nonempty closed convex set , that is The projection operator P has a well-known property, that is, for any u, y ∈ R n the following nonexpansive property hold Algorithm 1: Input. Choose any arbitrary initial point v 0 ∈ , the positive constants: Tol ∈ (0, 1), ∈ (0, 1), ζ ∈ (0, 1), ς > 0, λ > 0, τ ∈ (0, 2). Set t := 0.
Step 2. Compute the search direction d t by the following formula: where , η t are defined as in (12).
Step 3. Determine the step-size α t = ζ m where m is the smallest non-negative integer such that the following line search is satisfied: where ξ t is defined as with µ t ∈ [µ min , µ max ] ⊆ (0, 1]. Step 4. Compute  Step 5. If x t ∈ and ϕ(x t ) = 0, stop. Otherwise, compute the next iterate by where Step 6. Finally we set t := t + 1 and return to step 1. Lemma 2.1: Let the direction {d t } be generated by (15), it holds that for every t ≥ 0, Proof: Remember, λ t is a bunded sequence. For t = 0, (21) obviously holds. For t ∈ N, we have

III. CONVERGENCE ANALYSIS
To give the convergence result, the following assumptions are required. Assumption 1: A1. The solution set of (1), denoted by Sol ϕ, is nonempty. A2. The mapping ϕ : R n → R n is Lipschitz continuous on R n , that is, there exists a constant L > 0 such that A3. For any v ∈ Sol ϕ, and w ∈ R n , it holds that The Assumption A3 is obviously a weaker condition than monotonicity.
Lemma 3.1: Let the sequences {d t } and {v t } be generated by the Algorithm 1, then there always exists a step-size α t satisfying the line search (16).
Proof: Suppose for the sake of contradiction there exist t 0 ≥ 0 such that the line search (16) fails to hold for any nonnegative integer m, then we have where ξ t 0 is defined by (17). It is clear that 162720 VOLUME 8, 2020 By continuity of ϕ and setting m → ∞ yields −ϕ T t 0 d t 0 ≤ 0, which contradicts (21). Hence, proved.
Lemma 3.2: Suppose the mapping ϕ is Lipschitz continuous and the sequences {v t } and {x t } are generated by Algorithm 1, then Proof: As we can see, from the line search procedure (16), if α t = ζ, then −1 α t fail to hold for the line search procedure. That is, From the sufficient descent condition (21) and Lipschitz continuity, we have This yields the desired inequality (25).
Moreover, the sequence {v t } is bounded and Proof: Using the weaker condition of monotonicity given by Assumption A3, we have Since for all t, it holds that ξ t ≥ µ min > 0, we have VOLUME 8, 2020 Combining the property of the projection operator (14) and We can infer from the above that the sequence { v t − v * } is therefore non-increasing and convergent, therefore {v t } is bounded, that is there exist a positive constant say k 0 such that for all t ≥ 0, v t ≤ k 0 .
Since ϕ is continuous and {v t } is bounded, then {ϕ(v t )} is bounded. That is, there exist a positive constant k 1 such that for all t ≥ 0, Moreover, by using the boundedness of ϕ t , we can deduce that Thus, since from (30), it holds that Then, we obtain, Hence the sequence {x t } is bounded owing to the boundedness of {v t }. That is, there exist a positive k 2 > 0 such that and furthermore Furthermore, the sequence {v t } converges to a solution of (1). Proof: We proof by contradiction. Suppose (36) fails to hold. That is, there exists a constant r > 0 such that for all t ≥ 0, ϕ t ≥ r. Combining this with the sufficient descent condition implies that ∀t ≥ 0 Using Lipschitz continuity property on the definition of y t−1 , we deduce that Also, from the definition of u t−1 in (13), it holds that Since the sequences {ϕ t } and {ϕ(x t )} are bounded by (33) and (34) respectively, it follows that for all t ≥ 1, Since the sequence {λ t } is a bounded sequence, we get from (35) that there exist a constant b ∈ (0, 1) and an integer n 0 VOLUME 8, 2020 such that for all t > n 0 with t ∈ N, Hence for any t > n 0 with t ∈ N, we have Since by the Lipschitz continuity of the mapping ϕ, we have established that the sequences {ϕ t } and {ϕ(x t )} are bounded, it holds that ϕ(v t + α t d t and the sequence {d t } are bounded. Thus, we set Now, by (25), we have that which contradicts (35). Thus, (36) holds.

Problem 1:
This problem is the Exponential function [39] with constraint set = R n + , that is, Problem 3: The Nonsmooth Function [40] with constraint set = R n + .
Problem 4 [41]: with = R n + defined by, Problem 5: Strictly convex function [13], with constraint set = R n + , that is, Problem 6: Strictly convex function II [13], with constraint set = R n + , that is, Problem 7: Tridiagonal Exponential function [42] with constraint set = R n + , that is, Problem 9: The Trig exp function [39] with constraint set Problem 10: The function ϕ i (v) with = R n + defined by, = 1, 2, 3, . . . , n. VOLUME 8, 2020  We take the parameters for DF-PRPMHS in the numerical experiment as ζ = 1, = 0.8, ς = 10 −4 , τ = 1.2, Tol = 10 −6 , and the sequences λ t = 1 (2t+5) 2 and µ t = 1 exp (t+1) t+1 . The parameter for the algorithms compared with are chosen as given in their various papers. We considered dimensions ranging from n = 1000, 5000, 10000, 50000, 100000 for each test problem. The algorithms are terminated either by using the stop criterion set at ϕ t ≤ 10 −6 , or if the iteration number exceeds 1000. Whenever the algorithm does not converge within 1000 iterations a failure is recorded. The performance profile developed by Dolan and Moré [44] based on the number of iterations, function evaluations and CPU computing time is used to obtain the performance profile of the methods in order to assess the detailed output of the techniques.
All the methods successfully solve all the test problems in the numerical experiments. Figures 1, 2, and 3  with DF-PRPMHS slightly outperforming the methods compared. Based on the performance profile of the number of iterations, it is clear that DF-PRPMHS outperforms comparative methods as we can see from Figure 1 that DF-PRPMHS is a better algorithm for solving (1) since it solved about 62% of test problems with fewer iterations compared to NHCGPM, MHSPM and STTCGM which solved 15%, 10% and 18% of test problems with less number of iterations.
From Figure 2, it is easy to see that DF-PRPMHS also outperforms NHCGPM, MHSPM and STTCGM in terms of number of function evaluations, since DF-PRPMHS is a better solver for approximately 69% of test problems, while NHCGPM, MHSPM and STTCGM were better solvers for approximately 10%, 10% and 18%. Finally, regarding the CPU time, we can see the competitive nature of the time from Figure 3, but DF-PRPMHS is slightly faster than the compared methods. On the overall, it is clear that DF-PRPMHS is superior to NHCGPM, MHSPM, and STTCGM based on the performance profile metrics of Dolan and Moré, that is, number of iterations, number of function evaluations and CPU processing time.
It is worth mentioning that the compared methods (NHCGPM, MHSPM and STTCGM) made use of the It can be observed that the right hand side of (39) will be too large when v t is far from the solution of the nonlinear equation with convex constraints problem (1). As such, the computing cost of the line search increases. To reduce the computation cost of the line search, the adaptive line search scheme proposed in [34] is adopted. This gives an insight of the good numerical performance of the proposed algorithm. Table 2

V. APPLICATION TO IMAGE RESTORATION PROBLEMS
Image restoration is about reconstructing or estimating uncorrupted images from noisy, blurred ones. This blurring may be caused by optical distortions, motion of objects during imaging, or turbulence in the atmosphere. Many science and engineering areas, such as aerial photography, remote sensing electron microscopy, and medical imaging, have current or potential applications of image restoration [45]. For instance, in medical imaging, since the human visual system can be the key to decoding the elusive functions of the brain, a large amount of medical and neuroscience research is devoted to understanding the human visual system (see, for example, [46], [47]).
In this section, we focus on the first problem, i.e. the use of mathematical algorithms to perform image processing tasks. The general image restoration problem can be formulated by the inversion of the following observation model: where b ∈ R m is representing the observed data, v ∈ R n is the unknown image, is the noise and A is a linear mapping such that A ∈ R m×n (m < n). In order to address problem (40), one of the tools usually employed is the 1 -regularization. The restoration is obtained by approximating the following unconstrained optimization where is a positive regularization parameter and · 1 is the 1 -regularization term. With the approximate equivalence between problem (1) and (41) (see [48], [49]), in what follows, we illustrate the efficiency of our algorithm in approximating (41). Some recent methods for image restoration includes; multi-channel and multi-model based auto encoding prior for gray scale image restoration [50]; formatted learning for image restoration [51], image restoration by combined order regularization with optimal spatial adaptation [52], multi-Level encoder-decoder architectures for image restoration [53], riemannian loss for image restoration [54] and modulating image restoration with continual levels via adaptive feature modification layers [55]. In this test, the efficiency of the proposed algorithm (DF-PRPMHS) is illustrated in restoring blurred and noisy images. We considered four images of different sizes which are degraded using a Gaussian blur operator and a Gaussian noise with standard deviation 10 −2 . The parameters selected to implement the algorithm are: τ = 1; ς = 10 −4 ; = 0.55; λ t = 1 (2t+5) 2 ; ζ = 1; µ = 1. In addition, we compare its performance with the iterative shrinkage/thresholding (IST) [56] designed for waveletbased image deconvolution and the PSGM method [57] to reflect the performance of the DF-PRPMHS method in restoring the blurred and noisy images. It is worth noting that the iterative procedure for all the algorithms starts using the same initial point and ends when the tolerance, Tol < 10 −5 . Figure 4 shows the original images. In Figures 5 and 6, the blurred and noisy images, and the restored images by the various algorithms are presented. Images on the column labelled (a) are the blurred and noisy images, images on column (b) are the restored images by DF-PRPMHS, images on column (c) are the restored images by IST and images on column (d) are the restored images by PSGM. In Table 1, numerical result obtained from the implementation of the algorithms are presented. The performance of the methods are analysed based on the signal-to-noise ratio (SNR), [58] peak signal-to-noise ratio (PSNR), and the Structural Similarity Index (SSIM) [59] metric. From the outcome of the experiment as reported in Table 1, we can see that the restored images by DF-PRPMHS are closer to the original than those from IST and PSGM for all the test images. This is reflected by their Larger SNR, PSNR, and SSIM.

CONCLUSION
We have presented a derivative-free conjugate gradient method that combines the conjugate gradient direction of the three-term modified PRP method (TTPRP) and the threeterm HS conjugate gradient method (TTHS) to solve constrained nonlinear equation with convex constraints using a projection and a modified line search procedure. Under the assumption that the underlying operator is Lipschitz continuous and satisfy a weaker monotonicity assumption, the global convergence of the proposed method is established. Another contribution from this paper is the use of our approach to solve the regularized 1 -norm compressive sensing problems. The computational experiments in restoring blurred and noisy images have shown that the proposed approach is competitive with the comparative ones.