A Secure-Transmission Maximization Scheme for SWIPT Systems Assisted by an Intelligent Reflecting Surface and Deep Learning

Recently, the demand for spectral and energy efficiency has significantly been increased along with new breakthroughs in programmable meta-material techniques. The integration of an intelligent reflecting surface (IRS) into simultaneous wireless information and power transfer (SWIPT) systems has attracted much attention from operators in advanced wireless communication networks (WCNs) such as fifth-generation (5G) and sixth-generation (6G) networks. In addition, an IRS-assisted SWIPT system faces many security risks that can easily be compromised by eavesdroppers. In this paper, we investigate the physical-layer secure and transmission optimization problem in an IRS-assisted SWIPT system where a power-splitting (PS) scheme is installed in the user equipment (UE). In particular, our purpose is to maximize the system secrecy rate by jointly finding optimal solutions for transmitter power, PS factor of UE, and phase shifts matrix of IRS under the required minimum harvested energy and maximum transmitter power. We propose the alternating optimization (AO)-based scheme to obtain optimal solutions. The proposed AO-based scheme can effectively solve both convex and non-convex problems; however, applying them in practice still poses some difficulties due to the complexity and long computation time. This is because many mathematical transformations are used and the optimal solution needs a number of iterations to achieve convergence. Therefore, we also propose 5 types of data and DNN structures to potentially achieve efficiency in computations by using a deep learning (DL)-based approach. The simulation results indicate that the proposed IRS scheme provides an improvement in terms of the average secrecy rate (ASR) by up to 38.91% when the number of reflecting elements is high (30 elements) compared to a scheme without an IRS. We also observe that the DL-based approach not only provides similar performance to the AO-based scheme but it also significantly reduces computation time.


I. INTRODUCTION
In recent years, wireless communication technologies have developed dramatically. The demand for quality of service (QoS) has also increased because of the rapid increase in the number of users, resulting in a scarcity of spectrum resources [1]. In addition, power consumption is constantly increasing due to expanding network infrastructure such as The associate editor coordinating the review of this manuscript and approving it for publication was Zheng Yan . transmission lines, terminal equipment, and base stations (BSs). Therefore, it is becoming increasingly important to save energy. Efficient energy management helps to overcome the bottleneck of wireless network applications operating under battery and energy constraints. It not only helps to reduce a device's dependence on battery power and power consumption, but also provides a continuous power source for the long-term operation of devices on the network. As a result, the simultaneous wireless information and power transfer (SWIPT) transmission technique was developed to VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ fulfill these requirements [2]- [5]. In the SWIPT system, the received signal can be used for energy harvesting (EH) and information decoding (ID). In addition, to simultaneously perform power transfer and information transmission in a SWIPT system, two practical structures are used: power switching (PS) and time switching (TS) [5].
On the other hand, a SWIPT system also suffers negative effects, such as channel attenuation and interference signals from wireless transmission environments. Besides that, tall objects like trees, traffic signs, or buildings can block the communication link between the transmitter and the receivers in wireless communication networks (WCNs). All of them reduce the quality of the communication link and weaken the information and energy received. Fortunately, with the breakthrough developments in meta-materials in recent years, the intelligent reflecting surface (IRS) was developed and is considered an effective solution to overcome these negative effects [6], [7]. In addition, several variants of the IRS have been developed, such as the large intelligent surface (LIS) [8], [9], the large intelligent metasurface (LIM) [10], and the reconfigurable intelligent surface [11], [12]. An IRS includes an array of low-cost and passive reflecting elements. Each reflecting element is able to change the frequency, phase, amplitude, or even polarization of an incident signal [6], [7]. An IRS is introduced to generate an additional reflected link. Along with the signals directly received via direct communication links, an additional reflected signal can be added to suppress the channel interference of undesired receivers and improve the received signal for desired receivers. The IRS is more energy-and cost-efficient than a conventional relay system. This is because in a relay system, transmitting and receiving signals are done with the active RF signal. Meanwhile, for the IRS, the incident signal is reflected by reconfiguring the IRS's phase shifts without RF chains. So, the beamforming design in an IRS is classified in a nearly passive manner. The IRS also has quite low power consumption due to its lightweight and compact size, and thus, it is easily installed in the indoor environment (e.g., on ceilings and walls) and in outdoor environments (e.g., on road signs, moving trains, building facades, etc.).
Currently, we are living in an era of information and data explosion where sharing and exchanging information between devices takes place every day and hour. Personal data and private communications easily become targets of security threats such as eavesdroppers (Eave's) [13]. Thus, a private conversation or communication in a SWIPT advanced networking system, even in combination with an IRS, may be secretly or stealthily overheard. Therefore, the secure transmission problem in IRS-enabled SWIPT systems must be considered more and more important.
From the above surveys, for the purposes of efficient energy management, secure transmission and signal enhancement from the IRS technique, we investigate the secure transmission IRS-enabled SWIPT system which is one of the current research topics of interest.

II. RELATED WORKS
There has already been a lot of work investigating the system secrecy rate optimization problem with many modern techniques applied, such as artificial noise (AN)-based anti-jamming, and multi-antenna beamforming [13]- [15], or even in the SWIPT system ifself [16], [17]. Liu et al. [16] studied the secure transmission optimization problem in SWIPT systems with multiple energy receivers (ERs) and an information receiver (IR). They aimed to optimize the ERs' weighted sum energy and the IR's secrecy rate. The authors in [17] maximized the system sum secrecy rate by satisfying the constraints on the ER's minimum harvested energy and the IR's minimum data rate. The secure transmission problem was addressed in a SWIPT-enabled non-orthogonal multiple access (NOMA) system that consisted of multiple IRs, multiple ERs, and a BS. Studies on optimal secrecy rates have also been conducted in WCNs with the help of IRSs [18], [19], where the authors considered an IRS-assisted wireless transmission system in which a single-antenna eavesdropper attempts to listen to communications. The secrecy rate is maximized by optimizing the IRS's reflect beamforming and the transmitter's beamforming. Both systems used an alternating optimization (AO) algorithm for solving optimization problems. Simulation results showed a significant improvement in terms of the secrecy communication rate from the proposed scheme compared to a scheme not using an IRS.
Furthermore, there have been many studies on the secure transmission of IRS-assisted SWIPT systems. However, the secrecy rate optimization problem was not considered as the main optimization problem [20]- [23]. More specifically, the authors in [20]- [22] aim to optimize transmit beamforming while ensuring the constraints of the QoS and harvested energy. Niu et al. [23] maximized the minimum robust information rate among the legitimate IRs while the ERs are considered as potential Eave's. In addition, the IRS-assisted SWIPT system was considered in [24], [25] to optimize secrecy rate. However, the PS factor was not jointly optimized in the secure transmission of IRS-assisted SWIPT system, which is an important factor that can prolong the uptime and improve the energy efficiency of devices.
Summary, in all of the aforementioned work, the secure transmission optimization problem was mostly considered in following system models: the SWIPT system without IRS, the conventional IRS-assisted WCN system, the IRS-assisted SWIPT system with no secrecy rate optimization, and the IRS-assisted SWIPT system with secrecy rate optimization without considering PS scheme. Most recently, the secure transmission optimization problem was studied in an IRSassisted SWIPT system where separate receivers are IRs and ERs [26]. At the ERs, the harvested energy was formulated by a practical non-linear model. In addition, the secrecy rate was maximized while constraints on EH and transmit power for the ERs and the BS being satisfied, by optimizing the AN covariance, the BS's transmit beamforming, and the IRS's reflective beamforming. The AO algorithm was also implemented to solve the target problem. However, it is noteworthy that our work is different from [26], although the secure transmission issue is also considered in the IRSassisted SWIPT system. In the paper, we consider the unified user equipment (UE) with a PS scheme where the secrecy rate should be maximized by additionally considering the PS factor at the UE. Furthermore, in our work, the computational efficiency of the optimization algorithm is also studied in comparison with the proposed deep learning (DL)-based approach, which the previous works did not take into account. Table 1 compares existing works related to IRS and SWIPT systems.
Although the optimization algorithm-based approach is a very powerful approach for solving most optimization problems including convex and non-convex problems, it still faces many challenges when deployed in many applications with low computation time requirements. This disadvantage comes from the implementation of optimization algorithms, which are based on iterations and complex mathematical transitions from non-convex problems to convex problems. Fortunately, the DL technique can effectively overcome these issues. DL technology has shown high efficiency when applied in WCNs [27]. Sun et al. [28] investigated the weighted minimum mean square error (WMMSE) discussed in [29], and the interference was approximated by using a deep neural network (DNN). Results showed that the WMMSE problem can be well-approximated with low computation time through a DNN model.
In this paper, to take advantage of the IRS and SWIPT system, we investigate an IRS-assisted SWIPT systems in which the IRS is deployed to improve the security of the communication link between a single-antenna transmitter and a single-antenna UE despite eavesdropping by a singleantenna Eave', as shown in Fig. 1. We not only study the secure transmission optimization problem in the IRSassisted SWIPT system with a PS scheme in the UE, but we also consider a neural network for achieving computational efficiency. The optimization problem of secure transmission is difficult to solve when it has non-convex form. Fortunately, the optimization problems with non-convex form can be effectively solved using the feasible point pursuit-successive convex approximation (FPP-SCA) algorithm [30] and the AO method [18], [19], [26]. The FPP-SCA algorithm executes the non-convex functions (non-convex constraints, or even non-convex objective functions) with upper convex functions at each iteration. Specifically, the concave terms are approximated around a feasible point by a convex function, and the optimal solution of the convex problem in the current iteration will be served for the next iteration as the feasible point. On the other hand, the AO method optimizes one or more variables by fixing remaining variables in an alternating manner. Regarding the DL-based approach, training and running stages are required. After the optimization algorithm reaches feasible solutions, the optimal output along with the corresponding input will be used as the training data for DNN model. If the DNN is well-trained (i.e., the trained network can provide predictive outputs almost identical to the feasible solutions of the optimization algorithm), then, the trained DNN can be applied to estimate optimal output in the running stage with lower computation time.
In a nutshell, this paper's main contributions are as follows.
• We consider an IRS-assisted SWIPT system where a signal is transmited to the UE while an Eave' tries to listen to the transmitter-UE communication.
By deploying an IRS in the system, network security can be enhanced, and eavesdropping can be reduced. Furthermore, the UE is equipped with a PS scheme that makes the UE get both signal and harvested energy simultaneously. We formulate the secure transmission problem of an IRS-assisted SWIPT system with a PS scheme to maximize the system secrecy rate by finding the optimal solutions for the transmitter's power, the UE's PS factor, and the IRS's phase shifts matrix.
• We propose an AO-based scheme for solving the optimization problem where FPP, SCA, and penalty methods are used to solve the optimization problem.
• A DL-based approach is considered to improve computational performance. Specifically, 5 types of data and DNN structures are proposed. Notations: Matrices and vectors are denoted by boldface capital and lower-case letters, respectively, while (·) H and (·) T represent the Hermitian and the transpose operations, respectively. The scalar's absolute value is denoted by |·|. The diagonal matrix is represented by diag {·} where the elements of the input vector are diagonal. C m×n represents a complex matrix with an m × n space. CN 0, σ 2 denotes the random variable distribution with zero mean and variance σ 2 of a circularly symmetric complex Gaussian (CSCG), and '∼' implies distributed as. The symbols E {·} and Tr(·) represent expectation and trace operations. Defining Q 0 means Q is a positive semi-definite (PSD) matrix. The terms Im (a) and Re (a) represent the imaginary part and the real part of complex number a. Table 2 lists other notations used in this paper.
The subsequent sections of this paper are organized as follows. Section III presents the formulation of the problem with the system model, the proposed AO-based scheme and the proposed DL-based approach. Analysis and discussion of the simulation results are in Section IV. Finally, Section V presents the conclusion.

III. FORMULATION OF THE PROBLEM A. CHANNEL MODEL
In this paper, an IRS-assisted SWIPT system is considered, consisting of a transmitter, a UE, an Eave', and an IRS (as shown in Fig. 1). The UE, transmitter, and the Eave' utilizes a single omni-directional antenna, respectively, while M reflecting elements are used in a uniform linear array (ULA), which is the IRS indexed by M = {1, . . . , M }. The IRS is connected to a smart controller, which can configure the IRS phase shifts in real-time manner for desired signal propagation [12], [31]. The UE is equipped with a PS scheme. The IRS is placed parallel to the x-axis and is located in the x − z plane. Let  element. In practice, when designing elements of the IRS, the amplitude reflection coefficient is often set to 1 to achieve maximum signal reflection such that we have β m = 1, ∀m. In addition, we assume that the center point of the IRS is the reference point, where the horizontal coordinates and altitude are indicated by w I = [x I , y I ] T and z I , respectively. Therefore, the distance of the communication link from a particular user node to the IRS can be approximately equal to the distance from the corresponding user node to the reference point of the IRS. The horizontal coordinates of the transmitter, the UE, and the Eave' are denoted by Because the location of Eave' is uncertain, the knowledge of the channel state information (CSI) between transmitter and Eave' is difficult to achieve. However, many methods and assumptions have been considered in recent studies to solve this problem. This knowledge may range from a complete lack of CSI (the approach based on studying the compound wiretap channel [32]) to partial CSI (optimizing the AN transmit covariance [33] or relaxing the orthogonality constraint [34]) and statistical CSI (meeting a target performance criterion in terms of SNR or rate at the receiver based on allocating enough power [13]) or even the CSI uncertainty (adopting a deterministic model [35]- [37]). In addition, there are some methods to identify the presence of an Eave' such as detection-theoretic methods based on its local oscillator leakage power and mutual communication between the legitimate nodes based on realizations of a constructed random variable [38]. Moreover, it is reasonable to assume that the CSIs of links related to Eave' can be known when the Eave' is considered as an active user but untrusted by the legitimate user [13]. Besides, several channel estimation techniques for IRS-assisted systems have been proposed recently such as those mentioned in [10], [39]. Therefore, to characterize the performance limit of the secure transmission IRS-assisted SWIPT system, the CSIs of the channels involved are assumed to be either completely known at the BS/IRS or achievable based on existing channel estimation techniques. In general, for the sake of simplicity in our scenario, the CSIs of the channels involved are modeled as the Rayleigh and Rician fading channels as follows.
Let h TU ∈ C 1×1 and h TE ∈ C 1×1 , respectively, denote the channel gain of transmitter-UE (T-U) and transmitter-Eave' (T-E) links. We assume the channel gain of the T-U and T-E links model a the Rayleigh fading channel, as follows: whereh TU andh TE denote the CSCG random variable distribution, d TU and d TE denote the distances of the corresponding communication links, calculated by d TU = w T − w U 2 and d TE = w T − w E 2 , respectively, α denotes the path loss exponent, and ρ l denotes the path loss at reference distance D 0 = 1 m [40].
In fact, the IRS can be installed on the facade of the building so the links from the transmitter to the UE, and from the IRS to the UE and to the Eave' might not be blocked by obstructions like trees or traffic signs. As a result, there is a line-of-sight (LoS) component to these channels.
where d cl , α cl , and β cl represent the distances, the path loss exponents, and the Rician factors of the related communication links, cl, respectively. The distances of the related communication links, where λ c and represent the carrier wavelength and the antenna separation, respectively; ψ cl = {ψ TI , ψ IU , ψ IE } denotes the cosine of the angle of the related communication links, in which ψ TI = x I −x T d TI denotes the cosine of the angle of arrival (AoA) for the propagation path from the transmitter to the IRS, while ψ IU = x U −x I d IU and ψ IE = x E −x I d IE denote the cosine of the angle of departure (AoD) of the propagation paths from the IRS to the UE and to the Eave', respectively.

B. COMMUNICATION MODEL
The transmitter sends signal x t = √ Ps, where s denotes the information-bearing symbol, which is a CSCG distribution. P denotes the transmitter power, and E |s| 2 = 1. In this paper, the IRS is assumed to be able to impose an additional time delay on the incident signals, which not only helps the coherent superposition of multiple copies of the desired signals but also guarantees their synchronization in time. Specifically, one of the possible approaches is the delay adjustable elements [42] cascaded with the existing phase adjustable elements [6]. In addition, to ensure that the incident signals are reflected independently by all IRS elements, the reflected signal-coupling among neighboring IRS elements is assumed that does not exist. Moreover, due to the severe path loss, we only consider signals which are reflected by the IRS first time [40], [43] by ignoring signals which are reflected by the IRS two or more times. The received signals at the UE and the Eave' are defined as follows: where U and n E ∼ CN 0, σ 2 E denoting noise from the antenna at the UE and the Eave', respectively. By using the PS scheme, the UE is able to execute EH and ID simultaneously. Regarding the PS structure, the received signal can be divided into ID and EH streams with PS factors θ and (1 − θ ), respectively, where θ ∈ (0, 1). The ID process is only executed on the ID stream at the UE, and thus, the signal-to-noise ratio (SNR) at the UE and the Eave' can be obtained. Accordingly, the achievable rates at the UE and the Eave' are defined as follows: where v ∼ CN 0, δ 2 U is the noise of the circuit on the ID stream at the UE shown in Fig. 1. Regarding the EH stream, the EH process is executed, and thus, the harvested energy at the UE is determined as: where µ ∈ (0, 1] and denotes the efficiency of the EH process on the EH stream at the UE. In this paper, for simplicity in computation, the UE is assumed to harvest all the energy from the received signal, and thus, µ is fixed at 1 (µ = 1) for the remainder of this paper. VOLUME 10, 2022

C. OBJECTIVE PROBLEM
For secure transmission, the success of the user is maximized, whereas the success of the Eave' needs to be minimized. To estimate this performance metric the secrecy rate is often used, defined as the variation between the achievable rates of the user and the Eave' [13]. Therefore, the secrecy rate at the UE in bits/second/Hertz (bps/Hz) is given as follows: where the function (x) + = max (x, 0). In this work, we aim to maximize the system secrecy rate by optimizing the received PS factor, θ , the transmitter power, P, and the phase shifts matrix, , subject to constraints on the required harvested energy and power. Then, the secrecy rate optimization problem is formulated as: where P max denotes the required maximum transmitter power, and e represents the required minimum harvested energy.

D. THE PROPOSED AO-BASED SCHEME FOR THE SECURE TRANSMISSION PROBLEM
In this section, we propose an AO-based algorithm for solving problem (11) which provides optimal value of P, θ and in an alternating manner. Since the AO method optimizes one or more variables by fixing remaining variables in an alternating manner, in the proposed scheme, the optimization of P, θ with a fixed is found by the FPP-SCA method, while the optimization of with a given P, θ is found by FPP-SCA and a penalty method.

1) FINDING P, θ WITH A GIVEN
Since is fixed, the constraint (11e) is satisfied. Then, we remove the logarithm function and add two variables u, v where u, v ≥ 0. Specifically, u 2 is used for the numerator and v is used for the denominator of the problem. Then, we can get the inequality which is always guaranteed. Therefore, problem (11) can be changed to: Then, problem (12) can be changed to: Due to the non-convex property of − u 2 v and − 1 θ in the objective function (13a) and under constraint (13b), respectively, problem (13) is non-convex. Therefore, we need to perform first-order Taylor approximation as follows: After that, non-convex problem (13) is reformulated into an approximated convex problem with the n-th subproblem using the FPP-SCA method. From the FPP-SCA method [30], slack variables s 1 , s 2 , s 3 , s 4 are also added into problem (13) to generate a feasible point. By replacing (14) and (15), and adding the slack variables into problem (13), non-convex problem (13) is converted into a convex problem as follows: U and λ is a trade-off factor between the slack term and the objective function. In this work, by using the convex Taylor underestimation of (14) and (15) and with the FPP-SCA method, problem (13) is transformed into solvable convex problem (16). Then, by using the interior-point method [44], [45] with a solver tool like Matlab's CVX [46], the convex optimization problem will be solved easily. Finally, the proposed FPP-SCA algorithm is presented in Algorithm 1.
In Algorithm 1, in each of the iterations and using Matlab's CVX to solve the convex problem, we then obtain the optimal solution. We assign the optimal solution from the previous iteration. In other words, the (n + 1)th iteration reuses the optimal solution at the n-th iteration which is considered a feasible point. Therefore, the optimal solution u * and v * will be assigned for the next iterations of u and v, u (n+1) and v (n+1) , respectively. This process will be repeated until convergence. Besides, Algorithm 1 converges when the initial point is feasible and converges to a stationary point. This is because the alternative functions satisfy the convergence conditions as mentioned in Section II-C [47]. Furthermore, the FPP-SCA approach yields a non-increasing cost sequence when considering additional slack variables which make the optimal values nonincreasing after each iteration [30], and thus, Algorithm 1 is guaranteed to be converged.

2) FINDING WITH A GIVEN P, θ
By removing the logarithm function, performing some computational operations, and fixing P, θ, problem (11) with regard to (w.r.t.) becomes: , fixed initial phase shifts matrix , convergence conditions (ε 1 , ε 2 ), a feasible point with an initial point u (0) , v (0) , P (0) , θ (0) , λ = 100, required minimum harvested energy (e), required maximum transmit power (P max ), and n = 0 Output: The optimal value: P * , θ * ⇒ R * sec 1: Calculate: Calculate initial feasible point at n = 0: choose P (0) = P max , θ (0) = 0.5 such that they satisfy constraints (12e) and (12f), respectively. Then, calculate u (0) and v (0) such that u (0) and v (0) satisfy constraints (12b) and (12c), respectively Solve problem (16) using Matlab CVX solver and calculate: Calculate: ≤ ε 1 and s 1 + s 2 + s 3 + s 4 ≤ ε 2 9: return P * ← P (n) , θ * ← θ (n) , calculate optimal secrecy rate R * sec based on P * and θ * . q = e jϕ 1 , . . . , e jϕ M H , andq = [q; 1]. Then, we get: The numerator and denominator of (17a) are converted to: where Accordingly, we rewrite problem (17) into a more tractable problem, as follows: The optimal solution to problem (22) is really not easy to find, since objective function (22a) is not only a non-concave function w.r.t.q but is also a fractional function. In addition, VOLUME 10, 2022 constraint (22c) is a non-convex quadratic equality function for each m. Let Tr (Q) denote the trace of matrix Q, and define Q =qq H where Q is a PSD matrix and rank (Q) = 1. Then, problem (22) is transformed as follows: By adding the two variables u, v to transform the fraction function in a way similar to the transformation from problem (11) to problem (13) in Section III-D1, problem (23) changes to: Since we will also apply the FPP-SCA method to solve this problem, it is essential to find the feasible point for the final convex problem. Therefore, we can find feasible point Q (0) from problem (24). Because variables u and v are only in constraints (24b) and (24c), we do not use constraints (24b) and (24c) when finding feasible point Q (0) of problem (24), as follows: min Solving problem (25), we can find feasible point Q (0) . Next, problem (24) has rank-1 constraint (23e). So, problem (24) is still non-convex. Therefore, we use the penalty method to solve the rank-1 problem, as mentioned in [48], [49]. We know that all eigenvalues of Q are non-negative, since Q is a PSD matrix. And thus, Tr (Q) ≥ λ max (Q) holds where λ max (Q) is the maximum eigenvalue of Q. Moreover, Tr (Q) = λ max (Q) if and only if rank (Q) = 1. From this insight, (Tr (Q) − λ max (Q)) should be smaller in each of the subsequent iterations. By using the penalty method, we can add the term η (Tr (Q) − λ max (Q)) to objective function (24a), where η is the penalty factor. Problem (24) can be rewritten as follows: In problem (26), the rank-1 solution of Q can be obtained when the penalty factor is large enough. However, problem (26)  v function, as seen in (14), and convert the non-convex problem into the iterative optimization problem using the FPP-SCA method as follows: (26e), (23c), (23d).
Regarding non-convex function (−λ max (Q)), we observe that λ max (·) is a convex function [49]. Therefore, we can approximate the λ max (·) function in an iterative manner. We review again Theorem 1 regarding the maximum eigenvalue, which is mentioned in [50].
Theorem 1: It is assumed that the PSD matrices are X and Y, so λ max (X) − λ max (Y) ≥ y H max (X − Y) y max will be achieved where λ max (·) is the maximum eigenvalue function, and y max is the eigenvector according to the maximum eigenvalue of Y.
From Theorem 1, we can get the inequality of PSD matrices, Q and Q (n) : where y (n) max is the eigenvector according to maximum eigenvalue λ max Q (n) of Q (n) . With w (n) max as the eigenvector corresponding to maximum eigenvalue λ max Q (n) , we solve the convex sub-problem in the n-th iteration as follows: Then, optimal solutions u * , v * , and Q * of the n-th convex sub-problem will be used to serve as the (n + 1)-th iteration (i.e., we update u * , v * , and Q * to u (n+1) , v (n+1) , and Q (n+1) , respectively). We obtain Q = λ max (Q) w max w H max when Tr (Q) ≈ λ max (Q). After that, optimal solution vectorq = √ λ max (Q)w max . And then, the optimal phase shifts vector q * can be calculated as q * = [q] (1:M ) . From the definitions = diag e jϕ 1 , . . . , e jϕ M and q = e jϕ 1 , . . . , e jϕ M H ,
we get the optimal phase shifts matrix * from q * with * = diag q * H . Algorithm 2 presents the proposed iterative algorithm based on FPP-SCA and the penalty method, while Algorithm 3 presents the proposed overall iterative algorithm for solving main problem (11).
Regarding Algorithm 2, we need to determine initial feasible points u (0) , v (0) , w (0) max of convex sub-problem (29) where w (0) max is the eigenvector corresponding to maximum eigenvalue λ max Q (0) which is related to the initial feasible point Q (0) . Fortunately, the initial feasible point Q (0) can be obtained by solving the problem (25) (step 1 in Algorithm 2). After that, u (0) and v (0) can be calculated such that constraints (27b) and (27c) are satisfied, respectively (step 2 in Algorithm 2). Note that, in step 2, to calculate u (0) and v (0) , the matrices A 1 and A 2 need to be calculated. As analyzed in Section III-D2, the matrices A 1 and A 2 are involved in the calculation of the values B and C, respectively, which are also computed based on P and θ . Because Algorithm 2 finds the phase shift matrix by fixing P and θ , the transmit power P and the PS factor θ are in this case the optimal transmit power and the optimal PS factor, which can be obtained from Algorithm 1, respectively.
For convergent analysis, similar to Algorithm 1, in Algorithm 2, the n-th optimal solution Q * , u * , v * is a feasible point to the problem (29) at the (n + 1)-th iteration, and the optimal value of problem (29) is non-increasing over each iteration and converges to a stationary point. Subsequently, convergence of Algorithm 2 is guaranteed.

3) THE COMPUTATIONAL COMPLEXITY OF THE PROPOSED AO-BASED ALGORITHM
In this section, we discuss about the computational complexity of the proposed AO-based scheme. The computational complexity is mainly from steps 2 and 3 in Algorithm 3, which includes the computation complexity of Algorithm 1 and Algorithm 2. As observed in Algorithm 1, problem (16) is only consisted of single non-negative variables. Therefore, the computational complexity of Algorithm 1 can be neglected. Besides, at step 1 of Algorithm 2, problem (25) is performed once to find the initial feasible point Q (0) , and thus, the computational complexity of this step can also be ignored. Finally, the computational complexity of the proposed overall algorithm is mainly from step 3 to step 7 of Algorithm 2 when solving problem (29). It is noteworthy that the convex sub-problem (29) can be solved by using the interiorpoint method. Therefore, the computational complexity can be calculated based on Theorem 3.12 [45]. According to Theorem 3.12 [45], in each iteration, when the semi-definite programming problem with an n × n PSD matrix and m constraints is given, the computational complexity is given by O √ n log 1 ξ mn 3 + m 2 n 2 + m 3 where ξ > 0 is the solution accuracy and O (·) is the big-O notation. For problem (29), since the PSD matrix Q is an (M + 1) × (M + 1) matrix, we can set n = M + 1. In addition, as observing problems (23) and (27), we can set m = 4 due to (27b), (27c), (27d), and (23c) constraints which are related to the PSD matrix Q. If we denote the number of iterations of proposed algorithm for convergence as K 1 , total computational complexity of the proposed algorithm is approximated as O K 1 log 1 ξ (M + 1) 3.5 because of the small number of constraints (m = 4).

E. LEARNING TO OPTIMIZE: THE PROPOSED DEEP LEARNING-BASED APPROACH
In the previous section, we proposed the AO-based scheme, which provides optimal solution but requires high complexity and long computation time. Therefore, in this section we will consider a DL-based approach to predict the transmit power, PS factor, and the phase shifts vector. Regarding the VOLUME 10, 2022  DL-based approach, a simple DNN model is used, called a feedforward neural network (FFNN). Fig. 2 shows the overall flow of DL-based approach with training and running stages, where the training data are based on the solution of the AO-based scheme.

1) REPAIRING DATA SAMPLES AND DNN TRAINING STAGE
In this work, the DNN-based method uses the optimal solutions obtained by the AO method as training data including transmit power, PS factor, and phase shifts. Choosing a reasonable data and DNN structures for the training process will contribute to a significant improvement in performance. Therefore, in this section, we investigate 5 types of data and DNN structures as shown in Fig. 3.
First, we generate N samples of the channel power gain on the related communication links {h TU , h TE , h TI , h IU , h IE }. By using the proposed scheme, we can get the optimal solution for transmit power P * , PS factor θ * , and the phase shifts vector q * corresponding to the channel power gain of the related communication links.
In ), we convert it into the real part and the imaginary part of the phase shifts vector, denoted Re (q * ) and Im (q * ), respectively.
We denote the channel gain matrix as X, with Y being the output matrix of the optimization solution, which is the optimal power allocation, the PS factor, and the real and imaginary parts of the phase shifts vector. The size and structure of the training data will depend on the type of structure. For construction of the data for the training stage, we present the case under DL AC. DL PC is done similarly, but only with the channel gains of {h TU , h TI , h IU }. The training data structures are described as follows.
• Type 1 (Fig. 3a): the optimal transmit power and PS factor are constructed separately (and thus, will also be trained separately). The optimal phase shifts vector is used and combined with the predictive training output for transmit power and the PS factor in order to calculate the secrecy rate. Then, the input and output matrices are given as follows: The superscripts of X and Y indicate the type of training data structure from Type 1 to Type 5 whereas the subscripts are the distinctive numbering. Because h TU , h TE ∈ C 1×1 and h TI , h IU , h IE ∈ C M ×1 where M is the number of reflecting elements. In addition, if the number of samples for training data is N , the training input and output data size of Type 1 will be X 1 1 ∈ C (2+3M )×N and Y 1 1 , Y 1 2 ∈ C 1×N , respectively. • Type 2 (Fig. 3b): the optimal transmit power, the PS factor, and the phase shifts vector are constructed separately. Because the real part and imaginary part of the phase shifts vector are used, we also need to convert channel gain to the real part and imaginary part, which ensures the channel gain and the phase shifts vector use the same dimension of the samples for the training stage. Then, the input and output matrices are given as follows: Similarly, the training input and output data size of Type 2 based on (33), (34), and (35) are X 2 1 ∈ C (2+3M )×N and Y 2 1 , Y 2 2 ∈ C 1×N , respectively. Due to the real and imaginary parts of the phase shift and the size of the phase shift, q, depends on the number of elements M , the training data size of X 2 2 and Y 2 3 are X 2 2 ∈ C (2+3M )×2N and Y 2 3 ∈ C M ×2N , respectively. • Type 3 (Fig. 3c): the optimal transmit power and PS factor are constructed and used for the training data, whereas the optimal phase shifts vector is used. The input and output matrices are: The training input and output data size of Type 3 are X 3 1 ∈ C (2+3M )×N and Y 3 1 ∈ C 2×N , respectively. • Type 4 (Fig. 3d): the optimal transmit power and PS factor are constructed and used for the training data, whereas the optimal phase shifts vector is constructed separately for the training stage. The input and output matrices are: The training input and output data size of Type 4 are X 4 1 ∈ C (2+3M )×N , X 4 2 ∈ C (2+3M )×2N , Y 4 1 ∈ C 2×N , and Y 4 2 ∈ C M ×2N . • Type 5 (Fig. 3e): the optimal transmit power, PS factor, and the phase shifts vector are constructed and used for the training data for the training stage. The input and output matrices are: The training input data size of Type 5 is X 5 1 ∈ C (2+3M )×2N . According to (45), the training data output size is Y 5 1 ∈ C (1+M )×2N . Next, the training data are trained by the DNN using backpropagation. The scaled conjugate gradient algorithm is used in the training process to optimize the mean squared error (MSE). To perform backpropagation in the training stage, two activation functions are used: purelin (·) is used for the output layer, whereas tansig (·) is used for hidden layers; they are calculated as follows:

2) DNN RUNNING STAGE
In the running stage, we also generate channel matrices Z for the input layer according to the type of training data structure but with K samples for channel power gain. Channel gain is generated the same way as in the training stage. Then, for the run data, the well-trained network is loaded for channel matrix Z. Finally, the output layer produces the running predictive optimal value, which includes the predictive optimal transmit powerP, PS factorθ , real part Re q and imaginary part Im q of predictive optimal phase shifts vectorq according to the type of training data structure. Fig. 3 shows the 5 types of data and DNN structures where Type 1 and Type 3 structures only estimate powerP * and PS factorθ * by using DNN while the optimal value of the phase shift q * directly is calculated. Specifically, in the running stage of Type 1 and Type 3 structures, the value of the phase shift is not available so, for the new channel gain input, we must use to Algorithm 2 to get optimal q * . That is, we can get the phase shifts matrix by using Algorithm 2 while fixing the transmit power and the PS factor as the estimated power P * and PS factorθ * . Even we can set the estimated value ofq * by DNN as initial value for Algorithm 2. On the other hand, Type 2, Type 4, and Type 5 structures estimate transmit power P * , PS factorθ * , and phase shiftq * using DNN structures.

IV. SIMULATION RESULTS AND DISCUSSION
First, we set the necessary parameters for the optimization algorithm and the DNN. Then, the numerical results for the average secrecy rate (ASR) from changing the transmitter power and the number of IRS reflecting surfaces are provided. We also consider the effect on the ASR of circuit noise at the UE as well as factors affecting channel gain (such as the vertical distance between the UE and the IRS, as well as the path loss exponents). Regarding the DL-based approach, we use the DL scheme for the ASR based on changes to the required minimum harvested energy with different structures of the training data. With regard to the proposed optimization-based approach, the solution to the problem can be obtained, and it converges to the optimal value through a number of iterations. Meanwhile, the proposed DL-based approach shows the ability to approximate the response that is produced by the optimization algorithm. In our work, benchmark schemes are used, including a scheme without an IRS, a random phase shifts scheme, and the equal PS-factor scheme. The scheme without an IRS only finds the optimal resource allocation (i.e., only the optimal transmit power and PS factor). The random phase shifts scheme reuses the optimal resource allocation from the scheme without an IRS, and combines it with the random phase shifts vector to calculate the system secrecy rate. The equal PS-factor scheme uses the optimization algorithm without IRS to solve the problem, and the PS factor is fixed so that the gain of the PS factor across the ID and EH streams is equal, i.e., the PS factor is set to 0.5 (θ = 0.5).

A. THE NEURAL NETWORK CONFIGURATION AND SIMULATION PARAMETERS
In our work, we setup the system at small scale on a three-dimensional Cartesian coordinate system. The reference (center) point of the IRS is located at w I = [4, 0] T m VOLUME 10, 2022   Fig. 4.
Regarding the channel model, the CSCG random variable distribution was used for the channel gain of the T-U link (h TU ), the T-E link (h TE ), and the NLoS components of related communication links (h NLoS TI , h NLoS IU , h NLoS IE ). Depending on the requirements of the simulation, the required maximum transmit power (P max ) and the required minimum harvested energy (e) were specifically provided in each simulation. For the DL-based approach, an FFNN model with four layers was used, which has not only input and output layers but also two hidden layers. We set 20 neurons for each hidden layer. The other simulation parameters are shown in Table 3.

B. THE SECRECY RATE PERFORMANCE UNDER VARIOUS CONFIGURATIONS
In this section, system performance is compared under different settings. First, we investigate the convergence property. Then, we check the effect of the number of IRS reflecting elements in terms of the ASR. After that, the ASR based on changing the required maximum transmitter power is investigated. Finally, we validate the ASR under factors affecting channel gain (such as the vertical distance between the UE and the IRS, as well as the path loss exponent of the T-U link). Fig. 5 shows the convergence property on the ASR according to the number of iterations under our proposed scheme with an IRS. We observed that the ASR increased rapidly and reached an optimal solution between the first and the third iteration. Fig. 5 also shows the improvement of the ASR when the IRS's reflecting surfaces increases in number. That is the result of using and optimizing the IRS's phase shifts, contributing to the enhancement of the received signal at the UE and the weakening of the received signal at Eave' when the number of IRS reflecting surfaces increases. To see this clearly, in Fig. 6 we checked the ASR from different schemes based on the number of IRS elements, M , when the requirements of the UE's harvested energy and the transmitter's power are fixed at e = −54 dBW and P max = 100 W, respectively. Fig. 6 shows that the ASR increased significantly when the number of IRS reflecting elements increased. Specifically, the proposed IRS scheme improved the ASR from 18.01% to 38.91% when increasing the IRS reflecting elements from 10 to 30. This is because as the IRS reflecting elements increase in number, the signals from the IRS become dominant at the UE and degrade for the Eave'. Fig. 6 also shows that the proposed scheme outperforms the random phase shifts scheme and the scheme without an IRS, which results from using and optimizing the IRS phase shifts. By optimizing the phase shifts, the signals reflected by the IRS can be optimized and combined with the signals directly from the T-U and T-E links to enhance or degrade the signals obtained at the UE and the Eave', respectively, thus contributing to strengthening the system secrecy rate to a higher degree, compared to not using the  IRS. The performance of the random phase shifts scheme is less efficient than the proposed scheme, but it is better than the optimization scheme without an IRS, because the random phase shifts scheme reuses the optimal transmit power and PS factor from the optimization scheme without the IRS, along with random phase shifts, to calculate the secrecy rate. A note on random phase shift: although the ASR also tends to increase when the number of reflecting elements increases, in random phase shifts that are not properly optimized, the performance can not only be worse than the optimization scheme with an IRS but can be even worse when increasing the number of reflecting elements, for example, when M = 10 and M = 15, as shown in Fig. 6. The equal PS-factor scheme provides the lowest ASR because it can only achieve the optimal secrecy rate based on the optimal transmit power while the PS factor is fixed at θ = 0.5. Note that when the number of reflecting elements increases, the ASR from optimization scheme without an IRS remains unchanged, since the IRS is not used. In Fig. 7, we consider the effect of the required maximum transmit power at the transmitter on the ASR of the schemes. In this case, the required maximum transmit power is based on the values P max ∈ {60, 70, 80, 90, 100} W, while the required minimum harvested energy is fixed at e = −54 dBW, and the number of reflecting elements is 30. As observed in Fig. 7, again, the optimization scheme with the IRS achieves the highest ASR, while the equal PS-factor scheme achieves the lowest ASR. In addition, although the required maximum transmit power increases, in our scenario, due to the impact of noise and channel gain, the achievable rate at the UE changes relatively little. Therefore, Fig. 7 shows that the ASR increases very little. For the slight increase in terms of ASR according to the required maximum transmit power, it is not necessary to use too much power. Therefore, in operation, we can choose the appropriate transmit power to ensure performance and not consume too many resources. Fig. 7 also shows that the ASR would be improved by reducing circuit noise at the UE. Reducing processing noise at the UE is completely achievable as science and technology develop more and more.
Let d v denote the vertical distance between the UE and the IRS, and consider UEs at the following locations: 5] T , and w U 4 = [0, 7] T m, as shown in Fig. 8. This also means that the vertical distance between the UE and the IRS is considered based on d v ∈ {1, 3, 5, 7} m. Fig. 9 shows the ASR of different schemes when changing the vertical distance d v between the UE and the IRS. As observed in Fig. 9, with regard to schemes that do not use an IRS (i.e., the optimization scheme without an IRS and the equal PS-factor scheme), the best ASR is achieved when the UE is closest to the transmitter (i.e., when d v = 5 m). Conversely, the ASR decreases if the UE is farther away from the transmitter (i.e., when d v = 1, 3, and 7 m). This is understandable since the channel is modeled according to the Rayleigh model, and as a result, the greater the distance between the transmitter and the UE, the more the channel is attenuated. Therefore, the signal received at the UE is reduced, and the secrecy rate decreases. One thing to note is that when the distance between the transmitter and the UE is equal (when d v = 3 m and d v = 7 m), channel gain, h TU at d v = 3 m and d v = 7 m is the same owing to the VOLUME 10, 2022 Rayleigh fading channel model, as seen in (1), so the ASR gives the same result at d v = 3 m and d v = 7 m. Regarding the optimization scheme with an IRS, the best ASR is achieved at d v = 3 m. This shows that the closer the UE is to the IRS, the more the reflected signal from the IRS is enhanced, resulting in a stronger signal at the UE. However, as mentioned above, the signal strength at the UE also depends on it being a direct signal from the transmitter, which shows that when the UE is farther from the transmitter, the direct signal from the T-U link decreases. Therefore, when the UE is close to the IRS at a certain distance (for example, at d v = 1 m from the result in Fig. 9), the combination of the T-U link's direct signal and the I-U link's reflected signal is no longer optimal, resulting in the ASR decreasing at d v = 1 m. Fig. 10 shows the ASR of the different schemes according to the path loss exponent of the T-U link, α TU . Usually, the path loss exponent has a range between 1.5 and 5 [51], so we considered path loss exponent values from 1.5 to 3. In general, the ASR tends to decrease as the path loss exponent increases, and is even less than 0 when the path loss exponent is high (α TU = 3) for a low-performance scheme like the equal PS-factor scheme. This is caused by a decrease in the T-U channel gain as the path loss exponent increases accordingly. As a result, the signal received at the UE also decreases, leading to a decrease in the secrecy rate. Fig. 10 also shows that with the help of the IRS, the ASR of the optimization scheme with an IRS decreases more slowly than the other schemes. Again, the proposed scheme with an IRS outperforms the other schemes. It is noteworthy that, when the pathloss exponent is small (e.g., the path loss exponent is 1.5), the difference in the channel gain value, h U , between the scheme without an IRS and scheme with an IRS is insignificant under other conditions unchanged. Therefore, the secrecy rate between these two schemes may be approximately, or even the secrecy rate of the proposed scheme may be smaller than that of the scheme without an IRS. In addition, the secrecy rate of the random phase shift scheme in our work is calculated based on the optimal transmit power and PS factor of the scheme without an IRS and the random phase shifts vector. Therefore, the secrecy rate of the random phase shift scheme may be greater than that of the scheme with an IRS when the path loss exponent is small. Fortunately, the path loss exponent is generally greater than 2 for obstructions to the propagation of the energy of an electromagnetic wave [51]. Thus, Fig. 10 shows that the proposed scheme provides an acceptable performance when an appropriate path loss exponent value is used (e.g., path loss exponent values such as 2 and 2.5 in common transmission environments).

C. THE DL-BASED APPROACH TO COMPUTATION TIME PERFORMANCE
We further inspect the approximation of the DL scheme on the ASR according to the required minimum harvested energy. In addition, system performance in terms of computation time under the different schemes is evaluated in this section.
After finding a solution via CVX is complete, the CVX tool can summarize the result into the cvx_status string variable. The CVX solver has several status levels, like solved, unbounded, infeasible, or even failed, and many others [46]. Therefore, although the CVX solver can effectively solve the convex optimization problem, the problem may still reach an infeasible solution where the CVX solver cannot find the optimal solution to the optimization problem (i.e., the cvx_status is not solved). As a result, although a large amount of channel gain in related communication links is generated, the optimal solution may not be found for a certain channel gain. Furthermore, in this paper, the CVX tool was executed in each of the iterations in the FPP-SCA iterative approach where the solution converges to an optimal value after a number of iterations. Hence, it is very timeconsuming to generate huge amounts of samples for training data. Therefore, in this paper, to benefit from the efficiency of the DL approach, we try to generate about 1000 samples from feasible solutions of the proposed algorithm for training data, with about 100 samples for running data. Fig. 11 shows the average secrecy rate (ASR) of the DL-based approach according to the required minimum harvested energy when required maximum transmit power is P max = 100 W and the number of reflecting elements is M = 10. Here, we observe ASR for the different training data structures as well as for cases where all the channels (DL AC) and partial channels (DL PC) are utilized respectively. From Fig. 11, we observed the following things. First, the ASR decreases slightly as the required minimum harvested energy increases. This is because as the required minimum harvested energy is increased, the PS factor should be reduced to ensure more harvesting energy, as shown in the constraint (11b). Subsequently, a decrease of PS factor causes the UE's achievable rate to be decreased, which results in a decrease in the ASR.
Second, DL AC Type 1 and DL AC Type 3 provide near optimal value of ASR, compared to the AO method. However, DL AC Type 1 gives better performance than DL AC Type 3 since DL AC Type 3 uses one DNN for estimating transmit powerP * and PS factorθ * while DL AC Type 1 uses two DNNs. It is noteworthy that optimal value of the phase shift should be calculated in the case of DL AC Type 1 and DL AC Type 3.
Third, DL AC Type 2, DL AC Type 4, and DL AC Type 5 where transmit powerP * and PS factorθ * , and phase shift q * are estimated by DNN, provide less performance than DL AC Type 1 and DL AC Type 3. Among DL AC Type 2, DL AC Type 4, and DL AC Type 5, the DL AC Type 2 provides the best performance since it utilizes three DNNs for estimating transmit powerP * and PS factorθ * , and phase shiftq * , respectively. However, the performances of all DL AC Type 2, DL AC Type 4, and DL AC Type 5 are better than those of the optimization scheme without an IRS and random phase shifts. Fourth, in practice, it is very difficult to obtain the channel gain associated with the Eave' {h TE , h IE }. In the paper, DL PC was considered as DL PC Type 1 and DL PC Type 2. From Fig. 11, interestingly, it is observed that DL PC achieves the similar performance to DL AC. Subsequently, the proposed DL PC Type 1 and DL PC Type 2 have practical applications since the channel gains from Eave's are not required in advance for obtaining transmit powerP * and PS factorθ * , and phase shiftq * .
Finally, Fig. 11 also shows the secrecy rate of the proposed scheme compared to that of the existing IRS-aided secure transmission schemes. As observed, the proposed scheme outperforms the IRS-SWIPT without PS scheme [24], [25] and IRS without SWIPT scheme [18], [19]. This is due to the influence of SWIPT as well as the PS factor. The IRS-SWIPT without PS scheme [24], [25] does not use the PS factor. Therefore, the secrecy rate of this scheme tends to increase slightly as the required minimum harvested energy increases, but it is almost negligible. It should be noted that IRS without the SWIPT scheme [18], [19] does not use the SWIPT system, and therefore it is not affected by the required minimum harvested energy. As a result, the secrecy rate of this scheme remains unchanged. Fig. 12 shows the computation time in running stage of Type 1, Type 2, and Type 3 DL AC schemes compared with optimization and benchmark schemes. The optimization scheme with an IRS takes a long time to implement even though the number of samples of channel gain in the related communication links is small (only 100). This is because the AO algorithm uses an alternative method to find solutions. The equal PS-factor scheme skips some calculations related to the PS factor because the PS factor is fixed at θ = 0.5. Therefore, the computation time of this scheme is less than the optimization scheme without an IRS. Along with performance close to that of the proposed AO algorithm as shown in Fig. 11, Type 1 and Type 3 clearly improve computation time compared to the proposed AO algorithm (i.e., scheme with an IRS) in Fig. 12. This is because when DL is applied, the optimal phase shifts can be achieved only by using Algorithm 2. Type 2, Type 4, and Type 5 provide low running times. However, as observed in Fig. 11, the performance of these types is better than that of the scheme without IRS and worse than that of the optimal scheme with an IRS.

V. CONCLUSION
In our work, an IRS-assisted secure transmission maximization scheme for SWIPT systems with a PS scheme is considered. We first aim to maximize the system secrecy rate by finding the optimal transmitter power, UE PS factor, and IRS phase shifts while satisfying the requirements of energy harvesting at the user and transmit power at the transmitter. For solving the optimization problem, we invoked an AO algorithm in which an FPP-SCA iterative algorithm and a penalty method are used to find the optimal solutions in an alternating manner. The simulation results show that the scheme helped by the IRS achieves a significant improvement in terms of ASR, compared to the scheme without an IRS. Then, we proposed a DL-based approach to improve computation performance. The comparison results showed that the DL-based approach not only provided performance similar to that of the optimization algorithm but significantly improved computation time. For future work, our work can be extended to a multiple-antenna transmitter and even to multiple PS users. In addition, with the benefits of the unmanned aerial vehicles (UAV) in significantly improving capacity, throughput and reliability, the combination of UAVs with IRS opens promising research directions. However, this also brings many challenges such as channel modeling, channel estimation, and especially new dimensions, like the UAV's location and trajectory. In addition, to meet the real-time processing requirements of large-scale heterogeneous communication systems, deep Q networks and deep deterministic policy-gradient algorithms in deep reinforcement learning are potential solutions to solve our problem. Even so, further studies on these combinations are worth pursuing as one of our future works.