Proactive Eavesdropping via Covert Pilot Spoofing Attack in Multi-Antenna Systems

Proactive eavesdropping is an effective method for government to monitor suspicious users who are deemed to misuse communication systems for illegal activities. In this paper, considering that a legitimate full-duplex (FD) eavesdropper tries to monitor a dubious multi-antenna system, we propose a covert pilot spoofing attack (PSA) scheme to enhance the legitimate eavesdropping performance by taking the channel training phase into consideration. For the proposed covert PSA scheme, the total error detection probability and optimal detection threshold of suspicious source are derived as the worst case for the considered monitoring system. Given the optimal detection threshold, the closed-form expressions of effective eavesdropping rate are also derived based on the results of detection at suspicious source. Furthermore, an optimal power allocation algorithm to maximize the effective eavesdropping rate is proposed under the covert PSA and transmission power constraints. Simulation results illustrate that the adversary’s uncertain about channel state information (CSI) before channel estimation process, can be exploited by legitimate eavesdropper to cover the PSA without being detected. Therefore, the proposed covert PSA scheme can achieve a better performance with respect to effective eavesdropping rate and effectively combat with a suspicious multi-antenna system.

eavesdropping more challenge. To enhance the eavesdropping performance, a proactive eavesdropping paradigm was first proposed in [15], where a full-duplex (FD) eavesdropper tried to monitor and intervened a pair of suspicious users via active jamming. With a assumption that an adaptive rate strategy is adopted for suspicious source to send suspicious messages, the legitimate eavesdropper can successfully overhear the intercepted information only when the wiretap channel capacity is lager than the suspicious communication rate. And the effective eavesdropping rate, which defined as suspicious communication rate that satisfies such a condition, was coined to evaluate the eavesdropping performance. For the same system, a robust proactive eavesdropping scheme against imperfect self-interference cancelation and a two-player noncooperative game power allocation approach were proposed in [16] and [17], respectively. In [18], considering that a multi-antenna FD eavesdropper was a spoofing relay, its power allocation and beamforming vector were jointly optimized to maximize the eavesdropping rate. Related works [19]- [24] extended the proactive eavesdropping paradigm to wireless powered communication network, unmanned aerial vehicles and suspicious relay systems. All the aforementioned works, however, have assumed that the suspicious entities equipped with single antenna, which limits the suspicious nodes to combat with proactive eavesdropper. Nevertheless, the legitimate eavesdropping is very challenging when the suspicious source is equipped with multi-antenna, since it offers spatial degrees of freedom. In order to successfully eavesdrop a multi-antenna system, the optimal jamming power and beamforming vectors were jointly designed for eavesdropping in [25]. However, it will be invalid with the antennas of suspicious source increasing since little private information leaks to eavesdropper through security-oriented beamforming [26]. So, the effective eavesdropping rate may be still zero even the eavesdropper jams with the maximal jamming power.
Noted that above works focus on improving the legitimate eavesdropping performance during the data transmission phase. In practice, the legitimate monitoring performance can be strengthened during channel training phase, especially in multi-antenna systems, since the security-oriented beamforming in multi-antenna systems highly depends on the uplink channel training process. This motivates the legitimate eavesdropper to launch active attacks (e.g. pilot attack [27], [28] and jamming attack [29]) during the training phase for contaminating channel training and altering the downlink beam pattern. As a result, the beamforming based on a weighted channel state information (CSI) of suspicious channel and wiretap channel will be directed to both the suspicious destination and legitimate eavesdropper [27], [28]. And a pilot spoofing attack (PSA) scheme has been proposed in [29] to enhance monitoring performance. However, the PSA can be detected with a high probability through energy-ratio-based detector since the strategy of PSA is exposed to suspicious source [30]- [32]. More importantly, the suspicious source could estimate both suspicious and wiretap channels if the PSA is correctly detected. Furthermore, by exploiting beamforming and artificial noise, the data can be securely transmitted once the wiretap channel is exposed to suspicious source. Hence, to successfully monitor a suspicious multi-antenna system, the PSA needs to be tactfully hidden without being detected.
Covert communication, which utilizes the various uncertainties to disturb warden, can hide the existence of communication from warden and achieve a positive covert rate. In [33], an uninformed jammer was hired to assist Alice and Bob. With the help of uninformed jammer, Alice can communicate covertly with Bob in the presence of a watchful adversary Willie. In [37], a full-duplex receiver was used to achieved covert wireless communication. In [38], the confusion introduced by uncertainty of channel was exploited to achieve covert communication in relay network. It has been proven that the uncertainties in terms of jamming power [33]- [37], receiver noise power [39], [40] and the wireless fading channel [38], [41] can be exploited to achieve a positive covert communication rate, which provides a promising approach to launch active attacks without being detected for proactive eavesdropper. Since the suspicious source, who plays the part of warden to detect whether there is a PSA during the training phase, is uncertain even completely unknown about CSI, this gives a chance for eavesdropper to launch PSA without being detected. However, whether the covert PSA is achievable and how much pilot power can be covered in the uncertain of channel are undefined. Furthermore, how much the legitimate monitoring performance can be improved by introducing covert PSA remains to be seen. To our best knowledge, few works has focus on addressing these questions, which motivates us to design a covert PSA scheme for further improving monitoring performance in a suspicious multi-antenna system.

B. OUR APPROACH AND CONTRIBUTIONS
Motivated by the above analysis, in this paper, we propose a covert PSA scheme to assist proactive eavesdropping, in which the PSA can be covered without being detected by exploiting the uncertainty of channels during the uplink training phase. In the considered suspicious communication scenario, where the suspicious source is equipped with multi-antennas, we also make use of a FD eavesdropper to achieve proactive eavesdropping. Specifically, the legitimate eavesdropper is a pretender to send pilot symbols as that of suspicious destination during the training phase. As the practicable covert PSA power may be less than the pilot power of suspicious destination and most beams still be directed to suspicious destination, a active jamming scheme is also adopted to improve the proactive eavesdropping performance during data transmission phase.
The main contributions of this work are listed as follows.
• For the first time, we propose a covert PSA scheme to enhance the legitimate eavesdropping performance in a multi-antenna system by taking the uncertainty of channels at suspicious source into consideration during the uplink training phase. • We analyze the total error detection performance and derive the optimal detection threshold for suspicious source. We also derive the closed-form expression of effective eavesdropping rate with the proposed covert PSA scheme. An optimal power allocation algorithm to maximize the effective eavesdropping rate is proposed with the covert PSA and transmission power requirement. • Numerical results demonstrate that the uncertainty of channels during the training phase can be utilized to achieve covert PSA for legitimate eavesdropper. Furthermore, by comparing to the proactive eavesdropping scheme without PSA, the proposed scheme can achieve a better performance with respect to effective eavesdropping rate and can effectively combat with multi-antenna suspicious communication system. The rest of the this paper is organized as follows: Section II describes the system model. In section III, the proposed covert PSA scheme and performance of PSA detection at suspicious source are analyzed. Effective eavesdropping rate analysis and optimization under two different scenarios in data transmission phase are given in section III. Simulation results are provided in section V. Finally, we draw conclusions in section VI.

II. SYSTEM MODEL
We consider a proactive eavesdropping paradigm as shown in Fig.1, in which a suspicious source (S) equipped with N S antennas sends suspicious messages to a single antenna destination (D). A legitimate FD eavesdropper equipped with two antennas, one for eavesdropping (E) and another for jamming (J ), tries to eavesdrop the suspicious messages. We assume that all channels are mutually independent and follow quasi-static Rayleigh fading, which indicates that the channel coefficients are invariant within a time slot, but independently change from one frame to another. Specifically, the S → D and S → E channels are respectively modeled as narrowband 2-D spatial model.  of lth path that assumed to be complex Gaussian random variable with zero-mean and unit variance and a (θ l ) represents array steering vector. For an uniform linear array, a (θ l ) = T [42]. The interference channels J → D and J → E are denoted by h JD and h JE , where the mean of |h JD | 2 and |h JE | 2 are 1 β JD and 1 β JE , respectively. It's worth to note that the self-interference (SI) channel h JE is also modeled via Rayleigh distribution since the antenna E and J are isolated and the major interference comes from scattering. Furthermore, perfect SI cancelation is difficult due to the finite resolution of analog-to-digital converter, however, partial SI can be cancelled in such a FD legitimate eavesdropper.
We assume that time division duplex (TDD) protocol is adopted and each time slot is divided into two phases, including channel training phase and data transmission phase, as shown in Fig.2. During channel training phase, D broadcasts common pilot symbols to S for estimating the channel. While, the legitimate eavesdropper also broadcasts synchronized and identical pilot symbols as that of D through antenna E to contaminate channel estimation, but antenna J keeps silence to reduce the chance of exposure. The suspicious source S, who plays the role of a warden during channel training phase, tries to estimate channel h SD and detect any PSA or jamming attack based on the received pilot symbols. We also assume that the distribution of channels is known to S, but the exact CSI is unaware during the training phase, since the distribution of channels can be acquired through long-time statistic, while exact CSI need to be instantaneously estimated. In addition, we assume that S has knowledge of the transmission power and noise variance.
After receiving the pilot symbols, S needs to make a decision as to whether a PSA happens. Then, according to the results of channel estimation and active attack detection, S transmits data with beamforming during data transmission phase. Specifically, we assume that a maximum-ratiotransmission (MRT) beamforming is adopted when S deems that there is absence of PSA, but the oriented beamforming, where the suspicious messages and interference signals are respectively directed to D and E, is adopted when the PSA is exposed to S. To effectively improve the eavesdropping performance, the legitimate eavesdropper sends interference signals through the antenna J to decrease the achievable VOLUME 7, 2019 data rate at D, while decoding the suspicious communication through the antenna E during data transmission phase. In the following, we will introduce the proposed proactive eavesdropping scheme under a multi-antennas system.

III. PROACTIVE EAVESDROPPING VIA COVERT PSA
A. PSA DETECTION AT SUSPICIOUS SOURCE As described in section II, after receiving the pilot symbols, S needs to make a decision as to whether a PSA happens, which is a binary hypothesis testing problem. For the simplification of expression, two hypothesises are defined, where H 0 represents that there is absence of PSA and H 1 means there is existing PSA in channel training phase. We define the detection error of PSA in the aspect of false alarm probability (p fa ) and miss detection probability (p md ). Specifically, p fa is the probability that S agrees H 1 , while H 0 is true, oppositely, p md is the probability that S agrees H 0 , while H 1 is true. Let p 0 and p 1 denote the priori probabilities of H 0 and H 1 , respectively. The total probability of detection error can be defined as To achieve a covert PSA, the total probability of detection error should be satisfied that ξ = p 0 p fa + p 1 p md ≥ min {p 0 , p 1 } − ε even under the optimal detection threshold, for any ε > 0 [38]. For effectively cover the PSA, we derive the optimal detection threshold of S to minimize the total probability of detection error, which is a worst case for proactive eavesdropper. Specifically, under the hypothesise H 0 and H 1 , the collected pilot symbol at S during the channel training phase can be respectively expressed as where x is the pilot symbol that satisfies E x H x = 1, P D and P E respectively denote the transmission power of pilot symbols at D and E, and n ∈ C N S ×1 is the noise vector that its elements are independent identically distributed (i.i.d) and satisfied CN 0, σ 2 0 . During the channel training phase, we assume the instantaneous channels are uncertainty but the complex statistic CSI under both hypotheses are certain for S. To detect whether a PSA happens, a optimal detection scheme, i.e. radiometer [37], is adopted at S to distinguish H 0 and H 1 , and a test statistic T is defined as where m is the number of the samples, D 1 and D 0 are binary decisions of S that inter there is a PSA or not, and γ is the test threshold.
Without loss of generality, we assume that p 0 = p 1 = 0.5, which means S has no priori knowledge about PSA strategy of E. To achieve covert communication, for any ε > 0, the total probability of detection error ξ = p fa + p md ≥ 1 − ε needs to be ensured when m → ∞. And the probability of false alarm under H 0 is where σ 2 1 = p D |h SD | 2 is the received power of pilot symbol for one of antennas at S under H 0 and χ 2 2mN S denotes chisquared distribution with the freedom 2mN S . According to the Strong Law of Large Numbers [32][33][34], we have By considering m → ∞ and substituting (6) into (5), the probability of false alarm under H 0 can be computed as Similarly, we can derive the probability of missed detection under H 1 as is the power of received pilot symbol for one of antennas at S under H 1 . And it follows generalized chi-squared distribution with the probability density function as [38] where For the considered system, we have k = 2,σ 2 As p fa and p md in (8) and (11) are functions of γ and γ > 0, we assume that S can set the optimal threshold γ * to minimize the total error detection probability. Specifically, when γ < N S σ 2 0 , ξ = 1 is permanent as p fa = 1. When γ ≥ N S σ 2 0 , to examine a optimal threshold γ * at S, we derive the first derivative of ξ as Note that forσ 2 1 =σ 2 2 , we have Since ∂ξ ∂γ < 0 in (15), so the optimal detection threshold is γ → +∞. According to the received pilot symbols and optimal detection threshold, S can decide whether a PSA happens.
Remark 1: According to the false alarm probability p fa , missed detection probability p md and the optimal threshold γ * , the total probability of detection error ξ has nothing to do with the number of antennas N S . In addition, the false alarm probability p fa is irrelevant to the power of PSA.

B. PROACTIVE EAVESDROPPING DURING DATA TRANSMISSION PHASE
It is clear that S may adopt different beamforming scheme to transmit suspicious messages during data transmission phase, since different CSI is acquired according to the detection result. We assume that a MRT beamforming is adopted when S deems that there is absence of PSA, as MRT beamforming is robust against passive eavesdropping in massive MIMO system [30]. But two oriented beamforming vectors are designed when the PSA is exposed to S, where one for suspicious messages that directed to D and another for interference signals that directed to E. According to the results of detection, there are four cases during data transmission phase. Specifically, Case I: S is in favor of H 0 and H 0 is true. Case II: S is in favor of H 0 but H 1 is true. Case III: S is in favor of H 1 but H 0 is true. Case IV: S is in favor of H 1 and H 1 is true. In this subsection, we will discuss these four cases, in detail.
Case I: During the channel training phase, S can estimate the CSI based on the received pilot symbols by the leastsquares approach, i.e.ĥ = yx H . For this case, the channel estimation result ofĥ SD iŝ And the MRT beamforming vector can be expressed as In addition, to successfully eavesdrop the suspicious messages, the legitimate eavesdropper also sends interference signals through antenna J during the data transmission phase. By considering imperfect self-interference cancelation at E, the received suspicious signals at D and E can be respectively expressed as where P S is the data transmission power, s denotes the suspicious data transmitted by S, which satisfied E ss H = 1, φ ∈ (0, 1) denotes the coefficient of residual SI, P J is jamming power, z denotes the jamming signal that satisfied z ∼ CN (0, 1), and n ∼ CN 0, σ 2 0 represents the received noise. Case II: For this case, the channel estimation result ofĥ SD can be computed aŝ And the MRT beamforming vector is given as The received suspicious signals at D and E can be respectively expressed as Case III: For this case, S is false alarm and the channel estimation result ofĥ SD iŝ Exceptĥ SD , S deems that there is an extra wiretap channel, but there is not, in fact. We assume that S transmits interference signals with a random direction, which has nothing to do with h SE . So the MRT beamforming vectors are given as For this case, the received suspicious signals at D and E can be respectively expressed as where P S1 + P S2 = P S and a denotes the interference signal transmitted by S and satisfied that E aa H = 1.
Case VI: For this case, the PSA is exposed to S, we assume that S can distinguish the pilot symbols through an extra process [31], [32]. Then, the channel estimation results of h SD and h SE under this case arê The beamforming vectors are given as For this case, the received suspicious messages at D and E can be respectively expressed as Remark 2: It is worth to note that Case I is similar as the jamming scheme proposed in [15], and the main difference is that the suspicious source equips with multi-antennas in our case. Case II and Case III are two possible cases of proposed covert PSA scheme, where Case II indicates missed detection and Case II means false alarm. Case VI is a worst case that the PSA is exposed to suspicious source.

IV. EAVESDROPPING RATE ANALYSIS AND OPTIMIZATION A. AVERAGE EFFECTIVE EAVESDROPPING RATE ANALYSIS
In this section, we derive the closed-form expression of effective eavesdropping rate according to the four cases and optimize the power allocation at eavesdropper for the proposed covert PSA scheme.
Case I: According to the received suspicious signals (18) and (19) during the data transmission phase, the average data rates at D and E can be respectively computed as Proof: See Appendix A. Case II: Similarly, according to (22) and (23) during the data transmission phase, the average data rates of this case at D and E can be respectively computed as Case III: Furthermore, according to (27) and (28) during the data transmission phase, the average data rates of this case at D and E can be respectively computed as (39) and (40), which is shown at the bottom of the next page.
Case VI: In addition, according to (33) and (34) during the data transmission phase, the average data rates of this case at D and E can be respectively computed as (41) and (42), which is also shown at the bottom of the next page.
Different from the existing works [14]- [18] where the legitimate eavesdropper optimizes the power allocation based on instantaneous CSI, we define the effective eavesdropping rate as a metric which the legitimate eavesdropper can select PSA and jamming power according to the statistical CSI to maximize the legitimate monitoring performance. Note that the effective eavesdropping rate can also approximate the exact legitimate eavesdropping performance well. Specifically, for the case that the PSA is covered, the eavesdropper can successfully eavesdrop the suspicious messages whit high probability (approximately equal to 1), since some message beams are directed to legitimate eavesdropper and the active jamming can further worse the received SINR at suspicious destination. And for the case that the PSA is exposed to suspicious source, the eavesdropper can hardly eavesdrop the suspicious messages (approximately equal to 0) as the artificial noise beams are pointed to legitimate eavesdropper but the message beams are directed to suspicious destination. Hence, for the considered proactive eavesdropping system, the effective eavesdropping rate can be defined as Remark 3: Based on the closed-form expression of average data rate in Case I, the challenge to successfully eavesdrop a suspicious communication system that equipped with multi-antenna is revealed. With the number of antennas N S increases, R D increases but R E remains unchanged, hence, even J jams with the maximum transmission power, R E may be still less than R D . For case II, the PSA is missed detection and some suspicious messages are directed to E, while some power of S is allocated to transmit interference with a random orientation for case III, which has more affect on D with respect to the received signal to interference plus noise ratio (SINR) since the performance of cognitive jamming becomes better with P S decreasing. Case VI lists a worst case that the PSA is exposed to S. For this case, it's more challenge to eavesdrop suspicious messages as the messages beam is directed to D while the interference beam is directed to E. So, it's necessary to launch pilot attack while keeping it covert to S. Remark 4: It's obvious that the effective eavesdropping rate highly depends on the power allocation between pilot and interference at eavesdropper. The more power to transmit pilot can accelerate suspicious messages leakage but is easy to be exposed. Similarly, the more power to send jamming can ensure a positive eavesdropping rate with high probability but the effective eavesdropping rate may also be degraded. So, it's necessary to optimize the power allocation for eavesdropper.

B. OPTIMAL POWER ALLOCATION AT EAVESDROPPER
In this subsection, we optimize the transmission power of legitimate eavesdropper to maximize the effective eavesdropping rate in the considered legitimate eavesdropping system under the covert PSA and transmission power constraints. Specifically, the optimization problem can be formulated as where C1 is the covert PSA requirement that ε is predetermined covert constraint and C2 is the transmission power budgets for legitimate eavesdropper.
For R E < R D , the effective eavesdropping rate is zero, so the optimal transmission power is P E = P J = 0. For R E ≥ R D , by substituting (18) in (19), the optimal problem can be transformed as To solve such a non-convex problem, we firstly treat P E as a constant while optimizing the jamming power P J .
Case III : R D = E h SD ,h SE ,h JD log 2 1 + P S1 h H SD w 31 2 P S2 h H SD w 32 2 + P J |h JD | 2 + σ 2 0 ≈ log 2 1 + P S1 P D N S + P S1 β SD σ 2 ≈ log 2 1 + P S1 P D + P S1 β SD σ 2 The problem (45) can be rewritten as For a given P E , the constraint C3 can be transformed to C3 ′ , which is proof: See Appendix B. As the interference transmitted by J simultaneously jams D and E, R D and R E monotonically decrease with P J increaseing, which indicates that a optimal jamming power of J is the minimum one that satisfied R E ≥ R D . Hence, a optimal jamming power during the data transmission phase can be computed as After obtaining the optimal jamming power P J * , the problem to optimize P E can be formed as With respect to P E , it can be solved via one-dimensional linear search. Algorithm 1 is summarized to solve problem (49). As to the complexity of our proposed algorithm, it highly depends on the cycles of one-dimensional search as each cycle only involves numerical operations. If we define the complexity of each numerical operations in the cycle as T c , the total computational complexity is O( T C P max ), as the maximum number of iterations is P max .

V. SIMULATION RESULTS
In this section, we first present numerical results to illustrate the detection performance at S for the proposed covert PSA scheme and demonstrate the optimal power allocation algorithm of the legitimate eavesdropper to maximize the effective eavesdropping rate is feasible. In our simulations, without loss of generality, we set the large scale fading coefficients as β SD = β SE = β JD = 1 and the power of noise is σ 2 0 = 0.1 W . The power of pilot at D and the power of suspicious messages at S are equal and fixed, i.e., P D = P S = 2 W . Furthermore, we assume that half of power at S is allocated to send interference in case III and case VI, while all power is used to send data in case I and II. Also, the power at legitimate eavesdropper is equally allocated for the antenna E and J to transmit pilot symbol and interference during training and data transmission phase, respectively. Fig. 3 depicts the probability of false alarm p fa , the probability of missed detection p md , the probability of total error ξ versus the detection threshold γ where N S = 16 and P E = 0.5 W . As our theoretical analysis, the probability of false alarm is 1, while the probability of missed detection keeps 0 when the threshold of detection is less than N S σ 2 0 = 1.6. And the probability of false alarm decreases while the probability of missed detection increases with the threshold increasing when the threshold is lager than N S σ 2 0 . As a result, the probability of total error also is 1 when the threshold of detection is less than N S σ 2 0 but first decreases and then increases when the threshold is lager than N S σ 2 0 , which indicates that there is a optimal detection threshold for detector. However, ξ > 0 is true even under the optimal detection threshold, which means that uncertainty of channels can be utilized to achieve covert PSA for legitimate eavesdropper. Fig. 4 plots the probability of total error ξ versus the detection threshold γ with different P E and N S . In this figure, we first observe that the optimal threshold of detection increases with P E and N S increase, respectively. And we can see the minimal ξ decreases when P E increases, which means that the PSA is easier to be detected by S when the P E increases. So, the available power of PSA is finite as the covert constraint normally requires ξ > 1 − ε, for any ε > 0. Although the proposed covert PSA scheme can be adopted to assist legitimate monitoring during the channel  training phase, the active jamming is also necessary since P E is less than P D in general and most power of suspicious message beam is directed to D. In addition, we observe that the minimal ξ remains unchanged when N S increases, which indicates that the probability of total error has nothing to do with N S under a optimal detection threshold at S. Fig. 5 plots the probability of missed detection p md and false alarm p fa versus the detection threshold γ with P E and N S . In this figure, we can see that p md increases with P E and N S decreasing, respectively. Hence, for the case that S equipped with massive antennas, the PSA may be detected with a high probability even the power of PSA is small. However, p fa decreases with N S increasing but keeps unchanged when P E increases, so the total error detection may still satisfy the covert constraint when the number of antennas increase. Fig. 6 shows the discriminatory of achievable rate R E and R D versus the number of antenna N S with different cases during data transmission phase, where we set P E = P J = 1W and the SI coefficient is φ = 0.1. From the figure, we can see that the discriminatory of achievable rate R E − R D slowly decreases and even remains constant in case II when the number of antenna at S is large enough. Furthermore, R E −R D is more than zero for this case, which indicates that legitimate monitoring is effective with the proposed legitimate monitoring scheme even in a multiple antenna system since some suspicious information beams are directly leaked to legitimate eavesdropper and most SI can be effectively eliminated. For comparing, we can observe that R E − R D persistently decreases with N S increasing for case I, case III and case VI, which leads legitimate monitoring invalid (R E − R D < 0) when N S is larger than a threshold. For case III, although some power is allocated to send interference with a random orientation and SINR is more seriously worsen, there is a tiny improvement by comparing to case I. For case VI, since the the PSA is exposed to S and the interference is directly pointed to E via beamforming, it has a worst legitimate monitoring performance. It's also revealed that the PSA can effectively assist to monitor a suspicious multiple antenna system but the PSA needs to be tactfully hidden without being detected for limited transmission power at eavesdropper. Fig. 7 depicts the discriminatory of achievable rate R E and R D versus the SI coefficient φ with different N S and cases, where P E = P J = 1W . We can see that R E − R D persistently decreases when φ increases since R E decreases with φ increasing. When φ is larger than 0.5, the legitimate monitoring becomes invalid even with the help of covert PSA. And for Case I, even though φ = 0, which corresponds to a perfected cancellation of SI, the legitimate monitoring may still be invalid as R E − R D < 0. It also shows that the number of antenna N S has little affect on the proposed covert PSA scheme, while active jamming scheme proposed in [15] is severely affected by N S . Fig. 8 shows the effective eavesdropping rate versus the number of antennas N S . We can see that the effective eavesdropping rate slowly decreases and then rapidly decreases to 0 when the number of antennas at S increases for the proactive eavesdropping without PSA, however, it slowly increases and   then maintains for the proposed covert PSA scheme. This is because that R E < R D even all the power are used to transmit jamming for the proactive eavesdropping without PSA when the number of antennas at S is lager than a threshold. But for the proposed covert PSA scheme, it's robust against multi-antenna suspicious communication since some suspicious messages beams are directed to the legitimate eavesdropper. Fig. 9 plots the effective eavesdropping rate versus the power allocation at eavesdropper. In this figure, we first observe that the effective eavesdropping rate slowly increases and then rapidly decreases to zero with the power of PSA increasing, this is due to the fact that more suspicious messages leak to eavesdropper with the power of PSA increasing, however, the PSA may be exposed to S (i.e. ξ * < 1 − ε) when the power of PSA is lager than a threshold, which leads to unsuccessfully eavesdropping. Furthermore, we observe that the proposed power allocation scheme can consume less power while achieving larger effective eavesdropping rate by comparing to the maximal power allocation scheme. This is because that, on the one hand, more power to transmit jamming will decrease the pilot attack power as the total power is constraints, on the other hand, the active jamming also decreases the performance at E since the imperfect interference elimination.

VI. CONCLUSION
In this paper, we first analyze the total error detection performance of suspicious source and derive the closeform expression of optimal detection threshold and effective eavesdropping rate. Then, we optimize the power allocation between PAS and interference to maximize the effective eavesdropping rate with the covert PSA and transmission power constraints. Finally, the numerical results are provided to validate the analysis. In addition, we show that the uncertainty of channels during the training phase can be utilized to achieve covert PSA for legitimate eavesdropper. Furthermore, the proposed covert PSA scheme can effectively combat with multi-antenna suspicious communication system.

APPENDIX A
According to the following approximation [43] E log 2 By substituting (52) and (53) into (51), we can obtain the average data rates at D as (35). Similarly, we can obtain the proof of (36).
Since the random variable X and Y in approximation expression (50) are not required to be independent [43], we can demonstrate the approximation expression of achievable rate in case II, III and VI, respectively.

APPENDIX B
By substituting (37) and (38) into C3, we can simplify the C3 as (54), which is shown at the top of this page. By defining l 1 = P D N S β 2 SE + P E β SD β SE + σ 2 0 β 2 SE β SD (55) l 2 = P E N S β 2 SD + P D β SD β SE + σ 2 0 β 2 SD β SE (56) (54) can be further computed as If l 1 and l 2 satisfy that l 1 < l 2 , which means the pilot power received from E is larger than that from D, the optimal interference power is P J = 0 as most energy of suspicious message beam is directed to E and there is no need to send jamming. Otherwise, the interference power should be satisfied that P J ≥ β SD β SE σ 2 0 (l 1 −l 2 ) β JE l 2 −β JD φ 2 l 1 when β JE l 2 − β JD φ 2 l 1 > 0. In addition, if l 1 and l 2 satisfy that l 1 > l 2 and β JE l 2 − β JD φ 2 l 1 < 0, which indicates that the pilot power received from E is less than that from D and the there is a serious SI, the constraint C3 is never satisfied for any P J . Hence, constraint C3 can be transformed as C3 ′ that given as (47).