Mining Pool Game Model and Nash Equilibrium Analysis for PoW-Based Blockchain Networks

Blockchain technology, has the characteristics of decentralization, openness and transparency, so that everyone can participate in database recording. Therefore, blockchain technology has a good application prospect in various industries. As the most successful application of blockchain technology, the Bitcoin system applies the Proof of Work (PoW) consensus mechanism. Under the PoW consensus mechanism, each miner competes through his own power to solve a SHA256 mathematical problem together, so as to gain profits. Due to the difficulty of the cryptography puzzle, miners tend to join the mining pool to obtain stable income. And the block withholding attacks will be carried out between the mining pools, so as to maximize his own income by controlling the infiltration rate dispatched to other mining pools. In this paper, we build a game model between mining pools based on the PoW consensus algorithm, and analyze its Nash equilibrium from two perspectives. The influence of the mining pools’ power, the ratio of the power to be infiltrated, and the betrayed rate of dispatched miners on the mining pool’s infiltration rate selection and income were explored, and the results were obtained through numerical simulations.


I. INTRODUCTION
Blockchain is a kind of chained data structure which combines data blocks in chronological order and ensures the tamper proof and forgery proof distributed ledger by cryptography. It uses a distributed consensus algorithm to generate and update data, uses cryptography to ensure the security of data transmission and access, uses the intelligent contract composed of automatic script code to program and operate data [1]. Because of its advantages in reducing costs, improving security and decentralization, blockchain technology has a wide range of application prospects, such as big data [2], smart grid [3], [4], Internet of Things [5], medical use [6]. Blockchain also has potentially huge application value in financial fields such as international exchange, letters of credit, equity registration, and stock exchanges. The application of blockchain technology in the financial industry can eliminate the need for third-party intermediary links and achieve direct peer-to-peer docking, thereby The associate editor coordinating the review of this manuscript and approving it for publication was Zhibo Wang . greatly reducing costs and quickly completing transaction payments [7]. Blockchain can also be naturally combined in the field of logistics. It facilitates to reduce logistics costs and improve the efficiency of supply chain management [8], [9].
The consensus mechanism is an important part of blockchain technology, which determines the degree of decentralization in blockchain technology, as well as the security and efficiency of blockchain technology. It allows the nodes on the entire network to reach a consensus and create a trust-free bookkeeping mechanism on the blockchain to ensure the consistency and authenticity of each transaction on all bookkeeping nodes. As the most successful application of blockchain technology, the Bitcoin system applies a Proof of Work (PoW) consensus mechanism. The core idea of the PoW consensus mechanism is to ensure the consistency of data and the security of consensus by introducing the computing power competition to the distributed nodes.
In the system, each node competes with its own computing power to jointly solve a SHA256 mathematical problem, that is, to find a nonce in the entire network to ensure that the double SHA256 operation result of the block header VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ of the current block is less than or equal to a predefined value. Once a node finds a random number that meets the requirements, the node will get the bookkeeping right of the current block as a reward, and the bookkeeper will also get a certain income [10], [11]. The above-mentioned process of obtaining rewards by implementing bookkeeping is also called ''mining'', and the nodes participating in mining are called ''miners'' [12]. Miners receive benefits based on their own computing power. It is difficult for small miners to succeed in mining alone.In order to obtain a more stable income, the miners choose to join the mining pool and mine together with other miners [13], and share rewards according to their own computing power [14]. A mining pool consists of a pool administrator and several miners. Miners use computing power to mine and send a partial Proofs of Work (PPoW) or a full Proofs of Work (FPoW) to obtain a gain proportional to the computing power. Sending part of the workload proof is not valuable to the Bitcoin system, and can only be used as a standard for measuring the miner's contribution to the computing power. In other words, the miner did not contribute effective computing power but obtained part of the profit of the mining pool. This behavior is called block withholding attacks. In a mining pool, miners can perform block withholding attacks on the mining pool and share the benefits of the mining pool with other miners. However, mining pools can also use miners to infiltrate into other mining pools and conduct block withholding attacks on other mining pools to obtain revenue in order to increase the total revenue of their own mining pools. For example, mining pool i sends a miner to infiltrate into mining pool j, and the miner sends partial proofs of work in mining pool j. That is, the miner did not mine effectively in mining pool j, but received the proceeds from mining pool j. And bring the income back to the original mining pool i, thereby increasing the income of the original mining pool i. This is a block withholding attack between mining pools. And, there will be a situation where miners who infiltrate into mining pool j will betray, that is, the miners faithfully mine in the mining pool j and do not bring the revenue back to the original mining pool i. In this case, for the original mining pool i, the revenue is lost. That is, for the original mining pool i, the behavior of the miners is betrayal. So, how to determine whether the mining pool is attacking or not and how to identify the betrayal miners? We plan to sign corresponding agreements when miners join the mining pool. The agreement stipulates that miners in the mining pool shall not enter other mining pools. And from the beginning to the end of a round of mining, no other miners are allowed to join the mining pool. Then in each mining process, the number of miners in the mining pool is fixed. Once the miners in the mining pool enter into other mining pools, it is regarded as an ''attack'', and the system will automatically give the mining pool punishment measures. If it appears at the beginning of mining, the miners in one mining pool enter other mining pools. And until the end of one round of mining, the miner did not return, that is, the number of miners in the two mining pools no longer changed, and this miner was regarded as a betrayal miner in the original pool. In order to make the content of the paper research more standardized and rigorous, the mining pools studied in the paper are considered as the closed mining pools.
At present, there are also some research results on the mining dilemma. Lewenberg Y et al. mapped the miner's choice of mining pool into a cooperative game model, and the miner increased his own income by changing the mining pool he chose to join [15]. Tang et al. start with the mining dilemma of pow consensus algorithm, analyze the existence conditions of Nash equilibrium of miner's strategy selection in the process of pow consensus, and optimize miner's strategy selection with zero determinant strategy [16]. Fan et al. combined with time-series difference enhancement algorithm and adaptive zero determinant strategy to deal with the problem of mutual attacks between mining pools [17]. Wang Tiantian et al. used the deep gradient learning strategy gradient algorithm to study the strategy choice of iterative prisoner's dilemma and deal with the Nash equilibrium problem of mining dilemma [18]. Eyal analyzes the existence of the mining dilemma, that is, the Nash equilibrium chooses the attack strategy for the mining pool, and the profit when the mining pool chooses to attack is not higher than the profit when it chooses not to attack [19]. Chang et al. analyzed UBA(uncle-block attack)'s incentive compatibility and identified and modelled the critical systems-and environmentalparameters which determine the attack's impacts [20]. However, considering the blockchain system's participation in the reward and punishment system and the mining pool's betrayal rate, no relevant research results have appeared. The main contributions of this paper are summarized as follows.
• The mining pool game model from the perspective of system rewards and punishments is established and its Nash equilibrium is analyzed. We get the relationship between the profit of the mining pool and the reward and punishment in the Nash equilibrium.
• The mining pool game model from the perspective of block withholding attacks between mining pools is established. That is, the infiltrate rate and betrayal rate of the mining pool are considered. And the Nash equilibrium and the value of infiltrate rate under the Nash equilibrium are analyzed.
• The influence of the mining pool's computing power, the ratio of the power to be infiltrated, and the betrayed rate of dispatched miners on the mining pool's infiltration rate selection and income are explored. The results were verified through numerical simulation. The organizational structure of this paper is as follows: From the perspective of adding a reward and punishment system to the blockchain system, a multi-pool mining game model is established, and its pure strategy Nash equilibrium and mixed strategy Nash equilibrium are analyzed in Section II. From the perspective of block withholding attacks between mining pools, that is, considers the infiltrate rate and betrayal rate of the mining pool, analyzes the Nash equilibrium and the value of infiltrate rate under the Nash equilibrium in Section III. Through numerical simulation, section IV explores the influence of the mining pool's computing power, the ratio of the power to be infiltrated, and the betrayed rate of dispatched miners on the mining pool's infiltration rate selection and income. We summarizes and prospects our work in Section V.

II. GAME ANALYSIS OF MINING POOLS A. MODEL DESCRIPTION
Assume that in a blockchain system based on the PoW consensus, M mining pools are formed by miners to obtain revenue through mining. Suppose the mining power vector is p = (p 1 , . . . , p i , . . . , p M ), p i is the mining power of the mining pool i, mining pool i will send the miners in its own pool to infiltrate other pools for block withholding attacks. Let the total power income in the system be R, and R = 1. In the case of attacking the mining pool, the mining pool that chooses not to attack will get an extra reward of a(0 ≤ a ≤ 1) given by the system, and the mining pool that chooses to attack will get the penalty of ka(k ≥ 1), where k is the proportion of penalty to reward. (1) When m = M − 1, pool i choose not to attack, and the mining pool i gains (2) When 0 ≤ m < M − 1, mining pool i chooses not to attack, set d i ≥ 0, then the profit of mining pool i is p i + a − (M − m − 1)d i , if mining pool i chooses to attack, and its profit is p i − ka + (M − 1)d i , that is: (3) When all mining pools choose to attack each other, their Let's take M = 2 as an example, because here we only analyze the relationship between the system reward a and the mining pool to reach the Nash equilibrium, so the profit of the mining pool in the model obtained through the infiltrating power is temporarily represented by d. The third part will analyze d in detail. Their benefits are as follows:

1) NO MINING POOL ATTACK
When both mining pools choose N (Not to attack), that is, neither miner is sent to infiltrate into the other mining pool, and the profit obtained by mining pool 1 is p 1 · 1 = p 1 , in the same way, the profit of mining pool 2 is p 2 .

2) A MINING POOL ADOPTS THE A(ATTACK) STRATEGY
If mining pool 1 chooses not to attack, it will receive a reward of a. Although mining pool 2 that chooses the attack strategy will be punished by ka, it will share the profits in pool 1 by infiltrating miners, so pool 1 will lose the return of d, then the return expression of pool 1 is p 1 + a − d, and the return expression of pool 2 is p 2 − ka + d. Similarly, mining pool 1 chooses to attack, earns d by infiltrate miners into mining pool 2, and gets punishment ka. The mining pool 2 that chooses not to attack gets a reward of a, but will lose the profit of d, then the return expression of pool 1 is p 1 − ka + d, and the return expression of pool 2 is p 2 + a − d.

3) TWO MINING POOL ATTACKS
When the mining pools choose to attack, the mining pools infiltrate the miners into the other mining pool and are punished ka. Suppose that the final mining pool 2 receives more revenue from mining pool 1 than mining pool 1, then mining pool 1 loses d , mining pool 2 gets d , and mining pool 1's income expression is p 1 −ka−d , the mining pool 2's income expression is When the mining pool 1 chooses not to attack, the mining VOLUME 8, 2020 pool 2 increases its own revenue by selecting the non-attack strategy; when the mining pool 1 selects the attack strategy, the mining pool 2 will not significantly reduce the revenue by selecting the attack. It can be obtained that the Nash equilibrium points of the mining pool 1 and the mining pool 2 are (N, N) and (A, A).
When d > d > ka. When the mining pool 1 chooses not to attack, the mining pool 2 increases its own revenue by selecting the attack strategy; when the mining pool 1 selects the attack strategy, the mining pool 2 will increase its own income by selecting the attack, so the Nash equilibrium is (A, A).
When the mining pool 1 chooses not to attack, the mining pool 2 will not reduce its own revenue by selecting the non-attack strategy; when the mining pool 1 selects the attack strategy, the mining pool 2 chooses not to attack the strategy to increase its own revenue than the attack strategy. At this time, the Nash equilibrium point is (N, N).
It is analyzed that in the case of a < d < d < ka and a − d < d − ka < d − ka, the Nash equilibrium is (N, N) and (A, A). If both mining pools choose to attack each other, then the income of the two mining pools will not be high when the cooperation is selected, and the system revenue will also decrease. In order to improve system revenue, we apply the ZD strategy to the mining pool game.

2) MIXED STRATEGY NASH EQUILIBRIUM
The mining pool game has a unique mixed Nash equilibrium point. Set the probability that mining pool 1 chooses not to attack is x, (0 ≤ x ≤ 1), and the probability that mining pool 2 chooses not to attack is y, (0 ≤ y ≤ 1), then the expected return of mining pool 1 choosing not to attack is: the expected return of mining pool 2 choosing not to attack is: Under the mixed strategy Nash equilibrium point, the expected return of the mining pool choosing different strategies is the same, namely: Similarly, for mining pool 2, the expected return from choosing not to attack is: the expected return on the chosen attack is: According to the mixed strategy Nash equilibrium point, the mining pool chooses different strategies with the same expected return, which is: (8) namely, According to the expressions of x, y, we can get the conditions under which the mixed Nash equilibrium exists: Theorem 1: In the mixed strategy, for the mining pool 1, when d − kd < 0, the probability of the mining pool 1 choosing not to attack x is inversely proportional to the value of a. When d − kd > 0, the probability that the mining pool 1 chooses not to attack x is proportional to the value of a. The probability that the mining pool 2 chooses not to attack y is always proportional to the value of a, that is, the greater the value of a, the greater the probability that mining pool 2 chooses not to attack, and the conditions under which the mixed strategy Nash equilibrium exists are: Remark 1: When both mining pools attack each other, when the income of mining pool 1 is p 1 − ka + d and the income of mining pool 2 is p 2 −ka−d , the above conclusions about x, y are opposite.
Corollary 1: Under the mixed strategy Nash equilibrium point, when 0 < a < a 1 , the expected return of the mining pool 1 e 1 is proportional to a. When a 1 < a < 1, the expected return e 1 of mining pool 1 is inversely proportional to a, (2), the expected return of the mining pool 1 is: Derivating e 1 : Because (a + d ) 2 > 0 holds, let −ka 2 − 2kd a + dd + kdd + d 2 = 0. After calculation, the axis of symmetry is a = −d < 0, and there is a root greater than zero, Therefore, when 0 < a < a 1 , e 1 > 0, the expected return of the mining pool 1 e 1 is proportional to a. When a 1 < a < 1, the expected return e 1 of mining pool 1 is inversely proportional to a, where Corollary 2: Under the mixed strategy Nash equilibrium point, (i) when d d > 1 + k and 0 < a < a 2 , the expected return of the mining pool 2 e 2 is proportional to a. When a 2 < a < 1, the expected return of the mining pool 2 e 2 is inversely proportional to a, where (ii) When d d < 1 + k, d < 0.5 and a 3 < a < a 4 , the expected return of the mining pool 2 e 2 is proportional to a. When 0 < a < a 3 or a 4 < a < 1, the expected return of the mining pool 2 e 2 is inversely proportional to a, where Proof: Substituting Eq.(9) into Eq. (6), the expected return of the mining pool 2 is: Derivating e 2 : Because (a − d ) 2 > 0 is constant, let −ka 2 + 2kd a − dd − kdd + d 2 = 0, after calculation, the axis of symmetry is a = d > 0.
(1) When d d > 1 + k, −ka 2 + 2kd a − dd − kdd + d 2 = 0 has a positive root: Therefore, when 0 < a < a 2 , e 2 > 0, the expected return of the mining pool 2 e 2 is proportional to a. When a 2 < a < 1, e 2 < 0, the expected return of the mining pool 2 e 2 is inversely proportional to a, where That is, the equation −ka 2 + 2kd a − dd − kdd + d 2 = 0 always has two solutions a 3 , a 4 , (0 < a 3 < a 4 ), and, Therefore, when d < 0.5 and a 3 < a < a 4 , e 2 > 0, the expected return of the mining pool 2 e 2 is proportional to a. When 0 < a < a 3 or a 4 < a < 1, e 2 < 0, the expected return of the mining pool 2 e 2 is inversely That is (ii).

III. ANALYSIS OF MINING POOL GAME MODEL ON INFILTRATE RATE AND BETRAYAL RATE A. MODEL INTRODUCTION
This section will analyze the ''d'' in the previous model in detail. Suppose there are M mining pools in the system, and the initial computing power of the mining pool i is p i . Assume that there are also miners who dig alone without adding the mining pool in the system, and consider all the separately miners as a whole, with a computing power of p M +1 , and satisfying M +1 i=1 p i = 1, then the total system revenue is also 1. a ij (i, j = 1, 2, . . . , M , M j=1 a ij = 1, 0 ≤ a ij < 1, 0 < a ii ≤ 1) is the infiltrate rate of mining pool i to mining pool j, a ij · p i represents the infiltration mining power of mining pool i to mining pool j, and a ii is the computing power ratio reserved by the mining pool i itself. Among them, when the mining pool infiltrates miners into other mining pools, although some miners bring the profits obtained by infiltrating into the mining pool back to the original mining pool, some of them will also provide FPoW in the submerged pool to obtain the income but not bring back. That is to say, due to the betrayal of the miners, the power of the original mining pool decreases, while the effective power of the submerged mining pool increases. Let δ i (0 ≤ δ i < 1) be the betrayal rate of the submerged power of the mining pool i.
In the mining process, the computing power of the mining pool i can be divided into two parts: effective mining power and attack power. Then the income of the mining pool i is the income obtained by effective mining power and the income obtained by infiltrating into other mining pools through attack. The average power income of the mining pool is defined as follows.
Definition 1 (Average Power Income): Let the average power income of the mining pool i be the ratio of the total profit of the mining pool to the total power of the mining pool, and record it as R.
Then the average power income of the mining pool i at step t isR i (t): r i is the profit obtained from the system by the mining pool's effective mining power.
That is, the ratio of the computing power a ii p i retained by the mining pool i and the power of betrayed miners infiltrated from pool j but faithfully mine in pool i and the total effective computing power in the system. Proof: Set the average power income at the t step game of the mining pool is: In each round, the profit obtained from the system by the effective mining power in the mining pool i is equally distributed to the actual total computing power of the mining pool i that includes the computing power infiltrate from other mining pools. Let P and C be as shown in formula (44) and (45) as shown at the bottom of the next page. Where, Define a M × M matrix on infiltrate rate and betrayal rate: When i = j, H ij = 0. Because the power of mining pool i has infiltrated into the mining pool j in the t step, share the profit at the end of the t − 1 step game in the mining pool j. Then:R (t) = P + H ·R(t − 1) + C. (27) Because the sum of each row of matrix H is less than 1, when t tends to infinity, there is: The return of the mining pool i at step t is: Then when the number of iterations of the game is enough, the final return of the mining pool is stable.

B. NASH EQUILIBRIUM ANALYSIS
The following takes M = 2 as an example to discuss the Nash equilibrium of the mining pool under various strategic options.

1) NO MINING POOL ATTACK
When both mining pools choose N (Not to attack), that is, a ij = a ji = 0(i, j = 1, 2), and the profit obtained by mining pool 1 is p 1 · 1 = p 1 , in the same way, the profit of mining pool 2 is p 2 .

2) A MINING POOL ADOPTS THE A(ATTACK) STRATEGY
When mining pool 1 chooses not to attack and mining pool 2 chooses to attack. That is a 12 = 0, a 21 > 0, δ 2 ≥ 0. a 21 p 2 δ 2 is the betrayal power of mining pool 2. p 3 is the total power of the individual mining in the system. Obviously a 33 = 1. The effective mining power profit of mining pool 1 is: The average power income of mining pool 1 is the effective mining power income plus the reward obtained from the system when mining pool 1 chooses not to attack is divided equally by the mining computing power of mining pool 1 and mining pool 2. The average power income of mining pool 1 is: The effective mining power profit of mining pool 2 is: The average power income of mining pool 2 is the effective mining power income plus the income from the potential power without betrayal minus the punishment for choosing to attack mining pool 1, which is equally divided in the power without betrayal of mining pool 2. The average power income of mining pool 2 is: Mining pool 2 will maximize its own income by controlling the infiltration rate to mining pool 1, that is, the value of a 21 (0 < a 21 < 1). Because mining pool 1 does not respond to the attack of mining pool 2, the value of a 21 when the mining pool 2 maximizes the return value is the stable state of the system. Thereby: Substitute the stable value a 21 to get the profit value of mining pool 1 and mining pool 2.
Similarly, when mining pool 2 chooses not to attack and mining pool 1 chooses to attack. That is a 12 > 0, δ 1 ≥ 0, a 21 = 0, a 12 p 1 δ 1 is the betrayal power of mining pool 1. p 3 is the total power of the individual mining in the system. The effective mining power profit of mining pool 1 is: The average power income of mining pool 1 is the effective mining power income plus the income from the potential power without betrayal minus the punishment for choosing to attack mining pool 2, which is equally divided in the power without betrayal of mining pool 1. The average power income of mining pool 1 is: The effective mining power profit of mining pool 2 is: The average power income of mining pool 2 is the effective mining power income plus the reward obtained from the system when mining pool 2 chooses not to attack is divided equally by the mining computing power of mining pool 1 and mining pool 2. The average power income of mining pool 2 is: Similarly, Mining pool 1 will maximize its own income by controlling the infiltration rate to mining pool 2, that is, the value of a 12 (0 < a 12 < 1). Because mining pool 2 does not respond to the attack of mining pool 1, the value of a 12 when the mining pool 1 maximizes the return value is the stable state of the system. Thereby: Substitute the stable value a 12 to get the profit value of mining pool 1 and mining pool 2.
The effective mining power profit of mining pool 1 is: The average power income of mining pool 1 is the effective mining power income plus the income from the potential power without betrayal minus the punishment for choosing to attack mining pool 2, which is equally divided in the actual total power of mining pool 1 and the power infiltrated from mining pool 2. In stable state, the average power income of mining pool 1 is: Similarly, the effective mining power profits of mining pool 2 is: R 1 (a 12 , a 21 ) = r 1 p 2 + (r 1 + r 2 )a 12 p 1 − r 2 a 12 p 1 δ 1 − r 1 a 21 p 2 δ 2 p 1 p 2 + a 12 p 2 1 + a 21 p 2 2 − δ 1 a 2 12 p 2 1 − δ 2 a 2 21 p 2 2 + [(δ 2 + δ1)a 12 a 21 − δ 1 a 12 − δ 2 a 21 ]p 1 p 2 (46) R 2 (a 12 , a 21 ) = r 2 p 1 + (r 1 + r 2 )a 21 p 2 − r 1 a 21 p 2 δ 2 − r 2 a 12 p 1 δ 1 p 1 p 2 + a 12 p 2 1 + a 21 p 2 2 − δ 1 a 2 12 p 2 1 − δ 2 a 2 21 p 2 2 + [(δ 2 + δ1)a 12 a 21 − δ 1 a 12 − δ 2 a 21 ]p 1 p 2 (47) VOLUME 8, 2020 The average power income of mining pool 2 is the effective mining power income plus the income from the potential power without betrayal minus the punishment for choosing to attack mining pool 1, which is equally divided in the actual total power of mining pool 2 and the power infiltrated from mining pool 1. In stable state, the average power income of mining pool 2 is: In order to solve the values of R 1 and R 2 in the stable state, the simultaneous expression of (41)(43) can be used to obtain the expression of average power income related to a 12 and a 21 (see formula (46) and (47) as shown at the bottom of the previous page).
In each round of game, the mining pool will optimize its own revenue by controlling its own infiltration rate. When both mining pool 1 and mining pool 2 will not change their infiltration rate to increase revenue, this stable state will reach the Nash equilibrium. That is, for any pair of a 12 and a 21 , there are: argmax a 12 R 1 (a 12 , a 21 ) = a 12 ; argmax a 21 R 2 (a 12 , a 21 ) = a 21 . (48) Similarly, in each round of game, the mining pool will optimize its own revenue by controlling its own infiltration rate. When both mining pool 1 and mining pool 2 will not change their infiltration rate to increase revenue, this stable state will reach the Nash equilibrium. That is, for any pair of a 12 and a 21 , there are: argmax a 12 R 1 (a 12 , a 21 ) = a 12 ; argmax a 21 R 2 (a 12 , a 21 ) = a 21 . (52) Therefore, in the Nash equilibrium state, the values of a 12 and a 21 satisfy: ∂R 1 (a 12 , a 21 ) ∂a 12 = 0; ∂R 2 (a 12 , a 21 ) ∂a 21 = 0. (53)

IV. SIMULATION
Through numerical simulation, this section explores the influence of the mining pool's computing power, the ratio of the power to be infiltrated, and the betrayed rate of dispatched miners on the mining pool's infiltration rate selection and income. We first consider the situation of one mining pool attack, assuming that mining pool 1 chooses the attack strategy and mining pool 2 chooses not to attack. In Fig.2 By comparing the three sets of figures in Fig.2-10, we find that all have similar changes. Fig.2, Fig.5 and Fig.8 are the changes of the values of the infiltrate rate a 12 at the Nash equilibrium in the case of different p 1 and p 2 . It is found through observation that when the computing power of mining pool 2 gradually decreases, the infiltrate rate a 12 of mining pool 1 also gradually decreases. When the total mining power of   pool 1 and pool 2 is close to 1, the infiltrate rate of pool 1 is larger. When the value of the miner betrayal rate δ 1 of the mining pool 1 increases, the value of the infiltrate rate a 12 of mining pool 1 also gradually increases. Fig.3, Fig.6 and Fig.9 are the average power incomes of pool 1 in the case of the optimum infiltration. It has been observed that the profit gained by mining pool 1 when choosing an attack strategy is higher than that when it chooses not to attack, that is, choosing an attack strategy can increase its own revenue.   When the total mining power of pool 1 and pool 2 is close to 1, the average power incomes R 1 of pool 1 is larger. When the value of the miner betrayal rate δ 1 of mining pool 1 increases, the average power incomes of mining pool 1 R 1 gradually decreases. It means that the more miners betrayed, the more adverse the impact on the profit of mining pool 1. Fig.4, Fig.7 and Fig.10 are the average power incomes of pool 2 in the case of the optimum infiltration. It has been observed that the average power incomes of mining pool 2 VOLUME 8, 2020   is generally lower than that of mining pool 1. This shows that when the opponent's mining pool chooses not to attack the strategy, the attack can increase its own revenue and effectively reduce the revenue of the opposing mining pool. When the total mining power of pool 1 and pool 2 is close to 1, the average power incomes R 2 of pool 2 is larger. When the value of the miner betrayal rate δ 1 of mining pool 1 increases, although the change trend of the average power income of   mining pool 2 is not too obvious, but it also has a slight increase. The results show that the more miners betrayed in mining pool 1, the more effect it will have on the income of mining pool 2. But on the whole, when mining pool 1 chooses to attack, its income is still higher than that of mining pool 2.
Next, we consider the situation where two mining pools attack each other. Fig.11-13 are the values of a 12 at the Nash equilibrium under the different p 1 and p 2 when δ 1 is taken as 0.2, 0.5, 0.8 respectively. It is found through observation that when the total mining power of pool 1 and pool 2 is close to 1, the average power incomes R 1 of pool 1 is larger. And as δ 1 increases, the value of a 12 gradually increases. Fig.14-16 are the changes of a 12 in Nash equilibrium under different cases of A 1 when δ 1 is taken as 0.2, 0.5, 0.8 respectively. Because the computing power of the mining pool varies, the same ratio of the power to be infiltrated A 1 corresponds to multiple optimal infiltrate rates a 12 . Through observation we found that as the A 1 increases, the infiltrate rates a 12 of mining pool 1 gradually decreases. On the whole, with the increase of the betrayal rate δ 1 , the threshold of the ratio of the power to be infiltrated A 1 decreases gradually when the infiltrate rates a 12 decreases to 0. And the infiltrate rates a 12 decreases with the increase of the betrayal rate δ 1 under the same ratio of the power to be infiltrated A 1 .

V. CONCLUSION
For existing papers, a mining pool game model based on the PoW consensus mechanism from the perspective of adding rewards and punishments to the blockchain system is builded firstly in this paper. And its pure strategy Nash equilibrium and mixed strategy Nash equilibrium are analyzed. Then the block withholding attacks between mining pools are considered. That is, the infiltrate rate and betrayal rate of the mining pool are considered, related models are builded, the Nash equilibrium and the value of infiltrate rate under the Nash equilibrium are analyzed. This is also a new discussion that has not appeared in other papers. Finally, the influence of the mining pool's computing power, the ratio of the power to be infiltrated, and the betrayed rate of dispatched miners on the mining pool's infiltration rate selection and income are explored by numerical simulation. Nowadays, the upsurge of blockchain technology has swept across all walks of life and has become one of the hottest and most noticed information technologies of the moment. The PoW consensus mechanism has always played a very important role in blockchain technology. The phenomenon of mutual attack of mining pools in blockchain technology has brought a very adverse impact on the application of blockchain technology. This article analyzes the Nash equilibrium from the perspective of adding reward and punishment mechanisms to the blockchain system and block withholding attacks between mining pools, and discusses the value of the infiltrate rate under the maximum profit of the mining pool. It has a certain effect on the research of blockchain technology. However, blockchain technology integrates a variety of complex computer technologies such as encryption algorithms, P2P file transfers, and consensus mechanisms. The research content of this article is relatively single. In the future, game theory will continue to be more deeply integrated with the problems existing in blockchain technology. Based on the model studied in this paper, we will continue to study the mining dilemma in depth, hoping to provide more effective help in improving the promotion of blockchain technology.