An Energy Trace Compression Method for Differential Power Analysis Attack

,


I. INTRODUCTION
Cryptographic devices inevitably leak some physical information, such as power consumption, electromagnetic radiation, and runtime, when performing encryption or encryption operations. Side-channel attacks (SCAs) use the physical information to reveal secret keys of cryptographic devices. According to different types of physical information, the most important SCAs are of three types: timing attacks [1], power analysis attacks [2], and electromagnetic radiation attacks [3]. Timing attacks [1] were first proposed in the field of cryptographic devices in 1996. Since then, non-intrusive SCAs technology has set off a wave of research in the field of information security [4]- [8]. Power analysis attacks [2] use the energy consumption characteristics of cryptographic devices rather than the mathematical characteristics of cryptographic algorithms. Because of its simple operation, wide application range, and high success rates, power analysis attacks have become one of the most commonly used and effective methods in SCAs. The power analysis attack has successfully attacked various cryptographic algorithms, for example, AES [9], DES [10], and RSA [11]. It can also attack various The associate editor coordinating the review of this manuscript and approving it for publication was Noor Zaman . encryption devices, such as smart cards, field programmable logic devices (FPGA), microcontroller, and ASIC and crypto SoC [12]- [15].
Differential power analysis (DPA) attacks use statistical tools (also known as distinguishers) to reveal the relationship between the key and power consumption. DPA attack results are greatly affected by the signal-to-noise ratio [16] of the leaked information. In order to improve the success rate and efficiency of DPA attacks, it is particularly important to pre-process the power traces. This has two main purposes: one is to reduce the power consumption samples and the other is to decrease the sample calculation. Pre-processing techniques include digital signal processing (DSP) technology [17], principal component analysis (PCA) [18], interception [19], and integration [20]. DSP methods mainly include wavelet transform denoising technology [21], [22], the Fourier transform [23], [24], low-pass filters [25], and so on. However, DSP technology needs to know all kinds of parameters and ensure that the signal and noise are not in the same frequency domain, which needs high requirements of the attacked device and attacker. Thus, it is difficult to implement the DSP technology. The literature [26]ower traces were selected according to the principal components. Compared with the common correlation power attack (CPA) [27], the attack efficiency was greatly improved. Clavier [28], [29] looked for points of interest in power tracking and applied collision-related techniques to recover the entire key. Meanwhile, Park et al. [30] proposed subtraction algorithm analysis on equidistant data subtraction algorithm analysis on equidistant data (SAED), which extracted sensitive information using the event information of the subtraction operation in a reduction algorithm; however, an attacker still needed 256 power traces to obtain a data block. Based on the singular value decomposition, Zhou et al. [31] selected high-quality energy traces for DPA attacks, but did not reduce the sample calculation. Focused on template attacks [32], Zhang and Zhou [33] evaluated the optimal number of interest points in the simulation scene and provided a useful empirical formula; however, this was only applicable to Gaussian template attacks. Finally, Wang et al. [34] proposed a feature point extraction scheme based on the difference, but the experimental results still needed to be extracted manually; they also lacked objectivity and sufficient theoretical support.
Our contributions in this study are as follows. In order to reduce the amount of sample calculation and manual intervention, this paper studies the energy trace data preprocessing scheme for SoC. Based on the current characteristics of complementary metal oxide semiconductor (CMOS) circuits, the position and range of the power consumption difference interval are analysed, which provides a theoretical basis for extracting the effective interval. This study then proposes a method for locating marker points based on the Hamming distance, which accurately locates the position of power consumption data in the energy trace that has the strongest correlation with the key. Through a large number of experiments, it is confirmed that only a small number of data near the markers are needed to form an effective interval, and the attack effect is the best when 10 sampling points are intercepted to form an effective interval near the marker.
The rest of the study is organized as follows. In Section 2, we will briefly introduce the relevant research background. The power consumption difference interval will be analysed in Section 3. In Section 4 follows explanations of the use of the Hamming distance classification to locate markers. The experimental verification of the effective interval will be discussed in Section 5. Finally, Section 6 will conclude the whole study.

II. RELATED BACKGROUND A. ENERGY MODEL
DPA attacks use an energy model as a criterion for distinguishing between right and wrong keys. The commonly used energy models are the Hamming distance (HD) model [35] and the Hamming weight (HW) model [36], both of which represent the correlation between input data (i.e., plaintext and key) and power consumption. The HD model is simple in principle, convenient to implement, and more widely used. The HD represents the number of distinct bits between v 0 and v 1 , and is equal to the HW of v 0 ⊕ v 1 . The HW is equal to the number of bits with a logical value of ''1'' in a binary string. The HD model assumes that all components have the same influence on power consumption, that is, the 0 → 1 conversion and the 1 → 0 conversion have the same power consumption. Then, the total number of conversions is used to characterize the power consumption of the circuit during this period. Therefore, the energy model based on the HD is expressed as follows: where E is the energy consumed by the circuit during the register switching from v 0 state to v 1 state, a is the energy consumption ratio coefficient, and b is the power consumption and noise that are not related to the processed data.

B. STEPS OF DPA ATTACK BASED ON THE MEAN DIFFERENCE
The general method of the DPA attack requires computing the correlation between two matrices. The correlation calculation process is cumbersome when the matrix is large. In order to simplify the calculation steps, the mean difference is used instead of the correlation coefficient calculation. The steps of the DPA attack based on the mean difference are as follows: According to each column vector in matrix H, T is divided into two subsets. The first subset contains the rows in T whose index value is equal to the index of 0 in the column vector, and the second subset contains all the remaining rows in T . Then, calculate the averages of the two subsets separately to get two rows. Finally, find the difference between the two mean values to get 1 row. A total of N rows are obtained for all of the N column vectors of H. • Step 6: Comparing All Mean Differences. Compare the N average difference rows obtained in step 5.
The key corresponding to the row with the largest value is the key VOLUME 8, 2020 obtained by the attack, and the time corresponding to the column is the maximum information leakage moment.

C. ENERGY TRACE COMPRESSION
Not all data in the energy trace are related to the hypothesized power consumption, that is, there is redundancy in the energy trace. Energy trace compression is designed to try to remove redundant data. In fact, on the premise of retaining information, reducing the amount of data in the energy trace as much as possible can greatly improve the time efficiency of the attack. A previous analysis [37] has shown that the peaks appearing in the energy traces are the most relevant points for energy analysis attacks. Based on this conclusion, two commonly used energy trace compression techniques have been proposed: one is maximum value extraction and the other is integration. The former only keeps the maximum value of the energy trace in each clock cycle, and the latter integrates the energy traces near the peaks into a value, such as summing or square summing. However, it is still impossible to know which segment of the energy trace has a strong correlation with the hypothesized power consumption value.

D. ATTACK SUCCESS RATE AND RELIABILITY
The attack success rate was first proposed by Standaert et al. [38]. Since then, the success rate has been widely used to evaluate the probability of key recovery in DPA attacks. The success rate is defined as the probability that the keys of the attacked device is successfully recovered under a certain number of power traces. Generally, it is not necessary to recover all the key bits to prove that the encryption device has failed. If 8-bit sub-keys are proven to have serious sensitive information leakage, the device may fail [37]. Therefore, the partial success rate (PSR) is also often used. The PSR in this paper also refers to the success rate.
In addition, in order to evaluate the reliability of the result key, this paper uses Euclidean distance fluctuation Devia, which is defined as follows: where min, submin, and E represent the minimum value, the next smallest value, and the average value of the Euclidean distance of all keys, respectively. The greater the difference between E-min and E-submin, the closer Devia is to 1, the more prominent the uniqueness of the minimum value, and the higher the reliability of the successful attack.

III. POWER CONSUMPTION DIFFERENCE INTERVAL AND INTERMEDIATE VALUE SELECTION
Digital circuits are made up of logic components, including combinational components and sequential components. The output of a combinational component is used as the input of another. Such a circuit is called a multi-stage combinational circuit. When an input of a combinational circuit changes, the output does not necessarily change with it.
This phenomenon is more serious in multi-level circuits, that is, in a multi-level circuit, the output does not easily change with a change in input. The input signal is considered blocked at this moment. Assume that the probability of each input value being 0 or 1 is 0.5; then, the probability of the input signal of a two-input AND gate, NAND gate, OR gate, and NOR gate being blocked is 5/8, that is to say, the probability of the output of these logic gates changing due to input changes is 3/8. Therefore, for a multi-stage circuit composed of such logic gates, the input signal can usually only pass a few gates to the output. For example, the probability that the output signal of the register reaches the output terminal after passing through a six-stage combinational circuit is approximately (3/8) 6 ≈ 0.003, which is almost zero. Therefore, passing through six stage gates after the register output signal enters the combinational circuit, the signal tends to be stable and the dynamic energy consumption almost disappears.
When a signal passes through a logic component, or even a wire, there is a delay. In a clock cycle, due to the phenomenon that the signal is blocked, the maintenance time of the power consumption waveform of the circuit generally does not exceed 6t (here it is assumed that t is the average transmission delay time of each logic gate, and the transmission delay of the wire is not considered), and the amplitude of the waveform gradually decays with time. Therefore, the closer the power waveform is to the starting point, the more the circuit activity can be reflected.
The energy consumption of a cryptographic device depends on the intermediate value processed during the algorithm execution process. Choosing an appropriate intermediate value for the attacker can increase the success rate. The first encryption process of the 128-bit AES algorithm is shown in Fig.1. The plaintext and the key are exclusive-ORed before the data register, and the combinational circuit starts the first round of encryption operation. Each round consists of four round transformations, which are called AddRoundkey, SubByte, ShiftRows, and MixColumns (the 10th round does not perform AddRoundkey). According to the above discussion, the power consumption data with the strongest correlation with the key should be concentrated near the register, that is, the place marked ''Position A'' in the figure, which is why people often choose the output of Sbox as the intermediate value.
Different input data leads to different energy consumption, that is, a different plaintext or key produces different power consumption waveforms. This phenomenon is called the correlation between energy traces and input data. In DPA attacks, the absolute value of energy traces is meaningless. What is important is only the difference between the power consumption caused by different inputs. The so-called power consumption difference interval refers to the part of an energy trace that is different from others due to different input data. According to the above analysis, the length of this part usually does not exceed 6t. If the length is too large, then more noise will be introduced, which not only increases the amount of sample calculation, but also causes the attack to fail. If it is too small, then it may directly cause the attack to fail because it does not cover the effective information leakage interval. Therefore, the number of sampling points N diff should be appropriate and adequate and can be calculated using (3), where f sample is the frequency for the sampling energy trace:

IV. ENERGY TRACE MARKER LOCATION
From the above discussion, it is known that the selected intermediate value should correspond to a vicinity of a certain peak in the energy trace; this peak is called an energy trace marker point. However, generally, it is not known where the peak is because there are many peaks in the energy trace.
The following approach is taken to obtain the location of the marker.  (4)):  Fig.1) to obtain an hypothetical intermediate value column vector V : Finally, the rows in T are also divided into 3 categories. The first category T 1 contains the rows in T whose index values are equal to the HD index values of the first category. The second category T 2 contains the rows in T whose index values are equal to the HD index values of category 2, and the last category contains the rest. Since the energy traces in the third category have little effect on the positioning of the markers, they will not be processed: 4) Average category T 1 and category T 2 separately to obtain two energy trace classification center traces CM k under the correct key, where k = 1, 2: 5) Select the wrong key hypothesis k wrong , and repeat steps 2, 3, and 4 to get two classification centre traces W M k under the wrong key. Subtract W M k from CM k to get two differential centre traces DM k The position where two CM k curves show the largest peak at the same time is the position where the energy trace has the strongest correlation with the intermediate value, which is the so-called marker point. After obtaining the marker, a small number of sampling points near the marker are taken to constitute an effective attack interval. Fig.2 shows an example in which there are two energy centre trace difference curves in the left subgraph. It is clear that there are many peaks on both curves near the 6000th point. By amplifying the waveform near this point, we get the right subgraph in which two curves FIGURE 2. Two difference palpitations and partial enlargement. VOLUME 8, 2020 have the largest peak value at the 6110th point. Therefore, point 6110 is the marker we found.

V. EXPERIMENT A. EXPERIMENTAL ENVIRONMENT
The experimental object is an SoC chip with a cryptographic coprocessor. The coprocessor is implemented with SMIC130 nm process. The total area of the SoC is 124 mm 2 , and the total power is 6.25 mW. The coprocessor runs a 128-bit AES algorithm, with an area of 0.06 mm 2 , an average gate delay of 0.25 ns, and a power consumption of only 0.04 mW, which accounts for a very small percentage of the total power consumption. Because the coprocessor and the CPU are packaged together and cannot be isolated, various noise interferences from the same PCB are relatively large. These factors greatly increase the difficulty of the attack.
In order to reduce interference, a custom acquisition card is used in the experiment. The specific parameters of the environment are shown in Table.1. During the execution of the algorithm, the acquisition card collects the energy consumed by the SoC by measuring the voltage on a 0.1 sampling resistor connected in series to the power ground wire. Fig.3 shows the experimental platform. The acquisition work is controlled by the PowerAnalysis software.

B. EXPERIMENTAL METHODS AND STEPS
The sampling frequency of the acquisition card is fixed at 5 Gsps, and the working frequency of the cryptographic device is 20 MHz. Two hundred and fifty points are sampled per clock cycle, and each point is quantized to 10 bits. A complete original energy trace has 15,000 sampling points in total. According to (3), N diff is about 8. The average difference DPA method is used for the attack, and the attack results are evaluated using Devia and PSR. The experimental steps are as follows:

C. EXPERIMENTAL RESULTS AND ANALYSIS 1) MARKER POINTS
For each of the 10 randomly selected correct keys, we record the marker points obtained under the 255 wrong keys in sequence, count all 2550 marker points, and get the frequency histogram as shown in Fig.4. It can be seen that the markers have a high repeatability at the positions of 5610 and 6110, and the sum of the number of occurrences of the two points accounts for 83% of the total number. Therefore, it can be considered that the strongest points of information leakage are mainly concentrated at these two locations. Either one of them can be selected as a marker. In order to confirm the attack effect of the two markers, points 5610 and 6110 are selected as the marker points. The attacks are performed on the 10 keys, and the number of successful attacks is shown in Table.2. It can be seen from Table.2 that when intercepting 10 and 200 sampling points near the marker points 5610 and 6110 for attack, the number of successful attacks at marker point 5610 is less than that at marker point 6110. At the same time, Fig.4 shows that the most frequent occurrence of the marker 6110 is 1180, accounting for 46% of the total. Therefore, due to the space limitation, only the experimental results with point 6110 as the marker are given below.

2) THE ATTACK INTERVAL
In this study, 2,5,10,20,50,100,200, and 400 sampling points are intercepted around marker 6110 at eight different effective attack intervals, and the attack results for all 256 keys in each interval are shown in Fig.6, Fig.7, and Table.3.  Fig.6 shows the attack on key 58. The horizontal axis of the figure represents the keys and the vertical axis is the Euclidean distance. The sub-map in the figure corresponds to 8 different attack intervals from top to bottom and from left to right. The first seven sub-graphs show that the attack result key is 58 and the attack is successful. The last sub-map shows that the attack result key is 181, and attack fails. Among the first 7 correct results, the third Devia value is the largest, reaching 0.71. This result shows that for key 58, the optimal attack interval is [6105: 6114]. Therefore, the optimal number of points in the valid interval is 10, including the marker point itself.
The horizontal axis of Fig.7 represents different intervals, and the vertical axis denotes Devia values. The maximum, minimum, and median values of Devia are given for each interval. It can be seen that 10 intercepting points around marker 6110 as the effective attack interval has the best effect because the maximum, minimum, and median values of the corresponding Devia values are the largest among the 8 intervals. From this, three conclusions can be drawn. First, the circuit does leak a lot of secret information at 6110. Second, the effective interval should not be too small or too large. If it is too small, the selected range may not cover the information leakage range of the energy trace; if it is too large, it will introduce too much noise. Third, the size of the best effective interval is basically consistent with the result that N diff is about 8 (as calculated according to (3)). Table.3 shows the PSR and the median corresponding Devia values, which represent the attack success rate for each interval. When the number of points is 5, 10, and 20 in the interval, the attack with each sub-key is successful. As the number of points gradually increases, the PSR decreases and the attack success rate goes down. When there are more than 400 points, no attack is successful. If there are only two points, the attack success rate is only 95%. At the same time, when the number of points in the interval is 10, the median Devia value, 0.702, is the largest. Therefore, from the PSR and Devia values, it is best to intercept 10 points around marker 6110 as the effective attack interval.
Both Fig.7 and Table.3 show that the best attack effect can be achieved by selecting 10 points near 6110 as the effective attack interval. Therefore, the number of data used for attacking from each energy trace is reduced from 250 points in one VOLUME 8, 2020  cycle to 10 points, a reduction of 96%, thus improving the time efficiency of DPA attack.

3) SAMPLING FREQUENCY
In this experiment, a custom-made acquisition card is used, and the sampling frequency is fixed at 5 Gsps. Can the sampling rate be reduced? To answer this, the sampling frequency is decreased to 2.5 Gsps, 1.25 Gsps, and 0.625 Gsps by equal interval sampling. Within the best effective attack interval determined in step 4 of the experiment, the 10 selected keys are attacked again. The experimental results are shown in Fig.8.
The horizontal axis of Fig.8 represents different sampling rates, and the vertical axis represents Devia values. The figure shows that when the sampling rate gradually reduces from 5 Gsps to 0.625 Gsps, all attacks are successful and the Devia value changes little. For example, for key 18, the maximum value of Devia is about 0.70 and the minimum 0.67 (the difference is 0.03, that is, only about 5%). It should be noted that when the sampling rate is reduced to 0.625 Gsps, the attack range only contains one point, that is, the marker point itself (6110). This proves again that the circuit does leak a lot of secret information at the sampling point 6110.

4) COMPARISON WITH OTHER METHODS
When comparing the results of DPA attacks, the type, energy model, target device, and implementation mode of DPA attacks should be considered. To make a fair comparison,We choose paper [39] as the comparison object, because it has the same cryptographic algorithm and energy trace compression purpose as this article. We include as much information as possible in Table.4. Kim and Ko [39] proposed a new selection method to improve power analysis attacks using principal component analysis and utilized the SASEBO-GII platform in experiments, which includes a main FPGA and control FPGA. The encryption component in the main FPGA (Xilinx Virtex-5 LX30) is completely separated from the control FPGA, minimizing the transmitted noise from the same PCB. Compared with the FGPA, the energy traces measured from the SoC are more complex and more difficult to attack. For one energy trace, the final sampling points used by [39] were 500, while 10 points near the marker are used in this paper, which greatly reduces the calculation cost.

VI. CONCLUSIONS
Based on the current consumption characteristics of CMOS circuits, this study discusses the location and range of the critical data needed for DPA attacks. Based on the Hamming distance classification method, the power consumption data with the strongest correlation with the key are found, which solves the difficulty of manually locating the power consumption difference interval when analysing the power consumption trajectory. Under the premise of ensuring the success rate, the energy trace compression method in this paper reduces the sample calculation and attack cost. Since the location of the marker method must know the correct key in advance, the method, in this article, is suitable for designers of cryptographic devices or people who actually own the same device as the target device. In future research, we intend to apply these methods to other cryptographic algorithms and different practical scenarios to expand their universality.