Integral Cryptanalysis of Lightweight Block Cipher PIPO

PIPO is a lightweight block cipher proposed at ICISC 2020, which has a byte-oriented structure suitable for bit-sliced implementation and allows for efficient higher-order masking implementations. In this study, we use bit-based division property techniques to construct 6-round integral distinguishers, and propose key-recovery attacks on 8 rounds of PIPO-64/128 and 10 rounds of PIPO-64/256. The data complexity of both attacks is 263 chosen plaintexts and the time complexities are 2125 and $2^{253.8}$ respectively. Our results complement the security analysis of PIPO, and show that the PIPO structure is resistant to recently researched cryptanalysis methods. Because only differential and linear attacks were carefully considered to determine the number of rounds of PIPO, our work, based on division property, is important for verifying the security margin.

In this study, we examine the division property for PIPO and find that the division property can propagate up to 6 rounds. Then, we construct 6-round integral distinguishers [2] based on the observations and perform a key-recovery attack on 8-round PIPO-64/128 and 10 rounds of PIPO-64/256. The attack on 8-round PIPO-64/128 recovers a 128-bit key with 2 63 chosen plaintexts and 2 125 encryptions, whereas the attack on 10-round PIPO-64/256 recovers a 256-bit key with 2 63 chosen plaintexts and 2 253.8 encryptions.
Our results are summarized in Table 1. Integral cryptanalysis is an important tool for analyzing the security of block ciphers; however, to the best of our knowledge, the resistance of PIPO to integral cryptanalysis has never been published, VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/  even in [1]. Although our results do not weaken the security claim of full-round PIPO as presented in Table 2, these complement the security analysis by conducting attacks on the reduced-round versions. The remainder of this paper is organized as follows. In Section II, we present the basic background and related work. Section III discusses how the PIPO structure is modeled as suitable for an MILP solver. In Section IV, we analyze the division properties of PIPO structure. Section V presents the integral distinguishers and attacks on reduced rounds of PIPO. In Section VI, we present our conclusions.

II. PRELIMINARIES A. SYMBOLS AND NOTATIONS
An n-bit binary vector x ∈ F n 2 is defined as (x n−1 , x n−2 , . . . , x 0 ), where x i ∈ F 2 for 0 ≤ i < n. This can also be denoted by x = x n−1 x n−2 · · · x 0 . We define x ≪ i as an operation rotating a binary vector x in the left direction by i bits. We denote the concatenation of the two binary vectors x and y by x y. We represent a sequence of consecutive identical bits with the superposition of a single bit. For example, a 7-bit string 1111000 or a 7-bit binary vector (1, 1, 1, 1, 0, 0, 0) can be denoted by 1 4 0 3 .
Let X be a multiset of n-bit vectors. We denote the output multiset of a map f : F n 2 → F m 2 by f (X) := {f (x) : x ∈ X}, where x ∈ F n 2 and y ∈ F n 2 . We define w(x) = n−1 i=0 x i as the Hamming weight of x, x y as x i ≥ y i for 0 ≤ i < n, and x · y = n−1 i=0 x i y i as the inner product of x and y, where x i y i is the AND of x i ∈ F 2 and y i ∈ F 2 . In addition, we define x y as a monomial n−1 i=0 x y i i .
Let f be a Boolean function from F n 2 to F 2 and the algebraic normal form (ANF) of f be f (x) = y∈F n 2 α y x y with α y ∈ F 2 . We define a set ANF f of all the terms of f as ANF f = {y ∈ F n 2 | α y = 1}. Let E : F k 2 × F n 2 → F n 2 be a block cipher with k-bit key and n-bit block. c = E κ (p) indicates that plaintext p ∈ F n 2 is encrypted to ciphertext c ∈ F n 2 through block cipher E with key κ ∈ F k 2 . Furthermore, Y = E κ (X) implies that Y is the (multi)set of ciphertexts to which the block cipher E encrypts all plaintexts in the (multi)set X with the key κ ∈ F k 2 .
The key schedule of PIPO-64/128 splits a 128-bit master key κ into two 64-bit parts κ = κ 1 κ 0 . Subsequently, subkeys are defined as sk i = κ i mod 2 ⊕ i for 0 ≤ i ≤ 13. The key schedule of PIPO-64/256 splits a 256-bit κ into four 64-bit parts κ = κ 3 κ 2 κ 1 κ 0 . Subsequently, the subkeys are defined as sk i = κ i mod 4 ⊕ i for 0 ≤ i ≤ 17. The round function of PIPO consists of an S-layer S for nonlinear operation, an R-layer R for linear operation, and Key-XOR for adding round keys (see Fig. 1). The input x of the first round is the XOR of the plaintext p and whitening subkey sk 0 : x 0 = p ⊕ sk 0 . In the first round, S applies the 8-bit S-box S 8 to each column of x 0 , and the output y of S is the concatenation of the outputs of the S-boxes. The output z of R is the concatenation of the left rotation of each row of y. The numbers of rotated bits are 0, 7, 4, 3, 6, 5, 1, and 2 from 0-th row to 7-th row of y, respectively. The output of the first round is x 1 = z ⊕ sk 1 which is the input of the second round. Each of the remaining rounds has the same process: S-layer, R-layer, and Key-XOR (with sk i for i = 2, 3, . . .).

1) · 8-BIT S-BOX of PIPO
The 8-bit S-box S 8 of S-layer is constructed with a 3-bit S-box S 3 and two 5-bit S-boxes S 1 5 and S 2 5 . Fig. 2 illustrates the structure of S 8 . The 3-bit input x = (x 2 , x 1 , x 0 ) can be updated to the output S 3 (x) of S 3 as follows: The 5-bit input x = (x 4 , x 3 , x 2 , x 1 , x 0 ) can be updated to the output S 1 5 (x) of S 1 5 as follows: The 5-bit input x = (x 4 , x 3 , x 2 , x 1 , x 0 ) can be updated to the output S 2 5 (x) of S 2 5 as follows: Finally, for the 8-bit input x = (x 7 , . . . , x 0 ), the output of 8-bit S-box S 8 is computed as follows: The unbalanced-bridge structure, which combines S 3 , S 1 5 , and S 2 5 , provides high differential and linear branch numbers as well as efficient masking implementations [1].

C. INTEGRAL CRYPTANALYSIS
Integral cryptanalysis stemmed from the security evaluation of block cipher Square [4] and was formalized in [2]. This method uses integral distinguishers.
We denote the state of an active bit variable, on which 0 and 1 both appear, by 'a' and the state of a constant bit variable, on which the value is fixed as constant, by 'c'. For example, if the state of the 4-bit variable (x 3 , x 2 , x 1 , x 0 ) is (ccaa), four 4-bit values can appear with (x 1 , x 0 ) = (0, 0), (0, 1), (1, 0), and (1, 1) for a certain constant value of (x 3 , x 2 ). An integral distinguisher requires an input multiset whose state consists of active and constant bits, and exploits the fact that the XOR-sum of the corresponding output multiset is always zero at some bits.
Definition 1 (Integral Distinguisher): Let E : F k 2 × F n 2 → F n 2 be an r-round block cipher with k-bit key and n-bit block. Let X and Y = E κ (X) be a plaintext multiset and ciphertext multiset under a key κ ∈ F k 2 , respectively. If there exists any index i such that we say that the i-th bit variable y i of the ciphertext is balanced, and call the transition from X to Y an r-round integral distinguisher for E. VOLUME 10, 2022 Assuming that an integral distinguisher has m balanced bits, the probability that random permutation P on F n 2 satisfies m balanced bits is 2 −m . Hence, we can use such an integral distinguisher to distinguish block cipher E from P.

D. DIVISION PROPERTY
The notion of the division property was proposed by Todo at EUROCRYPT 2015 [5] as an efficient method for constructing integral distinguishers, and was subsequently generalized to bit-based division property [7]. In this study, we focus on the conventional bit-based division property. The definition is given in Definition 2.
Definition 2 (Conventional Bit-Based Division Property [6]): Let X be a multiset whose elements take the value of F n 2 , and let k be an n-dimensional vector whose i-th element takes 0 or 1. When multiset X has the conventional bit-based division property D n K , it satisfies the following conditions: For simplicity, the conventional bit-based division property is mentioned as a division property in the remainder of this paper. If k ∈ K and k ∈ K satisfy k k , we can remove k from K because k does not affect the condition (1). In [8], Xiang et al. defined operation SizeReduce(K) by removing redundant vectors from K and returning the reduced set of K.

1) DIVISION PROPERTY PROPAGATION RULE
Todo [7] demonstrated how the division property is propagated through copy, and, and xor. In this section, we briefly present propagation rules. In the following rules, the notation A ⇐ B for sets A, B denotes A = A ∪ B. a: · RULE 1 (copy) Let f : F 2 → F 2 2 be a copy function, where the input (x 0 ) ∈ F 2 and the output is calculated as (x 0 , x 0 ). Let X and Y be the input and output multisets of Let f : F 2 2 → F 2 be an and function, where the input (x 1 , x 0 ) ∈ F 2 2 and the output is calculated as (x 1 ∧ x 0 ). Let X and Y be the input and output multisets of f , respectively.
c: · RULE 3 (xor) Let f : F 2 2 → F 2 be an xor function, where the input (x 1 , x 0 ) ∈ F 2 2 and the output is calculated as (x 1 ⊕ x 0 ). Let X and Y be the input and output multisets of f , respectively.
Output: A set K of vectors such that the output multiset has the division property D m In addition to the above basic operations, the division property propagation through the S-box can be derived by analyzing its ANF [8].
Let f : F n 2 → F m 2 be a function of the S-box, where the input x ∈ F n 2 and the output y ∈ F m 2 . Let X and Y be the input and output multisets of We can calculate DP(f , k) using Algorithm 1, which was introduced in [8]. As mentioned above, the redundant vectors of K do not affect the division property. Therefore, Algorithm 1 considers the reduced set by applying SizeReduce(K) in Line 8.

2) DIVISION TRAIL
As shown in [8], the propagation of the division property can be regarded as a transition of vectors, from k ∈ K of the division property D n K to k ∈ K of the division property D m K . In [8], Xiang et al. defined a chain of propagation as a division trail.
Definition 3 (Division Trail [8]): be an iterated block cipher, and let f i denote the i-th round function of E. Assume that the input multiset to E has an initial division property D n k , and denote the division property after r-round propagation through f i by D n K r . Thus, we have the following chain of division property propagations.

Moreover, for any vector k
. . , r}, then we call (k 0 , k 1 , . . . , k r ) an r-round division trail. Definition 3 implies that the set of last vectors of all rround division trails starting with k is equal to K r . Therefore, checking for the existence of a useful integral distinguisher after r-round encryption (i.e., obtaining K r such that there exists any unit vector e / ∈ K r ) is equivalent to finding all r-round division trails starting with k. Based on this observation, Xiang et al. proposed an approach for finding all division trails by constructing a linear inequality system whose feasible solutions represent all division trails.    Compared with the basic operations copy, and, and xor, various approaches can be considered to construct an MILP model M for the S-box. Constructing M for S-box f : F n 2 → F m 2 is equivalent to converting a set of (n + m)-bit vectors into a set of linear inequalities, M.con. The conversion can be conducted in two ways: using the product-of-sum representation of Boolean functions [9] and the Inequality_generator() function in Sagemath software. 1 Each of the two conversions is detailed in the following section when constructing MILP models for the S-box of PIPO.

III. MILP MODEL FOR PIPO BLOCK CIPHER
In this section, we propose three methods of constructing MILP models for the S-box S 8 of PIPO and compare them. Moreover, we introduce a method for exploiting the rotational symmetry of PIPO to analyze the division properties more efficiently.

A. MILP MODEL FOR S-BOX OF PIPO
We attempted to construct MILP models for S-box S 8 of PIPO in three ways.

1) BY H-REPRESENTATION: M H-repre
First, we applied Rule 4 (S-box) directly to S 8 and obtained the set of division trails for S 8 . We convert P 8 into the corresponding linear inequalities using the Inequality_generator() function in the Sagemath software. Specifically, the function Inequality_generator() determines an H-representation (a set of inequalities) of the convex hull of P 8 . We denote this model for S 8 by M H-repre . Although the greedy approaches in [10], [11] can optimize M H-repre by computing a small number of inequalities that exactly describe P 8 , this reduction is only possible when the original H-representation is given.

2) BY PRODUCT-OF-SUM REPRESENTATION: M QM
Second, we applied the conversion of [9] to P 8 . We define the Boolean function g : F n+m 2 → F 2 as This gives the product-of-sum representation of g(x) as The product-of-sum representation trivially corresponds to a set of inequalities which exactly describe P 8 as Therefore, we can simplify the set of inequalities for S 8 by minimizing the number of terms in the product-of-sum representation. We apply the Quine-McClusky algorithm to minimize and obtain the MILP model M QM for S 8 .

3) BY CONSIDERING STRUCTURE OF S 8 : M struct
Finally, considering the structure of S 8 , we derive an MILP model M struct for S 8 from the sets of division trails for S 3 , S 1 5 , and S 2 5 respectively. As explained in Section II-B and described in Fig. 2, S 8 is constructed with an unbalanced-bridge structure with S 3 , S 1 5 and S 2 5 . We obtain the corresponding MILP models for P 3 , P 1 5 , and P 2 5 by applying the Quine-McClusky algorithm. We then combine them with the MILP models for copy and xor operations explained in Section II-E to obtain M struct for S 8 .

4) COMPARISON OF MILP MODELS FOR S 8
M H-repre and M QM allow accurate analysis of S 8 . However, M H-repre is efficient only for S-boxes whose sizes are less than 8 bits, because the computational complexity required to obtain linear inequalities and optimize them increases in proportion to the size of the S-box. M QM also does not guarantee its efficiency over 8-bit S-boxes, but fortunately, we obtained it on S 8 of PIPO around one hour. However, M struct does not guarantee analysis as accurate as M H-repre and M QM because it does not cover monomials cancelled through XORs in the ANF of S 8 . For some input division property k, M struct occurs a larger unknown set 2 of Equation (1) than M H-repre and M QM . Nevertheless, we proceeded to obtain M struct because of its efficiency in modeling simple operations and small S-boxes. Note that modeling simple operations, such as copy and xor costs, is negligible. See Table 3 for a comparison of the time complexities for modeling S 8 .

B. ROTATIONAL SYMMETRY OF PIPO
We can find an MILP model M for r rounds of PIPO, based on the analysis given in Section III-A. Then, we solve M to construct any r-round division trail (a 0 , a 1 , . . . , a r ).
To obtain an integral distinguisher from the trail, we need to start the trail with k of the division property D 64 k for the plaintext multiset. We can achieve this by adding the following constraints to M.con. Rotational symmetry can be used to reduce the number of initial division properties to be considered for search, because searching for trails starting with k covers trails starting with τ (k), τ 2 (k), . . . , or τ 7 (k).

IV. DIVISION PROPERTY ANALYSIS WITH LINEAR TRANSFORMATIONS A. EXTENDED INTEGRAL DISTINGUISHERS
Lambin et al. [12] presented a method for identifying more integral distinguishers. Their approach involves searching for L out • E • L in instead of a block cipher E : F k 2 × F n 2 → F n 2 , where L in and L out ∈ GL n (F 2 ) and where we regard E as a nonlinear permutation on F n 2 , a block cipher with a randomly selected secret key over F k 2 . Generally, their method 110200 VOLUME 10, 2022 finds an extended integral distinguisher. This is defined as Definition 5.

Definition 5 ((Extended) Integral Distinguisher):
Let E : F k 2 × F n 2 → F n 2 be an r-round block cipher with k-bit key and n-bit block. Let X and Y be the plaintext and ciphertext multisets of E, respectively. For any key κ ∈ F k 2 , if there exists is called an r-round integral distinguisher of E, and v·y is called a balanced bit.

B. LINEAR TRANSFORMATIONS ON INPUT AND OUTPUT
We consider L in only as a concatenation of eight 8×8 matrices L j in ∈ GL 8 (F 2 ) for 0 ≤ j < 8, because it is computationally impossible to try all the 64 × 64 binary linear matrices. Similarly, we consider L out only as a concatenation of eight 8 × 8 matrices L j out ∈ GL 8 (F 2 ) for 0 ≤ j < 8. Each output bit of L j out • S 8 for 0 ≤ j < 8 has the form of v out · S 8 for the corresponding row v out of L j out . Therefore, we only need to check whether there exists v out such that v out · S 8 is balanced for the j-th S-box in the last round function in order to find integral distinguishers with L j out . Considering the rotational symmetry of PIPO, we can force the initial division property D 64 k to have a single zero bit at the least significant position of k. In other words, we assume that the initial division property is D 64 1 63 0 and that the initial multiset is (a · · · ac) where the least significant bit is constant, and the other bits are active. Under this assumption, Theorem 6 implies that L j in for 1 ≤ j < 8 do not change its initial division properties.
Theorem 6: If the input division property is D n 1 n , for any invertible f : F n 2 → F n 2 , the output division property is D n 1 n . Proof: Assume f (x) = y. According to Proposition 1 in [13], deg(y u ) = n only when u is an n-bit all-one vector 1 n . Therefore, DP(f , 1 n ) = {1 n }, and the output division property is D n 1 n . Now, we can consider only L 0 in with the given input division property D 8 1 7 0 . Because DP(S 8 • L 0 in , 1 7 0) depends only on linear combinations of bits that become constant, we can classify 8 × 8 invertible matrices into 2 8 − 1 classes, in which each matrix instantiating L 0 in has the same DP(S 8 • L 0 in , 1 7 0). For D n K , we define Succ(k) := {u ∈ F n 2 | u k} for k ∈ K and Succ(K) := k∈K Succ(k). Let D 64 K 1 and D 64  In Lines 2 -4, Algorithm 2 first computes the division property for the first round by considering D 8 Then, through the loop covering Lines 5 -31, it searches for balanced bits in the r-th round output on the division property after the first round. U is the set of all v, such that the parity of x v is unknown for the output x of the (r − 1)-th round. In Lines 7 -24, we use the MILP model M for r − 2 rounds of PIPO to collect all possible entries of U. In Lines 25 -30, it computes ANF v out •S 8 for the j-th S-box in the r-th round, and checks whether ANF v out •S 8 contains monomials whose parities are unknown. If ANF v out •S 8 contains no such monomials, v out • S 8 is a balanced bit of an r-round extended integral distinguisher. Consequently, all balanced bits after r rounds are stored in S in the form of (j, v out ).

V. INTEGRAL DISTINGUISHERS AND ATTACKS A. SEARCHING FOR DISTINGUISHERS
We attempted two ways to search for distinguishers by constructing two MILP models for PIPO combining the S-box models, namely M struct and M QM obtained in Section III-A. We used Gurobi MILP Solver and performed every experiment on the platform of AMD Ryzen Threadipper 3970X CPU 3.7GHz, 256GB RAM and Ubuntu 20.04.1 LTS x86_64.
As a result, we found seventeen 6-round integral distinguishers for PIPO by searching with M QM , of which we can also find eight through a search with M struct . This implies 136 6-round distinguishers due to the rotational symmetry in the PIPO structure. Both search approaches did not find any integral distinguishers for more than 6 rounds of PIPO.
The 6-round integral distinguishers are split into two classes depending on the form of constant bit information in the input. Considering rotational symmetry with 0 ≤ i < 8, the distinguishers in the first class have the constant bit information: in the input. The corresponding balanced bit information in the output is one of seven: 0,3+i ⊕ x 6 1,2+i ⊕ x 6 6,4+i , x 6 0,4+i ⊕ x 6 1,3+i ⊕ x 6 6,5+i , x 6 0,5+i ⊕ x 6 1,4+i ⊕ x 6 6,6+i , R is the R-layer function 5: for j = 0, 1, . . . , 7 do 6: U ← ∅ U implies the division property on the position of j-th S-box 7: for k ∈ K 1 do 8: for v ∈ F 8 2 \ {0} do 9: if v ∈ U then 10: M ← M
Except for x 6 5,i and x 6 5,7+i , distinguishers with the balanced bit information in B 1 can be found by searching with both M QM and M struct .

B. KEY-RECOVERY ATTACK ON 8-ROUND PIPO-64/128
We can use four 6-round integral distinguishers, underlined in B 1 , to mount a key-recovery attack on 8 rounds of PIPO-64/128. The distinguishers are applied from the first round to the sixth round, with the same active bits in the input and various balanced bits in the output. The plaintext is denoted as p = (p 63 , . . . , p 1 , p 0 ). We use 2 63 plaintexts in which p 56 is fixed as a constant. In the attack, the attacker should try all possible 2 64 candidates of the last subkey sk 8 and guess four bytes of rk 7 , where rk 7 = R −1 (sk 7 ). Table 5 lists the balanced bits in the output and the key bytes of rk 7 related to the distinguishers. The attack process is presented in Algorithm 3. During the attack, the 7-th round partial decryption of Key Filtering Phase requires only the 32-bit intermediate values y * ,i for i ∈ {0, 1, 6, 7} and a 4-byte guessed key (rk 7 7 , rk 7 6 , rk 7 1 , rk 7 0 ) as Fig. 3 describes. We expect that the key space can be reduced by the ratio of 2 −4 after Key Filtering Phase because the α i for 1 ≤ i ≤ PIPO-64-128 and PIPO-64/256 has the same structure except the key schedule. The difference between the key schedules allows a 10-round attack on PIPO-64/256. In the attack, we use the same distinguishers, guess the same bits of rk 7 and sk 8 as in the attack on 8-round PIPO-64/128, and additionally guess the whole bits of sk 9 and sk 10 . Therefore, the time complexity of Key Filtering Phase is estimated as 2 253.3 ≈ 2 63 × 2 64·3 × 3/10 10-round PIPO-64/256 encryptions, while the time complexity of the final exhaustive search phase is 2 252 . Therefore, the total time complexity of the attack is approximately 2 253.8 ≈ 2 253.3 + 2 252 .

VI. CONCLUSION
In this paper, we analyzed the division property of the lightweight block cipher PIPO proposed at ICISC 2020 based on three MILP models with different modeling time and accuracy. As a result, we could find 136 6-round integral distinguishers. Among them, 120 distingusihers were derived VOLUME 10, 2022 by adding linear transformations into the S-box. We performed key-recovery attacks on 8 rounds of PIPO-64/128 and 10 rounds of PIPO-64/256 based on four of the obtained distingushers with 2 125 and 2 253.8 time complexities, respectively. Although our results do not weaken the security claim of full-round PIPO, these complement the security analysis. Moreover, we expect that our search approach 3 can be used to find the best choice of R-layer in terms of resistance against integral attack.