New Security Proofs and Complexity Records for Advanced Encryption Standard

Common block ciphers like AES specified by the NIST or KASUMI (A5/3) of GSM are extensively utilized by billions of individuals globally to protect their privacy and maintain confidentiality in daily communications. However, these ciphers lack comprehensive security proofs against the vast majority of known attacks. Currently, security proofs are limited to differential and linear attacks for both AES and KASUMI. For instance, the consensus on the security of AES is not based on formal mathematical proofs but on intensive cryptanalysis over its reduced rounds spanning several decades. In this work, we introduce new security proofs for AES against another attack method: impossible differential (ID) attacks. We classify ID attacks as reciprocal and nonreciprocal ID attacks. We show that sharp and generic lower bounds can be imposed on the data complexities of reciprocal ID attacks on substitution permutation networks. We prove that the minimum data required for a reciprocal ID attack on AES using a conventional ID characteristic is 266 chosen plaintexts whereas a nonreciprocal ID attack involves at least 288 computational steps. We mount a nonreciprocal ID attack on 6-round AES for 192-bit and 256-bit keys, which requires only 218 chosen plaintexts and outperforms the data complexity of any attack. Given its marginal time complexity, this attack does not pose a substantial threat to the security of AES. However, we have made enhancements to the integral attack on 6-round AES, thereby surpassing the longstanding record for the most efficient attack after a period of 23 years.


I. INTRODUCTION
Substitution permutation network (SPN) ciphers constitute a fundamental category of block ciphers that are widely used in modern cryptography.The Advanced Encryption Standard (AES) specified by the National Institute of Standards and Technology (NIST) [1] is an example of an SPN cipher that is extensively employed to provide confidentiality in various cryptographic protocols, such as Transport Layer Security (TLS), WiFi Protected Access (WPA), and the Signal protocol utilized in applications like WhatsApp.In this context, the cryptanalysis of SPN ciphers in generic settings plays a crucial role in comprehending the security of commonly utilized ciphers, and in evaluating their resilience against potential attacks.
The associate editor coordinating the review of this manuscript and approving it for publication was Pedro R. M. Inácio .
In contrast to the majority of previous research on cryptanalysis, which has largely focused on specific ciphers, this study adopts a more abstract and theoretical approach by examining the data complexity of reciprocal impossible differential (ID) attacks and the time complexity of nonreciprocal ID attacks on SPN ciphers in generic settings.Reciprocal attacks are those that require the same amount of data complexity to prepare the necessary data for the attack, regardless of whether the attacker has access to the encryption oracle or the decryption oracle.To provide a precise description of reciprocal attacks, we present Definition 1, and we establish various results regarding the minimum data requirements for reciprocal ID attacks on generic SPN ciphers.
The data requirement of an attack can be considered the most vital and critical complexity among time and memory complexities.This is due to the fact that data collection may not always be feasible, and the attacker has no control over the throughput of the oracle producing the data.Conversely, advancements in time and memory complexities are achievable through the efficient utilization of highspeed, parallel supercomputer platforms.Therefore, lowdata complexity attacks stand out as particularly noteworthy.For instance, Bouillaguet et al. explore the possibility of attacking AES with only one or two plaintext/ciphertext pairs [2].Hence, in this work, we investigate the minimum data requirement of reciprocal ID attacks on AES.
We focus on establishing lower bounds for data and time complexities related to reciprocal and nonreciprocal ID attacks on AES respectively.Despite AES being a subject of extensive research, comprehensive security proofs are notably lacking for various attack methods.The designers of AES have provided security proofs against differential and linear attacks [3].Subsequent efforts have aimed at refining and enhancing these security bounds [4].However, it is important to emphasize the existing gap in security proofs against other potential attack methods.We address this gap specifically in the context of ID attacks in this work.
Cryptanalysis techniques on SPN ciphers, particularly AES, have made significant progress.One example is the class of impossible differential (ID) attacks which were introduced by Biham et al. [5] and Knudsen [6] independently.The distinguisher in an ID attack utilizes an input-output difference of an encryption function that is not generated by any key.We classify ID attacks on SPN ciphers into reciprocal and nonreciprocal attacks.Reciprocal ID attacks are identified as those that can be executed with the same data complexity in the chosen ciphertext (CC) scenario as in the chosen plaintext (CP) scenario.It is apparent that almost all ID attacks on well-known SPN ciphers are reciprocal, and as yet there is no nonreciprocal ID attack on AES.While it seems that reciprocal ID attacks are generally more efficient and faster than nonreciprocal ones, this study reveals that reciprocal ID attacks on SPN ciphers require a considerable amount of data.
Several ID attacks have been proposed for AES, all of which rely on exploiting the 4-round conventional ID characteristics as described in [25].To date, no other ID characteristics for AES have been identified.In fact, Sun et al. have demonstrated that AES has no 5-round ID unless the specifics of the S-Box are disregarded [26].Wang and Jin [27] have also verified this claim through the ''dependent tree'' method, although their conclusion is based on the assumption that all the round keys are independent and uniformly random.In addition, Boura and Coggia [28] have demonstrated that no 5-round ID with two active bytes exists for AES, using MILP solvers.
The distinguishing feature of ID attacks on AES is that they are all reciprocal and require extensive data.These attacks rely on an outrageous number of chosen plaintexts to identify all the incorrect keys in the initial and final rounds.Boura et al. have introduced bounds on data, time, and memory complexities for various generic types of block ciphers [29].However, the bound for data complexity is notably loose.Several ID attacks on AES have different data requirements, ranging from 2 117.5 CP in [30] to 2 75.5 CP in [31], and 2 92 CP in [32] and [33] when 4-round conventional ID characteristics are enclosed by initial and final rounds.Remark that the 6round attack in [31] has the lowest data requirements among all the ID attacks on AES.
The category of practical attacks or attacks with low data on few rounds of AES, has gained popularity in the cryptanalysis of AES for understanding its security [2], [3], [14], [17], [34], [35], [36], [37].The critical lower bound of the number of rounds for a dramatic jump in the required data complexity can be considered as six.This is supported by findings that while there exist attacks on 5-round AES that require only 8 CP [38], attacks on 6-round AES require at least 2 26 CP [34].
The square attack introduced by Daemen et al. on 6-round AES in [39] held the record for more than two decades, requiring 2 32 chosen plaintexts.Although some improved versions of the square attack, such as the partial sum technique [14] and improved meet-in-the-middle attacks [10], [11], have better time complexities, their data complexities could not surpass 2 32 chosen plaintexts.Bar-On et al. improved the record to 2 27.5 chosen plaintexts with the mixture meet-in-the-middle technique [35].They further enhanced their analysis and achieved a data complexity of 2 26  chosen plaintexts in [34].

B. OUR CONTRIBUTIONS
We present a set of parameters that can be used to identify an ID attack, and we investigate the data complexity of reciprocal ID attacks on SPN ciphers in a generic setting, using these parameters.Our analysis yields several theoretical and generic results concerning the minimum data Data is CP.*: We make a minor amendment to rectify the complexity computation in [14].See Section VIII.
requirement of such attacks.These results are presented in Theorem 2, Theorem 4, and Theorem 5.By offering a more extensive and rigorous comprehension of the minimum data requirements of reciprocal ID attacks, our results serve to augment the current understanding of this class of attacks.
Based on the parameters of the ID characteristic, we propose a generic formula for estimating the minimum data complexity of a reciprocal ID attack on an SPN cipher in Theorem 5.This formula is independent of the sieving method employed in the attack.Specifically, we demonstrate that a reciprocal ID attack that exploits at least one structure in the encryption and decryption directions necessitates at least the cube root of 2 n+1 /p data by Theorem 3, where n represents the block length of the SPN cipher and p denotes the probability that an input/output difference pair leads to the ID characteristic.To illustrate, we find that the minimum amount of data required for such an attack is 2 43 CP for a 128-bit block length.
In order to investigate if there are reciprocal ID attacks that attain the lower bound established by Theorem 5 with respect to data complexity, we propose Definition 2. These attacks are denoted as ''reciprocal ID attacks with optimal data''.It is worth noting that the most efficient ID attacks currently known do not fall under this category.Furthermore, there is a lack of literature on reported reciprocal ID attacks on AES with optimal data, leading to an open question regarding the precision of the lower bound presented in Theorem 5.
We provide comprehensive lower bounds on either data or time complexities for all types of ID attacks on AES.We prove that any reciprocal ID attack on AES exploiting a 4-round conventional ID characteristic and containing at least one initial and one final round uses at least 2 66 chosen plaintexts (CP) in Theorem 7. We observe that all the ID attacks on AES utilize 4 active bytes either in the first or in the last round yielding only one active byte after the MC or MC −1 operations.We prove that the data complexity is bounded by 2 62 for these attacks in Theorem 8 whatsoever the ID characteristic is.Moreover, we prove that any nonreciprocal ID attack on AES, which exploits a conventional ID characteristic, has a time complexity of at least 2 88 computational steps in Theorem 9. Consequently, we introduce security bounds against a different type of attack for the first time since introducing the security bounds against differential and linear attacks for AES.
To assess the degree of sharpness of the bounds established in Theorem 7 and Theorem 2, we conduct a 6-round reciprocal ID attack with 2 66 chosen plaintexts.While not the most effective attack, we present this attack to demonstrate that it reaches the bound described in Theorem 2 and represents the first instance of a reciprocal ID attack with optimal data.This means that it attains the bound in Theorem 5 as well, establishing the sharpness of these theorems.
We have successfully mounted a couple of nonreciprocal ID attacks on 6-round AES with a record low data requirement of only 2 18 CP, to illustrate that ID attacks do not necessarily require a lot of data.This result is particularly surprising as ID attacks are known to typically require significantly larger amounts of data.The theoretical infrastructures we have developed for the data requirements of reciprocal ID attacks in this work have enabled us to establish the frameworks for our nonreciprocal attacks with minimal data.While our attacks may not be the most optimal with their marginal time complexities, it is notable for their remarkable efficiency in terms of data usage.To compensate it for having a practical application, we improve the integral attack through the partial sum technique in [14].Our attack is the fastest attack on 6-round AES, surpassing the prior record established over a span of 23 years.A summary of low-data complexity attacks on 6-round AES is presented in Table 1.
Consequently, we contribute to the theoretical characterization of the data requirements of reciprocal ID attacks in this work.Our other crucial contribution is to provide security bounds of certain levels for AES against both reciprocal and nonreciprocal ID attacks by utilizing our generic statements on SPN ciphers.Moreover, we mount an attack of the minimum data on 6-round AES-192 and AES-256; and another attack of the best complexity on 6-round AES.

C. ORGANIZATION
The paper is structured as follows.In Section II, we provide a concise overview of SPN ciphers and AES.Subsequently, we present the framework of our work and investigate the data complexities of the reciprocal ID attacks on SPN ciphers in Section III.We establish a lower bound on the data of the reciprocal ID attacks and a lower bound on the time complexities of nonreciprocal ID attacks on AES in Section IV.Our reciprocal ID attack with optimal data and nonreciprocal ID attack on AES are detailed in Section V and Section VI, respectively.Section VII outlines our attack with minimum data.We introduce our improvement of the integral attack in Section VIII.Finally, we conclude the paper in Section IX with a conjecture.

II. PRELIMINARIES
We give a brief decryption of SPN (Substitution permutation network) ciphers and AES (Advanced Encryption Standard) along with the notation we comply with in this section.

A. SUBSTITUTION PERMUTATION NETWORKS
A substitution permutation network is a block cipher E K : GF(2) n → GF(2) n .For a fixed k-bit key K , E K is an n-bit permutation.Its inverse, D K , is the decryption function such that D K E K (P) = E K D K (P) = P ∀P ∈ GF(2) n .E K is supposed to behave like a random permutation to be a secure cipher.
The round function of an SPN cipher consists of key addition, an S-box layer, and a linear transformation.The input is added to the round key.Then, each block is divided into n/s subblocks of s-bit in an SPN cipher.We call each subblock a word.Subsequently, S-boxes are executed to n/s words simultaneously.That is, each S-box is a nonlinear permutation from GF(2) s into GF(2) s .The last operation of the round function is the linear transformation.It is a multiplication by an invertible matrix in the n × n general linear group over the field GF (2).There is an extra round key addition at the end of the last round.

B. ADVANCED ENCRYPTION STANDARD
Advanced Encryption Standard (AES), is the most prominent SPN cipher, as being the FIPS 197 standard [1].Its block length is 128-bit.The key lengths are k = 128, 192 or k = 256 bits, corresponding to r = 10 round, r = 12 round, or r = 14 round encryptions respectively.We give a brief description of AES.One can refer to [1] and [3] for detailed information.It is convenient to demonstrate a round state of AES by a 4×4 matrix.There are four round functions of AES which are given as follows.These functions are depicted in Figure 1.

1) SUBBYTES (SB)
It is the layer of S-box operations.s = 8 for AES and we call a single S-box operation an AES S-box.It substitutes each byte by another byte according to the look-up table of the AES S-box bijectively without changing the byte position in the matrix.There are 16 identical S-boxes.

3) MIXCOLUMNS (MC)
Multiplies each column of the input state by a fixed 4 × 4 MDS matrix.

4) ADDROUNDKEY (ARK )
XORs the output state of the ith round with the i-th subkey.
At first, there is a whitening key addition with the plaintext.The MC operation is omitted in the last round.We call single matrix multiplication of one column as MC also for the sake of simplicity.
SB −1 , SR −1 and MC −1 are inverses of SB, SR and MC respectively.Let the i-th subkey be RK i and let us denote the j-th column of RK i by RK i {j} = RK {j + 4i}.Let N = 6 and N = 8 for the key length to be 192 and 256-bit respectively.
Any column RK {j} of a subkey is computed as where f and g are functions on columns that consist of S-box, cyclic shift, and round constant addition operations.

C. NOTATION
The symbols P, C, and K denote a plaintext, ciphertext, and main key, respectively, in the context of AES (Advanced Encryption Standard).The notation X represents the difference between a pair of elements X , where P and C refer to the plaintext and ciphertext differences, respectively.To specify both the output of an AES function and the round number for an arbitrary output, subscripts are employed.Specifically, SB i and MC i indicate the output of the data in the i-th round of the SubBytes and MixColumns operations, respectively.Likewise, SB i and MC i refer to the output difference of a pair of data in the i-th round of the SubBytes and MixColumns operations, respectively.Equivalent notations are employed for the inverse operations, If it is necessary to specify a particular input or output of these functions, the standard notation MC i (X ), SB i (X ), or MC −1  i (X ) is used.The bytes of a state are denoted by [•] notation.Specifically, X [i 1 , i 2 , . . ., i r ] denotes the (i 1 + 1), (i 2 + 1), . . ., (i r + 1)-th bytes of the state X .For example, SB −1 4 [0, 2] denotes the first and third bytes of an input difference for the SB operation in the fourth round.The index numbers of words in the 4 × 4 matrix are arranged as depicted in Figure 2.This matrix arranges words with indices 0, 1, 2, 3 in the first row; words with indices 4, 5, 6, 7 in the second row; words with indices 8, 9, 10, 11 in the third row; and words with indices 12, 13, 14, 15 in the last row.

III. RECIPROCAL ID ATTACKS ON SPN CIPHERS
Any attack on a block cipher E K is an algorithm whose input is a particular set of plaintext/ciphertext pairs.The output is the secret key.In general, E K is used as an oracle to produce the data that the attack makes use of.Alternatively, it is possible to produce the data through D K .Yet, the number of calls generally changes.We introduce Definition 1 for a reciprocal attack if the attack has the same data complexity when it is mounted on D K as when it is mounted on E K .Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.reciprocal.We show that ID attacks need not be reciprocal even though any ID characteristic is valid for the decryption function.To the best of our knowledge, all the prominent ID attacks on AES in the literature are reciprocal attacks.Therefore, these attacks can be mounted as CC (Chosen Ciphertext) attacks with the same complexities.

Definition 1: Let the data required in a non-adaptive attack algorithm A on a block cipher be produced by either α e calls of the encryption oracle E K or α d calls of the decryption oracle D
The literature shows that the most efficient and fastest ID attacks are reciprocal ID attacks.However, we prove that reciprocal ID attacks require too much data.We introduce a lower bound for the data complexity of reciprocal ID attacks on SPN ciphers.For this, we use the following notation.
• k i /k f : # of independent key bits in the initial/final rounds involved in producing the input/output differences of the ID characteristic respectively.
• n i /n f : # of active input/output words in the plaintext/ciphertext differences respectively.
• P i /P f : The probability that a subkey in the initial/final rounds produces the input/output difference of the ID characteristic for a given input/output pair respectively.
• D u : The average number of pairs used in the attack.
• U i /U f : # of structures in plaintexts/ciphertexts respectively.

A. TYPICAL ID ATTACK
A typical ID attack on an n-bit SPN cipher is a successful ID attack (faster than exhaustive search) that exploits one truncated ID characteristic in the middle rounds.Some few rounds are added in the beginning which we call initial rounds and some few rounds are added at the end we call final rounds.Then, the attack searches for all the necessary subkey bits in the initial and final rounds in order to check if a given plaintext pair produces the input difference of the ID characteristic and its corresponding ciphertext pair produces the output difference of the ID characteristic as truncated differences.We call these subkey bits the involved bits.
We assume these bits are independent.If some of them can be computed by means of the key schedule, we skip searching for them.We use enough data to sieve all the involved subkey bits in a typical ID attack.We adopt the big-O notation for data complexities in our statements but we ignore the use of the notation O(•) for the sake of simplicity.Shakiba et al. introduce the definition of an ideal ID attack in terms of its complexity in [41].They categorize an ID attack as an ideal ID attack if the dominant part of its time complexity is the number of memory accesses for sieving out the stored subkeys that are involved in producing the input/output differences of the ID characteristic.We also assume a typical ID attack is ideal.
Let an ID attack make use of the pairs ( P, C) where specific n i words of P and n f words of C are active.That is, we have nonzero differences only on these words.A structure for the inputs is a set of plaintexts whose n i words take all the values and other words are constant.Similarly, a structure for the outputs is a set of ciphertexts whose n f words take all the values and other words are constant.There are around 2 sn i −1 (2 sn i − 1) pairs in a structure of plaintexts.But we assume there are 2 2sn i −1 pairs for n i > 1.This does not change the complexities in big-O notation.Similarly, we assume a structure for ciphertexts contains 2 2sn f −1 pairs.For a CP attack, we construct U i structures and check the ciphertext pair C of each plaintext pair P in a structure if C has exactly n f active words only on the specific positions.Then, this ( P, C) is used in the attack.If a subkey guess leads to the ID characteristic in the middle rounds from ( P, C), this subkey is eliminated.
A typical ID attack has the following parameters in CP scenario: The parameters of this attack will be in CC scenario.Let us note that we do not consider multiple ID attacks or an ID attack exploiting multiple ID characteristics simultaneously in a typical ID attack.A trivial lower bound for the time complexity is max{2 sn i , 2 sn f }.
The numbers of structures are integers in general in practice.But, we do not impose U i or U j to be integers.If the number of structures is not an integer, then not all of the elements in one of the structures are used.Let U i = q i + ϵ i with 0 ≤ ϵ i < 1, q i ∈ Z, q i ≥ 0, and Remark 1: We assume O((q i + ϵ i )2 sn i ) ̸ = O(q i 2 sn i ) and O((q f + ϵ f )2 sn f ) ̸ = O(q f 2 sn f ) for nonzero ϵ i and nonzero ϵ f throughout the paper.Therefore, we simply assume ϵ i = ϵ f = 0 for q i ≥ 4 and q f ≥ 4.
We need D u pairs to eliminate all the wrong subkeys involved in either a CP attack or a CC attack.However, the number of calls of the oracle to get D u pairs may change.We assume that enough number of pairs are used to eliminate all the wrong subkeys.We also assume that each subkey candidate (K i , K f ) from the initial subkey K i and the final subkey K f is eliminated by the probability of P i P f through an input/output pair and hence it survives with the probability of (1 − P i P f ) D u in all D u pairs.Therefore, the minimum number of pairs in order to eliminate all the subkeys is for a typical ID attack on an SPN cipher with parameters ) where e is the Euler's number.
The elimination process may utilize several techniques such as guess and determine methods, hash tables, and early abort techniques to enhance the time complexity as proposed in [33] and [42].However, we study the reciprocal ID attacks in a generic setting.So, it is not possible to introduce statements about time complexities since they depend on the attack algorithms.Therefore, we do not consider time or memory complexities.But, we can introduce the trivial lower bound as 2 k i +k f memory accesses to eliminate all the wrong keys.In this work, we focus on data complexities.
Proposition 1: Let a typical ID attack on an SPN cipher have the number of the structures, U i = q i + ϵ i and U f = q f + ϵ f .Then, this ID attack is reciprocal if and only if A typical ID attack is reciprocal if and only if its data complexities are equal in both CP and CC scenarios by Definition 1.We use all the elements in q i structures.So, we have q i 2 2sn i −1 pairs of the plaintexts.On the other hand, we use ϵ i 2 sn i plaintexts of the last structure.So, we can produce ϵ 2 i 2 2sn i −1 pairs from this structure.Together we have q i 2 2sn i −1 + ϵ 2 i 2 2sn i −1 pairs.We check if their ciphertext pairs have exactly n f active words in the specific positions.So, the number of pairs used in the attack is which is also equal to since the same data pairs are used in both CP and CC scenarios.Organizing Equation 1 and Equation 2, we have On the other hand, the data complexity is q i 2 sn i + ϵ i 2 sn i in CP scenario and q f 2 sn f + ϵ f 2 sn f in CC scenario.We need them to be equal for the attack to be reciprocal.Substituting q i 2 sn i − q f 2 sn f with ϵ f 2 sn f − ϵ i 2 sn i in Equation 3, we obtain Then, substituting in Equation 3, we have which means that data complexities are equal in both CP and CC scenarios.Note that Equation 3 is always valid in any ID attack.□ Corollary 1: In a typical reciprocal ID attack, ϵ i = 0 if and only if ϵ f = 0.
Proof: If an ID attack with the numbers of structures U i = q i + ϵ i in the encryption direction and □ Corollary 2 below can be considered as a useful characterization of reciprocal ID attacks which can be used in practice to identify that the ID attacks on SPN ciphers in the literature are mostly reciprocal attacks.
Corollary 2: If both U i and U f are integers in a typical ID attack then the attack is reciprocal.
Proof: Let the numbers of structures of a typical ID attack be integers in both CP and CC scenarios.Then This simply implies that the attack is reciprocal by Proposition 1.
□ It is crucial to note that any attack with U i ≥ 4 and U f ≥ 4 is reciprocal by Corollary 2. For example, the numbers of structures used in all the well-known ID attacks on AES are much higher than 4 and are all integers [30], [31], [32], [33], [42], [43], [44], [45], [46], [47], [48].So, all the known ID attacks on AES are reciprocal by Corollary 2. We observe that the reciprocal attacks achieve good performance in terms of time complexity.However, they require a lot of data.We can give lower bounds for the data complexities of such attacks through our statements in Section IV.
The following theorem characterizes the reciprocal attacks when the numbers of structures are not integers.
131210 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Theorem 1: Let a typical ID attack have U i = q i + ϵ i and U f = q f + ϵ f with ϵ i ̸ = 0 and ϵ f ̸ = 0.Then, the attack is reciprocal if and only if n i = n f , q i = q f , and ϵ i = ϵ f .
Proof: If we have n i = n f and and hence the attack is reciprocal by Proposition 1.On the other hand, assume that the attack is reciprocal.If n i = n f then q i + ϵ i = q f + ϵ f since 2 sn i (q i + ϵ i ) = 2 sn f (q f + ϵ f ).This implies that q i = q f and ϵ i = ϵ f .Because q i and q f are integers and 0 ≤ ϵ i < 1, 0 ≤ ϵ f < 1. Assume on the contrary that n i ̸ = n f .Let n i < n f .If q i ̸ = 0. We have q i + ϵ i = 2 s(n f −n i ) (q f + ϵ f ) and But there is no nonzero solution for ϵ f for these two equations since both ϵ f ≈ q i 2 s(n i −n f ) − q f and ϵ 2 f ≈ q i 2 s(n i −n f ) − q f .If q i = 0 then q f must be zero.Otherwise ϵ i > 1.When q f = 0 also, we have no nonzero solutions of ϵ f and ϵ i again since 2 s(n f −n i ) ̸ = 1.In summary, if n i ̸ = n f then the attack cannot be reciprocal for nonzero ϵ i and ϵ f .□ We construct our lower bounds for the data complexities of typical ID attacks in general in Theorem 2. We introduce a precise lower bound for the data complexities of reciprocal ID attacks.
Theorem 2: A typical reciprocal ID attack on an SPN cipher with its parameters chosen plaintexts (or, equally, chosen ciphertexts) where U i = q i +ϵ i and U f = q f +ϵ f ; 0 ≤ ϵ i < 1; 0 ≤ ϵ f < 1, q i , q f ∈ Z, q i ≥ 0, q f ≥ 0. Proof: The data complexity is U i 2 sn i which is U f 2 sn f at the same time since the attack is reciprocal.We have Taking the logarithm, and similarly for the CC scenario Summing Inequality 5 and Inequality 6 and then dividing by 3, we obtain a lower bound for sn f + sn i : On the other hand, we have U i 2 sn i = U f 2 sn f since the attack is reciprocal and hence Adding Inequality 7 to Equation 8 and dividing by 2, we get a lower bound for sn i : Therefore, the logarithm of the data complexity is bounded by Taking the powers of the both sides of Inequality 10, we get □ One straightforward conclusion of Theorem 2 is introducing the following lower bound for reciprocal ID attacks.Even though it is the most generic bound, it is a loose bound.
Corollary 3: A typical reciprocal ID attack on an SPN cipher with U i ≥ 1 has the data complexity of at least 2 (n+1)/3 chosen plaintexts.
Proof: The data complexity is bounded below by by Theorem 2. U i ≥ 1 ⇒ U f ≥ 1 by Theorem 1.On the other hand The most dominant parameters in data complexity are P i and P f .So, we can simplify the lower bound as follows.
Theorem 3: A typical reciprocal ID attack on an SPN cipher with U i ≥ 1 has the data complexity of at least (2 n+1 (P i P f ) −1 ) 1/3 chosen plaintexts.
Theorem 4: A typical reciprocal ID attack on an SPN cipher with (n i , n f , U i , U f ) has the data complexity of at least 2 sn i +sn f U i U f chosen plaintexts (or, equally, chosen ciphertexts).
Proof: The data complexity is at least chosen plaintexts (or, equally, chosen ciphertexts) by Theorem 2. We can write this bound as which is equal to 2 sn i +sn f U i U f .□ Let us note that the bound in Theorem 4 is the geometric mean of the data complexities in CP and CC scenarios.So, if they are equal, they also equal their geometric mean.So, we have precise equality.In general, it is possible to bound the data complexity by using the number of active words in Lemma 1.This well-known result is valid for an arbitrary ID attack.
Lemma 1 ( [29]): A typical ID attack on an SPN cipher with the parameters log 2 (e)P i P f data in both the chosen plaintexts and the chosen ciphertext scenarios.Indeed, the data attains the bound if the numbers of structures are integers.That is, we have almost all the ID attacks (e.g.[30], [31], [32], [33], [46]) since the attacks make use of plenty of structures to optimize the overall complexity in both directions.As one exceptional example, the parameters of the attack on Camellia in [49] are sn i = 128, sn f = 56 and D u = 2 168 .So, U i = 2 −7.5 and U f = 2 57 .Then, the attack requires D u 2 129−128−56 = 2 57 2 56 = 2 113 CC in the decryption oracle, but 2 121.5 CP in the encryption oracle.
The bound in Lemma 1 might be insufficient for a reciprocal attack if the quantity of the difference, |n i − n f |, is large enough or the number of structures is less than one in one direction.We treat all the cases to have complete security proofs of SPN ciphers in terms of data requirements of reciprocal ID attacks.It is possible to eliminate the pairs from the cancellations either in the ciphertexts or in the plaintexts.Therefore, if one of n i or n f is too small, the corresponding reciprocal attack requires so much data. Theorem

5: The data complexity of a typical reciprocal ID attack on an SPN cipher having the parameters
Proof: We have in CC scenario.Hence, the data complexity is bounded below by which concludes the proof.□ Theorem 5 introduces an efficient bound.In fact, min{n i , n f } = 4 in almost all the reciprocal attacks on AES, and the data complexities of some of them are depicted in Table 2 along with the lower bounds deduced through Theorem 5.The question is if the bound in Theorem 5 is sharp.We claim that it is a sharp bound and define the reciprocal attacks attaining this bound as the attacks with optimal data.
Remark 2: Theorem 2 may be seen similar to the bound given in [29], which is stated as It is plain that the maximum of n i , n f is taken in [29]   its data complexity is not more than twice the bound given in Theorem 5.That is, where ⌊•⌋ is the flooring function.
There is no reciprocal ID attack on AES with optimal data yet.For the first time, we introduce it in Section V.This example proves that the bound in Theorem 5 is a sharp bound.
We characterize the reciprocal ID attacks with data attaining the bound in Theorem 5 in the following statement.
Theorem 6: Assume n i ≤ n f .The data complexity D of a typical reciprocal ID attack on an SPN cipher having the parameters log 2 (e)P i P f .
For the other direction, assume D is equal to the bound in Inequality 13.The number of the structures is at least D u 2 2sn i +sn f −n−1 and then Solving the inequality with respect to D u , we have D u ≤ 2 2sn f +sn i −n−1 .But, this is the number of the pairs in the CC scenario with only one structure or its subset.So, D = ϵ2 sn f .If ϵ < 1 then n i = n f by Theorem 1 since the attack is reciprocal.
□ Corollary 5: Assume D ≤ 2 max{sn i ,sn f } of a typical reciprocal ID attack on an SPN cipher having the parameters Proof: Assume D ≤ 2 sn f and n i ≤ n f .Then 2 n+1−sn i D u ≥ 2 n+1−sn i −sn f D u by Theorem 6.So, D u ≤ 2 sn i +2sn f −n−1 .On the other hand, 1 ≤ D u .So, n + 1 ≤ sn i + 2sn f .□ Corollary 5 can be utilized in developing a design criterion for an SPN cipher to provide security against ID attacks.The diffusion layer of a block cipher is supposed to satisfy the bound for any initial and final rounds and for any ID characteristic.
The straightforward conclusion of Corollary 5 is that if the number of structures is less than one, then the number of active words in both input and output cannot be arbitrarily small.
Corollary 6: Assume U i < 1 for a typical reciprocal ID attack on an SPN cipher having the parameters

IV. PROVABLE SECURITY OF AES AGAINST ID ATTACKS
Any ID attack on AES makes use of 4-round ID characteristics and all these characteristics are identified by Grassi et al. in [25].We introduce the result of the minimum number of data used in a typical reciprocal ID attack on AES exploiting one of these characteristics.

Lemma 2 ([25]): If the total numbers of the active input diagonal and the output inverse diagonal columns in any 4-round characteristic of AES is less than four then this characteristic is an ID characteristic. Definition 3: We call any 4-round ID characteristic described in Lemma 2 as a 4-round conventional ID characteristic of AES.
All the known ID attacks on AES are reciprocal ID attacks since their number of structures are integers in both encryption and decryption directions.Moreover, they all exploit 4-round conventional ID characteristics.Indeed, there are no known ID characteristics of AES other than conventional ones.We give a lower bound for the data complexity of a reciprocal ID attack on AES exploiting one of the 4-round conventional ID characteristics.
We introduce the following conjecture in Claim 1.Then, Theorem 7 below can be extended to all the ID attacks on AES when the conjecture is proven.
Claim 1: All the truncated ID characteristics of r-round AES where r ≥ 4 are conventional 4-round ID characteristics.
There are powerful indicators in the literature about the correctness of Claim 1.Sun et al. reduce the problem of the existence of an ID for a given SPN (Substitution Permutation Network) to the problem of the existence of an ID whose input and output Hamming weights are both one.They conclude that AES has no 5-round ID unless the details of the S-Box are not taken into consideration [26].Another proof is provided by Wang and Jin in [27], exploiting the properties of AES S-box by using the ''dependent tree'' method.However, their result is given under the assumption that all the round keys are uniformly random and independent.Boura and Coggia show that AES has no 5-round ID by using MILP solvers if the details of both the S-box and the key schedule are taken into account.Moreover, their result is valid only if the first and the last rounds of the characteristic contain two active S-boxes in total [28].
Theorem 7 gives a powerful lower bound for the data complexities of all the known reciprocal ID attacks on AES.
Theorem 7: A typical reciprocal ID attack on AES exploiting a 4-round conventional ID characteristic has the data complexity of at least 2 66 chosen plaintexts.
Proof: The number of active MC operations (whose input difference is nonzero) before and after any conventional ID characteristic is at least two (one in the initial rounds and one in the final rounds).The total number of passive bytes in one column of the input and in one column of the output of a 4-round conventional ID characteristic is at least 4 by Lemma 2. So, P i P f ≤ 4 2 2 2 −32 .The number of the key bits involved is at least 48.Hence, we have k i +k f log 2 (e)P i P f ≥ 2 31 .If n i + n f < 12 then the data D ≥ 2 70 by Lemma 1.
Let n i + n f = 12.Then, we have at least three active MC operations.If there are exactly 3 active MC operations then there is only one active MC operation either in the input or in the output.So, n i ≤ 4 or n f ≤ 4 and hence 2 129−8•min{n i ,n f } ≥ 2 97 .On the other hand, Hence, D ≥ 2 69 by Theorem 5. We need more data if n i ≤ 4 or n f ≤ 4 and there are more than 3 active MC operations, again by Theorem 5. So, assume n i > 4 and n i > 4 and hence there are at least two active MC operations in each direction.Then 64 .Then, we have 2 129−8•min{n i ,n f } ≥ 2 81 since min{n i , n f } ≤ 6.Therefore, D ≥ 2 71 by Theorem 5. So, let n i + n f > 12.In this case, the number of active MC operations is at least 4. Let the number of the active MC operations in the initial and final rounds be m i and m f respectively.If (m i , m f ) = (1, 3) then = 2 −44 and k i +k f ≥ 80. Hence, D u ≥ 2 49 .On the other hand, n i ≤ 4. So, D ≥ 2 (49+96+1)/2 = 2 71 in the CC scenario.The case (m i , m f ) = (3, 1) is similar.One can mount the attack in the CP scenario in this case.We need more data for (m i , m f ) = (1, ≥ 3) since P i P f is getting less and k i + k f increases and hence D u increases.Assume (m i , m f ) = (2, 2).Then, P i P f ≤ 4 2 2 2 −64 and k i + k f ≥ 104 for the minimum data.Hence, D u ≥ 2 65 .So, we need at least 2 129 pairs.One structure contains at most 2 127 pairs and at least 2 −64 of them will be discarded.That is, we need at least 2 2 structures and so D ≥ 2 66 both in CP and in CC scenario.The (m i , m f ) = (2, ≥ 2) case require more data in CC scenario and (m i , m f ) = (≥ 2, 2) case require more data in CP scenario.When (m i , m f ) = (3, 3), □ We prove in Theorem 7 that any reciprocal ID attack on AES exploiting a conventional 4-round ID characteristic requires at least 2 66 data whatsoever its steps are.The question is if there are reciprocal ID attacks on AES with roughly this data complexity.We introduce an example in Section V.This attack is both a reciprocal ID attack with optimal data and its data complexity attains the bound in Theorem 7.
Almost all the ID attacks on AES in the literature make use of four active bytes in either the first or the last round which produces only one byte active after the MDS multiplication.Then, we can prove the following statement for these attacks even though they do not exploit conventional ID characteristics.  6.Therefore the minimum data to eliminate all the subkeys involved is bounded below by 2 62 .□ All the well-known ID attacks on AES in [30], [31], [32], [33], and [46] make use of one active column of the MC operation in the first round.This column results in only one active byte.Hence all these attacks require at least 2 62 CP according to Theorem 8.It seems it is not possible to introduce an eligible lower bound for the data complexity of a typical nonreciprocal ID attack on AES.But we can give a lower bound for the time complexity.
Theorem 9: Any typical nonreciprocal ID attack on AES exploiting one of the 4-round conventional ID characteristics has the time complexity of at least 2 88 trials.
Proof: Let us show that n i or n f of a nonreciprocal attack on AES is greater than 10.Assume n i ≤ 9.If n i = 9 then one structure can produce 2 143 pairs and there are at least 3 active MC operations in the first round.If there is only one active MC operation in the last round then at most 2 47 of the pairs remain and P i P f ≤ 4 2 2 −24 2 −24 = 2 −44 with D u ≥ 2 50 .So, we need at least 8 structures in CP attack and many more structures in CC attack.So, the attack will be reciprocal.Assume there are two active MC operations in the last round.Then, P i P f ≤ 6 2 2 −48 2 −32 and D u ≥ 2 83 .Again, the number of remaining pairs in a structure is at most 2 79 .Hence we need at least 16 structures in CP attack and many more in CC attack.So, the attack cannot be reciprocal.Assume there are three active MC operations in the last round.Then, assume n f ≤ 10.This implies that P i P f ≤ 6 2 2 −48 2 −48 and D u ≥ 2 91 .131214 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
So, again, there are at least 2 4 structures in both directions.Similarly, if all the MC operations are active and n f ≤ 10 then there is only one case: n f = 10.The number of active bytes in the ciphertext pairs in each MC operation must be 2-2-3-3 or 2-2-2-4.So, P i P f ≤ 4 2 2 −72 2 −32 and D u ≥ 2 115 and the attack will be obviously reciprocal.In conclusion, if n i = 9 then n f ≥ 11.Similarly, if n i < 9 then n f ≥ 11.So, there are at least 11 active bytes in the plaintext pairs or in the ciphertext pairs.That is, we need at least 2 88 trials.□ Notice that Theorem 9 is valid for any key schedule.That is, if AES had no key schedule and all the round keys were equal then again any nonreciprocal ID attack would require at least 2 88 trials.Each trial is almost one encryption or partial encryption, depending on the characteristic of the special attack.

V. A RECIPROCAL ID ATTACK WITH OPTIMAL DATA
We introduce an example of a 6-round reciprocal ID attack as depicted in Figure 3 on AES to show that our lower bound is almost sharp.In this attack 7 )/ log 2 (e) ≈ 2 65.3 .So, take U i = 5 ≈ 2 2.5 .Then, the data complexity is D = 2 64 2 2.5 = 2 66.5 .Each pair among D u pairs suggests around (6 • 2 32 ) 2  ≈ 2 69 subkeys and we can determine all these keys by guessing four bytes from RK 0 and four bytes from RK 6 .Then, we detect the wrong keys to be eliminated for each guess and for each pair.So, the time complexity is around 2 65.5 2 69 = 2 134.5 memory accesses which is roughly 2 131 encryptions.We can recover 64 bits of RK 6 which are RK 6 [0, 1,4,7,10,11,13,14].Then, we can recover the remaining 16 bytes of RK 6 and RK 5 by exhaustive search for AES-192 or we mount the attack with the same data, this time to recover the other round key bytes of RK 6 by switching the active and passive bytes in the ciphertext pairs.Therefore, this attack works on AES-192 and AES-256.If we have only 2 66 CP, then the number of pairs D u will be 2 65 .We have around 2 35.5 pairs which suggest the output of the 4-round ID characteristic for a fixed value of RK 6 [0, 1,4,7,10,11,13,14].Then, the probability that a candidate for the 64-bit whitening key bytes RK 0 [0, 1,5,6,10,11,12,15] is not eliminated is (1 − 2 −29.5 ) 2 35.5  ≈ e −64 ≈ 2 −92. 3 .Hence the probability that all the 64-bit whitening keys are eliminated is (1 − 2 −92.3 ) 2 64 > 1 − 2 −28 .That is, we expect more than (2 64 − 2 36 ) of the candidates for the round key RK 6 [0, 1, 4, 7, 10, 11, 13, 14] to be eliminated.That is, we get its 28-bit information about RK 6 and the attack with 2 66 CP data will be faster than the exhaustive search.Therefore our attack is an ID attack with optimal data by Definition 2.

VI. A NONRECIPROCAL ID ATTACK ON AES
We show that the reciprocal ID attacks require many data.Particularly, any reciprocal ID attack on AES exploiting a 4-round conventional ID characteristic requires at least 2 66 data by Theorem 7. We examine if a high data complexity requirement is necessary for any ID attack in this section.So, we introduce a nonreciprocal ID attack on AES which requires only 2 30 CP to show that Theorem 7 is not true for nonreciprocal ID attacks even though the characteristic exploited is a 4-round conventional ID characteristic given in Lemma 2.
For the illustration of how to mount an ID attack on AES with small data complexity, we exploit the well-known ID characteristic introduced by Biham and Keller [42].The Biham-Keller characteristic is exploited in several ID attacks such as [31], [32], [33], and [42].We use this characteristic to mount a typical ID attack on 6-round AES which is depicted in Figure 4.The parameters of the attack are as follows: Then, the data complexity for the CP attack is U i 2 8n i = 2 −2 2 32  = 2 30 whereas it is U f 2 8n f = 2 −50 2 128 = 2 78 chosen ciphertexts.Clearly, it is not reciprocal.Indeed, 2 32 (2 −4 − 2 −2 ) ̸ = 2 128 (2 −100 − 2 −50 ) and hence the attack is nonreciprocal by Proposition 1.As easily observed, Theorem 2, Theorem 4, and Theorem 5 do not work for this attack since it is a nonreciprocal attack.Indeed, the lower bounds are 2 136/3 , 2 54 , and 2 46 respectively, which are even higher than D = 2 30 .

VII. A NEW ATTACK WITH MINIMUM DATA
We introduce a nonreciprocal ID attack on 6-round AES which requires only 2 18 CP.This is a record in terms of the minimum data complexity.
We derive a 3-round Impossible Differential (ID) characteristic by loosening the Biham-Keller ID characteristic.This expansion entails the activeness of all bytes in the output difference (see Figure 5).We can exploit it by utilizing the property that all the bytes after the SR operation in the third round of the ID characteristic are active.So, if we add one round at the beginning and one more round at the end, we expect all the bytes of an input pair of MC in the fourth round to be active for a whiting key producing only one active byte at the end of the first round, as depicted in Figure 5.We exploit this property as our distinguisher for our ID attack.We can examine the distinguisher since the whole round key is searched in the last round.
We mount our ID attack on 6-round AES-192 and AES-256 and we exploit their key schedules to use the minimum data  in our attack on 6-round AES.First of all, we extend the idea of the key bridge of the key schedule of AES-256 introduced by Dunkelman et al. [9].
Let us consider the equivalent key MC −1 (RK 5 ) which is executed before the MC operation and choose one of its inverse diagonals to eliminate.Each inverse diagonal enables us to compute the corresponding column of the output of the fourth round.If we consider the first inverse diagonal then we can eliminate the round key bytes MC −1 (RK 5 )[0, 7,10,13] by computing the first row of the output difference of the fourth round and checking if MC −1  4 produces less than four active bytes in the first column.This will give us a contradiction since we expect all the bytes of an input pair of MC to be active in the fourth round.The probability of this contradiction is slightly larger than 2 −6 .Therefore, the probability of eliminating each candidate for MC −1 (RK 5 )[0, 7,10,13] is (1 − 2 −6 ) 2 13  ≈ e −128 ≈ 2 −184. 5.Hence we expect all the guessed keys to be eliminated.If there are some key candidates left, we repeat the attack for the second, third, and last columns of MC −1 4 to eliminate the keys left.
We guess 128 bits of RK 6 and determine 8 bits of RK 0 through the key schedule and 24 bits of RK 0 from data.So, we have 152 bits and each guess is eliminated by about 2 11 pairs on average.Hence the time complexity is 2 152 2 11 2 26  = 2 189 memory accesses which is around 2 186 encryptions.The memory complexity is 2 24 2 13  = 2 37 units which is roughly 2 43 bytes.The data complexity is 2 18 CP.

B. AES-192 CASE
We need a hash table for RK 0 [0, 5,10,15] which can be prepared during offline (see [42]).The table contains about 2 10  keys for each plaintext pair (P[0, 5, 10, 15], P ′ [0, 5,10,15] leading to only one active byte after MC 1 , and is sorted with respect to the plaintext pairs. First, let us guess RK 6 .Then, compute MC −1 (RK 5 )[0, 13] through the key schedule for AES-192 since we can compute the first two columns of RK 5 from RK 6 .Furthermore, guess one byte from MC −1 (RK 5 ) [10,7] and determine the other byte through each ciphertext pair.Then, for each 144-bit secret information, determine the ciphertext pairs among 2 35  pairs leading to the impossible differential in the output.That is, check if MC −1  4 [0, 4, 8, 12] has at least one passive byte.The probability is about 2 −6 .So, there will be around 2 29 ciphertext pairs for each 144-bit guess of RK 6 and MC −1 (RK 5 ) [10,7], which is loaded in a memory for key.This memory can be reused for different 144-bit guesses.

VIII. ENHANCING INTEGRAL ATTACK
We have introduced an attack on 6-round AES with the minimum data in the previous section.Its time complexity is marginal.In this section, we study the improvements over the best attack on 6-round AES in terms of data, time, and memory complexities.
The integral attack using the partial sum technique in [14] has been the best attack on 6-round AES with respect to the total complexity given in [14] as 2 44 encryptions since 2000.In this section, we examine the practical security of 6-round AES and improve the best attack further.We utilize the partial sum technique but we refrain from an extensive elaboration on this technique.One can see [14] for the details.In summary, we prove better complexities in data, time, and memory.
First of all, we introduce here a small correction in the attack in [14].It is given only for recovering the four bytes of one of the reverse diagonals of the round key RK 6 in the sixth round with a complexity of 2 44 encryptions.However, the attack should be repeated four times to recover the whole round key in the last round.The fifth round key can be recovered much faster in the cases of 192-bit and 256-bit key lengths.So, the overall complexity should be 2 46 rather than 2 44 .We correct this minor fault and amend the complexity in Table 1 accordingly.
The attack uses 6 • 2 32 CP in [14].In this section, we improve the attack by using only 2 32 CP for AES-128 and it is 8 times faster.The attack is 4 times faster for the other key lengths and utilizes 2 33 CP.
Let an oracle encrypt 2 32 CP where the first diagonal (P[0, 5,10,15]) takes all the values and the other bytes are constant, and it publishes the corresponding ciphertexts.
We evaluate the S-box operations as 8 × 32-bit given as • SB −1 (x) to deal with the Galois field multiplication, and obtain the four bytes of the inverse MixColumn operation in one S-Box call, where || is the concatenation.
For each inverse diagonal, guess the 32-bit key in the sixth round and one byte of MC −1 (RK 5 ) from the column where this 32-bit key affects after the SR −1 operation.So, we can compute the corresponding byte at the end of the fourth round and then check if the sum is zero for all the 2 32 ciphertexts.
We exploit the zero-sum distinguisher as in [14] by using the partial sum technique.As one improvement, we utilize the zero-sum property not for only one byte in each column in MC 4 , but for all the bytes since all the 16 bytes of MC 4 are balanced.Hence, our distinguisher has the false alarm probability of 2 −128 instead of 2 −32 for each structure, which enables us to reduce the data up to a factor of six for AES-128.
Guessing one key for the reverse diagonal RK 6 [0, 7, 10, 13], around one candidate for MC −1 (RK 5 )[0] will pass the zero-sum test given as 2 32  −1 Similarly, one candidate for MC −1 (RK 5 ) [4] passes the zero-sum test given as Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Let us store them in a memory, say A 1 .Similarly, compute and store the candidates for (RK 6 [1,4,11,14], RK 5 [1,5,9,13]), (15) (RK 6 [2,5,8,15], RK 5 [2,6,10,14]), ( 16) (RK 6 [3,6,9,12], RK 5 [3,7,11,15]) (17) in A 2 , A 3 , and A 4 respectively.If we repeat the attack for the elements in the sets A i , i = 1, . . ., 4, by using another structure of 2 32 CP, we have around one element in each set.So, we recover RK 5 and RK 6 by using two structures, without exploiting the key schedule.The attack utilizes 2 33 CP instead of 6 • 2 32 CP.Preparing each set A i , i = 1, 2, 3, 4, costs around 2 49 S-box operations through the partial sum.We have 2 32 vectors for each reverse diagonal.Remove the vectors that appeared an even number of times and hence around 2 31 will be left.The first step of the partial sum costs 2 31 2 16  • 2 = 2 48 S-box operations where we search only 2 bytes of the last round key in any reverse diagonal with 2 S-box operations.Then, there are around 2 23  vectors left.The second step costs 2 23 2 24  • 1 = 2 47 S-box operations and the number of vectors left is around 2 15 .The third step costs 2 15 2 32  • 1 = 2 47 S-box operations again.The last step is run with only 2 7 vectors and hence it costs also 2 7 2 40  •1 = 2 47 S-box operations.We use directly SB −1 in this last step.Then we can check the zero-sum condition.The total complexity is around 2 48  + 3 • 2 47 ≈ 2 49 S-box operations.We repeat this partial sum technique for 3 other reverse diagonals.So, the overall complexity is around 2 51 S-box operations which is around 2 43 encryptions if we assume 2 8 S-box operations is roughly one encryption, as in [14].We mount the attack once more for the other structure to eliminate almost all the elements in the sets.Hence, the time complexity is 2 44 encryptions.The memory complexity is 2 3 2 32  = 2 35 bytes for loading one set among A i , i = 1, 2, 3, 4; and 2 37 bytes for loading the ciphertexts.So, we need around 2 37 bytes.Note that the memory for each set A i can be reused if we construct one set and then eliminate its elements by utilizing the ciphertexts of the second structure before constructing the other sets.
We can further improve the attack for AES-128 by exploiting the key schedule.Only one structure is enough for this case.Sort A 1 by RK 5 [12], RK 6 [7], RK 5 [0] ⊕ RK 6 [0].Because we can deduce these values from A 4 through the key schedule for AES-128.
Then, we have only one element in A 3 on average for each element in A 1 and in A 4 .Hence, there are around 2 8 elements left in A 1 and A 3 for one chosen element in A 4 .
These equations give one element on average for each element in A 1 , in A 3 , and in A 4 .Then, it is possible to check the candidate with the following 5 equations.
RK 5 [9] = RK 6 [8] ⊕ RK 6 [9], (29) RK 5 [13] = RK 6 [12] ⊕ RK 6 [13], RK 6 [4] ⊕ RK 5 [5] = RK 6 [5], (31) RK 6 [14] = RK 6 [13] ⊕ RK 5 [14].(33) with a probability of 2 −40 .We have 2 8 candidates for each element in A 4 in the other sets.So, all together, we expect only one element to be left in the last five equations since we have 2 40 candidates in all the sets and the probability that one candidate satisfies the five equations is 2 −40 .That is, there are around two candidates passing both the zero-sum check and the equations of the key schedule.One of them is the correct key and it can be deduced by a quick search.The complexity of constructing the sets A i is around 2 51 S-box operations which is around 2 43 encryptions.The key schedule utilization phase is much faster.Because we test five equations in A i for each of 2 40 candidates.However, most of them are eliminated in Equation 29.So, the key schedule utilization phase consists of 2 40 tests of equations like Equation 29, which is around 2 40 byte-wise XOR operations.Similarly, eliminating vectors costs 2 33 S-box and 3•2 32 XOR operations, 2 40 S-box and 2 42 XOR operations, and 2 40 S-box and 2 42 XOR operations in A 1 , A 3 , and A 2 respectively.It is 131218 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
clear that the key schedule utilization phase is much less than 2 40 encryptions.
In total, we improve the best attack on 6-round AES-128 by a factor of 6 in data usage, by a factor of 8 in time complexity, and by a factor of 2 in memory complexity.

IX. CONCLUSION AND DISCUSSION
Our study has taken a distinct and generic approach to the security analysis of SPN ciphers against ID attacks compared to typical cryptanalysis works, which simply aim to find the best attack on a specific cipher.We have categorized ID attacks on SPN ciphers into two distinct types: reciprocal ID attacks and nonreciprocal ID attacks.Moreover, we have proved lower bounds on the data complexity of a reciprocal ID attack on an SPN cipher.We have introduced a vast theoretical framework for a comprehensive understanding of the data requirements of ID attacks on SPN ciphers.
As an illustrative application of our theoretical insights on SPN ciphers, we have made use of our generic statements to prove the security bounds for a widely recognized cipher, namely AES, against ID attacks.Particularly, we have shown that a reciprocal ID attack on AES exploiting a 4-round conventional ID characteristic requires at minimum 2 66 CP.Our conjecture is that all 4-round ID characteristics for AES are conventional, resulting in a requirement of 2 66 chosen plaintexts for any reciprocal ID attack on six or more rounds of AES.We have introduced also a reciprocal ID attack on 6-round AES with 2 66 data to show that the lower bound is almost sharp.On the other hand, we have demonstrated a counterexample that this security bound is not valid for nonreciprocal ID attacks by mounting a nonreciprocal ID attack on 6-round AES-192 and AES-256 that requires only 2 18 chosen plaintexts.However, its time complexity is marginal.Indeed, this is not a coincidence; we have proven that any nonreciprocal ID attack on AES exploiting a 4-round conventional ID characteristic has a time complexity of at least 2 88 trials.Then, as a practical application, we improve the integral attack through the partial sum technique in [14], thereby enhancing the existing record after a duration of 23 years.Our attack is the fastest against 6-round AES.The time, data, and memory complexities are improved by factors of 4, 3, and 3 times (or 8, 6, and 2 times for AES-128), respectively.
We think that applying the theoretical foundation established in this study could lead to the discovery of several new results regarding ID attacks on other SPN ciphers.So, similar to proving the security bounds for AES, investigating the minimal data required for reciprocal ID attacks on other noteworthy SPN ciphers by utilizing Theorem 2, Theorem 4, and Theorem 5 is a prospect for future research.Additionally, similar findings can be obtained for Feistel networks.

FIGURE 2 .
FIGURE 2. Word indices in a state.

k i +k f log 2 3 ≥ 1
(e)P i P f 1/which simply implies the result.□ We can directly use Corollary 3 for AES with n = 128.Corollary 4: Any typical reciprocal ID attack on AES with U i ≥ 1 has the data complexity of at least 2 43 chosen plaintexts.

2 2 2 2 2 2
−96 and k i +k f ≥ 128 since 128 is the minimum key length of AES.So, D u ≥ 2 97 .If n i = n f = 12 then the attack is slower than the exhaustive search for any key length.If n f ≤ 11 than we need at least 297+40  = 2 137 pairs which require at least D ≥ 2 69 CP.Similarly, D ≥ 2 69 CC if n i ≤ 11.The cases (m i , m f ) = (3, ≥ 3) or (m i , m f ) = (≥ 3, 3) require more data.For the last case, let (m i , m f ) = (4, 4).Then, P i P f ≤ 4 −128 and k i + k f ≥ 128 since 128 is the minimum key length of AES.So, D u ≥ 2 129 .Again we need more than 2 66 data for a successful attack.

Theorem 8 :
Let a reciprocal ID attack make use of only four active bytes in the plaintext pairs and the MC operation in the first round produces only one active byte.Then the minimum data to eliminate all the subkeys involved is bounded below by 2 62 .Proof: We have n i = 4, k i = 32 and P i = 2 −22 .Assume n f ≥ 4. Then K f ≥ 32.Hence, min{n i , n f } = 4 and then D ≥ 2 129−32 (32+32) log 2 (e)2 −22 ≥ 2 62 by Theorem 5.If n f < 4 then min{n i , n f } ≤ 3 and hence D ≥ 2 129−24 •32 log 2 (e)2 −22 > 2

TABLE 1 .
Attacks a 6-round AES with minimal data.Memory is in Byte.

TABLE 2 .
Some examples of reciprocal ID attacks on 7-round AES with min{n i , n f } = 4.The data is given in CP.The lower bound in the last column is by Theorem 5.