Classification of 4-bit S-Boxes for BOGI Permutation

Bad Output must go to Good Input (BOGI) is the primary design strategy of <monospace>GIFT</monospace>, a lightweight block cipher that was presented at CHES 2017. Because this strategy obviates the need to adhere to the required conditions of S-boxes when adopting bit-permutation, cryptographic designers have more S-box choices. In this paper, we classify all 4-bit S-boxes that support BOGI, called “BOGI-applicable S-boxes,” and evaluate them in terms of the cryptographic strength and efficiency. First, we exhaustively show that only 2413 Permutation- XOR-Equivalence (PXE) classes over 4-bit S-boxes are BOGI-applicable. After refining the PXE classes with respect to the differential uniformity (<inline-formula> <tex-math notation="LaTeX">$\mathcal {U}$ </tex-math></inline-formula>) and linearity (<inline-formula> <tex-math notation="LaTeX">$\mathcal {L}$ </tex-math></inline-formula>), we suggest 20 “Optimal BOGI-applicable” PXE classes that provide the best (<inline-formula> <tex-math notation="LaTeX">$\mathcal {U}$ </tex-math></inline-formula>, <inline-formula> <tex-math notation="LaTeX">$\mathcal {L}$ </tex-math></inline-formula>). Our security evaluations revealed that all optimal BOGI-applicable S-boxes fulfill the security properties considered by the designers of <monospace>GIFT</monospace> and that the differences between them exist in the other properties. Moreover, we explore the resistance of <monospace>GIFT</monospace> variants against differential and linear cryptanalysis by replacing the existing S-box with other optimal BOGI-applicable S-boxes. Based on the results, we identify the best attainable resistance with the bit-permutation of <monospace>GIFT-64</monospace>. Lastly, we suggest notable S-boxes that support competitive performance, jointly considering the cryptographic strength and efficiency for <monospace>GIFT-64</monospace> and <monospace>GIFT-128</monospace> structures, respectively.


I. INTRODUCTION
A large number of lightweight block ciphers adopt bitpermutation due to its negligible implementation cost in hardware. Among these block ciphers, GIFT presented in [1] outperforms the others with its state-of-the-art design approach. Thus, GIFT is widely used as the main primitive in multiple candidates of NIST lightweight cryptography standardization [2]- [6]. The main novelty of GIFT is a logic named Bad Output must go to Good Input (BOGI). This logic prevents differential and linear trails consisting of only one active S-box in each round even though the round function is composed of a bit-permutation and an S-box whose differential and linear branch numbers are 2. As a result, this simple but effective idea enhances the design strategy of PRESENT [7] and allows GIFT to become faster and lighter. However, not every S-box can support BOGI because such The associate editor coordinating the review of this manuscript and approving it for publication was Tony Thomas.
BOGI-applicable S-boxes would need to satisfy particular conditions on their Difference Distribution Table (DDT) and Linear Approximation Table (LAT). Indeed, it has already been shown that all the ''Optimal S-boxes'' discussed in [8] are not BOGI-applicable. This implies that related studies that concentrate only on the optimal S-boxes [9] may not be helpful for analyzing the BOGI design strategy thoroughly.
Various aspects of an S-box, which is the main nonlinear component of modern SPN ciphers, have been analyzed. These aspects are mainly related to security strength and efficiency, such as classifying a set of S-boxes in terms of the security strength requirements [8], [10], [11] or implementation cost [12], and finding the optimal implementation of a given S-box [13]- [15]. Because of the infeasible searching space of large S-boxes, these studies tended to concentrate on 4-bit S-boxes. Moreover, for classification purposes, introducing an appropriate equivalence relation is necessarily considered to ensure analysis efficiency. The well-known relations to group 16!(≈ 2 44.25 ) 4-bit S-boxes into equiv-alence classes are Affine-Equivalence (AE), Permutation-XOR-Equivalence (PXE), and Permutation Equivalence (PE) relations. In two independent reports [16], [17], the number of AE, PXE, and PE classes over 4-bit S-boxes were deduced as 302, 142,090,700(≈ 2 27.08 ), and 36,325,278,240(≈ 2 35.08 ), respectively.
As BOGI-applicability is preserved in a PXE class, and it is feasible to analyze the number of PXE classes exhaustively, we present an in-depth analysis of all BOGI-applicable S-boxes. Our analysis includes the security strength of the S-boxes themselves and the extent to which they affect the resistance of GIFT against differential and linear cryptanalysis (DC and LC). We partitioned the BOGI-applicable PXE classes into PE classes to enable us to additionally analyze their implementation costs. Based on our results, we suggest generalized properties of BOGI-applicable S-boxes and notable S-boxes for GIFT.

A. OUR CONTRIBUTIONS 1) CLASSIFICATION OF ALL BOGI-APPLICABLE 4-BIT S-BOXES
We search for and identify all BOGI-applicable PXE classes and deduce the total number of BOGI-applicable 4-bit S-boxes. Our search showed that only 2,413 PXE classes (186,392,448 S-boxes) are BOGI-applicable over 4-bit S-boxes. We define BOGI-applicable 4-bit S-boxes that provide the best differential uniformity and linearity as being ''Optimal BOGI-applicable'' 4-bit S-boxes with knowledge similar to that of the optimal 4-bit S-boxes discussed in [8]. Following the definition, we provide 20 optimal BOGIapplicable PXE classes out of the entire set of PXE classes.
Using the 20 optimal BOGI-applicable PXE classes, we conduct a detailed cryptographic analysis and generalize their security properties (Observation 1-5). Our investigations revealed that all optimal BOGI-applicable S-boxes fulfill the security properties considered by the designers of GIFT and that the differences among the PXE classes exist with respect to the other cryptographic properties.
We also explore the cost of implementing the optimal BOGI-applicable S-boxes in software and hardware, respectively. This was achieved by adopting two well-known measures -Bitslice Gate Complexity (BGC) and Gate Equivalent Complexity (GEC). The BGC of optimal BOGI-applicable S-boxes ranges from 10 to 13 whereas that of GEC ranges from 16 to 21 with the UMC180nm cell library. Our result shows that the (BGC, GEC) value of the GIFT S-box is (11,16) implying that there exist S-boxes that provide more efficient implementations in software. However, as the smallest BGC = 10 always leads the S-boxes to have fixed points, we can conclude that (11,16) is the best implementation cost without fixed-points.

2) SUGGESTION OF NOTABLE S-BOXES FOR GIFT
We jointly consider the implementation cost and the resistance against DC and LC, and suggest notable S-boxes for the GIFT 1 structures. This is accomplished by deducing the best differential and linear trails by replacing the existing S-box of GIFT with optimal BOGI-applicable S-boxes.
We first show that an exhaustive investigation of the resistance is possible by using only 1,728 non-DDT-equivalent S-boxes. Then, we deduce the 13-round best differential and linear trails of the 1,728 GIFT-64 variants, where only the existing S-box is replaced by one of the non-DDT-equivalent S-boxes. Our results show that the maximum differential probability and correlation potential at log 2 scale can be improved to (−68.4, −72) or (−70, −68) from the current (−62, −68). Although both of the most improved DC and LC resistances cause the implementation cost to increase, we identify 128 (+ 80 with fixed points) notable S-boxes that support the competitive DC and LC resistances within the same implementation cost of the GIFT S-box.
For GIFT-128 structure, we only consider S-boxes already demonstrated to perform competitively in GIFT-64 structure due to the computational intensiveness. Our results show that the maximum differential probability and correlation potential of 12-round trails of GIFT-128 variants can be improved up to (−76.4, −74) from the current (−60.4, −72).

B. ORGANIZATION
Section II defines the notations used in this paper with brief explanations of BOGI and equivalence relations over S-box. In Section III, all BOGI-applicable S-boxes are classified and we define optimal BOGI-applicable S-boxes. In Section IV, we scrutinize cryptographic properties of optimal BOGI-applicable S-boxes and their implementation costs. In Section V, we investigate how the BOGI-applicable S-boxes influence the security of GIFT against DC and LC, followed by several competitive S-boxes compared to the existing S-box of GIFT. Lastly, the conclusion is given in Section VI. Some of our analysis results are presented in the Appendices. 2

A. NOTATIONS
In this paper, all the S-boxes we consider are 4-bit bijective functions. The following notations are used throughout the paper.

· wt(x)
: The Hamming weight of a binary vector x.
· x · y : The inner product of x and y over F 4 2 . · DDT S : The difference distribution table of an S-box : The linear approximation table of an S-box S. The element LAT S (λi, λo) in row λi and column λo is |{x ∈ F 4 2 | λi·x = λo·S(x)}|−8.
· SQLAT S : The If the reference to S-box S is clear from the context, we omit S from the notations.

B. PREVENTION OF CONSECUTIVE SINGLE ACTIVE BIT TRANSITIONS
For differential and linear cryptanalysis, the differential uniformity [18] and linearity [19] of an S-box are considered to be the most basic but significant measures. The differential uniformity and linearity of an S-box S are denoted by U(S) and L(S) and defined as: It is already shown that the smallest U and L of 4-bit S-boxes are 4 and 8, respectively. Based on the above properties, ''Optimal 4-bit S-boxes'' are defined as the follows.

Definition 2 [8]: A 4-bit S-box S is called an
Optimal Sbox if it fulfills these three conditions: However, adopting an optimal S-box cannot always guarantee the optimal resistance of the ciphers against DC and LC, especially when bit-permutation is used for the permutation layer. Thus, Zhang et al. additionally considered the number of non-zero entries in DDT 1 and LAT 1 [9]. In other words, the non-zero entries represent single active bit transitions.
Definition 3 [9]: CarD1 S and CarL1 S of an S-box S denote the number of non-zero entries in DDT 1 S and LAT 1 S , respectively. This is because the single active bit differential(linear) transitions of an S-box may cause the differential(linear) trails of bit-permutation based ciphers to have only one single active S-box in each round, and raise a number of efficient trails for DC and LC. Indeed, such linear trails allow multidimensional linear cryptanalysis on PRESENT up to 26 rounds out of 31 rounds.
Direct mitigation for this weakness is to use an S-box that does not have any active transitions both in DDT 1 and LAT 1 (i.e., (CarD1, CarL1) = (0, 0)). However, it is shown that nonlinear 4-bit S-boxes cannot have such (CarD1, CarL1) values [11].
Indirect mitigation presented in [9] prevents short iterative trails consisting of only single active bit transitions from allowing (CarD1, CarL1) = (0, 0). This design approach alters PRESENT and SPONGENT 88 [20] by replacing their S-box, and improves the block ciphers in terms of the resistance against DC and LC. Moreover, RECTANGLE [21] is designed with the prevention strategy, and provides the relatively robust resistances in spite of adopting an S-boxes with (CarD1, CarL1) = (2, 2). Nonetheless, this design approach cannot fundamentally solve the problem of presenting consecutive single active bit transitions in trails. To be specific, RECTANGLE allows 2-round trails that consist of only single active bit transitions.

C. BOGI
BOGI was presented as the first design approach toward the fundamental prevention of consecutive single active bit transitions [1]. The approach reveals DDT 1 and LAT 1 such that it achieves the fundamental prevention. It was successfully applied to the bit-permutation based block cipher, GIFT. Consequently, GIFT supports both DC and LC resistance with even fewer rounds than PRESENT.
Before describing BOGI, we first introduce PRESENT round function, on which BOGI is based. The round function consists of a substitution layer composed of the same 4-bit S-box and a 64-bit permutation layer. The permutation layer is again composed of four independent 16-bit permutations and a nibble-wise permutation followed by key addition. The round function, except for the key-addition, can be described in Fig. 1. P j mix denote the four independent 16-bit permutations and P shuf denotes the nibble-wise permutation while the S-boxes in the i th round are denoted by S i 0 , S i 1 , . . . , S i 15 . The P j mix can easily be deduced to be a 16-bit mapping from the four S-boxes Although the structure of PRESENT round function provides full diffusion in 3 rounds, P j mix of PRESENT have an Achilles' heel -allowing consecutive single active bit transitions in linear trails. This is because PRESENT S-box has single active bit linear transitions. To overcome the weakness of P j mix in PRESENT, BOGI properly crafts P j mix to guarantee that an output bit of a single active bit transition (Bad Output) must go to an input bit of a single non-active bit transition (Good Input). However, to enable BOGI to be used, in addition to wellcrafted P j mix , an S-box has to obtain an appropriate BGT as presented in Lemma 1.

Lemma 1 [1]: To apply BOGI to an S-box S, the corresponding BGT S must consist of at least four all-zero rows and columns.
If an S-box satisfies the above condition, we denote the S-box to be BOGI-applicable. As seen in Table 1, PRESENT S-box(PS) is not BOGI-applicable because BGT PS consists of only one all-zero row and column each. On the contrary, BGT GS of GIFT S-box(GS) has four all-zero rows and columns, and thus GS is BOGI-applicable.

D. BOGI PERMUTATION
Once an S-box is BOGI-applicable, appropriate 16-bit permutations P j mix can be deduced for the S-box. Hereafter, we assume that the four structures of P j mix are equal to each other. This equality is not required to apply BOGI but may be preferred for implementation and design consistency. P j mix consists of a group mapping (ρ) and four individual mappings (π k ) as presented in Fig. 2. We assume that the group mapping is ρ of GIFT. 3 The group mapping ρ ensures that the input bits in each S-box originate from four different S-boxes in the previous round. At the same time, the bit orders of the bad (B) and good (G) outputs of each S-box in the i th round are preserved after passing through ρ. Considering the preserved orders, 3 GIFT-64 and GIFT-128 have the same P j mix . the individual mappings π k can be chosen to map the bad outputs to good inputs in the next round. For example, P j mix of GIFT adopts identity mappings for π k as presented in Fig. 2.
In this case, it should be noted that B cannot propagate to B. Likewise, all of the individual mappings that do not produce B -B matches are BOGI permutations for an S-box. Note that such mappings do not exist for non-BOGI-applicable S-boxes.
The number of BOGI permutations of a BOGI-applicable S-box can be deduced by the BOGI-spectrum defined in Definition 4.

Definition 4:
The BOGI-spectrum BG(S) of an S-box S denotes a tuple (R 0 , C 0 ), where R 0 and C 0 denote the number of all-zero row vectors and column vectors in BGT S , respectively.

E. EQUIVALENCE RELATIONS OVER S-BOX
Various equivalence relations are used when analyzing the S-boxes. The well-studied equivalence relations are XOR Equivalence (XE), PE, PXE, Linear Equivalence (LE), and CCZ relations. In this paper, we mainly deal with XE, PE, PXE, and AE relations over 4-bit invertible S-boxes.

Definition 5: If an S-box S can be defined from S as
for some two vectors c in and c out over F 4 2 , S and S are XOR equivalent (XE).

Definition 6: If an S-box S can be defined from S as
for some two bit-permutation matrices P in and P out over F 4×4 2 , S and S are Permutation equivalent (PE).
Definition 7 [10]: If an S-box S can be defined from S as for some two bit-permutation matrices P in and P out over F 4×4 2 , and two vectors c in and c out over F 4 2 , S and S are Permutation-XOR equivalent (PXE).

Definition 8: If an S-box S can be defined from S as
for some two non-singular matrices L in and L out over F 4×4 2 , and two vectors c in and c out over F 4 2 , S and S are Affine equivalent (AE).
The above equivalence relations enable the 4-bit S-boxes to be grouped into equivalence classes, such as the XE, PE, PXE, and AE classes. An algorithm to search all PXE classes over 4-bit S-boxes was first presented by Saarinen [10]. This algorithm was subsequently improved, and it was proved that every PXE class consists of 384 times the number of 4-bit S-boxes [11]. The improved algorithm can provide each representative of PXE classes and their size within several minutes.

III. CLASSIFICATION OF BOGI-APPLICABLE S-BOXES
In this section, we identify all BOGI-applicable S-boxes and classify them according to their differential uniformity and linearity. Because the BOGI-applicability is invariant in a PXE class as presented by Proposition 1, we only check the BOGI-applicability of the representatives of 142,090,700 PXE classes to investigate the BOGIapplicability of all 4-bit S-boxes.
Proposition 1 [1]: In a PXE class, BOGI-applicability is preserved. To be specific, if S is BOGI-applicable, S (x) = P out S(P in (x ⊕ c in )) ⊕ c out for all bit-permutation matrices P in , P out over F 4×4 2 and vectors c in , c out over F 4 2 is also BOGIapplicable.

A. DISTRIBUTION OF BOGI-APPLICABLE PXE CLASSES
Checking the BOGI-applicability of all 4-bit PXE classes required approximately 6 hours with our single-threaded program. The result yielded only 2,413 BOGI-applicable PXE classes. We briefly categorized them according to their cryptographic strength by introducing differential uniformity and linearity. Note that these cryptographic properties are also invariant in the PXE class. Table 2 provides the distribution of the BOGI-applicable PXE classes.
Because each PXE class has a distinct size, the distribution of the 4-bit S-boxes differs from that of the PXE classes. Table 3 provides the distribution of BOGI-applicable   As already shown in [1], BOGI-applicable PXE classes (S-boxes) that support U ≤ 4 do not exist. Therefore, (U, L) = (6,8) can be concluded to be the optimal choice. Based on the optimal choice, we define optimal BOGI-applicable S-boxes.
Definition 9: A 4-bit S-box S is called an optimal BOGI-applicable S-box if it fulfills these four conditions: There exist only 20 optimal BOGI-applicable PXE classes. Table 4 presents details of all the optimal BOGI-applicable PXE classes. The inverse relations in Table 4 suggest that none of the optimal BOGI-applicable S-boxes is selfpermutation-XOR equivalent, from which it can be concluded that they are unable to support involution (self-inverse).

B. BOGI SPECTRUM OF BOGI-APPLICABLE S-BOXES
As already mentioned in subsection II-D, the BOGI-spectrum BG of a BOGI-applicable S-box can give an insight into the BOGI permutations of the S-box. Especially, the number of BOGI permutations can easily be deduced by BG. The distribution of BG of BOGI-applicable PXE classes is presented in Table 5. These results lead to the conclusion that BG of optimal BOGI-applicable 4-bit S-boxes is always (2,2), and that the number of BOGI-permutations for optimal BOGIapplicable S-boxes is always 4. Observation 2: All optimal BOGI-applicable 4-bit S-boxes have BG as (2,2). This implies that there exist only four distinct BOGI permutations(π) for each optimal BOGIapplicable 4-bit S-box.

C. BOGI-APPLICABLE S-BOXES THAT FULFILL THE CRITERIA OF THE GIFT DESIGNERS
In this subsection, we traverse every BOGI-applicable S-box satisfying criteria suggested in [1]. Except for the consideration of the implementation, the following conditions are considered by GIFT designers.
· Condition 1 (GC1) : An S-box S is BOGIapplicable.  Table 6 presents the distribution of PXE classes that fulfill all the conditions GC1∼3. Only 363 PXE classes(43,118,592 S-boxes) satisfy all three conditions. Note that all optimal BOGI-applicable S-boxes satisfy the conditions. Observation 3: All optimal BOGI-applicable S-boxes satisfy the conditions GC1∼3.
This observation implies that none of the optimal BOGI-applicable S-boxes make any difference in terms of the previous design criteria considered by GIFT designers. However, as we show in the following section, the differences between optimal BOGI-applicable and non-optimal BOGIapplicable S-boxes manifest themselves in the other security properties and the implementation cost.

IV. EVALUATIONS OF OPTIMAL BOGI-APPLICABLE S-BOXES
In this section, we evaluate optimal BOGI-applicable S-boxes in terms of their cryptographic strength and implementation cost in more detail than the criteria considered by GIFT designers. Hereafter, we denote each of the 20 optimal BOGIapplicable PXE classes in Table 4 as B i with the corresponding index i. Although most of the cryptographic properties we evaluate are preserved in a PXE class, we sometimes partition each PXE class into the corresponding PE, XE classes for the properties that are not preserved in a PXE class. To be specific, optimal BOGI-applicable PE classes are discussed for the implementation cost in subsection IV-B whereas the XE classes are discussed in Section V.

A. SECURITY EVALUATIONS
In this subsection, we consider security properties of the 20 optimal BOGI-applicable PXE classes additional to those that were considered by GIFT designers. Although all optimal BOGI-applicable PXE classes satisfy GIFT designers' criteria as shown in Observation 3, our extra security evaluations present some differences between the PXE classes.

1) OPTIMAL BOGI-APPLICABLE AE CLASSES
We first compute the AE classes that include optimal BOGIapplicable PXE classes and present the results in Table 7. We refer to [22] for the index of AE class. Only four distinct AE classes include optimal BOGI-applicable PXE classes (five each). It is noted that BOGI-applicability is not preserved in an AE class. For example, the S-boxes that are included in the 25 th AE class but not included in B 0,2,4,6,8 are not BOGI-applicable.

2) DIFFERENTIAL SPECTRUM AND WALSH SPECTRUM
The differential spectrum of an S-box is related to GC2. In addition to the frequency of differential uniformity in DDT, the frequency of the other values in DDT may affect the resistance against DC. Thus, the differential spectrum could be of interest. The differential spectrum D spec is defined as follows.
Definition 10 [23], [24]: The differential spectrum of an S-box S : F n 2 → F m 2 is the multiset: In the similar concept, the Walsh spectrum of an S-box could be of interest. The Walsh spectrum was not included in the three primary conditions(GC1 ∼ GC3). However, the frequency of maximal values may also affect the resistance against LC. In consideration thereof, we evaluate the extended Walsh spectrum |L| spec . The (extended) Walsh spectrum of a Boolean function can be generalized for an S-box as follows.
Definition 11 [23]: The Walsh spectrum of an S-box S : F n 2 → F m 2 is the multiset:

Moreover, the extended Walsh spectrum of an S-box
|L| spec (S) is defined as the multiset of the absolute values in L spec (S).
Because D spec and |L| spec are invariant in an AE class, we only deduce them with the four AE classes in Table 7. Surprisingly, these four AE classes have the same differential, extended Walsh spectrum as Observation 4.
Observation 4: All optimal BOGI-applicable S-boxes have the differential spectrum D spec and extended Walsh spectrum |L| spec :

3) ALGEBRAIC DEGREE OF COMPONENT BOOLEAN FUNCTIONS
The algebraic degree, deg(f ), of a Boolean function f is the degree of the maximum term in the corresponding algebraic normal form. The algebraic degree for an S-box(vectorial Boolean function) S can be generalized as follows: where S a = a · S. In addition to the degree of the maximum term, the following multiset: deg spec (S) = {deg(S a )|a ∈ F n 2 − {0}} could be of interest. Because deg spec is invariant under affine equivalence, we again utilize the results in Table 7 to investigate deg spec of all optimal BOGI-applicable S-boxes. The evaluations present that every optimal BOGI-applicable S-box has the same algebraic degree spectrum as presented in Observation 5. This also implies that all optimal BOGI-applicable S-boxes have the algebraic degree of 3.
Clearly, at least two of the coordinate Boolean functions (S e i for a unit vector e i ∈ F 4 2 ) have the algebraic degree of 3 with the same knowledge of Theorem 3 as in a previous study [9]. This means that at least two non-zero entries are present in LAT 1 (i.e., CarL1 ≥ 2), which corresponds to our results in Table 9.

4) HAMMING WEIGHT ON THE SUB-OPTIMAL DIFFERENTIAL TRANSITION
Related to GC3, wt( i) + wt( o) for DDT( i, o) = 6 should be considered to reduce the occurrence of sub-optimal differential transition in a differential trail. As the Hamming weights are preserved in a PXE class, we compute the following multiset: Table 8 presents the value of W D6 of each of optimal BOGIapplicable PXE classes.
All optimal BOGI-applicable PXE classes have only two entries with the differential uniformity 6 as presented in Observation 4. One can notice that there exist better PXE classes than B 0 and B 1 , which include GIFT S-box and inverse S-box, respectively. Indeed, W D6 of B 4∼7 and B 14∼17 are {4, 5}, and W D6 of B 8,9,18,19 are {5, 5}. Thus, GIFT S-box may not be the best choice with respect to the prevention of sub-optimal differential transitions in a trail.

B. IMPLEMENTATION EVALUATIONS
In this subsection, we evaluate the implementation of optimal BOGI-applicable S-boxes with Peigen [14], which is based on LIGHTER [15]. We deduce both the softwareand hardware-oriented implementations. This enables us to jointly consider the software and hardware efficiency of the optimal BOGI-applicable S-boxes.

1) IMPLEMENTATION SEARCHING TOOL -PEIGEN
The implementation searching tool Peigen(or LIGHTER) can find the efficient(not always best) implementation of a given S-box within a set of the invertible instructions, denoted by B. Such implementations are denoted B-implementation. The searching method is based on bi-directional Dijkstra algorithm, and expands the two subgraphs until the predetermined expansion limit is reached (or when a proper stopping rule is satisfied). The expansion limit(λ in the paper [15] and ''−l'' in the corresponding tool 4 ) determines whether the obtained implementation is the best or not. We set an expansion limit to guarantee all the implementations we obtain are the best B-implementation. By tweaking the instruction set B and the corresponding costs, one can obtain the implementations of S-boxes for different environments. For more details, refer to [14], [15].

2) OPTIMAL BOGI-APPLICABLE PE CLASSES
Because the implementation costs are not invariant under PXE relation, we first partition each of optimal BOGIapplicable PXE classes into the corresponding PE classes. In total, the PE classes amount to 4,608. Moreover, as an S-box and the corresponding inverse S-box have the exactly same implementation complexity due to the searching way of Peigen, we only consider half of the entire number of PE classes (i.e., 2,304 PE classes).

3) SOFTWARE-ORIENTED IMPLEMENTATION
The complexity of the software implementation is measured by mainly using BGC [13]. Because BGC denotes the number of atomic operations used in the implementation, BGC directly determines the required cycle number and code size for bit-slice implementation of an S-box. The set B 5 includes invertible instructions that are constructed by software opera-tions{AND, XOR, OR, NOT and ANDN}. As the invertible instructions are constructed by at most three of the software operations, B includes instructions whose cost ranges from 1 to 3. We specify an expansion limit of ''8,'' which implies each subgraph can be expanded until its size becomes 11(=8 + 3). All the implementation costs we obtain are smaller than 15(= 2 × (8 + 1) − 3); thus, our obtained implementations can be proved to be the best B-implementation.

4) HARDWARE-ORIENTED IMPLEMENTATION
As a measure of the complexity of hardware implementation, GEC is mainly used. GEC denotes the logic size of implementation and may be affected by the gates to be used. We restrict the logic gates to those supported by the UMC180nm cell library to construct the hardware instruction set B. Each of the available gates and costs can be found in [14], and the cost of invertible instructions in B 6 ranges from 0.67 to 5. Searching tends to be more intensive than finding the software implementation because B becomes bigger. We apply the expansion limit of ''13,'' which means each subgraph can be expanded until its size becomes 18(= 13 + 5). As the implementation costs we obtain are smaller than 22.34(= 2 × (13 + 0.67) − 5), all the implementations are proved to be the best in the B-implementation.  (11,16), and 18 PE classes provide the same implementation cost. Table 10 shows the best implementation costs of 4,608 optimal BOGI-applicable PE classes. The best options for BGC and GEC are 10 and 16, respectively. However, both of the minimum costs cannot be provided at the same time. Thus, two options would be the best for (BGC, GEC): (11,16) and (10, 16.33).

5) TRADE-OFF BETWEEN SOFTWARE AND HARDWARE-ORIENTED IMPLEMENTATIONS
Considering that the smallest value of GEC is only possible with the option (11,16), which is (BGC, GEC) of the GIFT S-box, this S-box is indeed the best option in hardwareoriented designs. Moreover, our analysis shows that all the S-boxes whose BGC is 10 have at least one fixed point while GIFT S-box does not have any fixed points. A fixed point may cause the entire block cipher to be vulnerable to Invariant Attacks [25]- [27]. Although the weakness can be mitigated by using proper round constants as presented in [28], providing an appropriate round constant could burden designers even further. As a result, we conclude (11,16) is the best cost for implementing an optimal BOGI-applicable S-box.
Appendix VI lists the available implementation options in each optimal BOGI-applicable PXE class. The best options are supported only by B 0,1,2,3 .

V. NOTABLE S-BOXES FOR GIFT
In this section, we investigate competitive S-boxes compared to the existing S-box for GIFT. To do so, we check the probability of the best differential/linear trails replacing the existing S-box while fixing the diffusion layer as the bit-permutation of GIFT. We apply all optimal BOGI-applicable S-boxes that are available for the bit-permutation of GIFT-64. For GIFT-128, we only consider the promising S-boxes in GIFT-64 instead of all the S-boxes.
Before starting this section, we define the following in order to measure the resistance. Based on the results of (DR i , LR i ) γ , the minimum required number of rounds r min for the resistance against DC and LC can also be obtained. r min is defined as follows: Note that if fewer rounds than r min were used, the corresponding cipher would obviously allow DC or LC.
The best differential/linear trails of GIFT-64 are investigated in [29]. The result showed that r min of GIFT-64 is 14.

A. OPTIMAL BOGI-APPLICABLE XE CLASSES
Because the trail search is computationally intensive, we decrease the search space by introducing the XE relation. S-boxes that are included in an XE class have the same (DR i , LR i ) γ because the corresponding DDT S and SQLAT S are invariant in an XE class. Moreover, SQLAT S can be deduced from DDT S with bijective Walsh transform as follows: As a result, only XE classes whose DDT are distinct can be considered for trail searching. According to our results, 10,368 XE classes are included in optimal BOGI-applicable PXE classes, and all optimal BOGI-applicable XE classes have distinct DDT, and thus SQLAT as well. However, some of the XE classes cannot interplay with the original bit-permutation of GIFT(i.e., the B − B match occurs). This is because the mapping π k of GIFT is the identity. Among all the XE classes, only 1,728 XE classes can interplay with GIFT bit-permutation. Let XE I classes denote the corresponding XE classes whose DDT can adopt the BOGI-permutation as an identity. Now, we can consider only 1,728 XE I classes for every variant of GIFT-64. It should be noted that the considered number is the same for GIFT-128 variants.

B. RESISTANCE OF GIFT-64 VARIANTS AGAINST DC AND LC
As mentioned above, r min of GIFT-64 is 14. Thus, we focus on 13-round trails to check if the less r min ≤ 13 can be possible with other BOGI-applicable S-boxes. We searched the best trails based on the Branch&Bound technique presented in [31], which is relatively fast for bit-permutation based SPN ciphers. Table 11 shows (DR 13 , LR 13 ) 64 obtained with the 1,728 XE I classes. One can easily see that 192 XE I classes provide DR 13 and LR 13 ≤ −64 despite of only using 13 rounds(i.e., r min = 13). Since there is a trade-off between DR 13 and LR 13 , one can choose the two best options as (−68.4, −72) and (−70, −68).
Appendix VI shows (DR 13 , LR 13 ) 64 which can be supported by each PXE class. Only B 4,5,6,7,12,13,16,17 have r min = 13, and the two best options can be provided by B 4,5,12,13 among them. Because the bit-permutation of GIFT-64 is not involutory, an S-box S and its inverse S −1 do not always have the same (DR 13 , LR 13 ) 64 . However, despite the asymmetry, a PXE class and the inverse PXE class have the same results of (DR 13 , LR 13 ) 64 on the whole.

C. XE-PE INTERSECTION IN A PXE CLASS
In this subsection, we introduce the XE-PE intersection in a PXE class. When selecting the best S-boxes in an optimal BOGI-applicable PXE class, this intersection allows independent consideration of (DR i , LR i ) γ , and (BCG, GEC).
According to Proposition 2, a non-empty intersection of an XE class and PE class exists as long as they are included in the same PXE class. Thus, the BOGI-applicable S-boxes can always be selected with any available combinations of (DR i , LR i ) γ , and (BGC, GEC).

Proposition 2: There always exist S-boxes included both in a given XE class and PE class as long as the XE class and PE class are included in the same PXE class. We denote the non-empty intersection as XE-PE intersection.
Proof: Assume that there exists a non-empty PE class P and an XE class X in a PXE class PX such that P ∩ X = ∅. Let two S-boxes S P ∈ P and S X ∈ X . Because S P , S X ∈ PX , it follows that S X (x) = P out S P (P in (x ⊕ c in )) ⊕ c out for some two bit-permutation matrices P in and P out over F 4×4 2 , and c in and c out over F 4 2 . Let S X ∈ X be an S-box satisfying Because S X (x ⊕c in )⊕c out = P out S P (P in (x ⊕c in ))⊕c out , and thus S X (y) = P out S P (P in (y)) where y = x ⊕ c in , it follows that S X ∈ X and S X ∈ P. This contradicts the assumption that P ∩ X = ∅.
The above S-boxes support (DR 13  Although the cost of implementation in software increases slightly, these S-boxes can provide the best resistance against DC and LC, and thus the required number of rounds may be reduced relative to the current number of rounds. Because the number of rounds mainly affects the latency and throughput of the block cipher, designers may choose alternatives where speed measures, rather than the implementation costs, become the main consideration.

3) S-BOX PROVIDING THE SAME PERFORMANCE AS THAT OF EXISTING GIFT-64 STRUCTURE
The results in Table 12 indicate a total of 128(=2×16×4) XE I -PE intersections in which all the S-boxes provide the same performance as that of the current S-box to GIFT-64. Each of the intersections consists of a single S-box, and thus the total number of these S-boxes is 128. Among them, 96 S-boxes without fixed points are presented in Appendix VI.    However, we do not insist that GIFT designers should have used those better S-boxes. As presented in Table 9, CarL1 of B 2,3 is worse than B 0,1 . Thus, although the (DR 13 , LR 13 ) 64 is improved to (−62.8, −70.0) from the original (−62.0, −68), thorough analyses have to be conducted in order to ''well-replace'' the GIFT S-box.

E. NOTABLE S-BOXES FOR GIFT-128 STRUCTURE
The number of XE I classes(1,728) is still infeasible for larger block ciphers such as GIFT-128 because trail searching tends to take much more time than 64-bit block ciphers. However, considering only the S-boxes whose performance is not worse than the current S-box in GIFT-64 structure, we can deduce the corresponding (DR i , LR i ) 128 of GIFT-128 variants with 48 XE I classes(16 for the same performance and 32 for better performance).
For GIFT-128 structure, we investigate 12-round trails because searching longer trails requires significant time, 7 7 Each searching for the XE I classes takes from 5 min to 13 days on a personal computer. and (DR 12 , LR 12 ) 128 of GIFT-128 is (−60.4, −72), which implies GIFT-128 requires more than 24 rounds to become resistant against DC.
Appendix VI and VI present (DR 12 , LR 12 ) 128 of GIFT-128 variants we consider. The results show that (DR 12 , LR 12 ) 128 can be improved up to (−76.4, −74.0). Unlike (DR 13 , LR 13 ) 64 , a PXE class and its inverse PXE class have distinct results of (DR 12 , LR 12 ) 128 . Moreover, S-boxes that provide better performance in GIFT-64 structure cannot always guarantee better performance in GIFT-128. This implies that choosing a dedicated S-box for each version of GIFT may guarantee more promising performance.

F. EXTENSION TO OTHER BOGI-BASED BLOCK CIPHERS
The results of all our analyses except for (DR i , LR i ) γ can be reused for other BOGI-based block ciphers. However, as we show in Observation 6, only 10,368 BOGI-applicable XE classes can be considered rather than all the S-boxes. Moreover, the number can decrease again to 1,728 after determining the structure of P j mix . Therefore, our findings are expected to help designers analyze (DR i , LR i ) γ in their structures.

VI. CONCLUSION
In this paper, we conducted an exhaustive search for 4-bit BOGI-applicable S-boxes. By classifying the PXE classes with respect to their differential uniformity and linearity, we suggested 20 optimal BOGI-applicable PXE classes. We evaluated these PXE classes, and presented their generalized properties, which have not been analyzed before. Moreover, by partitioning the PXE classes into PE and XE classes, we explored their implementation cost and resistance against single-trail differential and linear cryptanalysis. Based on our investigations, we suggested notable S-boxes for both versions of GIFT. Although we only concentrated on GIFT, we expect our study to form the basis for extensions to other BOGI-based ciphers in future.
• The results in boldface are the best results for software or hardware implementations in each PXE class.

APPENDIX B AVAILABLE RESISTANCES (DR 13 , LR 13 ) 64 OF OPTIMAL BOGI-APPLICABLE PXE CLASSES
• GIFT S-box has the underlined result (−62, −68). • The results in boldface are the best resistances against DC or LC in each PXE class.
• The results that reduce the minimum required number of rounds r min to 13 are highlighted.

APPENDIX C S-BOX OF WHICH THE PERFORMANCE EQUALS THAT OF THE EXISTING S-BOX IN GIFT-64 STRUCTURE
The following S-boxes can provide the same (DR 13 , LR 13 ) 64 , and (BGC, GEC) as GIFT S-box. Moreover, all of the security properties we consider in this study are equivalent to those VOLUME 8, 2020 of GIFT S-box. In other words, the following 96 BOGIapplicable S-boxes provide: