New Genetic Operators for Searching for S-Boxes with Low Boomerang Uniformity

The boomerang uniformity measures the resistance of block ciphers to boomerang attacks and has become an essential criterion of the substitution box (S-box). However, the S-boxes created by the Feistel structure have a poor property of boomerang uniformity. The genetic algorithm is introduced to improve the properties of the S-boxes created by the Feistel structure. New genetic operators are designed for genetic algorithm to improve its searchability. The new genetic algorithm searches for the bijective S-boxes with low differential uniformity, high nonlinearity, and low boomerang uniformity. The experimental results show that the new genetic algorithm dramatically improves the properties of the S-boxes created by the Feistel structure and obtains some 8 × 8 S-boxes with excellent performance. We compare the S-boxes generated by the new genetic algorithm with those generated by the traditional genetic algorithm. The comparison results show that the S-boxes generated by the new genetic algorithm have better properties than the S-boxes generated by the traditional genetic algorithm, which demonstrates the effectiveness and superiority of the new genetic algorithm in generating S-boxes.


I. INTRODUCTION
The boomerang attack [1] is a variant of the differential attack. For ciphers that the probabilities of the differential characteristics decrease exponentially with respect to the growth of rounds, the boomerang attack can concatenate two short characteristics to form a longer characteristic with a better probability. In boomerang attack, two short parts E 0 and E 1 make up a larger characteristic E. Assume that p is the probability of the differential characteristic (α, β) for E 0 , and q is the probability of the differential characteristic (γ, δ) for E 1 . Then the probability of the boomerang distinguisher is The boomerang attack is an effective cryptanalysis tool, which has been successfully applied to famous block ciphers such as AES, IDEA and SHACAL1 [2]- [5].
Boomerang connectivity table (BCT) [6] provides a unified representation for boomerang-style attacks, which has become a new tool of substitution boxes (S-boxes) for more accurately evaluating the probability of generating a right quartet in boomerang-style attacks. The boomerang uniformity [7] is the maximum value in BCT among all nonzero input differences and output differences that measures the resistance of an S-box to a boomerang attack.
S-boxes are crucial nonlinear building blocks providing confusion in modern block ciphers. The emergence of cryptographic attacks has led to the development of criteria for resisting such attacks. Existing attacks require S-boxes to meet some cryptographic properties, including bijectivity, low differential uniformity [8], and high nonlinearity [9]. With the development of boomerang attacks, boomerang uniformity has become a new essential criterion for the Sbox, which has attracted the interest of researchers.
Boura and Canteaut [7] completely characterized the BCT of all differentially 4-uniform permutations of 4 bits and then studied these objects for inverse functions and quadratic permutations. Their work provided the first examples of differentially 4-uniform S-boxes optimal against boomerang attacks for an even number of variables. The boomerang unifor-mities of some specific permutations were studied in [10] and a class of 4-uniform BCT permutation polynomials over F 2 n were obtained. Mesnager et al. [11] focused their research on the boomerang uniformity of quadratic permutations in even dimensions. A new family of optimal S-boxes was found by generalizing previous results on quadratic permutation with optimal BCT. Calderini and Villa [12] further studied the boomerang uniformity of some non-quadratic differentially 4-uniform functions. Wang et al. [13] studied the boomerang uniformity of all normalized permutation polynomials of degree up to six over the arbitrary finite field F q by using the resultant elimination method. Li et al. [14] presented infinite families of permutations of F 2 2n for a positive odd integer n, which have the best-known nonlinearity and boomerang uniformity 4.
In addition to mathematical methods, intelligent methods have also been used to create S-boxes in recent years. Reinforcement learning was used to train a method expressed in the Markov decision process to an agent to generate Sboxes that can effectively resist the side-channel attack [15]. Heuristic evolution strategy improved the initial S-Boxes created by a modular operation [16]. The S-box construction time was reduced by constrainedly maximizing the nonlinearity of the S-boxes created by a random-restart hillclimbing algorithm [17]. The S-boxes based on Chaos were designed in [18]. The combination of the Chaos method and intelligent algorithm was also used to generate S-boxes. An artificial bee colony algorithm was used to optimize the S-boxes generated by chaotic sequence [19]. A β-hill climbing search was applied to improve the S-boxes based on chaotic map [20]. As an intelligent algorithm simulating the evolution of nature, the genetic algorithm provides a practical solution to the combinatorial optimization problem that is difficult to deal with by traditional methods and provides a new idea and means for the complex problems in cryptography.
Genetic algorithms have been increasingly used to generate S-boxes with good performances in recent years. The traditional genetic algorithm was used to generate S-boxes with good values of the confusion coefficient in terms of improving their side-channel resistance [21]. A method based on chaos and the genetic algorithm was proposed by [22] for designing an S-box. The full use of the traits of chaotic map and evolution process makes it possible to obtain a stronger S-box. A genetic algorithm working in a reversed way was proposed by [23], which can rapidly and repeatedly generate a large number of strong bijective S-boxes. Several genetic algorithms and problem sizes were explored by [24] to find functions having differential uniformity equal to 6. In addition, simulated annealing and genetic algorithm were used to optimize the design of symmetric-key primitives in [24].
S-boxes constructed by the Feistel structure have the advantage of low hardware implementation cost [26]; however, they have high boomerang uniformities. In this paper, a new genetic algorithm is introduced to improve the properties of the S-boxes created by the Feistel structure. The new genetic algorithm generates 8×8 bijective S-boxes with low differential uniformity, high nonlinearity, and low boomerang uniformity. This is the first time a meta-heuristic algorithm has been used to search for S-boxes with low boomerang uniformity. A new crossover operator and a new mutation operator are proposed to improve the performance of the genetic algorithm. Benefiting from the full use of the advantages of gene exchange and gene mutation, the new genetic algorithm in this paper dramatically improves the properties of the Sboxes created by the Feistel structure. In addition, we compare the S-boxes generated by the new genetic algorithm and the S-boxes generated by the traditional genetic algorithm. The comparison results show that the S-boxes generated by our new genetic algorithm have better properties than those generated by the traditional genetic algorithm. The experimental results show the effectiveness and superiority of our new genetic algorithm.
This paper is organized as follows. Section 2 gives some preliminaries on necessary concepts. Section 3 describes our new genetic algorithm and the traditional genetic algorithm. Section 4 illustrates the experimental parameters and gives the results of this paper. Then the results are compared and analyzed. Finally, section 5 concludes this paper.

II. PRELIMINARIES
A bijective n × n S-box is a permutation on F n 2 . Mathematically, S-box is a vectorial Boolean function F : F n 2 → F n 2 , which can be defined as a vector F = (f 1 , f 2 , ..., f n ). The Boolean functions f i : F n 2 → F 2 , i ∈ {1, 2, ..., n} is called the coordinate function of F . The component functions of an n × n-function F are all the linear combinations of the coordinate functions with non all-zero coefficients. Definition 1 (Differential uniformity [8]): Let F : F n 2 → F n 2 be a n × n vectorial Boolean function. The derivative of Sbox F with regard to vector a ∈ F n The symbol # here represents the number of solutions in the set. Differentially δ F -uniform is the maximum value of DDT F (a, b) for every non-zero a ∈ F n 2 and every b ∈ F n 2 , i.e., Definition 2 (Nonlinearity and linearity [9]): Let F : F n 2 → F n 2 be a n × n vectorial Boolean function. The nonlinearity of an S-box F is defined as the minimum Hamming distance between all non-zero component functions of F and all nvariable affine Boolean functions, which can be represented by the Walsh spectrum, The Walsh spectrum of an n×n F with respect to two vectors where b · F for all b ∈ F n 2 and b = 0 are called component functions and symbol · is an inner product over F 2 . The linear approximation table (LAT) of F is The linearity of a S-box F is defined as Definition 3 (Boomerang uniformity [7]): be a n × n invertible vectorial Boolean function. For input difference a ∈ F n 2 and output difference b ∈ F n 2 , the entries of the boomerang connectivity table (BCT) are defined as

III. GENETIC ALGORITHMS
Genetic algorithm [27] is a computational model that simulates the evolution process of nature, which has been successfully applied to various optimization problems. In recent years, many researchers have also applied genetic algorithms to design block cipher primitives. The genetic algorithm principle is based on Darwinian natural selection and Mendelian genetics. The selection method allows high-quality individuals to be more likely to survive and improves the quality of individuals in the population. Mendelian genetics provides a theoretical basis for the population to produce new individuals. The crossover operator recombines the genes of the two-parent individuals to generate two new individuals, which is the primary way to generate new individuals. The mutation operator generates new individuals by changing the genes at specific loci. As the primary way of generating new individuals in genetic algorithm, genetic operators have a significant impact on the performance of genetic algorithm. Traditional genetic operators are universal, but they can not guarantee to generate better new individuals. We design new genetic operators for the genetic algorithm to produce better individuals in the process of evolution.
Algorithm 1 depicts the framework of our genetic algorithm. In Algorithm 1, the size of population P is N . Individuals in the initial population are created by an unbalanced Feistel structure. r p is a randomly generated probability. The parents in the population perform crossover according to probability p c . The probability of mutation is p m . In our work, the termination condition of the genetic algorithm is that the maximum number MAX of generations is reached. C F is the fitness function that calculates the fitness value f p for the individual. Next, the components of the genetic algorithm will be introduced in detail.

Algorithm 1
The Framework of Our Genetic Algorithm 1: for each p ∈ P do 2: p ← Unbalanced Feistel structure; 3: f p ← C F (p); 4: end for 5: g ← 0;//Number of iterations 6 k individuals are randomly selected; 11: Two individuals with the lowest fitness values are copied into the new population; 12: end for 13: //The process of crossover; 14: for if r p < p c then 16: (p, q) ← randomly select two individuals from the population; 17: (p, q) ← Crossover operator (p, q); 18: //The process of mutation; 23: if r p < p m then 25: p ← The i-th individual in P ; 26: p ← Mutation operator (p); 27: f p ← C F (p); The form of permutation encoding is intuitively more suitable for representing S-boxes. In this representation, the bijectivity property is automatically satisfied. An n × n S-box is represented as an array of 2 n integer numbers with elements in range [0, 2 n − 1]. Each value occurs exactly once in an array and represents one entry for the S-box lookup table.
b: Initial Population Individuals in the initial population are created by an unbalanced Feistel structure. We extend the method in [26] to generate 8×8 S-boxes. Let f is a seven-variable nonlinear Boolean function, and the variable is

c: Fitness Function
The fitness function design is related to the criteria for evaluating the S-box. The properties of the S-box concerned about in this paper mainly include differential uniformity, nonlinearity, and boomerang uniformity. Our fitness function is It is easy to see that the first term δ F and the third term β F are differential uniformity and boomerang uniformity, respectively. Both of these two terms in the S-box are as low as possible. However, the higher the nonlinearity, the better. For consistency, the second term in the fitness function is linearity L F . Algorithm 2 gives the calculation process of the fitness function.

d: Selection Operator
The k-tournament selection [28] is suitable for target minimization problem. First, k individuals are randomly selected from the population P . Then the two individuals with the smallest fitness values are copied into the new population. Repeat this process until the size of the new population reaches N .

a: New Crossover Operator
In order to improve the performance of the genetic algorithm, we design a new crossover operator for the genetic algorithm. The fitness function considers three properties of an S-box: differential uniformity, linearity, and boomerang uniformity. The smaller their values, the better. In each iteration, the new crossover operator takes advantage of gene exchange to reduce the values of three properties.
The new crossover operator is described in Algorithm 3. First, randomly select two individuals p and q from the population as the two parents. Let p = p 0 , · · · , p 2 n −1 and q = q 0 , · · · , q 2 n −1 . The crossover processes performed on p and q are similar. We take individual p as an example. Find the input-output differential pair (a, b) that satisfies DDT p (a, b) = δ p in the differential distribution table. For each pair (a, b), find p i , i ∈ [0, 2 n − 1] that increases DDT p (a, b) of p, and exchange p i and p j to obtain a new individual p , where p j = q i . If δ p ≤ δ p , L p ≤ L p , β p ≤ β p , DDT p (a, b) ≤ DDT p (a, b) and no new value is added to δ p after exchange, replace p with p . If DDT p (a, b) < DDT p (a, b), find the next input-output differential pair (a, b) satisfying DDT p (a, b) = δ p and repeat the process. If DDT p (a, b) = DDT p (a, b), find the elements adding DDT p (a, b) in p and perform the same operation to reduce DDT p (a, b). The process of reducing the boomerang uniformity is similar to that of reducing the differential uniformity. When reducing the linearity, it should be considered in two cases: L p = max The partially mapped crossover (PMX crossover) [29] is the traditional crossover operator we use. Randomly select two individuals p and q from the population as the two parents. Let p = p 0 , · · · , p 2 n −1 and q = q 0 , · · · , q 2 n −1 . Randomly select two positions (c 1 , c 2 ), c 1 , c 2 ∈ [0, 2 n − 1], and exchange the gene fragments of the two parents between c 1 and c 2 . Check the elements in the uncrossed gene segment

IV. EXPERIMENTAL SETUP AND RESULTS
This paper uses the new genetic algorithm and the traditional genetic algorithm to search for 8 × 8 S-boxes with low differential uniformity, high nonlinearity, and low boomerang uniformity.

A. EXPERIMENTAL SETUP
For the traditional genetic algorithm and the new genetic algorithm, we run 30 experiments, respectively. Except for the different genetic operators, the other parameters of the two genetic algorithms are the same. The parameter values are determined based on experience and experimental feedback. The population size N is 256. The tournament size k is set to 3. Different crossover probabilities and mutation probabilities have no significant impact on the search of traditional genetic algorithm. The higher the crossover probability and the mutation probability for our new genetic algorithm, the better. Therefore, we set these two parameters to relatively large values. Crossover probability p c = 0.9, and mutation probability p m = 0.1. The maximum number of iterations is determined by observing the output of experimental results, and MAX=400.

B. EXPERIMENTAL RESULTS AND ANALYSIS
In our work, in addition to the differential uniformity and nonlinearity, we also consider the boomerang uniformity. Table 3 shows the distribution of these three properties in the initial population and the end population. The data 16#13 in Table 3 means that the number of differential uniformity δ p = 16 in the initial population is 13. As can be seen from Table 3, for the initial population created by the unbalanced Feistel structure, the differential uniformity, nonlinearity, and boomerang uniformity are concentrated at 64, 64, and 256, respectively. The best differential uniformity, nonlinearity, and boomerang uniformity in the initial population are 16, 96 and 52, respectively. The best S-box created by the unbalanced Feistel structure is given in Table 1. It can be seen that the properties of the initial population created by the unbalanced Feistel structure are not ideal, especially the boomerang uniformity. At the end of the iteration, the differential uniformity, nonlinearity, and boomerang uniformity of the population obtained by our new genetic algorithm are concentrated at 12, 94, and 20, respectively. At this time, the best values of differential uniformity, nonlinearity, and boomerang uniformity in the population are 6, 108, and 10. At the end of the population, there are 119 δ p ≤ 10, 47 N p ≥ 96 and 9 β p ≤ 16. According to the comparison in Table 3, on the whole, the new genetic algorithm improves the properties of S-boxes created by the unbalanced Feistel structure.
In Table 4, S-box1-S-box4 are generated by our new genetic algorithm, and S-box5 is generated by the traditional genetic algorithm. It can be seen from Table 4 that the S-box1 and S-box2 generated by our new genetic algorithm have the best cryptographic properties: the lowest differential uniformity 6, the highest nonlinearity 108, and the lowest boomerang uniformity 10. Table 2 shows the lookup table of S-box1.  Table 5 compares the cryptographic properties of S-boxes generated in different ways. The values of random S-box are the expected values of differential uniformity, nonlinearity, and boomerang uniformity given in [33]- [34]. As can be seen from Table 5, the S-box generated by the new genetic operator has better properties than the random S-box. Moreover, the S-box generated by the new genetic operator is comparable with those generated by other methods.

S-box
Method New genetic algorithm 6 108 10 [21] Traditional genetic algorithm 12 98 20 [23] Reversed genetic algorithm 6 112 6 [31] Tweaking 6 106 14 [32] Gradient descent method 8 104 16 [15] Reinforcement Learning 12 98 20 [33], [34] Random 11.34 92. 7 20.2 In summary, the new genetic algorithm has successfully improved the properties of the S-boxes created by the unbalanced Feistel structure. Moreover, the S-boxes generated by the new genetic algorithm have better properties than those generated by the traditional genetic algorithm, demonstrating the effectiveness and superiority of the new genetic algorithm in searching S-boxes.

V. CONCLUSIONS
In this paper, a genetic algorithm is used to improve the properties of the S-boxes created by the Feistel structure. New genetic operators are designed for the genetic algorithm to generate 8 × 8 S-boxes with low differential uniformity, high nonlinearity, and low boomerang uniformity. It is the first time that a genetic algorithm has been used to improve the boomerang uniformity of the S-box. Experimental results show that the new genetic algorithm successfully improves the properties of the S-boxes created by the Feistel structure. The S-boxes generated by the new genetic algorithm have better properties than those generated by the traditional genetic algorithm, which shows the effectiveness and superiority of the new genetic algorithm in generating S-boxes. In the future, genetic algorithms can be used to generate Sboxes of different sizes. On the other hand, other new genetic operators can be designed to generate S-boxes with better performances.