On the Design of Chaos-Based S-boxes

Substitution boxes (S-boxes) are critical nonlinear elements to achieve cryptanalytic resistance of modern block and stream ciphers. Given their importance, a rich variety of S-box construction strategies exists. In this paper, S-boxes generated by using chaotic functions (CF) are analyzed to measure their actual resistance to linear cryptanalysis. The aforementioned papers emphasize on the average nonlinearity of the S-box coordinates only, ignoring the rest of the S-box components in the process. Thus, the majority of those studies should be re-evaluated. Integrating such S-boxes in a given cryptosystem should be done with a considerable caution. Furthermore, we show that in the context of nonlinearity optimization problem the profit of using chaos structures is negligible. By using two heuristic methods and starting from pseudo-random S-boxes, we repeatedly reached S-boxes, which significantly outperform all previously published CF-based S-boxes, in those cryptographic terms, which the aforementioned papers utilize for comparison. Moreover, we have linked the multi-armed bandit problem to the problem of maximizing an S-box average coordinate nonlinearity value, which further allowed us to reach near-optimal average coordinate nonlinearity values significantly greater than those known in literature.


I. INTRODUCTION
The cryptographic properties of vector boolean functions, or S-boxes, are thoroughly examined by introducing a rich list of desirable parameters an S-box should have in order to guarantee an acceptable resistance to sophisticated cryptographic attacks such as, for example, linear cryptanalysis [1] [2], differential cryptanalysis [3], boomerang attack [4] or interpolation attack [5]. Furthermore, S-boxes are widely used in modern cryptographic algorithms like AES [6], Whirlpool [7], Camellia [8] and many others.
Despite the rich variety of proposed methods for S-boxes generation, we mainly focus on S-box constructions benefiting from the study of chaos, to further analyze their actual resistance to linear cryptanalysis.
In Section II we introduce the definitions of some basic cryptographic characteristics used to measure the cryptographic strength of a given S-box.
In Section III we show that the actual nonlinearity value, or NL, of the majority of chaotic functions-based (CF-based) published S-boxes differs from the average nonlinearity value originally announced. This discrepancy is based on the fact that the aforementioned papers consider the average nonlinearity of the S-box coordinates only, or ACNV, ignoring the rest of the S-box components in the process. In Section IV, we propose an algorithm, which significantly outperforms all previously published S-boxes in terms of ACNV. During our experiments, we repeatedly reached S-boxes with ACNV of 114. We want to emphasize, that ACNV greater than 112.0, to the best of our knowledge, was never achieved in the literature.
In Section V, we demonstrate the efficiency of the proposed algorithm by optimizing the ACNV of some popular S-boxes. Thus, we show that the starting state of the optimization routines is negligible. Having this in mind, the competitiveness of S-boxes generated by exploiting chaos structures, at least in the context of S-box nonlinearity optimization problem, is arguable. The same observation was made in [9].
Then, in Section VI, we translate the S-box ACNV optimization problem to the multi-armed bandit problem, which allow us to further improve our results by reaching an ACNV of 114.5 -a value significantly larger than those known in literature.

II. PRELIMINARIES
Let B = {0, 1}. A Boolean function f (x) of n variables x 1 , · · · , x n is a mapping f : B n → B from n binary inputs x = (x 1 , x 2 , · · · , x n ) ∈ B n to one binary output We define each column of S LU T as coordinate of S. All linear combinations of coordinates of S are called components of S. Definition 4 (Linear Approximation Table): The linear approximation table of an S-box S(n, m), denoted by S LAT , is a (2 n ×2 m ) table, which entries are given by: where Y is a linear combination of the coordinates of the current S-box, X is the consequent linear function with length n and d H (X, Y ) denotes the Hamming distance between X and Y .
The linear approximation table of a given S-box S(n,n) reveals the actual correlation between the components of S and all linear Boolean functions sharing the same dimension n.
Definition 5 (S-box Nonlinearity): The nonlinearity of an Sbox S(n, m) is defined as S N L = 2 n−1 −max ({abs(w i )}) , where {w i } is the set of all elements in the LAT, excluding the first row and the first column.
Lower values of nonlinearity could be exploited by the family of linear cryptanalysis attacks. Having this in mind, higher nonlinearity value is a desirable S-box property.
Each S-box is uniquely defined by its LUT. Therefore, if we translate each row of the LUT as decimal number, we can obtain a unique decimal representation of the S-box denoted by DLUT.

III. CHAOS-BASED S-BOX CONSTRUCTIONS
The methods involved in CF S-box constructions are manifold. For example, chaos function combined with travelling salesman problem [10], chaotic substitution box design [11], 1D chaotic map combined with β-Hill climbing [12], chaotic map combined with sine-cosine optimization [13], chaotic system with multiple attractors [14], chaotic map combined with heuristics [15], one-dimensional discrete chaotic map [16], hyperchaotic systems [17] [18], spatiotemporal chaotic dynamics [19], chaotic map combined with genetic algorithms [20], chaotic logistic maps combined with bacterial foraging optimization [21] and many others (see Table 6). Usually, the best candidate of each method is further compared to others in terms of important cryptographic properties like nonlinearity, differential uniformity [22] and strict avalanche criterion (SAC) [23]. The majority of authors emphasize on the ACNV of their best candidate. In Table  1 the coordinate nonlinearities of several S-box candidates achieved by some CF-based methods are presented. A more detailed overview is given in Table 6.
The actual nonlinearity of an S-box is calculated by the minimum nonlinearity of all the components of the S-box. For example, let us take an arbitrary S-box F (5, 5) with Each column of F LAT is determined by some linear combination of coordinates of F , sorted lexicographically, from left to right, by the binary representation of the column index, zero-filled to 5. Let F LAT [i] denotes the i-th column of F LAT . Then, for example, the F LAT [11] column holds the nonlinear characteristics of the Boolean function f 1 ⊕ f 3 ⊕ f 4 , while F LAT [4] holds the nonlinear characteristics of the Boolean function f 3 . In Figure  1 the coordinate decomposition of F LAT is visualized. Each coordinate is associated with distinct color. The number of segments in each column corresponds to the number of terms in the respective linear combination of coordinates. Since F LAT [0] is the trivial linear combination (all coefficients are equal to zero), we leave the first column of Figure 1 colorless. For technical reasons and better illustration, the coordinate decomposition example is based on a (5, 5) S-box. However, it is applicable to S-boxes of any dimension.
As defined in Definition 5, we seek the maximum absolute value v of all the elements in S-box S(n, n) LAT, to find the nonlinearity of S, i.e. S N L = 2 n−1 − v.  Table 2 the actual nonlinearity of each S-box from Table 1 is calculated. The deviations observed are due to the fact that the designers consider the nonlinearity values of coordinates only (the non-segmented columns in the (8,8) coordinate decomposition).
In the context of block ciphers, a low nonlinearity Sbox value is associated with the cipher linear cryptanalysis resistance [24] [2] [25].

IV. ALTERNATIVE CONSTRUCTION
As we have shown in the previous section, the average value of the nonlinearities of the coordinates of a given S-box S doesn't correspond to the the actual nonlinearity of S. However, from the designer perspective, if a higher value of ACNV is desirable, a new heuristic construction is suggested.

V. RESULTS PART I
By using a magnitude of 10, we repeatedly generated S-boxes with high coordinate nonlinearities. During our experiments, we have tried various magnitude values. However, larger or smaller values of the magnitude are respectively too aggressive or too tolerant to the largest elements of the S-box LAT.
In Figure 7 the DLUT, in a hexadecimal format, of an optimized S-box S c (8, 8) is presented. The first row and column of the table correspond respectively to the first and second half of the input in hexadecimal format. For example, the input 11110101, equal to f5, is transformed by S c to 5d. The characteristics of S c are summarized in Tables 3 and 4. 1 hill climbing without neighborhood search  In [27], Table 5, a summary on the CF-based S-box constructions found in the literature is presented (an updated version of it is to be found in Table 6). We significantly outperform all of them in terms of ACNV and SAC, reaching the optimal SAC value of 0.5.
We further launched the algorithm on some popular (8,8) S-box constructions. However, because of the non deterministic nature of the optimization process, it is difficult to match a given S-box input S start , which is to be optimized, with the final optimized S-box S end . To achieve such matching, we have restricted the algorithm of changing the first 16 elements of S start . This allows us to further demonstrate the flexibility of the optimization process. Furthermore, since the first 16 elements of S start and S end are always shared, S end can be successfully matched to S start .
In Figures 3 and 4, an optimized by algorithm 1 versions of Rijndael [6] and Whirlpool [7] S-boxes are presented. The colored cells represent those elements of the corresponding S-box, which were not modified during the optimization process. Furthermore, in Figures 5 and 6, the optimized versions of Fantomas [28] and Skipjack [29] S-boxes are given.
All of the aforementioned S-boxes are optimized to the ACNV of 114.0. Algorithm 1 was implemented with the built-in tools provided by the open-source mathematical software system SageMath [30].
Each bijective S-box S(n, n) can be represented as a collection of n bandits, such that each bandit uniquely corresponds to some of the n coordinates of S. The arms of each bandit could be associated with the operation of applying a single transposition in S, while the profit of our action could be measured with the fitness function presented in Algorithm 1.
Associating each one of the possible n 2 transposition of elements of S DLUT to some distinct arm in each bandit is a trivial and non-working model -at the end, the bandits would be indistinguishable. Having this in mind, the following model is constructed: • Property I: Since each bandit uniquely corresponds to some coordinate of S, each bandit arm is restricted to initiate a transposition of two bits inside a column of S LU T only (instead of a transposition of any two elements in S DLU T ). • Property II: To keep the bijective property of S, in case VOLUME 4, 2016 an arm of some bandit is activated, the set of all distinct 2 n 2 bit transpositions in a given coordinate of S LU T is restricted to a subset of transpositions with a size of 2 n−1 .
The restriction introduced in Property II is motivated by the following observations: 1) Existence: If b 1 b 2 · · · b i · · · b n is a row from S LU T , flipping the bit b i will result in some other row Otherwise, if R is not among the rows of S LU T , S is not surjective, therefore not bijective, which contradicts our initial choice of S. 2) One-to-one Maping: If b 1 b 2 · · · b i · · · b n is a row from S LU T , flipping the bit b i will result in only one row is not injective, therefore not bijective, which contradicts our initial choice of S. 3) Search space: The total number of distinct bit sequences of the form b 1 b 2 · · · b i−1 b i+1 · · · b n is 2 n−1 . Let's denote as a bandit B i the bandit, which corresponds to the i-th coordinate of S. Each bandit consists of 2 n−1 distinct arms, s.t. each arm of B i corresponds to a distinct value of b 1 b 2 · · · b i−1 b i+1 · · · b n . Activating an arm of B i will result of interchanging two rows of S LU T , which differ only in bit position i.
For example, let's consider an S-box X(4,4), with X DLU T = [15,14,9,2,11,3,12,4,1,13,7,8,6,10,5,0]. X is a bijective S-box with dimension 4. Therefore, we can transform X as an 8-armed 4-bandit problem. In Figure 8, a visual interpretation of the bandits transformation of X is shown. Each row corresponds to a distinct bandit, while each pair of cells inside a given row, sharing the same color, corresponds to an arm of the given bandit. The x-axis represents the indexes of elements of X DLU T (starting from 1).
As an illustration, if we activate the white arm of bandit 1, we interchange the elements of X DLU T with indexes 14 and 4, i.e. 10 and 2. Their respective binary representations (with zero-fill of 4) are 1010 and 0010 (they differ only in bit position 1).
The profit (if any) of activating a bandit B i 's arm is measured by the same function E presented in Algorithm 1.
The transformation of the (n,n) bijective S-box ACNV optimization problem to the 2 n−1 -armed n-bandit problem allows us to focus on the optimization of the nonlinearity of single coordinates. Furthermore, by design, activating an arm of a given bandit doesn't affect the states of other bandits. Having this in mind, Algorithm 2 is proposed.

VII. RESULTS PART II
Our implementation of Algorithm 2 is based on a simple strategy Λ -we always choose a bandit, which posses the lowest nonlinearity. In case there are several bandits sharing the lowest value of nonlinearity, we choose one of them at random.

Algorithm 2
1: s ← R(n) the function R(n) generates pseudo-random bijective S-box S(n, n) 2: Ω ← M ODEL(s) We transform the S-box s to a 2 n−1 -armed n-bandit problem 3: repeat 4: bandit ← random(1, n, Λ) We choose a random bandit from [1, n], based on some profit-maximizing strategy Λ 5: We choose a random arm from 1, 2 n−1 6: oldBandit ← E(bandit) 7: ACTIVATE(bandit, arm) 8: if E(bandit) < E(oldBandit) then 9: We update the model 10:  We launched Algorithm 2 as a stand-alone optimization routine, starting from pseudo-randomly generated S-boxes, and in almost all of the instances we reached S-boxes with an average coordinate nonlinearity value of 112. However, when we initiated Algorithm 2 with S-boxes, which have been already optimized by Algorithm 1, we have reached an average coordinate nonlinearity value of 114.5. An example of such S-box is given in Figure 9. The corresponding nonlinearity by coordinates is given in Table 5.
In Table 6, an extended S-box comparison between the state-of-the-art methods is given. The entries are sorted, in increasing order, by ACNV (the last column).

VIII. CONCLUSION AND FUTURE WORK
CF-based S-box construction is a relatively new and interesting technique, which interconnects the tools provided by various academic disciplines with the problem of finding secure cryptographic primitives.
In this paper, we analyzed the actual linear cryptanalysis resistance of CF-based S-boxes, which differs from the average nonlinearity value announced by a great number of papers. Integrating such S-boxes in a cryptosystem should be done with a considerable caution. For example, if we interchange the Rijndael S-box in AES [6] with some CF-based S-box with higher ACNV, but lower overall nonlinearity, the resulting modified block cipher will be significantly weaker in terms of resistance to linear cryptanalysis. Furthermore,   we show that exploiting chaos structures, in the context of nonlinearity optimization problem, is arguable. Thus, the benefits of using chaos structures in the design of S-boxes is unclear and yet to be determined. However, as stated in [82], the chaos-based designs may be an alternative to application attacks, such as side-channel analysis.
Nevertheless, from designer perspective, if the overall nonlinearity value of an S-box S is negligible compared to the average nonlinearity value of all coordinates of S, two novel S-box constructions are suggested.
While Algorithm 1 yields better results than Algorithm 2, the latest could be used as an Algorithm 1 extension, to further improve the parameters of the resulting S-box. The methods presented in this paper significantly outperform all other state-of-the-art methods for designing S-boxes with high ACNV.
The linkage of the n-armed bandit problem to the problem of finding such S-boxes, opens an interesting area of future research -the investigation of how other state-of-theart methods, such as the concept of fuzzy graphs [83] [84], the stochastic optimization techniques [85] [86] [87], or the exploration-exploitation algorithms [88] [89] [90], could be exploited to further maximize the ACNV of a given S-box.
An interesting open question to be answered is to what extend the ACNV value of an (8,8) bijective S-box could be VOLUME 4, 2016 optimized? As summarized in [91], the maximal nonlinearity value achieved in balanced boolean functions with 8 variables is 116. Therefore, if an ACNV for an (8,8) bijective S-box greater than 116.0 is ever found, at least one of its eight components will posses nonlinearity value 118, which will finally give an answer to the long-standing problem of the maximum possible nonlinearity value of an eight variable balanced boolean functions. Furthermore, as shown in [92], the upper bound for eight variable balanced boolean functions is less than 120. Thus, the maximum theoretical possible ACNV of (8,8) bijective S-boxes is less or equal to 118.0, but most probably, considering the academic skepticism that eight variable balanced boolean functions with nonlinearity value 118 really exist, less or equal to 116.0. MIROSLAV M. DIMITROV was born in Yambol, Bulgaria in 1985. He received the B.S. degree in Informatics and M.S. degree in Information Security, both from the Faculty of Mathematics and Informatics, Sofia University. He is currently pursuing the Ph.D. degree in Informatics at Bulgarian Academy of Sciences. His research interests include cryptology, algorithms, and sequences.