Evolutionary Algorithm-based images, humanly indistinguishable and adversarial against Convolutional Neural Networks: efficiency and filter robustness

Convolutional neural networks (CNNs) have become one of the most important tools for image classification. However, many models are susceptible to adversarial attacks, and CNNs can perform misclassifications. In previous works, we successfully developed an EA-based black-box attack that creates adversarial images for the target scenario that fulfils two criteria. The CNN should classify the adversarial image in the target category with a confidence ≥ 0.95, and a human should not notice any difference between the adversarial and original images. Thanks to extensive experiments performed with the CNN C = VGG-16 trained on the CIFAR-10 dataset to classify images according to 10 categories, this paper, which substantially enhances most aspects of [1], addresses four issues. (1) From a pure EA point of view, we highlight the conceptual originality of our algorithm EAdtarget,C, versus the classical EA approach. The competitive advantage obtained was assessed experimentally during image classification. (2) We then measured the intrinsic performance of the EA-based attack for an extensive series of ancestor images. (3) We challenged the filter resistance of the adversarial images created by the EA for five well-known filters. (4) We proceed to the creation of natively filter-resistant adversarial images that can fool humans, CNNs, and CNNs composed with filters.


I. INTRODUCTION
I N 2012, Krizhevsky et al. [2] presented the outstanding performance of convolutional neural networks (CNNs) on a very difficult image classification task [3]. Since then, as computing capacity has increased, CNNs have become the main driver of many computer vision applications, including image classification [4]- [7], facial recognition [8], [9], malware detection [10]- [12], email spam filters [13], speech recognition [14], [15], robotics [16], and self-driving cars [17], [18]. Despite their increasing power and the variety of applications, CNNs are susceptible to deception. In the context of image classification, by analogy with Trompe-l'oeil that challenges humans' visual perception, a CNN can be led to misclassification of objects in an image. The generic attack to create such specially crafted adversarial images consists of adding some appropriate noise to a legitimate input, leading the network to label the new input in a different category than expected [13], [19], [20].
White-box and black-box attacks differ according to the level of knowledge about the addressed CNN at the disposal of the attacker. While the attacker has a complete knowledge of the CNN model, its design and its parameters in the former case, the situation is opposite in the latter case, in which the attacker's knowledge is scarcely limited to the size of the images handled by the CNN, and to classification values outputted by the CNN for ad-hoc queries (but without any information about how these values are obtained). Starting from an original image labeled by a CNN as representing an object in a specific category, methods that creates adversarial images may adopt different scenarios. For instance, a targeted attack creates an adversarial image that the CNN misclassifies as belonging to an a priori particular predefined class, different from the original one. A different scenario is addressed by untargeted attacks that only require the CNN to misclassify the adversarial image as belonging to any class whatsoever, provided that this class differs from the original one.
Although efficient against a CNN, the perturbations added to an original image to create an adversarial image may be highly noticeable for a human eye, as illustrated in Fig. 1 (a), (b), (c), and (d). Our evolutionary algorithm-based blackbox, targeted attack EA target,C d (introduced in [21], [22], see also [23]) differs from existing techniques in this respect. Not only does our evolutionary algorithm (EA) efficiently produce adversarial images that deceive the targeted CNN model C with high accuracy, but the perturbations added by our algorithm to the original image are not perceptible to the human eye, as shown in Fig.1e (original image in the first row, our adversarial image in the second row). The images in the first row represents the original images and in the second row the adversarial images and their respective class labels that are created by (a) One-Pixel attack [24], (b) Few-Pixels attack [25], (c) Fooling Transfer Net (FTN) [26], (d) Scratch that! [27], and (e) our EA-based attack [21], [22] .
The purpose of this study, which substantially enhances most aspects of [1], is to address four issues: (1) Conceptual originality and competitive advantage of our algorithm EA target,C d , (2) intrinsic performance of this EA-based attack, (3) filter resistance of the adversarial images created by the EA, and (4) creation of natively filter-resistant adversarial images. Before being more specific, let us point out that all experiments in this paper are performed with distance d = L 2 for the CNN C = VGG-16 [6], [28] trained on the CIFAR-10 [29] dataset to classify images according to 10 categories, and mainly address the target scenario, but also, to a lesser extent, the untargeted scenario (see Section II).
The issue (1) (see Section III) relates to a series of conceptual differences, from a "pure" evolutionary algorithm point of view (hence independent of the task to perform, to some extent), between our adapted version and the classical EA approach [30]. To assess the practical impact of these conceptual differences, we analyzed their respective performances in creating adversarial images for a demanding definition of a successful attack. Indeed, for the (c a , c t ) target scenario performed on an ancestor A classified by VGG-16 in c a , we require these algorithms to create in less than 7000 generations an adversarial image D classified by VGG-16 as belonging to c t = c a with a c t -label value ≥ 0.95, while remaining so close to A that a human would not notice any difference between the adversarial and the ancestor. The first outcome of this study is that our "adapted EA" significantly outperforms the "classic EA" approach at creating such adversarial images for all considered cases, since it requires between 8% and 25% fewer generations to terminate successfully.
We then address issue (2) by a thorough and extended efficiency study of our EA-based attack with the "adapted EA" version (see Section IV). In the first series of experiments with one ancestor per category of CIFAR-10, we performed 10 independent runs per ancestor per target category, leading to a total of 900 attacks. The algorithm EA target,VGG-16 L2 achieves a 100% success rate (all ancestor-target categories are achieved for at least one of the 10 runs performed on each ancestor), requiring between 290 and 2793 generations on average, depending on the (c a , c t ) target scenario. To better assess the importance of the choice of the ancestor in a given category c a , and the impact of the seed value used for a specific run, we extend these experiments. In a second series of experiments, we randomly selected 50 distinct ancestors for each of the 10 categories of CIFAR-10 and altogether ran 4500 attacks for the target scenario. In this case, our algorithm achieved a success rate of 98%, requiring between 461 and 1717 generations on average. Moreover, both series of experiments show that a run of EA target,VGG-16 L2 has more than 96% (actually 96.56% for the former, and 98.06% for the latter series) to terminate successfully, and to create images that fool humans and VGG-16 trained on CIFAR-10, despite our demanding requirements for a successful termination.
Issues (3) and (4) (addressed in Sections V and VI, respectively) deserve to be put into the following broader perspective. Let A be an image classified by a CNN C in some category c a , and D be an adversarial image; for the target scenario, C classifies in a distinct category c t (at this stage, the type of attack that leads to D does not matter). One now considers a function F that acts on such images to create images F(A) and F(D) of the size handled by the CNN (which coincides with the same common size of A and D in the present case). How does the CNN classify these new images? Does F(D) remain adversarial, or does the composition C • F (that consists in putting F ahead of C) protect C against the attack? If this latter case holds, can one adapt the attack to create images that fool not only C, but also the F-enhanced CNN C • F? If yes, would such images, adversarial for C • F, be adversarial as well for C • G for G = F, and hence have the capability to fool the same CNN C but enhanced by other functions G?
Among the different meaningful functions F one could think of in this context, we undertake the study for filters. Indeed, daily used in image processing, filters substantially impact the visual appearance of images for a human eye on the one hand, and potentially affect the classification process of a trained CNN on the other hand. It is therefore tempting to check whether adding filters may prevent CNNs from misclassification, or reduce this risk to some extent, when facing an adversarial image. Additionally, one may also want to evaluate the quality of adversarial images by their capacity to mimic the ancestor's image behavior when exposed to filters.
For reasons given in Section V, in which issue (3) is discussed, we proceed to the selection of five filters, namely the inverse filter (F 1 ), the Gaussian blur filter (F 2 ), the median filter (F 3 ), the unsharp mask filter (F 4 ), and the F 5 combination of the two last filters. With each of them, we filter the ancestor A a and the adversarial images D a,t (A a ) created by the EA target,VGG-16 L2 algorithm in Section IV. VGG-16 was then challenged with the filtered images. The values of a series of specifically designed indicators led to two conclusions. On the one hand, the Inverse, and the Unsharp mask filters are significantly inefficient against our EA, because, for instance, 95% of the adversarial images filtered by F 4 remain adversarial for the target scenario, and 95% remain adversarial for the untargeted scenario (in a relaxed sense to be made precise in this Section). A contrario, the other filters, especially the combination F 5 , render our EA-based attack less effective for both the target and the untargeted scenario.
This led us to address the final issue in (4). For a filter F , we conceived a filter-enhanced F -fitness function (see Section VI), and the corresponding algorithm EA target,VGG-16 L2,F , obtained from EA target,VGG-16

L2
, by updating the fitness function accordingly. For the reasons given in Section VI, we select F = F 5 , and allocate to EA target,VGG-16 L2,F5 the task to create adversarial images that are moreover natively immune against filter F 5 . In other words, these adversarial images simultaneously fool C and C • F 5 for C = VGG-16 for the target scenario (still with the demanding target label value ≥ 0.95), while remaining so close to the ancestor that no human eye would notice any difference. We performed similar experiments for issue (2). The first series of 900 attacks (one ancestor per ancestor category, 10 independent runs for each (c a (A a ), c t ) scenario) shows that EA target,VGG-16 achieves a success rate of 96.66% (three combinations were not achieved), and that the probability that it terminated successfully for a given run was 95.77%, requiring an average between 798 and 2746 generations for the successful (c a (A a ), c t ) considered. In a second series of 4500 attacks performed with 50 different ancestors per category, EA target,VGG-16

L2,F5
showed a success rate of 88%, with between 1250 and 2404 generations on average.
We complete study (4) by exploring whether an adversarial image, constructed by EA target,VGG-16

L2,F5
to fool both C and C • F 5 , would also be adversarial against C • F k for the other filters F 1 , F 2 , F 3 , andF 4 for C = VGG-16. Our study shows that it is so for F 3 and F 4 with (depending on the target or untargeted scenario) between 83% and 89% of the images remaining adversarial against these filters. 56% of theses images were also adversarial for F 1 for the untargeted scenario, while this percentage dropped to 23% for F 2 . Therefore, the EA target,VGG-16

L2,F5
attack, designed to be robust against C and C • F 5 for C =VGG-16, is also robust to a significant extent against all individual filters for the untargetedscenario.
Section VII summarizes the conclusions of this case study, and provides a series of research directions.

II. THE TARGET SCENARIO ON VGG-16 TRAINED ON CIFAR-10
Although applicable to any CNN trained at image classification on any dataset, we instantiate our approach on the concrete case of VGG-16 [6] trained on CIFAR-10 [29].

A. VGG-16 TRAINED ON CIFAR-10
The CIFAR-10 dataset encompasses 50, 000 training images, and 10, 000 test images of size 32 × 32 × 3, meaning that each image has a width and height of 32 pixels, and each pixel has a color resulting from the three RGB values. Once trained, VGG-16 sorts images according to the 10 categories c i of CIFAR-10 listed in the 2 nd row of Table 1, composed of 4 "Object" categories (c 1 , c 2 , c 9 , c 10 ), and of 6 "Animal" categories (c i for 3 ≤ i ≤ 8). The 4 th row of Table 1 displays the original ancestor images A a in the categories c a (and their respective c a -label values, see below) used throughout this paper, and the 3 rd row give their reference number in the test set of CIFAR-10.
In practice, an input image I given to VGG-16 trained on CIFAR-10 is processed through 16 layers to produce a classification output vector:  1: For 1 ≤ a ≤ 10, the image A a (and its reference number n o in the test set of CIFAR-10) classified by VGG-16 in the category c a , with its corresponding c a -label values. These images are used as ancestor in most of our experiments. a  1  2  3  4  5  6  7  8  9  10  ca  plane  car  bird  cat  deer  dog  frog  horse  ship  truck  n o  281  82  67  91  455  16  29  17  1  higher the confidence that I represents an object of category c k .

B. TARGETED AND UNTARGETED SCENARIOS
The target scenario consists of first choosing two different categories, c t = c a , among the 10 categories of CIFAR-10. Then, one is given an ancestor image A labeled by VGG-16 as belonging to c a . Finally, one constructs an adversarial image D, classified by VGG-16 as belonging to c t , although D remains so close to A that a human would likely classify D as belonging to c a or even be unable to distinguish D from A. The classification threshold value is set at τ = 0.95, meaning that such a D has achieved its purpose if o D [t] ≥ 0.95. We shall also encounter in Section V the slightly different untargeted scenario. In this case, an adversarial image D is still required to be similar to A for a human eye, but one only requires that VGG-16 classifies D as belonging to a category c = c a , in the limited sense that the label value of c outputted by VGG-16 for D is the largest among all label values, and is strictly larger than the label value of c a . In particular, an image adversarial for the target scenario is also adversarial for the untargeted scenario, but the inverse may not be true.

III. "ADAPTED_EA" VERSUS "CLASSIC_EA"
Our evolutionary algorithm EA target,C d (see [1], [21]- [23], [31]) is a black-box, targeted attack that constructs adversarial images against a CNN C in the sense sketched in Subsection II-B, where d is a metric assessing the proximity for a human eye between the evolved images and the original image.
In this section, we show, from a "pure" evolutionary algorithm point of view, that EA target,C d presents a series of important and substantial differences compared to the approach classically ( [30]) adopted for EAs performing similar tasks, and we prove that these differences lead to a comparative advantage in terms of performance. First, we examine these differences from a conceptual point of view, meaning independently of any specific task. For simplicity, we refer to our version as "adapted_EA" and to its classical version as "classic_EA". We then compared the performances of these algorithms for the task consisting of fooling VGG-16 trained on CIFAR-10 for image recognition in the target scenario. In other words, these algorithms are given the task of evolving an ancestor image A into an adversarial image D fulfilling the conditions described in Subsection II-B. We specify the parameters of the EAs, and run the algorithms for four different ancestor/target combinations.
All experiments in this paper were implemented in Python 3.7 with the NumPy [32] library. For the filter experiments in Sections V and VI, we used the OpenCV implementation library [33]. Keras [34] was used to load and run the VGG-16 [6] model. The experiments were performed on nodes with NVIDIA Tesla V100 GPGPUs of the IRIS HPC Cluster at the University of Luxembourg [35].

A. CONCEPTUAL DIFFERENCES BETWEEN "ADAPTED_EA" AND "CLASSIC_EA"
To illustrate the differences between our version ("adapted_EA") and the classic version ("classic_EA", as described in [30]) of an EA, let us provide their respective algorithmic pseudocodes. We assume that both have a fixed population size, which remains constant geneation for generation. For both, we set the initial population as made of identical copies of the considered ancestor. Based on our experiments, we considered a population size of 160 as the best trade-off in terms of speed and accuracy. The main difference between "classic_EA" (as described in Algorithm 1) and our version (as described in Algorithm 2) is the process of selection, recombination and mutation. In "classic_EA", the best 10-20% of the population are selected as elites (hence between 16 and 32 individuals), and new offsprings are generated with these elites by recombination and mutation. Then the last 10-20% (idem) of the population is eliminated, and only these 10-20% are updated at each Algorithm 2 "Adapted_EA" algorithm pseudo code elites + middle-class resulting in offsprings; 12: replace "didn't make it" with offsprings; 13: MUTATE 14: middle-class and offsprings; 15: EVALUATE each candidate; 16: END 17: END generation. However, in our version, the number of elites is set to the first 10 individuals; then, the algorithm starts to modify the whole rest (150 individuals) of the population by eliminating, mutating, and recombining with elites just after the first generation.

B. THE EA PARAMETERS
The task on which we evaluate the performance of both approaches is the construction of adversarial images for CNNs. Although our algorithm EA target,C d is efficient for a series of CNNs, here we make our point for the instantiation EA target,VGG-16

L2
of this algorithm (Algorithm 2) and of its classical EA version (Algorithm 1), for C = VGG-16 trained on CIFAR-10, and for the metric d = L 2 . Starting from a common ancestor image A of size 32 × 32 × 3 labeled by VGG-16 as belonging to c a , and from a target category c t = c a , the specific parameters and choices of the algorithms are as follows: Population initialization. Both algorithms start the search with the same initial population set, made of 160 identical replicas of the ancestor image A.
Evaluation -Fitness Function. This operation is performed on each individual image ind of a given generation g p via the fitness function f it L2 (ind, g p ) that assesses a dual goal: the evolution of ind towards the target category c t , and its proximity to ancestor A, measured by using the L 2 -norm: where the quantities A(g p , ind), B(g p , ind) ≥ 0 weight and balance the dual goal. The L 2 -norm is used to calculate the difference between the pixel values of the ancestor and the considered image ind: where p j is the pixel in the j th position, and 0 ≤ ind[p j ], A[p j ] ≤ 255 are the corresponding pixel values of the images ind and A. Concretely, for any generation g p , one sets B(g p , ind) = 10 −5 . The value of A(g p , ind) depends on o ind [c t ] (note that log 10 o ind [c t ] ≤ 0).
Selection, Recombination, Mutation. The fitness function of each individual in the population is computed (starting with the first generation made of the initial population). Mutations and cross-overs are those described in [21] (they remain similar to some extent to those of [31]).
The first generation of each run EA has a specific seed that controls the randomness of the mutation and crossover operations. The same seed value was applied to the adapted_EA and classic_EA to ensure fair competition. For the mutation, this seed determines the location of the pixels and the magnitude (within a range defined below) of the modifications they undergo. For the cross-over, the seed determines which individuals form pairs, as well as the location and size of the interchanged regions.
Pixel values were modified in the range of ±3 in both EA versions used here. The algorithms used the same parameters and techniques for the mutation and crossover operations. These operations lead to 160 descendant images composing the individuals of the new generation subject to the next round of evaluation.
Termination condition. For each version of the EA, this loop is repeated until a descendant image is created in less than 7000 generations (this maximum number of iterations is a reasonable trade-off, based on our experiments), which is classified as the target category c t with a probability ≥ 0.95, while remaining so close to A that a human would not notice any difference between it and A (and a fortiori would classify this descendant image still as belonging to the original category c a ). This defines a successful termination, in which case one notes D a,t (A) as the adversarial image resulting from EA target,VGG-16

L2
(Algorithm 2) run on A, and D classic a,t (A) as the result of the classic (Algorithm 1) version of the EA also run on A. Otherwise, the algorithm terminates without success.
Therefore, the algorithms terminate after 7000 generations at the latest, regardless of whether they have succeeded in creating such an adversarial image.

C. EXPERIMENTAL COMPARISON OF "ADAPTED_EA" WITH "CLASSIC_EA"
We experimentally compared the efficiency of both versions of the EA for four ancestor/target pairs of categories Animal/Animal, Object/Object, Animal/Object, and VOLUME 4, 2016 Object/Animal. Concretely, the Animal ancestor categories are bird and dog, with image A 3 as ancestor for the bird category c 3 , and A 6 as ancestor for the dog category c 6 taken from Table 1. Similarly, the Object ancestor categories are plane and ship, with images A 1 as the ancestor for the plane category c 1 , and image A 9 as the ancestor for the ship category c 9 .
With these ancestors, we performed 10 independent runs of the algorithms for each of the following combinations: the bird/cat pair (Animal/Animal), plane/truck pair (Object/Object), dog/car pair (Animal/Object), and ship/horse pair (Object/Animal).
Performance comparison. In all cases, the 10 independent runs of each algorithm succeeded in (far) less than 7000 generations. Table 2 lists the minimum number of generations (min gen ), maximum number of generations (max gen ), and mean generations (mean gen ) obtained over the 10 independent runs of each algorithm. The convergence graph, plotted in Figure 2, shows the convergence speed of both algorithms for all cases. The horizontal axis of these graphs is the number of generations, and the vertical axis is the average log probability of the target category obtained for these 10 independent runs. Results and Discussion. As can be seen in Table 2, "adapted_EA" outperforms "classic_EA" in all cases. The former requires fewer generations than the latter to obtain adversarial images with a confidence of 0.95. Figure 2 confirms that "adapted_EA" converges faster than "classic_EA". The graphs indicate that both algorithms apparently exhaust most of their generations to find the correct regions and/or pixels to modify. Once done, their learning curves accelerate drastically, still with "adapted_EA" leading the race against "classic_EA".
Although both algorithms start the search with the same 160 identical images, their respective performances differ substantially, as a consequence of their distinct updating process of the population. Indeed, "adapted_EA" starts these updates for the whole population, except for the elite individuals passed unchanged to the next generation, and does so right after the 1st generation. However, "classic_EA" only updates 20% of its population in each generation. Changing only 32 individuals, as opposed to changing 150 individuals, makes it much slower for the classic version compared to its adapted competitor. These results not only legitimize the choices made in our earlier work ( [1], [21]- [23], [31]), but also provide some evidence that for similar exploration problems with a starting point made of the same individuals (hence, not only for the construction of images that are adversarial for a CNN), the generic selection and mutation process adopted in "adapted_EA" (algorithm 2) shortens the learning period of the algorithm and enhances the convergence speed.
We complete this comparative analysis by assessing the potential differences in behavior between the adversarial images created by each version of the EA. To this purpose, we computed the Kullback-Leibler divergence [36] between the probability densities derived from the normalized histograms of the pixel modifications induced by each of them. In all cases, the values (averaged over the ten independent runs) of the Kullback-Leibler divergences were negligible (they vary between 2.24e − 04 and 5.17e − 03), indicating that the noise created by one version of the EA significantly differs from the noise created by the other. Hence, while both versions of the EA create adversarial images, the modifications introduced by each of them differ strongly, although both these modifications introduced by each EA on the one hand, as well as their differences on the other hand, are not perceptible by a human.

IV. THE ADVERSARIAL IMAGES OBTAINED BY EA target,VGG-16
As a result of Section III, from now on we only consider the "adapted_EA" version of our evolutionary algorithm, namely EA target,VGG-16

L2
. For an ancestor A a in a category c a , and the target scenario for category c t , one defines D a,t (A a ) = EA target,VGG-16

L2
(A a , c t ), provided that the algorithm terminates successfully. One writes more simply D a,t , or even D t , if there is no ambiguity about the choice of the ancestor A a chosen in category c a (mutatis mutandis in Sections V and VI).

A. WITH ONE ANCESTOR PER CATEGORY
From Table 1, we pick the ancestor image A a in category c a and perform 10 independent runs (with random seed values) of EA target,VGG-16 L2 for all nine possible target categories c t = c a .
An example of the quality of the obtained adversarial images is highlighted by the comparison between the dog ancestor A 6 of Table 1, and its corresponding 9 evolved adversarial images D t , with t = 6 (obtained after the  Table 1. first of the 10 independent runs of the EA) as shown in Figure 3. More generally, Figure 11 (Appendix A) contains the adversarial images obtained by the first successful run out of the ten independent runs of EA target,VGG-16

L2
for each of the ancestor images in Table 1, and Table 5 (Appendix A) gives their respective label values. This example illustrates that by slightly changing many pixels instead of heavily changing a few pixels, our approach enhances the indistinguishability between the adversarial image and the ancestor image. In particular, our method differs substantially from [24], [25], [27], where a small fraction of pixels is changed, but at the cost of being noticeable for a human without difficulty (see Figure 1, Section I).
For the ancestor image A a (from Table 1) in category c a specified in its a th row, the t th column of Figure 4 gives the average number of generations required by EA target,VGG-16 L2 to terminate, computed over 10 independent runs. In the four ancestor/target combinations, this number is followed by a symbol ( x) or ( x, ‡y). These symbols indicate that the algorithm did not achieve the τ = 0.95 threshold value within 7000 generations for x of the 10 runs, and therefore terminated without success for the corresponding seed values. The c t -label values of the corresponding best descendant images remained stuck at a local optimum < 0.95, whose quality is also indicated by the symbol. In the case of symbol ( x), this local optimum was quite close to 0.95 (not less than 0.9370 actually; we call quasi-adversarial the corresponding images produced by EA target,VGG-16

L2
). In the case of symbol ( x, ‡y), the complementary number y specifies the number of runs among the x unsuccessful runs for which the local optimum remained very low (between circa 10 −4 to 10 −5 ).
For each 1 ≤ a ≤ 10, the "Row Average" value, displayed in the rightmost column of the a th row, indicates the average number of generations required to perform our attack on ancestor A a in category c a for all c t = c a (Mutatis mutandis    the "Column Average" value displayed in the bottom row of the t th column). Our EA showed a success rate of 100 % since all possible target categories were achieved with at least one of the ten runs for the considered ancestors. Still, some attacks are easier than others. The ancestor image for which EA target,VGG-16 L2 requires the least amount of effort in general is the horse ancestor image A 8 , and bird (c 3 ) is the easiest target category regardless of the ancestor category (with the considered ancestor images at least). At the other end of the scale are the deer ancestor image A 5 and the bird ancestor image A 3 for which EA target,VGG-16 L2 requires the largest amount of effort in general, while dog (c 6 ), truck (c 10 ) and ship (c 9 ) are the hardest target categories. These correspond precisely to the categories (and the ancestors) for which some runs of EA target,VGG-16 L2 terminated without having created an appropriate adversarial image within 7000 generations. Indeed, out of the altogether 900 attacks (10 runs for each of the 90 ancestor/target combinations) performed by EA target,VGG-16

L2
, Figure 4 shows that only 31 did not succeed. It is worth noting the homogeneity and non-diversity of the quality of rare unsuccessful cases. For such an unsuccessful (c a , c t ) combination, either the local optimum is close to the τ = 0.95 value for all failed cases (this occurs for the nine unsuccessful runs of the (bird (A 4 ), dog) combination), or it is very far from this threshold value for all failed cases (this occurs for the 22 unsuccessful runs with the deer (A 5 ) ancestor for the car, ship and truck targets).
Therefore, as a consequence of this study with one ancestor A a per category c a , our experiments show that the probability that EA target,VGG-16

L2
terminates successfully for a given run is 96.56%, and that its termination requires between 290 and 2793 generations on average.

B. WITH 50 DISTINCT ANCESTORS PER CATEGORY
To further evaluate the efficiency of our attack beyond the case of a single ancestor A a per category c a , as described in Subsection IV-A, and to assess the importance of a specific ancestor chosen in a given category, we considered 50 distinct images taken randomly (from the CIFAR-10 testing set) in each of the 10 categories c a . Unlike the 10 independent runs per ancestor of Subsection IV-A, we considered that running EA target,VGG-16

L2
with one single run per ancestor was enough to make our point. So in total, we performed 50 × 10 × 9 = 4500 attacks with EA target,VGG-16

L2
. Figure 5, which summarizes the outcome of this experiment, is to be interpreted in a similar way as Figure 4, with the difference that the averages are computed over the 50 ancestors per category c a . Note also that the ( x) and ( x, ‡y) symbols added to some cell values for a given (c a , c t ) scenario have a different interpretation in Figure 5 compared to Figure  4, since they apply globally to different ancestors here, as opposed to applying to different runs performed on the same ancestor in Figure 4.
Performance differs again from one category to another. The ancestor categories for which EA target,VGG-16 L2 requires the least amount of effort in general are the f rog, cat, and dog categories. In addition, EA target,VGG-16 L2 achieves the target categories bird and deer fairly fast, regardless of the ancestor categories. Conversely, the ancestor categories car and horse are those for which EA target,VGG-16

L2
requires the largest amount of effort in general, while the car and the truck are the hardest target categories.
In this context, the comparison of these results with those of Figure 4 shows the relevance for EA target,VGG-16

L2
's performance of the specific ancestor image chosen in a given category c a . Indeed, while, for instance, the specific ancestor A 8 in the horse category was optimal in a sense (achieving all possible target categories in 290 generations on average), this property did not extend to the horse category as a whole as just seen. A contrario, for instance, the combination (deer, truck) with the ancestor A 5 in the deer category was (with 6939 generations on average) the toughest to achieve among all trials of Subsection IV-A, it is reasonably easy to achieve in general (with 1139 generations on average) with the 50 ancestors chosen for our experiment.
Finally, out of the 4500 trials performed by EA target,VGG-16

L2
, only 87 did not terminate successfully. Therefore, this experiment provides heuristic evidence that one run of EA target,VGG-16 L2 has a probability of 98.06% to terminate successfully. To better assess the strength of the failed cases, we ran again the 87 unsuccessful cases 10 times with different seed values: out of them, 28 succeeded in less than 10 runs, while 59 did not. This result, together with the fact that our algorithm required between 461 and 1717 generations on average in this case, and compared to the outcome of the similar experiments performed in the previous Subsection IV-A with other ancestors, further sustains the impact of the specific ancestor A a taken in a given category c a , and of the seed value used to run the EA. It also shows that the success rate of our attack, namely the capacity for EA target,VGG-16 L2 to terminate successfully for at least one of ten runs out of a small number of trials, is ≥ 98.68%.

AGAINST FILTERS
For the reasons given in the introduction to this paper (Section I), the study undertaken in this section essentially amounts to checking whether adding filters may prevent VGG-16 from misclassification, or may reduce this risk to some extent, when facing an adversarial image created by EA target,VGG-16

A. SELECTION OF FILTERS
Although a large list of filters exists, we focus on the following four filters that have a significant impact on images [37, Chapters 7 and 8].
The inverse filter F 1 replaces all the colours by their complementary colours. This operation is performed pixel for pixel by subtracting the RGB value (255, 255, 255) of white by the RGB value of that pixel.
The Gaussian blur filter F 2 uses a Gaussian distribution to calculate the kernel, G(x, y) = 1 2πσ 2 e − x 2 +y 2 2σ 2 , where x is the distance from the origin on the x-axis, y is the distance from the origin on the y-axis and σ is the standard deviation of the Gaussian distribution. By design, the process gives more priority to the pixels in the center, and blurs around it with a lesser impact as one moves away from the center.
The median filter F 3 is used to reduce noise and artifacts in a picture. Although under some conditions it can reduce noise while preserving the edges, this does not really occur for small images such as those considered here. In general, one selects a pixel and computes the median of all the surrounding pixels.
The unsharp mask filter F 4 enhances the sharpness and contrast of the images. The unsharp-masked image is obtained by blurring a copy of the image using a Gaussian blur, which is then weighted and subtracted from the original image.
Any filter F , or any combination of filters F i1 , F i2 , · · · , F i k operating successively (in that order) on an image I, creates a filtered image F (I) or F i k • · · · • F i2 • F i1 (I).
We make use of these four filters F 1 , F 2 , F 3 ,, and F 4 either individually or as the combination F 5 = F 3 • F 4 . The reason for the choice of the latter F 3 •F 4 is that F 4 is used to amplify and highlight detail, while F 3 is used to remove noise from an image without removing detail. Therefore, a combination of these filters can remove the noise created by the EA while maintaining a high level of detail. Moreover, because the computations are performed on images of size 32 × 32, we take a filter size f = 1 for F 1 and f = 3 for the others.
For each F = F k , 1 ≤ k ≤ 5, we then challenge VGG-16 with these 100 filtered images F (A a ) and F (D a,t (A a )).

B. INDICATORS ADDRESSING THE ROBUSTNESS OF FILTERED ADVERSARIALS
Filters differ substantially in their individual capacities to sustain the adversarial component of the filtered F (D a,t (A a )). Additionally, it may also happen that VGG-16 classifies F (A a ) in a category different from the ancestor category c a . Since in this section (and the next one) we consider that the classification of an image in a given category c means that the label value given by VGG-16 for c is the largest among all possible categories, we relax the formulation of the target scenario accordingly; in this context, one does not necessarily require a target label value exceeding the threshold value of 0.95, but only asks that it is the largest one. The formulation of the untargeted scenario in the filtered context, made precise below in this subsection, requires paying attention to the potential difference between the categories c a and c F (Aa) .
The following indicators quantitatively assess the aforementioned issues for each filter F k , with the considered ancestors and adversarial images. These indicators take integer values, and we specify their theoretical bounds (which clearly depend on the number 10 of ancestors, and on the number 9 of target categories considered in this study).
For each 1 ≤ a ≤ 10, we first define ρ k (A a ) as the number of target categories c t such that VGG-16 classifies F k (D a,t (A a )) (including potentially D a,a (A a ) = A a ) back to the ancestor category c a . One computes Σ k = Of interest for the target scenario is τ k (A a ), the number of t = a for which F k (D a,t (A a )) is classified as belonging to c t (namely those that "really succeed"), and its sum T k = Finally, we consider τ k (a) to assess the untargeted scenario: τ k (a) counts the number of t = a for which F k (D a,t (A a )) is classified as belonging to c = c F k (Aa) . One computes its sum T k = Observe en passant that the inequality T k ≤ T k may theoretically not hold (as opposed to what happens in the absence of any filter, where the corresponding inequality necessarily holds). The reason is that one considers c t = c a for the left-hand side of the inequality, and c = c F k (Aa) for the right-hand side. Since the quantities c a and c F k (Aa) may differ, the set whose number of elements is T k may not be included in the set whose number of elements is T k .

C. ROBUSTNESS ANALYSIS OF THE ADVERSARIAL DA,T (AA) AGAINST FILTERS
Let us now proceed to the analysis of Table 3, which provides these quantities resulting from, and summarizing Tables 6 to 10 (Appendix A).  (1,1) (1,0) (0,0) (9,9) (0,0) Looking at Σ k shows that, although all filters F 1 , · · · , F 5 bring some filtered images back to c a , the unsharp mask (F 4 ) and the inverse (F 1 ) filters are less efficient in this regard. In contrast, the three other filters bring back a majority of the filtered images back to c a . The median (F 3 ) filter and foremost the combination (F 5 ) of the Unsharp and Median filters are highly effective, since more than 80% of all filtered images are classified back to c a . The three filters F = F 2 , F 3 , and F 5 are also those that bring all filtered images back to c a for 5 (in the case of F 2 ), 6 (in the case of F 3 ) and 7 (in the case of F 5 ) ancestors, including a fortiori the filtered ancestor.
Consistently, the consideration of T k and of T k shows that EA target,VGG-16

L2
resists highly efficiently against the Unsharp mask filter F 4 , as 95 % (86 out of 90) filtered images remain adversarial for the target scenario (with target label values no less than 0.5505, see Table 9), and 95 % (86 out of 90) filtered images are adversarial for the untargeted scenario. Our EA is also significantly efficient against the inverse filter F 1 , as 17 % (16/90) filtered images remain adversarial for the target scenario (with target label values ≥ 0.4415, see Table 6), and 53 % (48/90) are adversarial for the untargeted scenario.
On the other hand, the Gaussian blur (F 2 ), the median (F 3 ), and the median and unsharp combined (F 5 ) filters are effective to a far larger extent against EA target,VGG-16 L2 , with F 3 and F 5 being particularly efficient at removing the adversarial property of the descendant images. Indeed, only three filtered adversarial images (hence 3 % of all filtered) remain adversarial for the target scenario for each of these two filters (with target label values ≥ 0.4978 for F 3 and ≥ 0.8131 for F 5 ; see Tables 8 and 10). For the untargeted scenario finally, the proportion of filtered images that are adversarial drops to 13 % (12/90) for F 3 , and to 12 % (11/90) for F 5 .
This study proves that the inverse (F 1 ) and the unsharp mask (F 4 ) filters are largely inefficient against our EA, but that the Gaussian (F 2 ), and foremost, the median (F 3 ) and the combination (F 5 = F 3 • F 4 ) of the unsharp mask and the median filters render our EA-based attack significantly less effective for both the targeted scenario and the untargeted scenario, at least with the ancestor images considered.

VI. THE FILTER-ENHANCED F -FITNESS FUNCTION
The results of the previous section lead to the conception of a new fitness function that natively forces the EA to create adversarial images that remain adversarial (in the sense of Subsection II-B) once filtered. For a filter F , the filteredenhanced F -fitness function is obtained as the following variant of the fitness function defined in Equation (2): by updating the fitness function accordingly. The termination and termination with success criteria are the same as in Subsection III-B.
Since F 5 = F 3 • F 4 is not only highly efficient against EA target,VGG-16

L2
, but is the filter that reverts the largest proportion (89 %) of images D a,t (A a ) back to c a , we limit this study to this case.

WITH ONE ANCESTOR PER CATEGORY
For 1 ≤ a ≤ 10, one performs 10 independent runs of EA target,VGG-16

L2,F5
on the ancestor A a in category c a given in Table 1 in less than 7000 generations. By construction, this image and its by F 5 filtered version are classified by VGG-16 as belonging to the target category c t with probability ≥ 0.95, while remaining so close to A a for a human eye that no one would notice any difference. Figure 7 pictures the adversarial images D F5 6,t (A 6 ) obtained that way for the dog ancestor A 6 (all first runs succeeded for the dog ancestor).
For the ancestor image A a (taken from Table 1) in category c a specified in its a th row, the t th row of Figure 8 gives the average number of generations required by EA target,VGG-16

L2,F5
to terminate, computed over 10 independent runs. With a terminology adapted from the one used in Figure 4, this number is followed by a symbol ( x, ‡y, †z) in 5 of the 90 cells. The occurrence of this symbol means that the algorithm did not terminate successfully for x out of the 10 runs (i.e., the average value = 7000 if x = 10). Not succeeding means that the c t -label value of the most performing descendant images D or of the filtered image F 5 (D) is stuck at a local optimum < 0.95. The symbols ‡y and †z measure the quality of these local optima. ‡y (respectively, †z) counts the number of runs among the x unsuccessful ones for which the local optimum for descendant D (respectively, F 5 (D)) remained very low (between 10 −3 and 10 −6 ).
Of the 900 performed runs, 38 did not terminate successfully, and 3 out of the 90 possible ancestor/target scenarios were not achieved, namely the pairs (plane(A 1 ), deer), (bird(A 3 ), car), (horse(A 8 ), ship). Therefore, the experiments show a success rate of EA target,VGG-16

L2,F5
of 96.66%, and a probability that the algorithm terminates successfully for a given run of 95.77%.
Comparing Figure 8 to Figure 4, when all 10 runs terminate successfully for both EA target,VGG-16

L2,F5
for an (ancestor(A a ), target) pair (83 cases altogether), the latter algorithm usually requires more generations than the former on average (with three notable exceptions, namely the (ship(A 9 ), deer), the (ship(A 9 ), dog) and the (truck(A 10 ), cat) pairs for which it needs 10%, 18% and 13% fewer generations). The fact that, for the 80 remaining pairs, EA target,VGG-16

L2,F5
requires between 1.12 and 3.87 (depending on the pair considered) times more generations than EA target,VGG-16

L2
to terminate successfully is not surprising since there are 3 and no longer 2 criteria to fulfill. (with D F5 a,t (A a ) = A a repeated on the diagonal for the sake of consistency and comparison), and For the sake of completeness, we performed the same experiments as in Subsection IV-B with the same 500 ancestor images (50 ancestor images per ancestor category), but with EA target,VGG-16

L2
. Figure 9 shows the outcome. Of the 4500 attacks, 543 were unsuccessful; hence, the success rate of EA target,VGG-16
Comparing Figure 4 with Figure 8 and Figure 5 with Figure 9 shows that EA target,VGG-16

L2
to construct adversarial images, which is to be expected since EA target,VGG-16 L2,F5 must satisfy not two, but three conditions.

C. ROBUSTNESS OF D F 5 A,T (AA) AGAINST VGG-16•FK FOR ALL FILTERS
Using again the images of Figure 12 (Appendix B) obtained as described in Subsection VI-A, the ancestor A a and the corresponding adversarial images D F5 a,t (A a ) were then tested against all five filters of Subsection V-A. Figure 10 shows the outcome of this process for the dog ancestor A 6 and the adversarial images D F5 6,t (A 6 ).
These filtered images are given to VGG-16 for classification (see Appendix B, Table 11 for F 5 , and Table 12 for F 1 , F 2 , F 3 , and F 4 , with D F5 a,a (A a ) = A a to ease the notations). Table 4 is obtained in a similar way as in Table 3. Note that the upper bounds of the indicators are impacted by the fact that three combinations (c a (A a ), c t ) were not achieved. Indeed, one has 0 ≤ ρ F5 k (A a ) ≤ 9 for a = 1, 3, 8, and 0 ≤ ρ F5 k (A a ) ≤ 10 otherwise. One writes δ F5 k (A a ) = 1 if the filtered ancestor and all filtered adversarial images are classified back to the ancestor category whenever possible. Consistently, one has

Mutatis mutandis,
Table 4 clearly shows that the produced images are not only adversarial for F 5 , but also for F 3 and F 4 to a large extent for the target scenario (88% and 84%, respectively), and for the untargeted scenario (89% and 88% respectively). Additionally, 56% of these images were efficient against F 1 plane car bird cat deer dog frog horse ship truck    to terminate.
for the untargeted scenario, while this percentage dropped to 23% with F 2 .
This study shows that the EA target,VGG-16

L2,F5
attack, designed to be robust against F 5 , is also robust to some significant extent against all individual filters considered for the untargeted scenario, and the Gaussian filter (F 2 ) is the most efficient at removing the adversarial character of the constructed images.
) in the 1 st row, and of (τ F5 k (A a ), τ F5 k (A a )) in the 2 nd row. The last two rows give the sums a of these quantities for all possible a.

VII. CONCLUSION
This study, which substantially complements our previous works [1], [21]- [23], [31], successfully addresses the four issues raised in the introduction. First, we proved that the conceptual originality of our generic evolutionary algorithm leads to a competitive advantage in terms of performance compared to the classical EA approach. Then, an extensive experimental study showed the intrinsic efficiency of our algorithm, EA target,VGG-16

L2
, at constructing adversarial images for the target scenario performed against VGG-16 with images from CIFAR-10. We then challenged the adversarial images obtained against a series of filters, and finally designed a variant EA target,VGG-16 L2,F of the EA, specifically designed to fool VGG-16 and VGG-16 composed with a filter F , and demonstrated the efficiency of the produced adversarial images not only against the specific chosen filter, but also against other filters.
The results of this paper lead to a series of additional studies. First, from a pure EA point of view, we intend to look for methods to accelerate our algorithm, including early warnings that indicate a high probability of unsuccess of a given run. In this line of thought and more specifically for the construction of adversarial images, other efficiency improvement methods will be studied, such as the restriction of the zones on which the EA should focus its noise creation, or the search for optimized paths between c a and c t for a given ancestor via auxiliary categories. Second, since the small 32 × 32 images in this study are naturally grainy, we intend to apply our attack to larger images, not only those of ImageNet [3], but foremost high resolution images arising from different horizons (e.g., satellite, medical, or artistic images), which may lead to combinations C • F for functions F that are no longer filters. Finally, our EA-based attack can potentially be extended to other domains (natural language processing, speech recognition, etc.) beyond the computer vision applications mentioned in the first paragraph of the Introduction. .

APPENDIX A
FIGURE 11: For 1 ≤ a ≤ 10, the image on the diagonal of the a th row is the ancestor A a (recovered from Table 1) classified by VGG-16 as belonging to the category c a , and the picture in the t th column, with t = a, is the adversarial picture D a,t (A a ) = EA target,VGG-16

L2
(A a , c t ) classified by VGG-16 as belonging to c t , obtained after the first of the 10 independent runs.