Adversarial Attacks and Batch Normalization: A Batch Statistics Perspective

Batch Normalization (BatchNorm) is an effective architectural component in deep learning models that helps to improve model performance and speed up training. However, it has also been found to increase the vulnerability of models to adversarial attacks. In this study, we investigate the mechanism behind this vulnerability and took first steps towards a solution called RobustNorm. We observed that adversarial inputs tend to shift the distributions of the output of the BatchNorm layer, leading to inaccurate train-time statistics and increased vulnerability. Through a series of experiments on various architectures and datasets, we confirm our hypothesis. We also demonstrate the effectiveness of RobustNorm in improving the robustness of models under adversarial perturbation while maintaining the benefits of BatchNorm.

A plethora of research has been dedicated to understand the reasons for this intriguing behavior of neural networks [15], [25], [28], [34], [37], [64], [66]. These explanations include inherent linearity of non-linear deep models [28], excessive invariance [37], sensitivity to input distribution [15], non-robust yet highly predictive features in the input space [34], etc. However, Galloway et al. [23] studied this problem from an architectural angle. They empirically The associate editor coordinating the review of this manuscript and approving it for publication was Prakasam Periasamy .
showed that BatchNorm also contributes to this vulnerability. {In this study, we investigate the mechanism behind this vulnerability and took first steps towards a solution called RobustNorm.
BatchNorm [36] was designed to reduce the internal covariate shift by normalizing the input of each layer with batch statistics (mean and variance). However, batch statistics are only available during training since their calculation needs batched inputs. To circumvent this, BatchNorm keeps an estimate of population statistics during training and utilizes these estimated statistics during inference. In standard training, population statistics are estimated from clean inputs. We {hypothesized that the inherent distribution shift present in adversarial inputs makes these estimated statistics inaccurate as they are estimated for clean images.
{To validate our hypothesis, we conducted a series of experiments using various architectures, diverse attack methods, and various levels of perturbation. Specifically, we replaced the train-time estimated batch statistics with validation-set statistics calculated using adversarial inputs. Our experiments on a range of datasets and architectures demonstrated a significant improvement in robustness. These Input distribution is a good approximate ideal distribution for clean images, but the distribution gets shifted when adversarial noise is added to the input image. This shift invalidates the implicit assumption of BatchNorm that the train and validation data will be from the same distribution. This makes population statistics estimated during training (with clean images) inaccurate and contributes to the adversarial vulnerability of neural networks.
results support our observation that the use of incorrect batch statistics can affect the vulnerability of models to adversarial attacks. We further explored the generalizability of this trend under various conditions, including different architectures, perturbation levels, and adversarial training, to confirm the validity of our hypothesis.
{Based on our observations, we propose that a normalization approach that does not rely on the estimation of population statistics may improve the robustness of models. To address this, we propose an improved variant of Batch-Norm called RobustNorm. Our experiments demonstrate that RobustNorm performs well in terms of robustness while retaining the other benefits of BatchNorm.
We also extended our hypothesis for transfer learning. In transfer learning, models trained on the source dataset (pre-trained models) are often used as feature extractors. A classifier is then trained on these features extracted for the target dataset. The feature extractor uses batch statistics that are estimated from the source dataset. We show that by updating batch statistics on the target dataset, we can achieve significant accuracy gains. We also observed that the accuracy increase depends on the similarity between the source and target dataset. Concisely, our contributions are as follows.
• {We investigated how BatchNorm causes adversarial vulnerability in deep models by conducting formulating the shift hypothesis and conducting a diverse set of experiments to validate this effect • {Based on our observations, we took first towards a more robust normalization approach • We showed that our batch-norm-explanation can be extended beyond adversarial attacks by illustrating results on transfer learning

II. RELATED WORK
Since the inception of adversarial attacks for neural networks [66], many explanations have been proposed to understand the adversarial vulnerability of neural networks. Szegedy et al. [66] linked adversarial vulnerability to blind spots in the discontinuous classification boundary of the neural network, Goodfellow et al. [28] showed that it is because of the local linearity of neural networks. Recent works have connected adversarial vulnerability with many factors like random noise [19], [22], spurious correlations learned by neural networks [34], insufficient data [61], high dimensions of input data [20], [26], distributional shift [15], [37] etc. Our work is different in the sense that we do not try to understand the broader phenomenon but rather the contribution of a BatchNorm. Our work is related to Galloway et al. [23] who empirically showed that BatchNorm is one cause of the adversarial vulnerability of neural networks. However, our focus is on understanding how BatchNorm causes this. {Recent works have explored the effect of BatchNorm and estimated statistics on the adversarial vulnerability of deep models. To start with, Benz et al. [5] investigated the contribution of Non-Robust Features (NRFs) in increasing the performance of models. They show that BatchNorm's use of NRFs is the predominant reason for the improvement in the performance of deep models. The second line of work has also utilized a hypothesis similar to ours for different purposes. Xie and Yuille [72] proposed a two-domain hypothesis for clean and adversarial images and showed that it could be leveraged to improve adversarial training. Xie et al. [75] showed that by using different batch statistics for clean and adversarial images during training, it is possible to achieve state-of-the-art ImageNet results without any extra data. Schneider et al. [62] illustrated that a model can achieve better robustness against many common corruptions by replacing statistics estimated from clean images with the statistics estimated from corrupted images.
BatchNorm [36] was introduced to reduce the internal covariant shift of deep models. It improved the stability and optimization of neural networks. Since then, many different variants of BatchNorm have also been proposed. Each Batch-Norm variant intends to solve a particular problem of the original formulation. LayerNorm [4] solves the problem of fixed batch size training making it useful in sequence models; BatchReNorm [35] and GroupNorm [70] eases the problem of small-batch training making it functional for tasks like detection or segmentation, and InstanceNorm [67] reduces intra-batch dependency making it applicable in style transfer.

III. PRELIMINARIES
We consider a supervised classification task for data {x, y} n i=1 . Our goal is to learn a feature extractor f = F(x) and a classifier C such that y = C(f, w). In adversarial settings, the objective of the adversary is to add small additive perturbation δ ∈ R n in clean image x: x adv = x + δ while satisfying following constraints. First, adversarial image should follow a perturbation budget ϵ: ∥x adv − x∥ p ≤ ϵ. Second, the adversarial image (x adv ) should look visually similar to the original image x. Third, the trained model should incorrectly label i.e., C(F(x adv )) ̸ = y.

A. ROBUSTNESS EVALUATION
Adversarial accuracy (or adversarial robustness) is accuracy of model on adversarially perturbed test set. Generally, we evaluate robustness of a model with PGD-20 attack with α = 2/255 and reported ϵ value following standard practice [51]. However, adversarial examples can be generated in many different ways. To make our results more rigorous, we also used diverse set of adversarial attack methods. We have used different variants of gradient based attacks: Fast Gradient Sign Method (FGSM) [28], Basic Iterative Method (BIM) [44], Projected Gradient Descent (PGD) [51], Momentum Iterative fast gradient sign Method (MIM) [17], Carlini-Wagner attack (CW) [8]. {For adversarial attack budget (ϵ), we use a scale of 0-255 for color images (CIFAR and ImageNet) and 0-1 for black and white images (MNIST). The different scale is based on range of value different datasets use.
For PGD attack, we also report results for two versions: one with 20 iterations and one with 100 iterations. We also use a parameter-free and reliable attack called Auto-PGD [13] or APGD-CE. To show that our results are not effected by gradient masking, we also used query-based blackbox attack called Square attack [1]. The square attack do not use model information (e.g., gradient) and, therefore, it is immune to problems like gradient masking. We set number of queries to 5000 for all of our evaluations.

B. PURPOSE OF THE WORK
Following [9], here we describe the purpose of our work. The intention of this work is neither to propose a new defense mechanism like [51] nor a broader explanation for the adversarial phenomenon like [34] etc. Instead, our purpose is to understand the contribution of a specific component of CNNs in this vulnerability. For this reason, we have used ϵ values smaller than commonly used for some of our experiments. However, we also report results on borad range of ϵ.

C. TRANSFER LEARNING
For transfer learning, we considered that the feature extractor f = F(x) is already trained on the source dataset (ImageNet) and we want to learn a classifier C for the target dataset on top of it.

IV. HOW DOES BatchNorm CAUSE ADVERSARIAL VULNERABILITY
BatchNorm estimates the population statistics during training by using moving averages of batch statistics. These estimated values are used during inference. However, one inherent assumption of this process is that training and inference data come from the same underlying distribution. Adversarial attacks introduce a targeted shift in the distribution of input data. This shift breaks this assumption. In the following sections, we explain how BatchNorm works, and empirically demonstrate our hypothesis through various experiments.

A. HOW BatchNorm WORKS
In this section, we briefly explain the working principle of BatchNorm. BatchNorm uses batch statistics during training and estimated population statistics during inference.

1) BATCH STATISTICS
Consider a mini-batch B of size M , containing samples x i for i = 1, 2, . . . , M . In the training procedure, the normalized feature mapsx i are computed as: where batch statistics are the sample mean µ β and sample variance σ 2 β computer over the batch B as: Besides, a pair of values γ , β are used to shift and scale the normalized valuex i as: For the sake of simplicity, we will omit this in our future discussions.

2) ESTIMATED POPULATION STATISTICS
During inference, it is not possible to calculate batch statistics (µ B and σ B ) as only one sample is available. To circumvent this problem, BatchNorm needs an estimate of population statistics. This estimate of population statistics is computed by maintaining the moving averages of the batch statistics during training. Formally, the moving average (also called tracking) of mean and variance are computed as follows: whereμ P are estimated population means,μ P are estimated values of population variance, and τ is a hyper-parameter that weighs previous moving average and current batch statistics. These population statistics are used during inference as:

B. DEVIL IS IN THE ESTIMATED STATISTICS
During the inference, BatchNorm normalizes the input witĥ µ P andσ 2 P . These statistics are estimated on clean inputs during training. The adversarial attack introduces a shift in the input and makes the estimatedμ P andσ P inaccurate. We show a hypothetical depiction of this idea in Figure 1.
To show this difference, we forward propagated all the validation set samples of CIFAR10 with PGD adversarial attack and calculated batch statistics (µ B , σ 2 B ) of each channel of a trained ResNet20. Their difference with estimated population statistics (μ P andσ 2 P ) is shown in Figure 2 where x-axis represents channels and the y-axis represents the difference for validation batches. The figure shows that estimated population statistics do not match the batch statistics under the PGD attack.
Recent works [15], [38] have also highlighted the link between the shift in the input data distribution and robustness. The same observation has also been used to augment adversarial examples to get SOTA results for image recognition [73]. Based on these observations, we made the following hypothesis: Hypothesis: BatchNorm's population statistics (μ P and σ P ) are estimated from (x, y) ∼ P and an implicit assumption is that inference images will also come from the same distribution. However, the addition of adversarial noise δ in clean images shifts this distribution. This shift makes population statistics inaccurate. The use of these incorrect statistics makes a neural network with BatchNorm more vulnerable to adversarial inputs.
To empirically validate this hypothesis, we conducted multiple experiments in different settings. We describe these experiments in the following sections.

1) EXPERIMENT 1 -REPLACING TRAIN-TIME ESTIMATED STATISTICS WITH VALIDATION-SET BATCH STATISTICS
One way to verify our hypothesis is to use an adversarially perturbed validation set to calculate the batch statistics instead of train-time estimated population statistics. This means replacingμ P ,σ 2 x v M } is the mini-batch of perturbed validation set. Note that this is only possible for a large enough validation set. Our only purpose here is to show the validity of our hypothesis.
We report results for five different datasets and five different adversarial attacks in Table 1. For MNIST and Fashion-MNIST, we use an 8-layer ResNet. We trained these two models for 50 epochs with a learning rate of 0.1. The learning rate is decreased by 10× at the 30-th epoch. For CIFAR10 and CIFAR100 experiments, we use a ResNet20. We train it for 150 epochs. The default learning rate of 0.1 is decreased at 80-th and 120-th epochs. For ImageNet, we use a ResNet18 and trained it for 100 epochs. The default learning rate of 0.1 is decreased at 30-th, 60-th, and 90-th epochs. We follow Section III for robustness evaluation.
The clean accuracy decreases when we use BatchNorm with batch statistics, so we may expect a similar decrease in adversarial accuracy. However, we observe an increase. For instance, on MNIST, we get 7% BIM adversarial accuracy with population statistics but replacing them with batch statistics from the validation set increase this to 69%. A similar effect is also visible across all the attacks, datasets, and training modes.

2) EXPERIMENT 2 -EVALUATION ON DIVERSE ATTACKS, ARCHITECTURES AND PERTURBATION LEVELS
The previous experiment uses ResNet architectures. To make this evaluation more rigorous, we repeated the previous experiments (e.g., replacing estimated statistics with batch statistics) on diverse sets of architectures, more challenging adversarial attacks, and larger perturbation budgets.
For this experiment, we use three different types of architecture. First, we use standard ResNet [31] with three different depths: 20, 38, and 50. Second, we use WideResNet [77] with a depth of 16 and a width of 10. The WideResNet are similar to ResNet in general architecture, but they have a larger capacity as the number of channels is increased significantly. Third, we also use VGG [65] with depths of 11 and 16. VGG is significantly different from ResNet as it does not use skip connections.
For robustness evaluation, we use three attacks: PGD-20, PGD-100, and APGD-CE. PGD-20 and PGD-100 use 20 and 100 iterations of gradient descent to find adversarial attacks. APGD-CE attack is a more reliable attack and it does not require any parameters. All of these attacks are generated with four perturbation (ϵ) levels: 1, 2, 4, 8.
The results are shown in Table 2. The P column stands for population statistics, i.e., standard BatchNorm layer, and the B column stands for using batch statistics instead of population statistics. The robustness of models using batch statistics is higher at all perturbation levels. For instance, the PGD-20 robustness of a ResNet50 with ϵ = 2 is 14.03 when the model uses population statistics. But, it increases to 22.93. The same model has APGD-CE robustness of 0.77 at the same perturbation level as population statistics. This robustness increases to 6.22% when the same model uses batch statistics. These results shows that our hypothesis holds across different architectures, adversarial attacks, and perturbation levels.

3) EXPERIMENT 3 -ROBUSTNESS OF VARIANTS OF BATCHNORM
Based on different intuitions and insights, many alternatives to BatchNorm have been introduced. Some of these variants do not require estimation of population statistics e.g., layer normalization [4], Fixup Initialization [33] etc. Our hypothesis suggests that the adversarial accuracy of these variants should be higher than BatchNorm. To test this, we performed experiments with three different variants of BatchNorm for CIFAR10. The results are shown in Table 3. The clean accuracy of these alternatives is less than BatchNorm, so we should expect a similar drop in adversarial accuracy. On the contrary, there is an increment of adversarial accuracy. For instance, if we train a neural network with LayerNorm, clean accuracy decreases from 92.1% to 89.4%. However, adversarial robustness for PGD attacks increases significantly from 23.1% to 30.1%. This increment in robustness shows the role of estimated statistics in the vulnerability of CNNs.

4) EXPERIMENT 4 -ADVERSARIAL TRAINING
Adversarial training leverages adversarially perturbed examples to train a neural network. An adversarially trained Batch-Norm layer estimates population statistics with adversarial examples. Therefore, we should expect better adversarial accuracy, which has already been shown [51]. We should also expect a smaller gap between using population statistics and input batch statistics (the setup we have used in Experiment 1). This indeed is true and adversarial training bridges the gap between estimated population statistics and validation set batch statistics based BatchNorm, as shown in Table 4. For instance, on the CIFAR10 dataset, the gap between BatchNorm with validation set batch statistics and population statistics is 100% for regular training, but it shrinks to 30% for adversarial training. These results show the importance of the reliability of train-time estimated population statistics and their effect on the adversarial performance of a neural network.

5) EXPERIMENT 5 -GRADIENT MASKING AND BLACKBOX ATTACK
Recently, [3] showed that obfuscated gradients may give a false sense of better robustness. To show that our experiments VOLUME 11, 2023 96453 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  Robustness of six different models against query-based blackbox square attack [1]. Robustness of each model is evaluated with batch statistics (µ B , σ 2 B ) and population statistics (μ P ,σ 2 P ). Robustness of models using batch statistics is much better than models using estimated population statistics. do not suffer from this problem, we repeated Experiment IV-B2 on a query-based BlackBox attack called Square attack [1]. This attack utilizes a random search to find adversarial attacks. Since this attack does not use any gradient information, it can not have the gradient obfuscation issue.
We set the number of queries to 5000 as used by [13]. The robustness of six different models on four different perturbation levels (ϵ ∈ {1, 2, 4, 8}) are reported in Figure 3. Results for each model is reported with population statistics (P) and batch statistics (B). Replacing population statistics with batch statistics significantly improves square-attack robustness. These results show that our experiments do not suffer from gradient obfuscation problems.

C. SHIFT HYPOTHESIS AND TRANSFER LEARNING
To further test our ''Shift Hypothesis'', we turn our attention to transfer learning. In transfer learning, we want to transfer the learning of the model trained on a large source dataset to a similar but smaller target dataset. There are many ways to do transfer learning. For instance, using a pre-trained model as a feature extractor and training and classifier on top of it, fine-tuning a few last layers of the pre-trained model on the target dataset, or fine-tuning the whole model on the target dataset [16], [40]. However, the first configuration is of particular interest to us. In this configuration, a model trained on the source dataset is not updated on the target dataset set and uses BatchNorm statistics estimated from the source dataset for feature extraction. We hypothesize that updating BatchNorm's estimate of population statistics on the target dataset can help achieve better accuracy.
To test this, we have used 10 transfer learning datasets listed as follows: Birdsnap [30], Stanford Cars [41], Describable Textures Dataset (DTD) [11], CIFAR10 and 100 [43], UCSD Birds [68], Oxford Flowers [55], Oxford-IIIT Pets dataset [57], Caltech101 [21], Caltech256 [29]. We used Pytorch pre-trained ResNet50 1 as a feature extractor and trained a classifier on top of it without any bells and whistles. To update the BatchNorm layer, we reinitialize the running mean and variance of the BatchNorm layer and calculated them from the target dataset. We also retrained the weight and biases of BatchNorm since they are dependent on sample mean and variance.
We report results in Table and Figure 4. The table shows test accuracy, and the Figure shows the training convergence for these datasets. These results show that we can achieve significant improvement in both accuracy and convergence speed by updating BatchNorm on target datasets. For instance, on the Stanford Cars dataset, the improvement in accuracy is absolute 12%, and on Birdsnap, it is 11.2%. However, an even more interesting observation is the relation of gain of accuracy and the target dataset's similarity with ImageNet. For instance, UCSD-Birds and Birdsnap are birds dataset. However, as shown in the USCD-Birds dataset website, 2 many images of it overlap with ImageNet. This similarity affects the gain of accuracy obtained by updating BatchNorm i.e. improvement for Birdsnap is 10% compare to 2% for USCD-Birds. Similarly, CIFAR10 and 100 are tiny images of size 32 compared to ImageNet's size of 224 and the absolute gain in accuracy is 19% and 23 % respectively. Note that updating BatchNorm always increases accuracy.

V. TOWARDS A ROBUST NORMALIZATION
In Section IV, we empirically show that train-time estimated population statistics of BatchNorm make a deep model more vulnerable to adversarial distribution shift. A straightforward solution -as shown in the experiments (Table 1) -is to use the batch statistics calculated from the inference inputs. However, during training, activations are normalized by the statistics estimated from the large batch. This introduces an intra-batch dependency [35] for BatchNorm. This issue makes BatchNorm dependent on the moving average estimated for inference. In the experiments of the last section, we used a batch size of 128 (same as the training batch size) to validate our hypothesis. However, if we use a small inference We used a ResNet50 pre-trained on ImageNet as a feature extractor and trained a classifier for the target dataset. For the second case, we also updated BatchNorm statistics on the target dataset.  batch size to calculate statistics, BatchNorm performance descends to zero, as shown in Figure 7. This decrease in performance illustrates that we can not use batch statistics as a solution to this problem.
Recent work on robustness has also shown a connection between the removal of outliers in activations and robustness [18], [74]. Based on these heuristics, min-max normalization becomes a good candidate since it re-scales input (controlling exploding activations), only requires maximum and minimum values, which are not dependent on the distribution, and can remove outliers. We keep using mean considering the importance of centering the data [60]. We define a simplified version of our RobustNorm as: where x i is i-th example of batch B, range is r B = u B − l B , maximum is u B = max 1≤i≤M (x i ) and minimum is From Von Szokefalvi Nagy inequality (r 2 B ≤ 2nσ 2 B , where n is the number of samples to estimate the range), we can say that range suppresses activations with higher intensity than the variance. BatchNorm uses linear transform to project activations to an appropriate range. However, in our case, the range makes it harder to learn this projection at the start of learning. To make the control more flexible, we introduced a new hyper-parameter -norm power (p). Finally, we define Robust Normalization (RobustNorm or RN) as follows: VI. EXPERIMENTS

A. EXPERIMENTAL SETUP
We have used two architectures, ResNet [31] with 20,38 and 50 layers and VGG [65] with 11 and 16 layers. We have used five different datasets for robustness evaluations: MNIST [45], Fashion-MNIST [71], CIFAR10, CIFAR100 [43] and ImageNet [14]. We have always used a learning rate of 0.1 except for no normalization scenarios where convergence is not possible with higher learning rates. In that case, we have used a learning rate of 0.01. We decrease the learning rate ten times at 80th and 120th epoch for CIFAR10, 100; at the 30th epoch for MNIST, Fashion-MNIST; and at 30th, 60th and 90th epoch for ImageNet.

B. EVALUATION ON CIFAR
We evaluated the robustness of RobustNorm for two different datasets. RobustNorm's accuracy is higher in the presence of adversarial attacks ( Figure 6). Specifically, RobustNorm increases the adversarial accuracy of ResNet20 from 22% to 70% for CIFAR10 (note that the epsilon value is 1/255).
All the results are shown in Figure 6.

C. EVALUATION ON CIFAR10 WITH DIFFERENT ARCHITECTURES AND ATTACKS
To further verify the effectiveness of RobustNorm, we trained two models on BatchNorm and RobustNorm. Both models were trained for 100 epochs with an initial learning rate of 0.1 and cosine annealing learning rate decay [47].  For RobustNorm, we use p = 0.2. We evaluated both of these models on four different attacks: PGD-20, PGD-100, APGD-CE [13] and Square Attack [1]. We used four different perturbation levels e.g., ϵ ∈ {1, 2, 4, 8} for robustness evaluation. The results are shown in Table 6. RobustNorm performs better than BatchNorm against a diverse set of attacks and perturbation levels.

D. EVALUATION ON ImageNet
To test the effectiveness of RobustNorm at scale, we performed experiments for RobustNorm on ImageNet. Results are shown in Table 7. RobustNorm beats BatchNorm for all the attacks by a wide margin. Note that we have not used any fine-tuning for hyper-parameter p.

E. EVALUATION ON SMALLER BATCH SIZES
RobustNorm performs better compared to BatchNorm when we do not use batch statistics as shown in Figure 7. But, it still suffers some loss of accuracy. Since mean (µ) is a distribution statistic, we use its estimate calculated during training. This improved the performance of RobustNorm for small batch size is shown in Figure 7. To understand the effect ofμ P on adversarial accuracy of RobustNorm, we perform experiments with varying values of ϵ. As shown in Figure 7, adversarial accuracy of RobustNorm withμ P is comparable to RobustNorm while also having consistent small inference batch performance.

F. ABLATION STUDIES
In this section, we have validated and explored different properties of RobustNorm.

1) ANALYSIS OF POWER HYPERPARAMETER
The RobustNorm introduces a new hyperparameter called power or p of the range r B . We found p = 0.2 having faster convergence (see Figure (8), red shows p = 0.2) and generality across datasets. Therefore, we have used it for all our experiments. Later, we observed that faster convergence does not necessarily mean better adversarial robustness (see Figure 9). For instance, RobustNorm with p = 0.2 performs worse in terms of adversarial accuracy when compared to other values. Similarly, p = 0.05 has better adversarial robustness in RobustNorm when using the population mean. Therefore, it has room for more improvement by tuning this hyperparameter.

2) SCALIBILITY TO DIFFERENT ARCHITECTURES
We also evaluate RobustNorm to show its scalability on different neural network architectures and depths. We choose ResNet, and VGG architectures as a wide variety of neural networks evolved from these networks. Similarly, VGG was designed before BatchNorm, so it is also interesting to see   its performance under different normalizations. To show the scalability of RobustNorm for different depths, we choose two commonly used depths of ResNet (38 and 50) and VGG (11 and 16). Results for the experiments on these architectures for CIFAR10 are shown in Table 8. RobustNorm outperforms BatchNorm by wide margins in all of these networks. For instance, RobustNorm has a margin of 50% with ResNet38, 31% for ResNet50, 15% for VGG11, and 28% for VGG16 when the input has BIM adversarial noise. Similar trends are also visible under different attacks.

VII. CONCLUSION
In this paper, we have investigated the role of Batch-Norm in the adversarial vulnerability of convolutional neural networks. We observed that BatchNorm estimates population statistics from natural images during training, and the addition of adversarial noise introduces a targeted distribution shift in the input during inference. We hypothesized that this shift makes train-time estimated statistics inaccurate, thereby contributing to the adversarial vulnerability of BatchNorm. We validated our hypothesis by showing adversarial accuracy differences between statistics calculated from perturbed validation-set batch and train-time estimated statistics. We also showed that normalizations that do not require these train-time estimated values perform better compare with BatchNorm. Based on these insights, we proposed a variant of BatchNorm, which increases the robustness while keeping other benefits of BatchNorm. We have also extended our hypothesis for transfer learning, where we showed that we it is possible to get a significant accuracy gain by updating batch statistics on the target dataset.