Adversarial Attacks and Defenses in Fault Detection and Diagnosis: A Comprehensive Benchmark on the Tennessee Eastman Process

Integrating machine learning into Automated Control Systems (ACS) enhances decision-making in industrial process management. One of the limitations to the widespread adoption of these technologies in industry is the vulnerability of neural networks to adversarial attacks. This study explores the threats in deploying deep learning models for Fault Detection and Diagnosis (FDD) in ACS using the Tennessee Eastman Process dataset. By evaluating three neural networks with different architectures, we subject them to six types of adversarial attacks and explore five different defense methods. Our results highlight the strong vulnerability of models to adversarial samples and the varying effectiveness of defense strategies. We also propose a new defense strategy based on combining adversarial training and data quantization. This research contributes several insights into securing machine learning within ACS, ensuring robust FDD in industrial processes.


Introduction
Automated Control Systems operate with a variety of digital and analog signals received from sensors and control mechanisms.An example is a chemical plant where a set of sensors reflect the condition of an industrial process.A common task is Fault Detection and Diagnosis (FDD), where one needs to predict and/or classify a failure based on sensors data.Such methods play a pivotal role in monitoring diverse industrial processes, ranging from chemical processes to electromechanical drive systems.The classification proposed by [1] categorizes these methods into three groups: those based on expert knowledge, mathematical models, and data-driven approaches.The latter one includes various approaches in machine learning, including neural networks [2][3][4].Machine learning algorithms show themselves better than traditional methods based on rules and become more widespread in the area [5].Recent studies demonstrate the success for FDD of various neural network architectures: Multi-Layer Perceptrons, Recurrent Neural Networks, Convolutional Neural Networks [6,7].
However, another challenge appears: modern neural networks are vulnerable to adversarial attacks [8].The idea is that the attacker slightly changes the input data (unnoticed) such that the FDD model prediction changes to incorrect.Such attacks are modeled in the literature [9,10], but it remains unclear if there exist defense strategies good against a wide range of attacks.In order to address this challenge we benchmark attacks and defenses on the Tennessee Eastman Process dataset [11] where the task is the Fault Detection and Diagnosis (FDD) in a chemical process.We consider three various deep learning models -Multi-Layer Perceptron (MLP), model based on Gated Recurrent Units (GRU), and Temporal Convolutional Network (TCN).We subject these models to six different types of adversarial attacks and explore five defense methods.We analyze the success of these protective measures 2 .Then, a novel protection strategy is proposed, which employs several various defense methods.
The contributions of this work are as follows: • We benchmark popular attack and defense methods on the TEP dataset which shows that existing universal defense methods greatly reduce models quality on original data.
• To address this issue we suggest a new defense approach based on adversarial training and data quantization and demonstrate its average effectiveness against various attacks.
• We discuss benchmark results and conclude that autoencoders have a potential to be a universal defense methods but they need more research.
The rest of the paper is organized as follows.Section 2 gives a review on attacks and defense approaches in the area.Section 3 describes methods of attacks and defenses used in our work.In section 4 we describe experiments and discuss them in section 5. Finally, section 6 concludes the results.

Review
The operating principles of modern machine learning methods contain vulnerabilities that can be used to carry out various attacks on models.Different methods and areas of application require different adaptations and modifications of an attack developed in one domain area when used in another.Major research into machine learning attack vectors began around 2014 [12].Over the past five years, this area has advanced very far and many different attack options have been developed.

Classification of attacks on machine learning models
Attacks on machine learning models are typically categorized into several types, which vary based on the capabilities of the attacker in relation to the target and its characteristics throughout the model's lifecycle.These attack types include evasion attacks [13][14][15], poison attacks [16][17][18] and exploratory attacks [19][20][21].Additionally, some of these attack categories have further subgroups.Furthermore, attacks are segmented into three groups based on the level of information available about the model's architecture and access to its internal parameters and/or requests: white-box, black-box, and gray-box attacks.
In evasion attacks (often called adversarial examples) [13][14][15], an attacker interacts with a trained machine learning model and manipulates its behavior by perturbing input samples during testing.The term "evasion" implies that the attacker not only aims to cause the model to behave incorrectly but also seeks to evade detection by both human and automated defense mechanisms.
Poison attacks [16][17][18] are a complex term in the literature.Typically, it refers to injecting poisoned samples into the training dataset with the aim of distorting the training process (so-called data poison attacks).Exploratory attacks involve sending queries to the model to understand its principles of operation.Such attacks can pursue various goals: stealing the model [22,23], conducting membership inference attacks [19][20][21], and others.In this article the main focus was on evasion attacks, as they pose the greatest threat due to not requiring an insider attacker and directly impacting the model's predictions.
In white-box attacks [24,25], the attacker possesses complete information, enabling them to execute any operations on their instance of the deployed model (e.g., obtaining gradients, accessing output data from any layer) to construct perturbed samples.In the presence of defense mechanisms, these mechanisms are also susceptible to attack by the adversary.
Black-box attacks [26][27][28] assume that the attacker can only make a limited number of requests (L, where 1 ≤ L < ∞) to the deployed model.The models provide the attacker with predictions such as label class or probability, semantic map segmentation, etc.
Gray box (or semi-white box) [29,30] attacks represent an intermediate state between white-box and black-box scenarios.They involve the imposition of certain restrictions that provide some information about the learning model/process, albeit incomplete.

Most common methods of attack and defense
In the literature, there are numerous articles that explore issues akin to those addressed in this study.However, these methods are either examined on different datasets or pertain to disparate domain areas, necessitating adaptation for our purposes.Notably, many initial attacks were devised for image analysis models, and not all methods have been fully tailored for the domain area under our investigation.Therefore, in this review, we also consider these aspects.Articles such as [31][32][33][34][35] delve into the impact of attacks on various image datasets like CIFAR-10, CIFAR-100, ResNet-20, and MNIST.These articles also discuss various protective measures.Key attack methods include L-BFGS, FGSM, PGD, C&W, and DeepFool.While identifying the most prevalent defense methods can be challenging, several key approaches emerge, notably Defense-GAN and Adversarial Training.Moreover, articles such as [9,10,[36][37][38][39][40] discuss attacks and defense methods on datasets like TEP and/or similar domain areas such as CARLA, Electra, SWaT, BATADAL, and WADI.Many attack and defense methods in these articles share operational principles with those used in computer vision.
In white-box adversarial attacks, access to gradients is a primary tool.Attackers exploit gradients by calculating the gradient of the loss function concerning the input.Then, they perturb the input in the direction of the gradient to maximize the loss.The mathematical details of the attack algorithms that will be used in this article are outlined in the next section (section 3).
The Fast Gradient Sign Method (FGSM), proposed by [41], generates adversarial examples with a single gradient step.It updates the input based on the direction of increasing loss, using a small multiple of the sign of the gradient.While FGSM is fast, its success rate for adversarial examples is low.
To improve the success rate of FGSM, [42] introduced the Projected Gradient Descent (PGD) method.Unlike FGSM, PGD takes multiple smaller steps in the gradient direction and clips the result by a specified value.Although PGD is more effective than FGSM in finding adversarial examples closer to the model's decision boundary, it is computationally more expensive due to requiring multiple iterations.
For untargeted attacks, [43] proposed the DeepFool method optimized for the L 2 distance metric.It assumes the linearity of the decision boundary in neural networks and finds the minimum adversarial perturbation needed to fool the classifier.DeepFool iteratively identifies the direction that maximally changes the current prediction of the neural network and takes a small step in that direction until finding a true adversarial example.
A more sophisticated white-box attack, the C&W attack by [44], is applicable under various distance metrics: L 0 , L 2 , and L ∞ .This attack optimizes a loss function considering the distance between the original input and the adversarial example, along with the classifier's prediction confidence.The optimization includes a constraint on the perturbation size, making the resulting adversarial example more realistic and challenging to detect.
Adversarial training [31,45,46]  Despite some problems, adversarial training remains one of the most effective methods for protecting against adversarial attacks and is widely used in practice to improve the security and reliability of neural networks.

Interactions among defense methods
While the literature offers many different methods for defending machine learning models, there are few studies that explore building models combining multiple defense methods.In the paper [47] the authors examine the possibility of combining the most popular defense methods against evasion and poisoning attacks.The research concludes that many methods, at the level of algorithmic ideas, are incompatible, demonstrating this through practical examples.Therefore, constructing models that combine defense methods is a complex and underexplored task.

Summary of the review
Since the main objective of the article is to create a benchmark, we pay particular attention to the most common methods described in the literature.This research focuses on analyzing vulnerabilities and implementing protection strategies within ACS.The following section elaborates on the mathematical aspects of the methods employed in this research.

Fault diagnosis methods
Fault detection and diagnosis (FDD) methods, are widely used in monitoring industrial processes, such as chemical processes [48] and electromechanical drive systems [49].The authors of [1] divide FDD methods into three groups: data-driven, model-based, and knowledge-based approaches.In our work, we investigate the properties of data-driven methods.
Data-driven FDD problem is formulated as follows.Let there be a sequence of observations X 1 , . . ., X n , where X t ∈ R d are the values of sensors at time t.Thus, X 1 , . . ., X n form a multivariate time series.Also, let there be a sequence of labels y 1 , . . ., y n where y t ∈ {0, 1} m defines the type of fault at time t.If arg max(y t ) = 0, the process is in the normal state, otherwise arg max(y t ) determines the fault number.Then for a sliding window of width k, we need to find such a function f : where l is some loss function, most commonly cross-entropy, also known as Log Loss.The function f can be found using machine learning methods.
In recent years, many deep learning methods based on different neural network architectures were proposed to solve FDD problem.The simplest one is MLP that was applied to FDD in [6,[50][51][52].Multivariate time series is converted to a vector of concatenated observation and then processed by MLP to predict the process state.TCN is another popular architecture for FDD [53][54][55].TCN is a modification of a 1D convolutional network with causal and dilated convolutions [56] that helps to process sequential data with long-term dependencies.In addition, GRU is a type of recurrent neural networks that shows SOTA results of FDD on many datasets including Tennessee Eastman Process [7,57].

Adversarial attacks
During the attack, an adversarial sample , where X ′ t = X t + N and N ∈ R d×k is a perturbation matrix.Strength of an attacks is defined by the maximal shift ϵ as follows: When choosing types of attacks, we proceeded from the assumption that the attacker has access to either only input and output data or all information about the data and model architecture.2 black-box (Random noise, FGSM distillation) and 4 white-box attacks (FGSM, PGD, DeepFool, Carlini and Wagner) were implemented.

Random Noise
Random noise is the simplest black-box attack based on adding random values to the input data: where ϵ limits the magnitude of noise values and z is distributed according to Bernoulli's principle with parameter p = 0.5 on the sample space of elementary events {−1, 1}.

Fast Gradient Sign Method (FGSM)
FGSM [41] is a white-box attack based on the gradient of the loss function calculated for the input data.The signs of obtained gradient vector indicate the direction in which the input data should be changed to increase the probability of model error.The attack consists of shifting each value of the data by a step of size ϵ, with a sign corresponding to the gradient:

FGSM Distillation
Distillation can be used to create a black-box adversarial attack as proposed in [58].Based on the input and output data of the model, a neural network classifier with an arbitrary architecture can be trained.Adversarial samples are obtained by attacking the resulting model by any white-box attack.In our study we used MLP architecture and FGSM attack.

Projected Gradient Descent (PGD)
PGD [42] is an iterative modification of the FGSM white-box attack method.The main difference is that the data shift is done in several steps.After each step, the gradient signs are recalculated: where x ′ i denotes the changed input data since the previous iteration, Clip{} limits the resulting data shift to no more than ϵ, and α denotes the shift step size at each iteration.

DeepFool
DeepFool [43] is a white-box attack which minimizes the difference between the elements of the output vector f (x) that correspond to the correct and incorrect fault type.Among all possible incorrect types, the closest in absolute value of the difference is selected.Minimization occurs in several steps, each defined as follows: where false is the value of the output vector corresponding to the nearest incorrect fault type, which is selected independently at each step.After each iteration, the total adversarial vector x ′ i+1 is limited by ϵ value.

Carlini and Wagner (C&W)
C&W [44] is a white-box attack that minimizes the sum of the shift value over the distance metric D and the value of some auxiliary function g.Function g takes negative values in case of incorrect classification.This optimization problem can be represented as: where g(x) = ReLU(arg max(y) − arg max(f (x))) and D is Chebyshev distance.Minimization is performed by the stochastic gradient descent method or its analogues.For comparison with other attack methods, we constrain η according to the selected ϵ value.

Defense methods
Another goal of the study was to find out how defense methods behave under attacks with different strengths and for different neural network architectures.The five most popular strategies were implemented: Adversarial training, Autoencoder, Quantization, Regularization, and Distillation.We also proposed to protect models by combination of defense methods.

Adversarial Training
Adversarial training method [41] consists of adding adversarial samples to the training set.The training loss function is given as follows: where x ′ is adversarial sample and λ is adversarial training coefficient.

Defensive Autoencoder
Autoencoder can be used to reconstruct attacked data as proposed in [59].During its training, the following loss function is minimized: where x AE = autoencoder(x + ε) is a reconstructed data and ε is added noise.

Data Quantization
Quantization is a preprocessing method that converts continuous values into a set of discrete values on a uniform grid [60].This approach reduces the quality of the input data but can neutralize the impact of adversarial attacks.The fault diagnosis model must be retrained on quantized data.

Gradient Regularization
The fault diagnosis model can be protected by training using gradient regularization [61] of the loss function over the input data: h is a quantization step and λ is a regularization coefficient.

Defensive Distillation
Distillation defense method [62] refers to the process of creating a copy of the original neural network model that is more resistant to adversarial attacks.The original neural network is called the teacher, and the new neural network is called the student.When teaching a student, so-called smooth labels are used, which are obtained using the activation function softmax(x, T ) on the last layer of the teacher: , where T is a temperature constant.At T = 0 the function converges to a maximum and at T → ∞ the function converges to a uniform distribution.

Adversarial Training on Quantized Data
In recent years, a lot of research has been carried out to develop new defense methods against adversarial attacks.New ideas emerge that are superior to previous approaches in certain conditions.However, there is still no ideal defense method capable of protecting against all types of threats.The vulnerability of protected neural networks is reduced only under certain types of adversarial attacks; in other cases, the accuracy of the models drops noticeably.
In this paper, we propose to use a combination of adversarial training and data quantization.As was shown in [60], quantization allows to clean the input from adversarial perturbation due to the grid alignment of discrete values.However, the size of the grid (quantization frequency), has an important role in this type of protection.If the grid is too wide, it reduces the quality of fault diagnosis, if the grid is too narrow, only a fraction of the data can be effectively recovered.On the other hand, adversarial training provides high model robustness but reduces the quality of diagnosis.This happens because during training, the data contains many adversarial examples that degrade the model's ability to generalize important dependencies in the data that help diagnose faults.Thus, at high values of ϵ in adversarial training, the quality of the model drops significantly, otherwise it does not provide a sufficient level of protection.
We propose to use adversarial training on the data after quantization.Thus, during training, we attack the data with an adversarial attack such as FGSM.We then quantize this data and feed it into the input of the model as a training set.Quantization allows to reduce the strength of the attack, which in turn allows the model to generalize better during adversarial training.As a result, quantization helps the model to achieve better quality in adversarial training.
An additional advantage of this approach is that it does not require a separate model, as is the case with the distillation method or the autoencoder.It is also quite efficient in terms of computational time and memory, since quantization takes place in linear time and requires no additional memory, while adversarial training has the same complexity as training a model on the original data and also requires no additional memory.

Dataset
The Tennessee Eastman Process is a very popular dataset for benchmarking fault detection and diagnosis methods.It describes the operation of a chemical production line, where the process smoothly transitions from a normal state to a faulty one.In our study, we used a version of the TEP extended by Reinartz et al. [48] that contains significantly more sensor data than the original (5.2 Gb vs 58 Mb).This version includes 100 simulation runs for each of the 28 fault types.Each run consists of 52 sensor values for 2000 timestamps, and thus the input samples are in the form of matrices X k×52 , where k is the sliding window size.All data in our experiments were standardized by removing the mean and scaling to unit variance.

Experiments
In our study, we wanted to find out how adversarial attacks affect FDD models based on neural networks with different architectures and what defense methods can be used.To analyze the impact of adversarial attacks on fault diagnosis models, the accuracy metric was chosen.This metric well reflects changes in the quality of models when the data is attacked.
The description of our experiments is divided into 4 subsections.The FDD models subsection describes the training process of neural networks with different architectures.The next subsection shows how the accuracy of the models changes under different types of attack.Further, various methods for protecting models and their properties are shown.Finally, on the basis of the results of experiments, we also proposed and evaluated an approach consisting of a combination of two defense methods.All final results can be found in Fig. 7 and Tables 6-8.

FDD models
For our experiments, we used three models of neural networks with different architectures.To make the models differ from each other more, they contain different numbers of parameters and were trained for different numbers of epochs.The first model is a multilayer perceptron (MLP) consisting of two linear layers and containing 3 452 949 parameters.The second one is based on recurrent gated units (GRU) and containing 204 565 parameters.We also used temporal convolutional network (TCN) with 151 935 parameters.Data were standardized with a standard deviation of 1. Sliding window size was 32, which is a compromise between the accuracy of the models and the duration of the experiments.All models were trained on the TEP dataset for 20, 5 and 10 epochs for MLP, GRU and TCN respectively.The accuracy metrics for fault diagnosis on non-attacked data are presented in Table 1.Combinations of the number of parameters and training epochs are selected on the validation set.
Table 1: Accuracy of unprotected models on normal data.

Model
Accuracy MLP 0.8873 ± 0.0002 GRU 0.9067 ± 0.0041 TCN 0.8985 ± 0.0097 The selected neural network architectures showed similar accuracy and can effectively solve the fault diagnosis task.

Attacks on FDD models
At the next stage, unprotected models were attacked by six types of attacks with different ϵ.For ϵ values, 20 points were selected in the range from 0 to 0.3 with a step of 0.015.We consider this range to be reasonable given that the data is scaled to a unit variance and the attack should not be detected by both human and automated defense mechanisms.Fig. 1 shows how the model's accuracy degrades depending on the type and strength of the attack.It decreases significantly with small shifts in the attacked data for ϵ values less than 0.05.
To cause potential harm, an attacker does not always need to have access to model architectures and use white-box attacks.Experiments have shown that to create a strong adversarial attack, it is enough to have access to the input and output data of the FDD system.This data can be used to train an arbitrary neural network architecture on the basis of which adversarial samples will be created.The distillation FGSM attack showed a similar effect on the accuracy of models as white-box attacks in our study.This type of black-box attacks seems to be the easiest to carry out and potentially the most dangerous.

Protection of FDD models
All three neural network architectures have proven to be highly vulnerable to adversarial attacks and require protection methods.The defense methods studied in our research have many variations and parameters for selecting.It is not possible to conduct experiments for all combinations of settings and models in an adequate period of time.Therefore, we took only the TCN model, which has the fastest inference, to select more optimal settings for defense methods adjustment.Experiments conducted for each type of protection are described in following subsections.After setting up, the defense methods were applied to all FDD models and the final results are presented in Fig. 7 and Tables 6-8.

Adversarial Training
In our study we used equal amounts of normal and attacked data for adversarial training method.Experiments have shown a strong dependence of the model robustness on the set of adversarial samples during the training process.As an example the model trained on attacked data with ϵ value 0.1 is not protected from attacks with ϵ values 0.05 and 0.2.Fig. 2 shows changes in the TCN model's accuracy after adversarial training with different options.The first one is the training with FGSM adversarial samples and fixed ϵ value equal to 0.1.Then the number of ϵ values was expanded to the set with range from 0.015 to 0.3 (ϵ values were randomly selected from the set for every data sample).The same measurements were made for training with PGD adversarial samples.
Adding more different perturbed data to the training process increases the average robustness of the model to adversarial attacks.But the quality with normal data decreases.Table 2 shows the accuracy of the TCN model on non-attacked data before and after adversarial training.Training with PGD adversarial samples showed better robustness from all attack types but worse quality in non-attacked mode.We used this setting for the final comparison of all defense methods.

Defensive Autoencoders
During the experiments, we trained a simple autoencoder with linear layers in the encoder and decoder parts.There are two options for using it in conjunction with the models.The model can be trained on the original dataset data or on autoencoder output data.Both approaches are shown in Fig. 3 using the TCN model as an example.Experiments have shown that when using an autoencoder, a model trained on its output shows better quality and robustness to adversarial attacks.This setting was chosen for the final comparison for all models.Accuracy metrics on non-attacked data can be seen in Table 3.It is significantly lower than on unprotected model, but there is an opportunity to experiment with advanced autoencoder architectures in further research.For the final comparison, the setting with n = 5 was chosen as a trade-off between the model's robustness to attacks and the accuracy on normal data.

Gradient Regularization
The parameters for tuning the regularization method did not show a significant impact on the effectiveness of the protection.The Fig. 5 (a) shows the change in the accuracy of the protected TCN model after all types of attacks.The quantization step parameter h and regularization coefficient λ were equal to 0.001 and 1 respectively.Regularization turned out to be useful just for small ϵ values.However, it well improves robustness against random noise.

Defensive Distillation
Distillation is a gradient masking technique that protects models against gradient-based adversarial attacks.Changing the temperature constant parameter T does not significantly affect the effectiveness against other types of attacks.The Fig. 5 (b) shows the accuracy of the TCN model protected by distillation defense method with parameter T = 100.The results confirm good protection against gradient-based FGSM and PGD adversarial attacks.However, other threats remain relevant when using this protection method.

Adversarial Training on Quantized Data
The experimental results showed that various defense methods can be effective against some types of attack and not against the others.This fact suggests the idea of using several defense approaches together.In our study, we used a combination of adversarial training and quantization defense methods.
For the adversarial training setting we chose attack with FGSM samples and ϵ = 0.1 (Fig. 2 (a)).We combined it with the quantization having 2 5 discrete values (Fig. 4 (a)).The results of this combination significantly exceed the effectiveness of these methods separately (Fig. 6   Quantization defense method having 2 8 discrete values is more vulnerable to adversarial attacks than the one having 2 5 values.But its combination with adversarial training gives partially better results, especially on normal, non-attacked data (Table 5).

Discussion
Our experiments confirmed the vulnerability of fault diagnosis models based on different neural networks to adversarial attacks.We implemented six types of attacks and all of them lead to a significant decrease in accuracy of FDD methods.However, a good defense method should be effective against any type of adversarial attack.At the same time, the accuracy of defended models should not drop significantly on normal, non-attacked data.The ϵ parameter, which limits the maximum shift in the attacked data, is common to all types of attacks.The choice of ϵ value range when creating protection for models depends on many factors (such as additional systems for detecting adversarial attacks) and is the subject of discussion.In our work, we investigated five types of defense methods against adversarial attacks with ϵ values in the range (0, 0.3).We also proposed a combination of adversarial training and quantization defense methods.
Adversarial training with PGD samples and defense by autoencoder can be considered as universal methods against adversarial attacks over a wide range of ϵ values.The disadvantage of these approaches is a significant decrease in accuracy on normal non-attacked data.Adversarial training can be done against the attack with a specific ϵ value without losing accuracy on normal data but will be ineffective for attacks with other ϵ values.Adding more variety to adversarial samples degrades the overall accuracy of the model.
The accuracy of the model protected by autoencoder on non-attacked data has noticeably decreased, but was stable after most types of attacks.This approach seems to have great potential and requires further research with different autoencoder architectures.Additionally, the vulnerability of the autoencoders themselves should be studied.The disadvantage of this method is the need for additional computing resources.
Other methods such as quantization, regularization and distillation have shown high protection against some types of attacks and poor results against the others.To address the limitations of individual defense methods, we explored the possibility of combining them.In our study, we combined FGSM adversarial training and quantization defense method.This approach provides good protection against most types of attacks (except for PGD adversarial examples with large ϵ values) with small losses in quality.It is computationally efficient and does not require additional memory.Other combinations of various defense methods can be explored in further research.

Conclusion
This study confirmed that adversarial attacks can greatly reduce the quality of FDD models.Such attacks can be quite feasible if attackers have access to the data exchange system.Therefore it is important to know the robustness of models used in real systems to adversarial samples.There are many types of attacks and quite a few universal defense methods capable of protecting against all of them simultaneously.Also universal defense methods significantly reduce the accuracy of the models on normal non-attacked data.Here it seems that autoencoders can be improved in further research through the use of more advanced architectures.Many of defense methods protect against certain types of adversarial attacks with a good efficiency.Such methods can be combined into one more powerful defense system, like our experiment with adversarial training and quantization.

Figure 1 :
Figure 1: Accuracy drop of unprotected models under six different types of attacks depending on the strength of an attack ϵ.

Table 2 :Figure 2 :
Figure 2: Accuracy of the TCN model protected by adversarial training with different settings: a) training on FGSM samples with fixed ϵ = 0.1; b) training on FGSM samples with set of ϵ values from the range (0, 0.3); c) training on PGD samples with fixed ϵ = 0.1; d) training on PGD samples with set of ϵ values from the range (0, 0.3).

Figure 3 :
Figure 3: Accuracy of the TCN model protected by autoencoder: a) model was trained on the original data; b) model was trained on the data obtained at the output of autoencoder.

4. 3 . 3 Figure 4 :
Figure 4: Accuracy of the TCN model protected by quantization: a) model is under FGSM attack and n indicates the number of discrete values during the quantization process (2 n ); b) model is protected by quantization with n = 5 under six types of attacks.

Figure 5 :
Figure 5: Accuracy of the TCN model protected by: a) regularization defense method; b) distillation defense method.
(a)).Moreover, we tried to change the quantization defense setting by increasing the number of discrete values to 2 8 .Its combination with FGSM adversarial traning are shown in Fig. 6 (b).

Table 3 :
Accuracy of the TCN model protected by autoencoder on normal data.

Table 4
shows the accuracy of the TCN model protected by quantization with different numbers of discrete values on non-attacked data.

Table 4 :
Accuracy of the TCN model protected by quantization with different parameter n on normal data.

Table 5 :
Accuracy of the TCN model protected by combination of adversarial training and quantization defense method on normal data.

Table 6 :
Accuracy of protected and unprotected MLP model after adversarial attacks with different ϵ values.