A Modular Ice Cream Factory Dataset on Anomalies in Sensors to Support Machine Learning Research in Manufacturing Systems

A small deviation in manufacturing systems can cause huge economic losses, and all components and sensors in the system must be continuously monitored to provide an immediate response. The usual industrial practice is rather simplistic based on brute force checking of limited set of parameters often with pessimistic pre-defined bounds. The usage of appropriate machine learning techniques can be very valuable in this context to narrow down the set of parameters to monitor, define more refined bounds, and forecast impending issues. One of the factors hampering progress in this field is the lack of datasets that can realistically mimic the behaviours of manufacturing systems. In this paper, we propose a new dataset called MIDAS (Modular Ice cream factory Dataset on Anomalies in Sensors) to support machine learning research in analog sensor data. MIDAS is created using a modular manufacturing simulation environment that simulates the ice cream-making process. Using MIDAS, we evaluated four different supervised machine learning algorithms (Logistic Regression, Decision Tree, Random Forest, and Multilayer Perceptron) for two different problems: anomaly detection and anomaly classification. The results showed that multilayer perceptron is the most suitable algorithm with respect to model accuracy and execution time. We have made the data set and the code for the experiments publicly available, to enable interested researchers to enhance the state of the art by conducting further studies.


I. INTRODUCTION
In a world where most of the commodities are mass produced by companies using automated manufacturing systems, the quality of those systems is of vital importance. Even a small deviation in parts of the system could potentially result in bad or malfunctioning products leading to customer dissatisfaction, environmental impacts or huge economic losses to The associate editor coordinating the review of this manuscript and approving it for publication was Sotirios Goudos . the industry. This is the main reason why all components and sensors in the system have to be continuously monitored to identify anomalies, and prompt remedial actions should be provided if something goes wrong.
In a generic sense, an 'anomaly' is a deviation from expected behavior, and can occur for different reasons, including faults in the system or its configuration, or due to unanticipated external interference. Such interference, or even some system or configuration faults, belong to the realm of cybersecurity threats if the root cause is an act of bad intention. Regardless of the cause, the consequences of anomalies must be kept at acceptable levels.
In the context of this paper, anomalies are data points or patterns in the data that deviate from normal behavior. Anomalies might be induced in the data for a variety of reasons, such as malicious activity or system failure [1]. The goal of anomaly detection is to find those data points based on the knowledge gained from previous observations. There are two different problems to solve when detecting anomalous behavior: Anomaly Detection (AD) and Anomaly Classification (AC). AD is the process of detecting whether a behavior is deviating from the normal one, and AC is the process of indicating which type of anomaly is happening when more than one type is possible.
One of the widely used methods for AD and AC is Machine Learning (ML). The ability of ML techniques to build the model automatically based on the given training data makes them an ideal candidate for solving AD and AC problems. During the years, many ML algorithms were used for this purpose with considerable success [1], [2], [3]. Some of these algorithms are Multilayer Perceptron [4], Decision Tree [5], Support Vector Machine [6], [7] or Random Forest [4], [5].
Though there exists a plethora of ML techniques and a multitude of research publications describing their applications in certain contexts, often the designers of industrial systems are overwhelmed and perplexed by the difficult question of how to choose a specific method that is relevant for their problem at hand. In general, the applicability of ML methods is dependent on multiple characteristics of the data which necessitates a two-dimensional approach for answering the above challenge: a) a relevant dataset that captures the context of the application and b) a proof of applicability of selected ML techniques on the given dataset. These two dimensions are interrelated and must be solved together.
In this paper we address the above mentioned dimensions in the context of modular manufacturing systems. Specifically, we present our research on development of a dataset called MIDAS ( Modular Ice cream factory Dataset on Anomalies in Sensors) that realistically represents a manufacturing process, and the application of four different ML algorithms (Logistic Regression, Decision Tree, Random Forest, and Multilayer Perceptron) on that dataset. MIDAS is created using a modular manufacturing simulation environment, where an ice cream-making process was simulated. The simulation environments created by ABB 1 and Malardalen University 2 as part of a European project Intelligent Secure Trustable Things (InSecTT) 3 [8].
The contributions of this paper can be summarized as follows: • A new dataset to support ML in modular manufacturing systems is generated and made publicly available. 1 www.https://global.abb/group/en 2 www.mdu.se 3 www.insectt.eu/ • The dataset generation procedure is explained in detail and the dataset analysis is provided.
• Four different supervised ML algorithms are evaluated for anomaly detection and classification using our dataset.
• A comparative analysis of the evaluated ML algorithms is presented, the algorithms are compared in terms of different classification performance metrics and time consumption. The generated dataset, as well as the code for the machine learning experiments, are publicly available on GitHub. 4 The rest of the paper is organized as follows. Section II presents details on the dataset generation, including the description of system context, anomaly injection process, dataset generation, validation and analysis, its benefits for ML research community, and comparison with some related datasets. Section III presents the application of supervised machine learning algorithms for anomaly detection and classification on the generated dataset, including the results. Finally, a summary of our findings and plans for future work are given in Section IV.

II. MIDAS: MODULAR ICE CREAM FACTORY DATASET ON ANOMALIES IN SENSORS
The focus of this section is the dataset generation process and characteristics of the dataset.

A. CAPTURING THE CONTEXT OF THE SYSTEM
We have used a simulation environment to capture the context of the system. The system that is used to generate the data is in the form of simulated sensors and actuators, part of a process automation system, built using the modular automation design strategy [9], [10], [11]. In modular automation, the production process contains a set of autonomous modules, specialized in performing certain tasks, and well-defined logical and physical interfaces, allowing easy combination of modules into complex production processes [12]. The purpose of this specific simulation environment is to achieve overall realistic system behavior according to the modular automation design strategy and to enable easy reconfiguration. The simulation environment is previously presented in [8].
An example of the use of the simulation environment is a modular ice cream factory that is used in this paper. The simulation engine is configured to simulate the behavior of six separate modules (FIGURE 1): a mixer, a pasteurizer, a homogenizator, a module for ageing and cooling of the mixture, a module handling dynamic freezing that whips air into the ice cream mixture while refrigerating it, and a packaging module. The different modules are jointly simulated and physically interconnected, allowing complex material flow between the modules, but the modules are independently controlled, using industrial controllers. Sensor and actuator signals are exchanged with controllers using the Message Queue Telemetry Transfer (MQTT) protocol [13]. Synchronization of the overall process is performed using high-level recipe orchestration, utilizing Open Process Communication Unified Automation (OPC UA) [14] client/server communication. For each of the modules, a separate module controller is implemented containing the low-level control logic for each module, as well as high-level commands, e.g., StartPasteurize, EmptyModule, etc. Different types of data can be extracted from the simulator including sensor/actuator logs, network data logs from the different interfaces, and message logs from the MQTT broker.
The description of the modules to simulate and their respective Input/Output (I/O) interfaces is done using configuration files, which include physical properties of the modules, analog and digital signals, sensor precision, and material flow interconnections. The simulation process is defined in the XML file with the recipe. All values are reported in the International System of Units (SI) [15]. A simple graphical user interface provides a visualization of the currently simulated process (FIGURE 2). It contains visual representations of modules, their interconnections, and current values of the parameters. Implementation is done using C# programming language, graphical components are created using Windows Presentation Foundation (WPF) and Extensible Application Markup Language (XAML), and Mosquitto 5 is used as a MQTT broker.
The simulated process, that is visualized in FIGURE 2, has the following stages: 1) The mixing module (Mixer) is filled with the content until the level reaches a specific value.
2) The content in Mixer is mixed.
3) The mixture is transferred from the Mixer to the pasteurizing module (Pasteurizer). 5 mosquitto.org/ 4) The pasteurization process is performed in Pasteurizer (first heating to 345K, keeping the temperature for a while and then cooling to 278K). 5) The ice cream mixture is passing through the homogenization module (Homogenizer), reducing and equalizing the fat globule size. 6) The mixture is cooled and stored in the ageing and cooling module (AgeingCooling). The ageing time is reduced in the simulation as compared to a real-world scenario, where ageing would ideally be at least 12h.
Cooling set-point is 277K. 7) The mixture is transferred to the dynamic freeze and flavoring module (DynamicFreezer), where liquid and solid flavoring is added while mixture is whipped during freezing, increasing the air-content of the mixture. Temperature set-point is 267K. 8) In the freezing and packaging module (Hardening), the ice cream is packaged in batches of cones or plastic cups and then hardened by freezing to 243K 9) One run is completed and the process will start again from stage 1).
Each of the provided modules has an analog sensor for the level of the content inside the tank and/or sensor for the tank's temperature, except Packaging module that has no analog sensors and contains actual ice creams that are produced. FIGURE 3 shows how the values for level (FIGURE 3a) and temperature (FIGURE 3b) in the tank evolve for all the modules and ice cream production (FIGURE 3c) during one run without anomalies.

B. PROBLEM FORMULATION
Malfunction in any of the sensors can cause problems in manufacturing process and seriously affect the resulting product. For example, FIGURE 4 shows how the values for level and temperature evolve when the temperature sensor inside Pasteurizer freezes at one point. We can observe that in FIGURE 4b value of the temperature sensor stops decreasing and gets stuck (orange line), despite the fact that the actual temperature of the tank decreases (dashed orange line). This makes the level in Pasteurizer never decrease (FIGURE 4a), the entire process stucks and no ice cream will be produced (FIGURE 4c). The goal of this work is to simulate different types of malfunctions in analog sensors and capture the data, as a basis for ML research that can help to detect those situations and provide prompt reaction to reduce the damage.

C. ANOMALY INJECTION
The simulation environment presented in II-A was extended with a functionality that enables injection of anomalies which can occur during the production process itself. Access to this functionality is enabled using the graphical user interface and the user can use visualization window to observe how the anomaly influenced entire process. The user can inject false data into the signal values of different analog sensors. From the moment of anomaly injection (t n ) the actual sensor value (v) will be replaced by a modified value (v * ). The values can be modified in different ways: • Freeze value -actual sensor value at the moment of anomaly injection will be used at the following time points.
• Step change -actual value is converted to a value greater or lower than the actual value depending on the step • Ramp change -actual sensor value is gradually increased or decreased depending on the parameter c, until the total difference (d) reaches the maximum (s, same as in the step function). v where i is the current time point, n is the time point of anomaly injection and m is the point until which the anomaly persists.
The different types of anomalies that can be injected are visualized in FIGURE 5.
Anomaly injection is implemented by inserting modified sensor values directly into the MQTT queue. This means that the ''physical'' state of the simulation is not affected, but the controllers will read the wrong values, which will cause changes in the behavior. In this way, we simulate different scenarios, from malfunctioning sensors to external intrusions.

D. DATASET GENERATION
The system behavior was recorded during 1000 runs produced using the same recipe, out of which 258 were executed without anomalies. Anomalous behavior was simulated by injecting different types of anomalies into signals of analog sensors. Pasteurizer, AgeingCooling, and DynamicFreezer have analog sensors for the level of the content inside the tank and the tank's temperature, while Hardening has only the temperature sensor and Mixer has only level sensor. The values of these sensors were modified using three options provided with the anomaly injection functionality, resulting in 742 anomalous runs. Each analog sensor has an attribute that defines the sensor's precision (p). During normal behavior, sensor values are generated according to the uniform distribution defined in the range from (v − p) to (v + p) where p is the precision of the sensor (0.0025 m for a level sensor and 0.025 K for a temperature sensor). Values of different sensors were modified during different stages of the simulated process, with different values of parameters for a specific anomaly that is injected. The anomaly injection occurs in a randomly selected moment, either with an increasing or a decreasing trend, when the sensor value changes. In TABLE 1, we can find the minimum and maximum values that the sensors can take during one normal run (Min Val. and Max Val.), which are used as a reference for the parameters for anomaly injection. Additionally, on each sensor, p is used to decide how significant the anomaly will be. The maximum and minimum value that anomalies can take can be found in the last two columns of • c (in Eq. (3)): c is calculated as the division of s by a random integer number following an uniform distribution between [5,30]. The generated dataset has 60 columns: ordinal number of instance within one run, 13 parameters for Mixer module, 8 parameters for Pasteurizer module, 4 parameters for Homogenizer module, 7 parameters for AgeingCooling module, 16 parameters for DynamicFreezing module, 6 parameters for Hardening module, time stamp, run id, and 3 columns referring to anomaly injection (anomaly type, sensor where the anomaly was injected, and actual sensor value). It has 36,124,859 instances, 17,422,215 (49.67%) instances that represent normal behavior, and 18,182,644 (50.33%) instances that contain anomalies. FIGURE 6 presents percentages of instances per class.
The dataset is publicly available on GitHub 6 as 1000 CSV files, one file for each run. Each file name contains the run id and type (Normal, Freeze, Ramp, or Step). The dataset is 6 https://github.com/vujicictijana/MIDAS/tree/main/Dataset divided into training and testing data: runs from 1 to 600 are used as training data, while runs from 601 to 1000 are used as testing data.

E. DATASET VALIDATION AND ANALYSIS
In order to check the validity of the generated data, the code for anomaly injection was peer reviewed by industrial partners. The pilot dataset was generated and all runs were manually checked to verify that everything is well connected and that anomalies affect the process the way it should be. The final validation was performed through data set analysis that is presented in this subsection.    As mentioned in subsection II-D, the dataset is composed of 1000 runs, where 258 have normal behavior and 742 runs have anomalous behavior (with 3 different anomalies: step change, ramp change, and freeze value). The distribution of runs can be better seen in FIGURE 7a. All types of anomalies were injected into sensors that belong to a different module, and the distribution per module can be found in FIGURE 7b. These runs give a total of 36,124,859 instances and FIGURE 6 shows the distribution between the classes.
If we look at all the different values that the variables in which the anomalies were injected have (FIGURE 8), the first thing we can observe is that the instances of Step, and Ramp classes take slightly higher values than those with Freeze and Normal classes. This is completely expected, since for Step and Ramp classes the anomaly is to change the actual value by increasing it. This difference is even more visible in temperature sensors, since an anomaly is larger, especially, for the temperature in AgeingCooling (FIGURE 8f), Dynamic Freezer (FIGURE 8g) and Hardening (FIGURE 8h). There exists an exception with Freeze class values. If we look at the sensor that measures the temperature in Hardening (FIGURE 8h), we can see how Freeze class can reach lower values than the other classes. Additionally, Freeze class values can be higher than Normal class values, while still being smaller than those in Step or Ramp class. This is happening because the batches of Ice Cream have around 1000 cones. In these cases, it might happen that the number of cones is smaller and the temperature of the tank goes under the usual minimum since the freezing power is bigger than it would be needed on this case specific scenario.
Although FIGURE 8 shows all the values, it cannot be seen how they are distributed, because of the many circles lying on similar spots. For this reason, the distribution is additionally shown in FIGURE 9. From this figure, it is also observable how the distribution of the instances in Normal, Step, and Ramp class have the same shape (the small differences are due to the difference in the number of cases), and the distribution of the values in Freeze class differs from them. Since for Freeze class the sensors get stuck at different points, we can actually see them having a different peak for each one. We can also observe how at the level sensor of all modules, the biggest amount of data is around 0, due to the fact that the modules are most of the time empty, except Dynamic Freezer, which takes half of the process. An interesting behavior occurs for the level sensor in Dynamic Freezer module. We can observe how the level decreases in steps to produce the final product in Hardening module (Check FIGURE 3 for a better understanding).

F. BENEFITS OF THE PROPOSED DATASET
The proposed dataset brings opportunities for different research areas within Machine Learning (ML) in modular manufacturing systems. Different supervised, semisupervised and unsupervised ML techniques can be applied, including: • Traditional ML algorithms • Time series-based ML algorithms • Graph-based ML algorithms • Distributed and Federated ML algorithms, since the modules can be seen as independent systems.
There are many problems that can be addressed using this dataset. First, there are different ML classification problems that can be solved (TABLE 2), such as: • Anomaly Detection: The goal is to detect whether there is an anomaly or not (binary classification). • Anomaly Classification: The goal is to detect if there is an anomaly and identify the type of anomaly (multi-class classification).
• Sensor Classification: The goal is to detect which sensor has an anomaly, if the anomaly exists.
• Anomaly/Sensor Classification: The goal is to detect whether a certain sensor within a module has an anomaly or not, and if that is the case, to classify the anomaly. In addition, different ML regression problems can be addressed, such as prediction of sensor value for different sensors and production forecasting.

G. RELATED DATASETS
There are some existing datasets that focus on a related field, called predictive maintenance [16], [17]. These datasets focus on topics such as robotics, 7 gearbox fault detection 8 or air pressure systems for Scania Trucks. 9 There are some datasets that focus on the faults in the product itself, such as steel plate. 10 There are four datasets that has some similarity to the one proposed in this paper. The dataset proposed in the PHM data challenge in 2018, 11 has similar characteristics, but focus on a different manufacturing process. The mill dataset, 12 machine temperature system failure dataset, and ambient temperature system failure dataset, where the two last are part of the Numenta Anomaly Benchmark [18] and can be found in the ''real known cause'' folder. 13 The main difference between MIDAS dataset and the last three is that the mentioned datasets focus either on one machine or on one sensor, while MIDAS has multiple sensors on multiple machines in a manufacturing system.

III. MACHINE LEARNING ALGORITHMS ON MIDAS DATASET
The focus of this section is the application of supervised machine learning algorithms for anomaly detection and classification on the MIDAS dataset.

A. DATASET PRE-PROCESSING
During the experiments that were presented in this paper, all modules' parameters were used as features (54 features Step. Those values were encoded to numbers from 0 to 3, in the given order.

B. MACHINE LEARNING ALGORITHMS
Machine Learning (ML) is a sub-area of artificial intelligence, where algorithms have the ability to learn from experience, adapt to new circumstances and detect patterns [19].
There are different types of ML algorithms and they can be divided into three main categories: Supervised Learning (SL), Unsupervised Learning, and Reinforcement Learning. This paper is focused on SL algorithms. The goal of SL algorithms is to map input values to a specific output, which is given for training purposes. There are many SL algorithms, and the following four have been used in this paper: • Logistic regression (LR) [19], [20], [21] -Logistic regression is a classification method based on Linear Regression. LR creates a hyperplane that separates two classes and transforms the output value using a sigmoid function, which changes the output range from R to [0, 1]. This new value estimates the probability of belonging to one class. LR regression is intended for binary classification, but can be applied to multiclass classification problems using the One-Vs-All method.
• Multilayer Perceptron (MLP) [22], [23], [24], [25] -MLP is one of the feed-forward artificial neural networks families. Artificial neurons are organized into VOLUME 11, 2023 layers (input, one or more hidden, and output), where the output of one layer is used as input to the next layer. The layers are interconnected with unidirectional connections that have certain parameters, also called weights. These parameters are adapted during the training process.  • Decision Tree (DT) [19], [26], [27] -DT is a classification algorithm that evaluates and groups different instances of the dataset depending on the values of the features. One feature is selected and the instances are separated into groups based on the values of that feature. Then, a different feature is selected and additional separations are made. This process is repeated until the termination criteria is satisfied.
• Random Forest (RF) [28], [29], [30] -RF is an ensemble learning algorithm, where different ML algorithms are combined together. In this case, RF combines x different DTs. Each DT gives a prediction and, in the end, the RF makes the final prediction considering all the knowledge provided by the independent DTs. Usually, a simple voting method is used.

C. EXPERIMENTAL SETUP
In this paper, two different problems were addressed: Anomaly Detection (AD) and Anomaly Classification (AC). All experiments were performed in a MacBook Pro 2019 2.6GHz Intel Core i7. Python programming language and sklearn 14 [31] ML library were used to implement the algorithms. The code is publicly available on GitHub. 15 The experiments were performed with 500-100-400 runs for training, validation and testing, respectively. The validation set was used to determine the values of the hyper-parameters for each algorithm. The final hyper-parameters used for the ML algorithms are: • LR: the maximum number of iterations was set to 1000.
• MLP: different numbers of neurons in the hidden layer (varying from 5 to 50 with increment of 5) were tested, and the best results were achieved with 35 neurons for  AD, and 40 neurons for AC. The maximum number of iterations was set to 2000, and rectified linear unit (relu) activation function was used.
• DT: The default parameters were used.
• RF: different numbers of DTs were tested, varying from 5 to 50 with increment of 5 and the best results were achieved with 20 DTs for AD and 25 DTs for AC. The performance of the ML algorithms was measured using different metrics [32]: Accuracy, Recall, Precision, F-Measure, and confusion matrix.

D. EXPERIMENTS FOR ANOMALY DETECTION
The results for AD can be found in Table 3. We can observe how MLP obtained the best results with 82.36% global accuracy, followed by RF with 7 percentage points less accuracy. LR obtained the worst results among all the tested algorithms with 64.60% global accuracy. Additionally, it can be observed that precision and recall are similar between the different classes which means that one class is not better classified than the other. Furthermore, we have applied the algorithms to four new runs: 1 Normal (runID: 992), 1 Freeze anomaly (runID: 802), 1 Ramp anomaly (runID: 631), and 1 Step anomaly (runID: 640). The results can be found in FIGURE 10. It is important to mention that we are showing 1 out of 50 predictions to be able to see some information on the image. The following observations can be made: • Normal scenario: LR is the algorithm that performs better than the others, even though it has some areas where the anomaly is predicted, but these areas are smaller than for the other algorithms.
• Freeze scenario: MLP performs the best, even though it detects the anomaly with a delay.
• Ramp scenario: MLP, RF, and LR detect the anomaly quite rapidly. On the other hand, if we take a look at DT, the detection of the anomaly is not good since most of the points are considered as normal behavior. • Step scenario: The anomaly is detected at the exact moment when it happened. However, there are some areas in the middle of the normal behavior and in the middle of the anomalous behaviour where all the algorithms give the opposite output.

E. EXPERIMENTS FOR ANOMALY CLASSIFICATION
The results for AC are shown in Table 4. The results are similar to those of AD with a 10 percentage points reduction in the global accuracy. MLP had the strongest performance. If we take a closer look to the confusion matrices (FIGURE 11), it can be observed how DT, RF, and MLP have an accuracy higher than a random classifier (which would be 25% on each class). In the case of LR, there is an issue with Step and Ramp functions where the accuracy is not better than a random classifier. Furthermore, we can see that the Normal, Ramp, and Step class are confused half of the times, which is why the accuracy on Ramp and Step are around or smaller than 33% for LR, DT, and RF. The accuracy for Normal class is higher since it is an imbalanced dataset and it is the class with higher instances. On the other hand, Freeze class is mostly confused with the Normal class. This makes sense as we indicated in Section II-E, due to the similarity in the classes. Additionally, one module can be not affected by an anomaly, but the instance will be marked as anomalous if there was an anomaly caused by a different module in a previous instance. Lastly, we can observe how MLP can more or less distinguish every class with a 40% or more as accuracy. However, we can see that there is still some overlapping between Step and Ramp class. Also between Freeze class and Normal class, but only when it is Freeze class. Similarly to AD, we applied the algorithms to the same four runs from the testing set. FIGURE 12 shows how the different ML methods perform. The following observations can be made:  • Normal scenario: LR seems to perform a better classification. However, LR predicts almost everything as a Normal class. MLP and DT perform a good classification except at the beginning for DT and at the end for both where Step class is predicted. It is interesting to mention that there is a short time when DT, RF, and MLP predict that the class is Freeze and they do it at the same intervals towards the end of the run.
• Freeze scenario: MLP makes the best classification, even though it takes time to detect the anomaly.
• Ramp scenario: All the algorithms detect the anomaly, but fail to detect which one it is.
Step anomaly is the anomaly with more predictions, which is quite close to Ramp in terms of behavior. • Step scenario: Same as in anomaly detection, DT, RF, and MLP manage to correctly classify the anomaly. And they classify Step as the correct class most of the time. Some wrong predictions are happening toward the end of the run for the there methods. LR is not able to detect the anomaly.

F. EXECUTION TIME ANALYSIS
Lastly, we have measured the execution time needed for training (in seconds) and for testing of one single case (in miliseconds) for all algorithms (Table 5). In this case, DT is the fastest algorithm for training and testing independently of the problem. In AD, DT is two times faster than LR and three times faster than RF and MLP for training. When it comes to testing, it is 10 times faster than RF and similar as LR and MLP.
In AC, DT does not increase the time much with respect to AD, while the other algorithms increased their time in training. MLP increased its time by 16 times. When it comes to testing, the times are similar as in AD. LR is 2 times faster than DT, DT is two times faster than MLP and MLP is five times faster than RF.

IV. CONCLUSION AND FUTURE WORK
This paper proposes a new synthetic open dataset for ML research in manufacturing systems called Modular Ice cream factory Dataset on Anomalies in Sensors (MIDAS). More specifically, MIDAS is created using a simulation environment that simulates ice cream making process. MIDAS is composed of 1000 runs, 258 simulating normal behavior, and 742 runs with anomalous behavior. To verify the validity of the dataset, an analysis is performed, showing how different anomalies can cause wrong behavior in the system, which proves the importance of anomaly detection.
Four different ML algorithms are tested for anomaly detection and anomaly classification on MIDAS dataset. The results achieved by the ML algorithms showed that MLP is the best algorithm to detect and classify these anomalies. However, this performance could be improved by a time-series ML algorithm.
As future work, we plan to implement additional types of anomalies and perform anomaly injection for digital sensors as well, with the goal of creating a more comprehensive dataset. In addition, we want to investigate the performance of time series ML algorithms in the generated data set. The final goal is to integrate the best-performing ML algorithms into the simulator and provide reliable anomaly detection functionality.

ACKNOWLEDGMENT
A. Ununger, A. Sasikumar, and G. Ninsiima made technical contributions to the implementation, together with students in the distributed software development course with Mälardalen University, in fall 2021.
The document reflects only the author's view and the Commission is not responsible for any use that may be made of the information it contains.
(Tijana Markovic and Miguel Leon contributed equally to this work.) Since 2020, he has been a Senior Lecturer in artificial intelligence with Mälardalen University. His research interests include various aspects of computational intelligence, including machine learning and data analytics, evolutionary algorithms, multisensor data fusion, and their applications in the industrial and medical domains. He has been a program committee member for a number of conferences and an invited referee for many leading international journals. His research interests include multiple aspects of real-time systems, dependability, and software engineering. He has over 160 research publications in international conferences and journals (including five best paper awards). He has been a member of several program committees and has played a lead role in several EU and national projects, such as DAIS, InSecTT, SafeCer, SafeCoP, SUCCESS, EuroWeb, EURECA, FORA, Retnet, Progress, and Synopsis.