Introduction
The proliferation of Internet of Things (IoT) and deployment of massive amounts of IoT devices in cyber-physical infrastructure systems such as Smart Factories [1], [2], Smart Grids [3], Smart Logistics [4] and others, brought forward an increasing number of cyber-security [5] and property management challenges [6]. For example, Smart Factory or Smart Logistics operations include asset management, intelligent manufacturing, performance optimization and monitoring, planning, and human-machine interaction, but neither of them takes into account full cyber-security protection or data management of Industrial IoT scale [7], [8]. Therefore, handling massive IoT device data integrity and device behaviour in real-time industrial IoT operation and management requires novel approaches. In recent research, they are mainly addressed via various machine-learning (ML) and deep-learning (DL) techniques [9]–[11]. The ability of ML/DL algorithms to process massive data sets while extracting useful features allow them to quickly identify anomalies and prevent breakdowns, which potentially has a broad application space in cyber-physical infrastructures [12]–[14].
With the introduction of the
In this paper, we propose to augment the 3GPP mobile cellular architecture with additional enhancements that provide support for a network-wide anomaly detection (AD) service. Our target is a generic AD CIoT service which can be tailored to applications ranging from identifying malfunctioning devices to threat detection for secure CIoT. The proposed hierarchical AD architecture embeds anomaly detection modules (ADMs) both at the IoT devices (ADM-EDGE) and in the mobile core network (ADM-FOG). The ADM modules are based on both shallow and deep autoencoders (AE) whose complexity is matched to both the edge and the fog deployment, balancing between the system responsiveness and accuracy. The distinguishing feature of our work is that the proposed AD enhancement of the CIoT architecture, including both ADM-EDGE and ADM-FOG modules, is implemented and deployed in a real-world CIoT network based on the 3GPP NB-IoT standard and demonstrated in the context of Smart Logistics. Moreover, we custom-designed a novel NB-IoT device platform for a Smart Logistics use case, where NB-IoT devices are connected to shipping containers in a factory supply chain, in order to collect data, deploy and test the ADM-EDGE module.
The paper is organized as follows. In Sec. II, we provide technical background, review the related work and present the contributions of this paper. The proposed solution for DL-based anomaly detection in CIoT is presented in detail in Sec. III. In Sec. IV, we describe system integration, data generation and provide numerical results from real-world experiments. The paper is concluded in Sec. V.
Background
In this work, we augment the CIoT architecture with anomaly detection capabilities at the IoT devices (edge) and the mobile core network servers (fog). Before going to details, we first provide the technical background needed for understanding the proposed system architecture and functionalities.
A. 3GPP Cellular IoT Architecture
We start by describing the current state-of-the-art CIoT architecture focusing primarily on the 3GPP NB-IoT technology [16], [17]. NB-IoT is a new CIoT technology that can be seamlessly integrated in the existing 3GPP 4G/5G architecture, coexisting in the radio access network with the current 3GPP 4G LTE and the emerging 3GPP 5G NR technology, and using the same evolved packet core (EPC) network functionalities [23]. Focusing on the current 3GPP 4G LTE architecture, the relevant 3GPP CIoT architecture elements are illustrated in Fig. 1. CIoT user equipment (CIoT UE), which is a formal name for a NB-IoT device, connects to the network via a neighbouring base station or eNodeB (eNB), which is the main element of the Evolved Universal Terrestrial Radio Access Network (E-UTRAN). NB-IoT downlink/uplink resources are allocated either within 4G LTE band (in-band deployment), at its edge (guard-band deployment), or as a separate channel (out-of-band deployment). After eNB, both user-plane (i.e., user data packets) and control-plane (i.e., signalling messages) information is processed at a CIoT Serving Gateway Node (C-SGN), which covers functionalities of both control-plane Mobility Management Entity (MME) and user-plane Serving Gateway (SGW). User-plane data further flows through a Packet Gateway (PGW) to the IoT platform, which forwards data via the Internet to external network application servers [22].
Two options for data transfer between the CIoT UE and the IoT platform are envisioned. The first one (mandatory) uses signalling radio bearers to transmit user data, thus avoiding establishment of data radio bearers for energy efficiency. From eNB, data is routed either following a control-plane path via an EPC element called Service Capability Exposure Function (SCEF) for non-IP data, or a user-plane path via C-SGN and PGW for both IP/non-IP data. The second one (optional) establishes a data radio bearer to send IP/non-IP data via an eNB/C-SGN/PGW user-plane path to the IoT platform. Herein, we assume that a UDP encapsulated IP data from CIoT UE device traverses the path following the latter approach, which will impact the deployment choices for the proposed anomaly detection enhancements strategy described in Sec. III.
B. Machine Learning for Anomaly Detection at the Edge
Security challenges and threats in industrial IoT networks call for innovative applications of ML/DL techniques for IoT security. More specifically, these techniques can be employed for authentication and access control, anomaly and intrusion detection, malware analysis and distributed denial-of-service (DDoS) attacks detection and mitigation [24], [25]. The main challenges of implementing ML/DL models at the edge are scalability issues and IoT edge platforms resource limitations [13]. Depending on the ML algorithm being run on the edge node, the size of the ML model can go as low as a few kilobytes. Also, the requirements in regard to the memory capacity and computational power depend heavily on the choice whether the models are trained at the edge, or pre-trained models are being used.
Besides the sensor readouts, which are the primary source of data for ML/DL at the edge, an IoT module itself can provide a host of useful insights about the network and wireless link conditions, the feature we also exploit in our edge device design described in Sec. III-B. The amount of useful data that can be extracted from the IoT module generally exceeds the capacity of the wireless communication channel, however, this kind of metadata can be used to feed a locally run ML algorithm for anomaly detection, or be aggregated and sent to the core network fog gateway periodically, for further analysis.
In this work, to perform AD, we apply shallow and deep autoencoders (AE) trained using deep learning algorithms. AE is a neural network that learns a latent lower-dimensional representation of training data by reproducing its inputs through latent variables in the hidden layers at the output layer with the smallest possible error. The error function captures differences between values at the input and output layers. This so-called reconstruction error is used as the outlier score in an anomaly detection process. The proposed AD architecture is hierarchical, as it comprises AD models running at different levels within a CIoT system (both IoT edge devices and core network fog gateway), where more powerful higher-level models are activated if decisions of lower-level models have low confidence scores (see Sec. III-C for details).
C. Related Work
Recent research efforts in the area of ML methods for anomaly detection at the edge IoT devices have been focused on efficient utilization of limited computational resources at the edge. It is well-known that the training process for most of deep learning-based AI models is highly resource-intensive, usually requiring hardware resources (e.g., GPU, FPGA) [26]. Resource-aware edge AI model designs have been considered in a different line of research. The AutoML idea [27] and the Neural Architecture Search techniques [28] have been used to devise resource-efficient edge AI models tailored to the hardware resource constraints of both the underlying edge devices and network servers. Important research advances were also made regarding the tailored design of DL architectures for resource-constrained devices: Zhang et al. proposed an extremely efficient convolutional neural network (CNN) for mobile devices and Nikouei et al. introduced a lightweight CNN that can run on edge devices [29].
A number of proposals using distributed ML/DL for security in Industrial IoT are recently considered [30]. In DIoT, a recurrent neural network (RNN) is trained for each device type present in the IoT network to learn a normal communication profile. A federated (distributed) learning scheme is employed to learn device-type specific RNNs [31]. Wang et al. proposed a control algorithm that determines the best trade-off between local update and global parameter aggregation in data partitioned federated learning models trained using gradient-descent algorithms [32]. Ferdowsi and Saad proposed a distributed privacy preserving IoT intrusion detection security system based on federated generative adversarial networks. In the proposed decentralized architecture, every IoT device monitors its own data as well as neighbor IoT devices to detect internal and external attacks [33]. Meidan et al. proposed N-BaIoT – a method for detecting IoT botnet attacks based on deep autoencoders. For each device present in an IoT network, a deep autoencoder is trained on features extracted from normal traffic data [34]. Bezerra et al. proposed IoTDS – a distributed method for detecting IoT botnet attacks based on light-weight one-class classification models [35]. Rathore and Park created a decentralized attack detection framework for IoT networks based on semi-supervised learning employing extreme learning machines and fuzzy C-means algorithms [36]. Doshi et al. employed various machine learning algorithms (k-nearest neighbor, support vector machines, decision trees and neural networks) to detect DDoS attack traffic in consumer IoT devices [37]. Pajouh et al. (2018) proposed a malware detection approach for IoT based on deep RNNs [38], while [39] presents an approach to anomaly detection that implements autoencoders at each edge device, while the edge devices are orchestrated via a federated learning model with the central server. In [40], the authors show that Random Forest, Multilayer Perceptron, and Discriminant Analysis models can viably save time and energy on the edge device during data transmission, while K-Nearest Neighbors, although reliable in terms of prediction accuracy, is resource-inefficient in their studies.
D. Contributions
We now summarize the main contributions of the paper. We propose an approach to embed anomaly detection capabilities in the Cellular IoT architecture, providing for combined threat detection both at the IoT devices (edge) and in the mobile core network servers (fog). The corresponding architecture design is motivated by and well-suited for Smart Logistics. The proposed edge-based ADM-EDGE and fog-based ADM-FOG modules can balance between the responsiveness and accuracy by employing both shallow and deep autoencoder (AE) based learning modules whose complexity is matched to both edge and fog deployment. We carry out implementation, integration, and evaluation of an end-to-end testbed according to the proposed architecture. This includes: 1) real IoT data generation and emulation of a real-world Smart Logistics scenario; 2) fabrication and configuration of the relevant edge and fog hardware and infrastructure; 3) development and implementation of a software library for edge and fog-based anomaly detection; and 4) evaluation of the developed anomaly detectors on the generated data and quantification of detection performance-response time1 tradeoffs. For the latter contribution, we explicitly quantify the tradeoffs that take into account limited computational and storage budget at the edge devices, and communication and processing costs due to processing larger amounts of data at the fog for improved AD performance.
DL-Based Anomaly Detection in 3GPP NB-IoT
In this section, we describe in detail the design and system architecture of the proposed AD support for the 3GPP NB-IoT mobile cellular network.
A. System Model and Architecture
We augment the 3GPP CIoT system architecture with support for CIoT device anomaly detection. The augmented architecture is illustrated in Fig. 1 and introduces two additional ADMs: one placed at the edge CIoT UE (ADM-EDGE) and another placed at the fog gateway (ADM-FOG). The architecture represents a generic CIoT enhancement for anomaly detection, although in this work, we specialize it to the domain of Smart Logistics. This includes managing supply of items from various origin points delivered to warehouses in manufacturing plants (Fig. 1). Items being delivered are packed into containers, each of which has an NB-IoT device attached. For this purpose, we designed an entirely new NB-IoT UE device, and deployed suitable ADM-EDGE and ADM-FOG modules at both NB-IoT UEs and the FGW server within the mobile core network.
1) ADM-Edge
As described below, NB-IoT devices collect various information such as acceleration and GPS coordinates. This sensory information can be used to detect anomalies such as physical tampering of items, container mishandling such as overturning, delays, routing problems, incidents with the delivery vehicles, etc. We assume each NB-IoT device possesses two types of sensors: i) sensor S1 with low sampling rate
Due to a limited memory capacity and processing power, ADM-EDGE integrated into an NB-IoT device firmware requires restrictive design. ADM-EDGE consists of a pre-trained autoencoder detecting anomalies in individual data points. At the input, ADM-EDGE processes a single data point that consists of a single S1 and S2 value. As illustrated in Fig. 2, we assume ADM-EDGE is triggered synchronously with the low-rate sensor S1 outputs
2) ADM-FOG
NB-IoT devices connect to a mobile network and transfer data via the nearest base station. Each ADM-EDGE data point is forwarded to the FGW, adjoined with the ADM-EDGE confidence score evaluated from the last available data point. The communication delay incurred by NB-IoT network connection may vary between an order of tens-of-milliseconds to several tens-of-seconds, depending on the NB-IoT device radio conditions and network load [41]. The FGW server runs an instance of ADM-FOG relying on higher memory capacity and processing power. Thus, ADM-FOG uses a more powerful autoencoder that processes multi-variate timeseries through several hidden layers. At ADM-FOG, a larger input is considered which is formed by concatenating the last
To summarize, the above AD-augmented CIoT architecture features several important properties: 1) ADM-EDGE at the NB-IoT node immediately detects an anomaly over a single data point which may result in extremely fast response time (order of milliseconds); 2) ADM-FOG collects timeseries of specific lengths matched to the more powerful AE design through a communication channel that can be a bottleneck and cause unpredictable delays (order of seconds); 3) Only ADM-EDGE has access to raw data (note that sending the raw data to ADM-FOG would be inefficient due to a low-rate NB-IoT connection and energy-constrained NB-IoT devices), while ADM-FOG gets access to aggregated data; 4) ADM-FOG applies deep learning analyses over the longer timeseries of data points using more powerful AE design with more hidden layers, requiring higher processing power and memory capacity unavailable at the edge; 5) In the worst case scenario, the final anomaly detection decision at the system level is obtained within the time frame of several seconds. It is worth noting that this response time meets the requirements and is well-aligned with the targeted Smart Logistics applications [42].
B. NB-IoT Edge Device Design
We designed the NB-IoT edge device illustrated in Fig. 3 having in mind the specific requirements of a Smart Logistics environment: tracking and monitoring the vibration of the shipping containers. Here, we reflect on the most important features supported by our device.
1) Cellular Connectivity
To fulfill the requirement for ubiquitous connectivity, while keeping the power consumption of the battery-powered device low, we utilize a BG96 cellular module from Quectel, which supports NB-IoT and LTE-M, as state-of-the-art 3GPP CIoT communication standards, that will be further evolved in 5G standardization [44]. In addition, EGPRS is supported to ensure the connectivity in areas where LTE carrier might not be available. Finally, the integrated GNSS module provides the geolocation information which is essential to the asset tracking task in the logistics use case. The intention is to use NB-IoT as the primary means of communication due to its desirable properties, namely energy efficiency combined with extended coverage [43]. However, in occasions when it is necessary to transfer larger amounts of data, (e.g. a new firmware image), LTE-M is a more efficient solution. The architecture of our edge node provides flexibility which allows us to adapt the throughput of the communication module according to the needs of the application.
2) On-Board Sensors
Apart from the localization data provided by the GNSS module, on-board environmental sensors are used to measure parameters relevant to the logistics use case. The 6-axis Inertial Measurement Unit (IMU) provides information about the vibrations and the magnetic field along X, Y and Z axes relative to the chip position. An additional set of sensors is used to measure the atmospheric conditions such as air temperature, pressure and humidity.
The designed platform provides additional metadata that could be used as inputs to ADM-EDGE. For example, the cellular modem is capable of providing the standard set of radio condition metrics (SNR, RSSI, RSRP, etc.). In addition, our design includes the on-board current measuring circuitry that allows the micro-controller unit (MCU) to acquire precise measurements of power consumption by the BG96 module.
3) The MCU Features and Capabilities
The main MCU inside the edge node is a low-power 32-bit ARM Cortex M0+ with 256KB of FLASH and 32KB of SRAM, operating at 16MHz. The MCU resources are sufficient to efficiently control the rest of the circuitry, while maintaining low power consumption, especially in the sleep mode. However, the absence of operating system as well as the hardware constraints limit the usage of ML tools only to lightweight models that are fully customized and optimized for a given application. Finally, an external FLASH memory module enables data logging over the intervals when there is no connectivity, and it is used to store the firmware images during over-the-air updates.
4) Security
In an industrial setup, the security is of the critical importance. Thereby, we use a hardware crypto element which enables offloading the computationally expensive asymmetric cryptographic algorithms (elliptic-curve cryptography and RSA) from the resource-constrained MCU [45]. Tampering-resistant memory within the crypto chip is used to store security credentials, making FW on the host MCU oblivious of the sensitive information such as the encryption keys and certificates.
C. Anomaly Detection Using ADM-Edge and ADM-FOG
1) Autoencoder Inference and Training
ADM-EDGE and ADM-FOG detect anomalies using autoencoders. An autoencoder is a feed-forward neural network trained to replicate input values at the output layer in order to obtain latent data representations in hidden layers. The number of neurons in the output layer of the autoencoder is equal to the number of neurons in the input layer, and both quantities are equal to the number of features in the training dataset (the
Autoencoders typically have a symmetric architecture with an odd number of layers as shown in Figure 4. The first
autoencoder architecture with 1 hidden layer in which the middle hidden layer contains
nodes, wheren/2 is the total number of features,n autoencoder architecture with 3 hidden layers containing sequentially
,n/2 andn/4 nodes, andn/2 autoencoder architecture with 5 hidden layers containing sequentially
,3n/4 ,n/2 ,n/4 andn/2 nodes.3n/4
As for any feed-forward neural network, there is a directed weighted link from each node in the \begin{align*} x_{i}^{(l)}=&W_{i}^{(l)} \cdot y^{(l - 1)} + b_{i}^{(l)} \\ y_{i}^{(l)}=&\sigma (x_{i}^{(l)}) \\ y_{i}^{(1) }=&v_{i},\tag{1}\end{align*}
The weights of autoencoder links and biases associated to autoencoder nodes are learnt on a training dataset \begin{equation*} E(T, \theta) = \frac {1}{|T|}\sum _{v \in T} \text {Err}(v, \theta),\tag{2}\end{equation*}
\begin{align*} \text {Err}(v, \theta)=&\sum _{i = 1}^{f} (v_{i} - \hat {v}_{i})^{2} \\ \hat {v}=&\mathcal {A}(v, \theta),\tag{3}\end{align*}
To minimize MSE of ADM-EDGE and ADM-FOG autoencoders we use the Adam optimization algorithm [46]. Adam belongs to the class of iterative gradient descent (GD) optimization algorithms which minimize \begin{equation*} \theta _{t} = \theta _{t - 1} - \eta \frac {\hat {m_{t}}}{\sqrt {\hat {v_{t}}} + \epsilon },\tag{4}\end{equation*}
\begin{align*} \hat {m_{t}}=&m_{t} \: / \: (1 - \beta _{1}^{t}) \\ \hat {v_{t}}=&v_{t} \: / \: (1 - \beta _{2}^{t}) \\ m_{t}=&\beta _{1} m_{t - 1} + (1 - \beta _{1}) g_{t} \\ v_{t}=&\beta _{2} v_{t - 1} + (1 - \beta _{2}) g_{t}^{2} \\ g_{t}=&\nabla _{\theta }E(T^{b}, \theta _{t - 1}).\tag{5}\end{align*}
2) Anomaly Detection Based on Autoencoders
Let us assume that the device behaviour is described by a feature vector \begin{equation*} e = \max _{v \in D} \text {Err}(v, \theta).\tag{6}\end{equation*}
ADM-EDGE and ADM-FOG autoencoders identify anomalies according to the previously described rule. For each anomaly detection decision, the confidence score \begin{equation*} C(y) = S(\text {Err}(y,\theta) - e),\tag{7}\end{equation*}
The important property of the confidence score function is that non-anomalous data points have scores in the range (0, 0.5], whereas anomalous data points exhibit higher scores that belong to the interval (0.5, 1). In other words, confidence scores close to 0 indicate non-anomalous data points, while values close to 1 signify anomalies. Thus, confidence scores for non-anomalous data points after making decision are further transformed into
Due to a low computational power and small memory capacity, it is practically infeasible to train the ADM-EDGE autoencoder directly on the edge node device: 1) a large number of data points has to be stored at the device to train a model exhibiting an acceptable level of accuracy, 2) the training of autoencoders is a computationally intensive optimization process usually performed in a large number of iterative steps, 3) a low computational power prevents any serious model validation and tuning of model hyper-parameters. Consequently, we adopt a scheme in which ADM-EDGE autoencoders are trained offline and an inference engine for feed-forward neural networks is directly integrated into the firmware of the edge node device enabling autoencoder-based anomaly detection on pretrained models. The edge node device also does not provide a storage for sensory readings. This means that it is also not feasible to make ADM-EDGE autoencoders detecting anomalies in timeseries data. Thus, ADM-EDGE autoencoders perform anomaly detection considering individual data points (the last values of sensory readings). In our future work we will also consider online training for ADM-EDGE autoencoders for more powerful edge node devices w.r.t. computational and storage capabilities. In contrast to ADM-EDGE lightweight autoencoders, ADM-FOG autoencoders process multivariate timeseries constructed using the sliding window approach in an arbitrary number of hidden layers.
The inference engine for ADM-EDGE autoencoders is realized in C as a standalone, self-contained module without any external dependencies to third party libraries. This C module is directly integrated into the firmware of an edge node device. To train ADM-EDGE and ADM-FOG autoencoders we have developed a Python module based on the deep learning Tensorflow library [50]. This module builds an autoencoder as a Tensorflow sequential neural network model for a given specification of the autoencoder structure (the number of hidden layers and the number of nodes per hidden layer) and determines autonecoder weights and biases using the previously described Adam algorithm for a given number of epochs (by default 100) and batch size (by default 16). Before training, data points in the input training dataset are normalized such that each feature has zero mean and unit variance. The training of both ADM-EDGE and ADM-FOG autoencoders is performed on a fog gateway. In the case of ADM-EDGE autoencoders, the structure of the trained autoencoder AD model, its weights and data normalization parameters are exported as C declarations to a header file. The exported header file is included by the C module realizing the inference engine for ADM-EDGE autoencoders prior to its integration into the firmware. The inference engine for ADM-FOG autoencoders is implemented in Python relying on the Tensorflow library.
Decisions made by ADM-EDGE lightweight autoencoders are re-evaluated by ADM-FOG autoencoders in case of low confidence scores. The default value of the threshold is set to
System Integration, Data Generation and Numerical Results
A. System Integration
To integrate the system, collect real-world data and perform testing and evaluation, CIoT UE is connected to the FGW via a mobile operator macro-cellular NB-IoT eNB. CIoT UE is running the ADM-EDGE software module and periodically sends data points to the FGW encapsulated into UDP packets. Within the mobile operator core network, a general purpose server is set and connected to the PGW gateway. The ADM-FOG software module within the server accepts UDP packets sent by CIoT UE. The server provides sufficient resources to run the ADM-FOG module, so in the sequel, we focus on the ADM-EDGE module deployment on the CIoT UE device.
To estimate the storage budget of an ADM-EDGE model with one hidden layer in terms of memory footprint the following results are given in Table 1. One can note that ADM-EDGE consumes a small fraction of standard NB-IoT device firmware needed for basic device sensing, processing and communication functionality. Tensorflow and Tensorflow lite exported models sizes are also given for reference. Table 2 shows comparison of ADM-EDGE memory resource utilization for autoencoders with 1, 3 and 5 hidden layers (as in the previous table, the sizes of exported Tensorflow and Tensorflow light models are given for reference). It can be observed that additional hidden layers do not significantly increase the memory footprint of the ADM-EDGE module within the firmware of the edge node device.
Computational budget of ADM-EDGE devices is estimated by available number of operations per second. The ARM Cortex-M0+ ADM-EDGE CPU has a two-stage pipeline, and most instructions are executed within 1 clock cycle (some take 2 and a few take more than 2 clock cycles). The following holds for CPU: Peak throughput = Peak IPC * f = 1 * 16 MHz = 16 MOps/s, where peak IPC (instructions per cycle) = 1 for ARM Cortex M0+ architecture. Accordingly, peak computational budget is up to 16 MOps/s.
B. Data Generation
To generate the dataset (elaborated in Section IV-C), we used NB-IoT edge nodes described in Section III-B. We created a setup where an edge node has been attached to a box-shaped container inside a transport vehicle moving through the city of Novi Sad. The device was initially connected to the NB-IoT network, and it had the uninterrupted connectivity throughout the path. We collected the positioning data from GNSS module (timestamp, latitude, longitude, altitude, speed and number of satellites in range), as well as the outputs of the IMU (acceleration and magnetic field along the 3 spatial axes). The time resolution (sampling period) of the GNSS samples was
C. Numerical Results
ADM-EDGE and ADM-FOG autoencoders are evaluated using two independent real-world datasets. The first dataset reflects the behaviour of the edge node device under normal driving conditions without large disturbances. This dataset contains 12678 data points and it is used to train ADM-EDGE and ADM-FOG autoencoders. The trained autoencoders are tested on the second dataset. The test dataset has 1571 data points with 42 intentionally caused anomalous events induced by shaking and overturning the container with the attached device. Since the edge node records both location-based features (GPS longitude and latitude) and IMU-based features, we can distinguish two types of anomalous events: location-based anomalies (large deviations from learned trajectories) and behaviour-based anomalies (large deviations from learned IMU signals). Our test dataset does not contain any location-based anomalies.
The accuracies of ADM-EDGE and ADM-FOG autoencoders are assessed by computing the following basic measures:
(true positives) – the number of correctly identified anomalous events,TP (false positives) – the number of times an autoencoder indicated a non-existing anomalous event, andFP (false negatives) – the number of times an autoencoder missed to indicate an existing anomalous event.FN
From \begin{align*} P=&\frac {TP}{TP + FP} \tag{8}\\ R=&\frac {TP}{TP + FN}\tag{9}\end{align*}
When comparing different anomaly detection models it is useful to have a single overall score reflecting their performances. For this purpose we use the \begin{equation*} F_{1} = \frac {2 \cdot P \cdot R}{P + R}.\tag{10}\end{equation*}
In our experimental evaluation, we examine three ADM-EDGE anomaly detection models (with 1, 3 and 5 hidden layers), 19 ADM-FOG models with three hidden layers (sequentially containing
The evaluation metrics for a particular model are estimated by averaging results individually obtained from all autoencoders in the corresponding ensemble. Additionally, two variants of each model are examined: a model trained without location-based features (NO-GPS case) and a model trained on all features (GPS case).
The performance of ADM-EDGE and ADM-FOG autoencoders is compared to five baseline anomaly detection methods that are not based on deep learning algorithms:
SVM – anomaly detection using one-class support vector machines,
ABOD – angle-based outlier detection,
KNN – anomaly detection based on the
-nearest neighbours algorithm (K ),K = 10 PCA – anomaly detection based on principal component analysis, and
HBOD – histogram-based outlier detection.
The results of the evaluation of ADM-EDGE autoencoders in both variants (GPS and NO-GPS case) are summarized in Table 3. The table also shows precision, recall and
The results presented in Table 3 also show that ADM-EDGE autoencoders perform better than the baseline anomaly detection methods: all ADM-EDGE autoencoders have higher
In the second experiment we examine the performance of ADM-FOG autoencoders with 3 and 5 hidden layers. The obtained F1 scores are presented in Figure 6 for the GPS case and in Figure 7 for the NO-GPS case. It can be seen that in both cases ADM-FOG autoencoders with 3 hidden layers achieve
The results shown in Figures 6 and 7 allow us to explicitly quantify trade-offs between performance of anomaly detection and response time, with respect to whether the decision on the presence of anomalies is carried out at the edge or at the fog. For this, note that the response time of ADM-EDGE corresponds approximately to one sampling period
In the last experiment, we compare ADM-FOG autoencoders to baseline anomaly detection methods trained and tested on timeseries. The obtained \begin{equation*} \text {ADM-FOG} \succ \text {ABOD} \succ \text {KNN} \succ \text {PCA} \succ \text {SVM} \succ \text {HBO},\end{equation*}
\begin{equation*} \text {ADM-FOG} \succ \text {ABOD} \succ \text {KNN} \succ \text {PCA} \succ \text {HBO} \succ \text {SVM}.\end{equation*}
Comparison of
Comparison of
Our experimental evaluation shows that both ADM-EDGE and ADM-FOG autoencoders perform better than all examined baseline methods. Therefore, it can be concluded that autoencoders are an adequate choice to enhance the proposed CIoT architecture with unsupervised anomaly detection capabilities at both edge and fog layer.
Conclusion
In this paper, we presented design, implementation, real-world deployment and evaluation of a novel anomaly detection architecture for Cellular IoT networks tailored for the Smart Logistics use case. We demonstrate and quantify major system-design trade-offs between responsiveness and accuracy with respect to the position (i.e., edge or fog) within the Cellular IoT network where anomaly detection is performed. Through real-world deployment study, we emphasize that autoencoders represent a suitable choice for ML anomaly detection at the edge.
The results reported in this paper are based on a small-scale real-world trial. For our future work, the trial will be extended to a large number (approximately 50) container-carrying vehicles in a realistic nation-wide Smart Logistics use case. Possibilities to perform training process at the edge devices will be explored, as well as opportunities to integrate advanced distributed learning concepts such as federated learning as part of the proposed Cellular IoT architecture will be investigated.