Loading web-font TeX/Caligraphic/Regular
A Linear Systems Perspective on Intrusion Detection for Routing in Reconfigurable Wireless Networks | IEEE Journals & Magazine | IEEE Xplore

A Linear Systems Perspective on Intrusion Detection for Routing in Reconfigurable Wireless Networks


The dynamic and open nature of Reconfigurable Wireless Networks (RWN) makes them vulnerable to routing attacks, where a malicious node could alter routing information to ...

Abstract:

Reconfigurable wireless networks, such as ad hoc or wireless sensor networks, do not rely on fixed infrastructure. Nodes must cooperate in the multi-hop routing process. ...Show More

Abstract:

Reconfigurable wireless networks, such as ad hoc or wireless sensor networks, do not rely on fixed infrastructure. Nodes must cooperate in the multi-hop routing process. This dynamic and open nature make reconfigurable networks vulnerable to routing attacks that could degrade significantly network performance. Intrusion detection systems consist of a set of techniques designed to identify hostile behavior. In this paper, there are several approaches for intrusion detection in reconfigurable network routing such as collaborative, statistical, or machine learning-based techniques. In this paper, we introduce a new approach to intrusion detection for reconfigurable network routing based on linear systems theory. Using this approach, we can discriminate routing attacks by considering the system's z-plane poles. The z-plane can be thought of as a two dimensional feature space that arises naturally. It is independent of the number of network attack detection metrics and does not require extra dimensionality reduction. Two different host-based intrusion detection techniques, inspired by this new linear systems perspective, are presented and analyzed through a case study. The case study considers the effects of attack severity and node mobility to the attack detection performance. High attack detection accuracy was obtained without increasing packet overhead for both techniques by analyzing locally available information.
The dynamic and open nature of Reconfigurable Wireless Networks (RWN) makes them vulnerable to routing attacks, where a malicious node could alter routing information to ...
Published in: IEEE Access ( Volume: 7)
Page(s): 60486 - 60500
Date of Publication: 09 May 2019
Electronic ISSN: 2169-3536

Funding Agency:


SECTION I.

Introduction

Reconfigurable wireless networks (RWN), such as ad hoc and Wireless Sensor Networks (WSN), are decentralized, flexible, highly dynamic, independent of any fixed infrastructure and capable of self-management, [1]. Routing is a fundamental operation that is needed in these networks in order to deliver information among nodes. Routing protocols in RWN were designed assuming a safe and cooperative network environment and this is not always the case. Some routing protocol vulnerabilities can be exploited by a malicious node or group of nodes to affect network performance and consume resources (e.g., energy, bandwidth). There are different types of network attacks against reconfigurable routing protocols, e.g., the flooding attack, [2], selective forwarding attack, [3], black hole attack, [4] and worm hole attack, [5].

Preventive measures, such as, authentication mechanisms or secure routing, could be taken to protect RWN from routing attacks. It is worth mentioning that those preventive measurements are not sufficient to fully protect the network from inside attackers. Intrusion Detection Systems (IDS) represent a protective measure because they are a set of techniques designed to identify malicious activities that could compromise network security. Several IDS approaches, [6], such as statistical, [7], collaborative, [8], or machine learning techniques, [9], have been taken in the literature to address the routing attack detection for RWN. Thus, the development of tools for intrusion detection is important, and more interest is drawn towards tools that can potentially control and act based on evidence of the presence of an attack.

In this paper, we present two new approaches for IDS in RWN’s routing. These approaches are based on linear systems theory, which allows us to detect intruders by monitoring the behavior of nodes in the network. At the same time this tool is powerful for protection and control of the network with actions to isolate or avoid the attack.

Malicious nodes behave differently than the rest of the nodes in the network; if we model each network node as a linear system, those differences in dynamic behavior should be reflected on the system representation for those nodes. A linear system can be defined in the frequency domain by the z-domain zeros and poles of the transfer function. System poles are always located on a two dimensional orthogonal space which is independent of the number of network metrics considered as input and output signals (e.g., number of received packets, bandwidth usage per link). This property can be used as a dimensionality reduction step that arises naturally when representing a node as a linear system. Depending on the pole locations, a system could behave as an under damped, over damped, critically damped or an unstable system. This property of the pole locations could be thought of as being part of a classification task that can be exploited to identify routing attacks. Hence, our goal is to find a routing attack-sensitive system representation of a node. Without loss of generality we introduce the general equations for system models and we form a case study with one input and one output to analyze and show feasibility of the two IDS techniques proposed.

The rest of this document is organized as follows, Section II, provides background on routing attacks and IDS for RWN. In addition, it provides a concise literature review. In Section III, two different techniques to identify routing attacks in RWN are presented. Section IV, presents simulation results as a case study where performance metrics are presented in terms of detection rates for different attack severity and node speed values. The implementation of both techniques is also addressed in this section. Finally, Section V, covers the conclusions of this work.

SECTION II.

Background

In this section we present a general overview on protocols and attacks for routing in RWN. Literature review on IDS for routing in RWN is briefly addressed. Finally, as background work, we discuss the use of linear systems theory for network attack detection in the literature.

A. Routing Attacks

Data routing is one of the most essential functions of a network. For the RWN case, routing represents a specially hard challenge due to the open and highly dynamic nature of the network. The problem gets even harder because of the fact that some nodes could have severe energy, computing and bandwidth restrictions. Routing protocols for RWN can be classified into proactive and reactive protocols. Proactive protocols are those in which nodes periodically exchange routing information and changes in topology with their respective neighbors, as in the case of Optimized Link State Routing (OLSR) protocol, [10]. Reactive protocols are those in which routes are generated only when there is a need to communicate a message between two nodes, such as Ad Hoc on Demand Distance Vector (AODV), [11]. In general, any routing protocol for RWN must perform three tasks, route discovery, route maintenance and data forwarding. During route discovery, nodes share information about their links with their respective neighbors, either in a proactive or reactive manner. With updated routing information, nodes cooperate to forward data packets from origin to destination, usually through multiple hops. Route maintenance refers to the actions performed by nodes when routes change due to node mobility or channel impairments.

Many routing protocols were designed assuming a cooperative environment. However, if there are one or more malicious nodes they could launch an attack to the routing protocol by violating its rules, either in the route discovery phase, the route maintenance phase or during packet forwarding. Any of those hostile actions could affect considerably network performance.

There are different attack classes that can be used against the reconfigurable routing protocols, [12], some of which are:

  1. Flooding Attack. There are two possible implementations of this attack, Route Request (RREQ) flooding and data flooding attack. RREQ flooding occurs when a malicious node sends a large number of RREQ messages in a short period of time in order to deplete network resources, such as bandwidth or energy. During a data flooding attack, the malicious node sends bogus traffic to consume network bandwidth and energy.

  2. Selective Forwarding. A malicious node behaves properly during route discovery and route maintenance, but selectively drops data packets in order to degrade network performance.

  3. Black Hole Attack. A malicious node sends fake routing information so that each neighboring node calculates an optimal route to a node of the attacker’s interest, attracting and controlling network traffic. After monopolizing network traffic, the attacker could then analyze the content of the collected data packets or simply discard the data packets.

  4. Wormhole Attack. This attack requires a pair of malicious nodes colluding to re-transmit packets from one network location to another using a private network, gaining control on network traffic.

Secure routing protocols usually rely on encrypting routing information to prevent any modification or misuse of it, [13], [14]. This approach prevents some attacks by increasing the packet overhead, but the lack of a central organism in charge of security poses a challenge in security certificate management. Additionally, some routing attacks, such as selective forwarding or wormhole attack, could be launched against the network by an inside attacker despite routing information encryption. Complementary techniques, such as IDS are necessary to fully protect RWN.

B. Intrusion Detection Systems

An Intrusion Detection System (IDS) is a defense mechanism capable of detecting hostile activities that could compromise the network security. IDS are an alternative to protect the routing process in RWN, [15]–​[17]. In order to carry out the detection, the network is constantly being monitored in search of known malicious behaviors or anomalous behaviors. IDS can be classified as host-based, if the detection process is performed by each node, or network based, if the detection is performed by a central entity observing larger traffic flows, such as a base station or a cluster head node.

Different collaborative, [18], and trust based IDS have been proposed in the literature to detect routing attacks in RWN, [19]–​[21]. There is generally a trust metric that nodes in the network obtain for their neighbors. Then, they share those measurements to reach a consensus about a particular node behavior, identifying in this way attacker nodes. This approach has usually good detection rates, but has an overhead increase for reaching consensus between nodes.

Statistical techniques such as the ones presented in [22]–​[24], and [25], tend to have very good attack detection rates for particular scenario conditions. Statistical metrics of network parameters change over time because of the dynamic nature of the RWN. Spectral information of relevant network parameters could be used in conjunction with statistical techniques to make intrusion detection robust to dynamic changes in the network, [26].

Support Vector Machines (SVM) are a popular machine learning approach for IDS in RWN, [27], [28], because they are robust to the network dynamic behavior. SVM are good for performing a classification task in high dimensional feature space, with a relatively little training data sample. Additionally, SVM can achieve very high attack detection accuracy, up to 99.98%, [29], and have a relatively low computational cost compared to other machine learning techniques.

C. Network Security Based on Linear Systems Theory

In the literature we identified two network security references on linear systems theory. In the first reference, [30], authors modeled the network behavior as a Multiple Input Single Output (MISO) linear system. They used the model obtained to detect unauthorized probe and Denial of Service (DoS) attacks in a centralized network. They considered as model inputs traffic parameters, such as, number of TCP packets received, number of UDP packet received, among other similar features. The system output models the network state and is used to detect a particular threat.

In the second reference, [31], the authors obtained a state-space feedback model and a Proportional Integral controller (PI) to detect and delay the spread of worms and viruses on a network. They were monitoring the number of connections per unit time and controlling the traffic in two queues, a safe queue and a suspicious queue to slow down the spread velocity of the worm.

In this paper, we analyze RWN routing security based on the theory of Discrete-Time Linear Time Invariant Systems (DT-LTIS), [32], [33]. We know that routing attacks degrade RWN’s performance. We consider a node as a linear system whose output signal is a performance metric. This metric is defined in such a way that performance degradation corresponds to an output signal increase. As system input signals, we consider different network metrics locally available to a network node. Those considered metrics are sensitive to network performance degradation. The linear systems approach and the input and output signals will allow us to analyze routing attacks in terms of causes and effects from the local perspective of a single network node.

SECTION III.

Pole Location Based Intrusion Detection

In this section, we present two host-based intrusion detection techniques for RWN. Our goal for each technique, is to find a linear system representation of a node. Poles of this system representation must be sensitive to routing attacks.

We begin by defining notation and some basic concepts. Then we discuss a black box technique to find an attack-sensitive system representation of a node. Finally a root locus approach is presented as an alternative to the black box technique.

A. Basic Definitions

Consider a reconfigurable network as a dynamic directed graph \mathcal {G_{\tau }} whose nodes can leave or join the network at any time. New links can be established or lost due to different network phenomena such as node mobility or channel impairments. The topology of \mathcal {G_{\tau }} can be described by the nodes and their respective links at a given time instant \tau . Formally, a network composed by a total number of S_{\tau } nodes is defined as \mathcal {G_{\tau }} = (\mathcal {V_{\tau }}, \mathcal {L_{\tau }}) ; where \mathcal {V_{\tau }}=\{v_{i}: i = 1,2,\ldots, S_{\tau }\} is the set of nodes in the network at instant \tau , and \mathcal {L}_{\tau }=\{ l_{ij}=(v_{i},v_{j}): v_{i},v_{j} \in \mathcal {V_{\tau }} \} is the set of ordered pairs representing the network links at the same instant \tau . Link direction is indicated by the sub-indices order, e.g., l_{ij} refers to the link that goes from v_{i} to v_{j} . If for any pair of nodes, v_{i} , v_{j} , the link l_{ij} \notin \mathcal {L_{\tau }} , then l_{ij} does not exist. Figure 1 (a) shows an example of a network represented by the described notation.

FIGURE 1. - (a) Topology of some reconfigurable network at instant 
$\tau $
. (b) Intrusion detection performed by node 
$v_{i}$
 on node 
$v_{j}$
 through information of the link 
$l_{ji}$
.
FIGURE 1.

(a) Topology of some reconfigurable network at instant \tau . (b) Intrusion detection performed by node v_{i} on node v_{j} through information of the link l_{ji} .

Without loss of generality, we will focus on node v_{i} to explain the proposed intrusion detection techniques. The set of neighboring nodes to v_{i} at instant \tau is defined as \mathcal {N}_{i} = \{ v_{j}: l_{ji} \in \mathcal {L_{\tau }} \} , \mathcal {N}_{i} \subset \mathcal {V}_{\tau } . Each node v_{i} \in \mathcal {V}_{\tau } , will be running the intrusion detection system, IDS_{ij} , for each neighbor v_{j} \in \mathcal {N}_{i} . If all the nodes in the network are independently performing intrusion detection for each neighbor, any malicious node could be identified and isolated by its one hop neighbors. Figure 1 (b) shows the intrusion detection performed by node v_{i} for its j -th neighbor, v_{j} .

A malicious node or group of nodes could perpetrate a routing attack in order to affect network performance (e.g., flooding attack, worm hole attack, black hole attack, selective forwarding attack, among others). The set of the M possible routing attacks that could be launched against the network is defined as \Omega _{\mathcal {A}} = \{\omega _{g}: g = 1,2,\ldots, M \} . Each \omega _{g} \in \Omega _{\mathcal {A}} has associated an attack severity defined in a finite interval, \psi _{g} \in [\psi _{g}^{\min }, \psi _{g}^{\max }] . In general, greater values of the attack severity, \psi _{g} , correspond to a greater impact on network performance degradation. The number, class and severity of routing attacks could vary with time, as a consequence, the attack impact on network performance also varies with time.

The set of L local performance metrics (e.g., point to point delay, link throughput, etc.) that node v_{i} could monitor to detect routing attacks is defined as \mathcal {P} = \{ \pi _{a}: a = 1,2,\ldots,L\} . Each routing attack, \omega _{g} \in \Omega _{\mathcal {A}} , affects at least one local performance metric, \pi _{a} \in \mathcal {P} , and each \pi _{a} \in \mathcal {P} could be affected by one or more routing attacks. It is worth mentioning that network performance degradation does not always correspond to an attack condition. The local performance metric \pi _{a} could be degraded by the g -th routing attack, \omega _{g} , but it could also be affected by different phenomena inherent to the reconfigurable nature of the network, such as node mobility, sleeping nodes, interference, among other causes.

Node v_{i} can use additional information, metrics obtained from incoming packets and the neighbor links, so that IDS_{ij} can detect each routing attack, \omega _{g}\in \Omega _{\mathcal {A}} . Some of these metrics are sensitive to degradation of network performance due to an attack, and others due to network phenomena different from the routing attacks. From the computational point of view, these metrics can be thought of as features used to detect routing attacks. The set of metrics or features locally available for each node v_{i}\in \mathcal {V}_{\tau } is defined as, X = X_{\mathcal {A}} \cup X_{\mathcal {N}} . This set of local features, X , is composed by the union of two subsets. Subset X_{\mathcal {A}} , composed of a total of A different features where performance metric, \pi _{a} , is sensitive to degradation due to an attack, X_{\mathcal {A}} = \{\chi _{\mathcal {A}b}: b = 1,2,\ldots,A\} ; and subset X_{\mathcal {N}} , composed of N auxiliary features used to discard other causes for degradation of \pi _{a} , X_{\mathcal {N}} = \{\chi _{\mathcal {N}c}: c = 1,2,\ldots,N\} .

In order to detect the g -th routing attack, \omega _{g} , each node v_{i} could be monitoring the a -th local performance metric, \pi _{a} , some of the features sensitive to routing attacks from set X_{\mathcal {A}} , and some of the features non sensitive to routing attacks from the set X_{\mathcal {N}} . The subset of \lambda _{a} features sensitive to routing attacks, X_{\mathcal {A}g}\subset X_{\mathcal {A}} , used to detect the g -th routing attack is defined as, X_{\mathcal {A}g} = \{\chi _{\mathcal {A}b} \in X_{\mathcal {A}}: b = 1,2,\ldots,\lambda _{a}\} , |X_{\mathcal {A}g}|=\lambda _{a}\leq A . The subset of \lambda _{n} auxiliary features used to detect the g -th routing attack, X_{\mathcal {N}g}\subset X_{\mathcal {N}} , is defined as, X_{\mathcal {N}g} = \{\chi _{\mathcal {N}c} \in X_{\mathcal {N}}: c = 1,2,\ldots,\lambda _{n}\} , |X_{\mathcal {N}g}|=\lambda _{n}\leq N . In the following paragraphs, we introduce an example using this notation.

1) An Example

Consider that as part of IDS_{ij} , node v_{i} wants to detect the g -th routing attack, \omega _{g} . For this particular example, let \omega _{g} be a selective forwarding attack. In a selective forwarding attack, the malicious node discards a fraction of the received data packets with a probability p_{D} , instead of forwarding them to their destination. We can then define the attack severity, \psi _{g} , in terms of the probability p_{D} . This means that the minimum attack severity corresponds to the case where the attacker does not discard any packet, \psi _{g}^{\min }=0 ; and the maximum attack severity occurs when the malicious node discards all the data packets, \psi _{g}^{\max }=1 . Therefore, attack severity is defined as, \psi _{g}=p_{D} \in [{0, 1}] .

The selective forwarding attack, \omega _{g} , will have an impact on the network throughput. Node v_{i} could measure the link’s l_{ji} throughput, as local performance metric \pi _{a} , sensitive to \omega _{g} . Note that l_{ji} throughput can be degraded because node v_{j} maliciously drops data packets or by some other cause such as frame collision due to channel congestion. Node v_{i} could use the number of packets received from v_{j} to detect the attack as \chi _{\mathcal {A}1} . A low number of received packets from v_{j} could be an indication of packets being dropped. In this particular case, X_{\mathcal {A}g} = \{\chi _{\mathcal {A}1}\} , \lambda _{a} =1 . Other features like the number of collided frames, \chi _{\mathcal {N}1} , could be used to discard frame collision due to channel congestion; the number of routes involving node v_{j} , \chi _{\mathcal {N}2} , could be an indication of alternative routes, X_{\mathcal {N}g}=\{\chi _{\mathcal {N}1},\chi _{\mathcal {N}2}\} , \lambda _{n} =2 .

Note that as long as link l_{ji} is active, node v_{i} will be able to take measurements on each performance metric and feature of interest, \pi _{a} \in \mathcal {P} , \chi _{\mathcal {A}b} \in X_{\mathcal {A}g} , \chi _{\mathcal {N}c}\in X_{\mathcal {N}g} , at any given instant \tau . If those measurements are taken at regular intervals with a fixed sampling period, T , we will get a collection of measurements at each instant \tau = kT, k=0,1,\ldots . Those collections of indexed measurements constitute a time series for the performance metrics and features of interest, \pi _{a}(kT) , \chi _{\mathcal {A}b}(kT) , \chi _{\mathcal {N}c}(kT) . To simplify notation, we will omit the indication of the sampling period, T , when referring to time series, \pi _{a}(k) , \chi _{\mathcal {A}b}(k) , \chi _{\mathcal {N}c}(k) .

So far, we have defined reconfigurable networks, routing attacks, the set of local performance metrics affected by those attacks and the set of features to perform intrusion detection. The rest of this section introduces the path to achieve our goal of finding a routing attack-sensitive system representation of a node, v_{i} , for each one of its neighbors, v_{j} \in \mathcal {N}_{i} .

B. A Black Box Approach

As a first approach for IDS_{ij} , we consider a system identification technique. We will focus on detecting the g -th routing attack, \omega _{g} \in \Omega _{\mathcal {A}} , this technique could be later expanded to the rest of routing attacks.

Node v_{i} will be considered as a DT-LTI system. Note that this assumption of linearity and time invariance is only valid for a small time window around a given instant, \tau , due to the highly dynamic nature of network topology and traffic. The system inputs are defined as a vector, \boldsymbol {\chi }(k) , composed by the time series \chi _{\mathcal {A}b}(k) , \chi _{\mathcal {N}c}(k) ; \boldsymbol {\chi }(k) = [\chi _{\mathcal {A}1}(k) {\dots }\chi _{\mathcal {A}\lambda _{a}}(k) \chi _{\mathcal {N}1}(k) {\dots }\chi _{\mathcal {N}\lambda _{n}}(k)]^{\intercal } . System output is defined in terms of the a -th performance time series as, \boldsymbol {y}_{a}(k) = \pi _{a}(k) . Figure 2 shows the system representation that would be used to detect the g -th routing attack, \omega _{g} . The dynamic behavior of this system representation will be estimated by the black box method.

FIGURE 2. - Proposed system model for the black box based 
$IDS_{ij}$
 using the time vector 
$\boldsymbol {\chi }(k)$
 as system input and 
$\pi _{a}(k)$
 as system output.
FIGURE 2.

Proposed system model for the black box based IDS_{ij} using the time vector \boldsymbol {\chi }(k) as system input and \pi _{a}(k) as system output.

After defining the system inputs and outputs, the next step consists in modeling the dynamical behavior of the chosen system in discrete time. Parametric modeling relies on a previously known model structure. This model structure can come from discretization of a set of differential equations modeling the physical principles of the system of interest. Since for a node v_{i} , we do not have such knowledge of physical principles, we consider polynomial model structures, commonly used to describe input-output relationships of black box systems. Black box systems are those systems whose dynamic behaviour is unknown. Several polynomial models, such as Box-Jenkins (BJ) or Auto Regressive with eXogenous input (ARX) allow us to model system and perturbations dynamics from input-output data, [34]. Because in this particular case we are only interested in system dynamics and not in modeling system perturbations, we will start from a simpler model that does not consider perturbations, the Output Error (OE) model structure, defined as, \begin{equation*} \boldsymbol {y}(k)=\sum _{e=1}^{p}\boldsymbol {A}^{(e)}\boldsymbol {y}(k-e)+\sum _{f=0}^{q}\boldsymbol {B}^{(f)}\boldsymbol {\chi }(k-f)+\boldsymbol {\varepsilon }(k),\tag{1}\end{equation*}

View SourceRight-click on figure for MathML and additional features. where \boldsymbol {\chi }(k)\in \mathbb {R}^{n} is the input signal vector, \boldsymbol {y(k)}\in \mathbb {R}^{m} is the output signal vector, \boldsymbol {A}^{(e)}\in \mathbb {R}^{m\times m} and \boldsymbol {B}^{(f)}\in \mathbb {R}^{m\times n} are the model parameter matrices. Note that the OE model structure has the form of difference equations. With a previous knowledge of the number of system poles, p , and the number of zeros, q , for q \leq p , we can obtain a set of difference equations that describe the input-output dynamics of the system in the time domain. The OE model is linear in the parameters. In general, the number of parameters of the OE model is equal to mpm+mqn . Given the model with parameters \hat {\boldsymbol {\theta }}=\{\hat {\boldsymbol {A}}^{(e)},\hat {\boldsymbol {B}}^{(f)}\} , we can estimate the system output as follows, \begin{equation*} \hat {\boldsymbol {y}}(k|\hat {\boldsymbol {\theta }})=\sum _{e=1}^{p}\hat {\boldsymbol {A}}^{(e)}\boldsymbol {y}(k-e)+\sum _{f=0}^{q}\hat {\boldsymbol {B}}^{(f)}\boldsymbol {\chi }(k-f).\tag{2}\end{equation*}
View SourceRight-click on figure for MathML and additional features.

The difference, \boldsymbol {\varepsilon }(k)\in \mathbb {R}^{m} , between the estimated output \hat {\boldsymbol {y}}(k) in (2) and the actual output \boldsymbol {y}(k) in (1) is called the estimation error, \begin{equation*} \boldsymbol {\varepsilon }(k)=\boldsymbol {y}(k)-\hat {\boldsymbol {y}}(k|\hat {\boldsymbol {\theta }}).\tag{3}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

The set of parameters \hat {\boldsymbol {\theta }} should be chosen in such a way that they minimize the estimation error \boldsymbol {\varepsilon }(k) . Suppose we have a total of d delayed measurements of \boldsymbol {y}(k) and \boldsymbol {\chi }(k) . In order to estimate all model parameters, \hat {\boldsymbol {\theta }} , we need that the number of measurements, d , satisfy the condition, d \geq mpm+mqn . Rearranging Equation (1), we obtain, \begin{equation*} \boldsymbol {y}=\boldsymbol {X}\boldsymbol {\theta },\tag{4}\end{equation*}

View SourceRight-click on figure for MathML and additional features. where, \boldsymbol {y}\in \mathbb {R}^{d} is a vector formed from the delayed measurements of \boldsymbol {y}(k) , \boldsymbol {X}\in \mathbb {R}^{d\times (mpm+mqn)} is a matrix formed from delayed versions of \boldsymbol {y}(k-e) and \boldsymbol {\chi }(k-f) , and \boldsymbol {\theta } are the model parameters, \hat {\boldsymbol {\theta }}\in \mathbb {R}^{(mpm+mqn)} , arranged as a vector. A method to estimate the optimal values of parameters \hat {\boldsymbol {\theta }} is least squares regression, with \begin{equation*} \hat {\boldsymbol {\theta }}=(\boldsymbol {X}^{\intercal }\boldsymbol {X})^{-1}\boldsymbol {X}^{\intercal }\boldsymbol {y}.\tag{5}\end{equation*}
View SourceRight-click on figure for MathML and additional features.

Once we have a model in the time domain, we can obtain its frequency domain representation by using the \mathcal {Z} -transform and use the system pole’s locations to identify the regions on the z-plane corresponding to a node under a routing attack and the regions corresponding to a non attack condition, given a set of label data. After having identified the decision regions that allow detection of a routing attack, we can implement them on an individual node. To detect routing attacks in real time, each node, v_{i}\in \mathcal {V}_{\tau } , will be continuously fitting the OE model, given the previous d measurements of the time series in a sliding time window fashion. Then, v_{i} needs to estimate the instantaneous system poles in the z domain, and compare those pole locations to the obtained decision regions, for each attack, \omega _{g}\in \Omega _{\mathcal {A}} , and each neighboring node, v_{j} \in \mathcal {N}_{i} . The training and online detection stages of the black box method are represented in Fig. 3.

FIGURE 3. - Black box based 
$IDS_{ij}$
. During the training phase we obtain the decision regions to detect the 
$g$
-th routing attack, 
$\omega _{g}$
. For online intrusion detection, we obtain the instantaneous system poles and compare them to the decision regions.
FIGURE 3.

Black box based IDS_{ij} . During the training phase we obtain the decision regions to detect the g -th routing attack, \omega _{g} . For online intrusion detection, we obtain the instantaneous system poles and compare them to the decision regions.

C. A Root Locus Approach

In this subsection we will take a different approach to achieve our goal. We start from the desired behavior of the instantaneous system poles on the z-plane. Similarly to the previous case, we will focus on the g -th routing attack and we will consider of each node v_{i} as a DT-LTI system for a small time window around a given instant, \tau . We need to define the proper input and output signals that lead to a system representation whose poles tend to be near the origin of the z-plane in absence of the attack, z_{\mathcal {N}}^{min} = 0 . Additionally, we want that in the presence of an attack, \omega _{g} , the system poles move closer to an arbitrary location in the z-plane, z_{\mathcal {A}}^{max} = r\cos \theta + jr\sin \theta , as attack severity, \psi _{g} , increases. Do not confuse angle \theta with the parameter vector \boldsymbol {\theta } in (4). Given the fact that complex poles have conjugates, we will have additional poles at \overline {z}_{\mathcal {N}}^{min} = 0 , and \overline {z}_{\mathcal {A}}^{max} = r\cos \theta - jr\sin \theta . This implies that the system representation that we are looking for is of second order. Figure 4 (a) shows the desired system representation of v_{i} to be sensitive to the g -th routing attack, \omega _{g} . We need to find the system input and output signals so the system poles on the z-plane follow the trajectories shown in Fig. 4 (b) as attack severity, \psi _{g} , increases. Later in this section, we will cover an optimization strategy to obtain the value of parameter r .

FIGURE 4. - (a) Proposed root locus based 
$IDS_{ij}$
, proper system inputs and outputs must be found. (b) Desired pole trajectories as 
$\psi _{g}$
 increases.
FIGURE 4.

(a) Proposed root locus based IDS_{ij} , proper system inputs and outputs must be found. (b) Desired pole trajectories as \psi _{g} increases.

In order for IDS_{ij} to be sensitive to the g -th routing attack, \omega _{g} , and not to other factors; we need to make the distinction of the time series sensitive to routing attacks, \chi _{\mathcal {A}b}(k) , from the time series non sensitive to attacks, \chi _{\mathcal {N}c}(k) . This distinction could be achieved stating our problem in the time domain as a multivariate linear regression problem as, \begin{equation*} \pi _{a}(k) = \sum _{b=1}^{\lambda _{a}}\alpha _{b}\chi _{\mathcal {A}b}(k) + \sum _{c=1}^{\lambda _{n}}\beta _{c}\chi _{\mathcal {N}c}(k) + \gamma (k).\tag{6}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

Now, assume that the desired system representation of v_{i} , used for IDS_{ij} , has a matrix of transfer functions, \boldsymbol {H}(z) , relating the system output to the system input signals, \boldsymbol {U}(z) , arranged as a vector, Y_{a}(z) = \boldsymbol {H}(z)\boldsymbol {U}(z) . This transfer function matrix could be thought of as, \boldsymbol {H}(z) = \left [{\frac {P_{1}(z)}{Q(z)} {\dots }\frac {P_{n}(z)}{Q(z)}}\right] . Note that the poles of \boldsymbol {H}(z) are defined by the roots of Q(z) ; therefore, we will focus on Q(z) . We want to find a polynomial Q(z) , whose roots follow the trajectories shown in Fig. 4 (b) when attack severity, \psi _{g} , increases. Root locus is a technique used in control theory for visualizing how the poles of a feedback system change as we vary the feedback gain parameter, \varphi , from zero to infinity. We will state the problem of finding Q(z) as a root locus problem by thinking about Q(z) as a feedback system, i.e., \begin{equation*} Q(z) = 1 + \varphi \frac {R(z)}{S(z)}.\tag{7}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

Suppose that the number of roots of R(z) is equal to the number of roots of S(z) . In that case, the poles of feedback system, Q(z) , will follow the trajectories that go from the roots of S(z) to the roots of R(z) as the feedback gain \varphi varies from zero to infinity, [35], [36].

If we want that our system poles have the desired behaviour, S(z) must have two roots at the origin, \{z_{\mathcal {N}}^{min},\overline {z}_{\mathcal {N}}^{min}\} , and R(z) must have two roots at \{z_{\mathcal {A}}^{max},\overline {z}_{\mathcal {A}}^{max}\} , so \begin{align*} S(z)=&z^{2},\tag{8}\\ R(z)=&(z-r\cos \theta -jr\sin \theta)(z-r\cos \theta +jr\sin \theta) \\=&z^{2} - 2zr\cos \theta + r^{2}.\tag{9}\end{align*}

View SourceRight-click on figure for MathML and additional features.

The roots of Q(z) must be sensitive to routing attacks. Having that in mind, we define the feedback gain parameter, \varphi , in terms of the parameters \alpha _{b} from (6). The model in (6) should be defined in such a way that for non attack conditions, the value of the b -th parameter, \alpha _{b} , is lower than the value of the same, \alpha _{b} , when there is an attack present in the network. As the value of attack severity, \psi _{g} , increases, the value of \alpha _{b} should increase. We could define \varphi as the sum of each \alpha _{b} , i.e., \varphi =\sum _{b=1}^{\lambda _{a}}\alpha _{b} , so an increment in the value of any \alpha _{b} will correspond to an increment in the feedback gain, \varphi . But since variations of the values of \alpha _{b} are not expected to be ranging from zero to infinity, we introduce an extra parameter, \eta , that will help us to increase distance between the roots of Q(z) when there is a non attack condition, \{z_{\mathcal {N}},\overline {z}_{\mathcal {N}}\} , from the roots of Q(z) when the network is being attacked, \{z_{\mathcal {A}},\overline {z}_{\mathcal {A}}\} . Therefore, the feedback gain parameter \varphi is defined as, \varphi = \sum _{b=1}^{\lambda _{a}}\alpha _{b}^{\eta } . Later in this section, we will cover an optimization strategy to obtain the value of the parameter \eta .

Substituting R(z) , S(z) and \varphi in (7), we obtain the proper polynomial Q(z) that lead to the desired pole behavior of \boldsymbol {H}(z) , as \begin{align*} Q(z)=&1+\left ({\sum _{b=1}^{\lambda _{a}}\alpha _{b}^{\eta }}\right)\frac {(z^{2} - 2zr\cos \theta + r^{2})}{z^{2}} \\=&1 +\sum _{b=1}^{\lambda _{a}}\alpha _{b}^{\eta } - z^{-1}\left ({2r\cos \theta \sum _{b=1}^{\lambda _{a}}\alpha _{b}^{\eta }}\right) \\&+ z^{-2}\left ({r^{2}\sum _{b=1}^{\lambda _{a}}\alpha _{b}^{\eta }}\right).\tag{10}\end{align*}

View SourceRight-click on figure for MathML and additional features.

The next step consists in defining the input signals, u_{\mathcal {A}b}(k) , u_{\mathcal {N}c}(k) , and output signal, y_{a}(k) , that define the system with the transfer function matrix \boldsymbol {H}(z)=\left [{\frac {P_{1}(z)}{Q(z)} \ldots \frac {P_{n}(z)}{Q(z)}}\right] . We have already defined Q(z) . The polynomials P_{1}(z), {\dots }, P_{n}(z) , are not fully defined at this point because they have no effect on the poles of \boldsymbol {H}(z) . We propose the following definitions for u_{\mathcal {A}b}(k) , u_{\mathcal {N}c}(k) , and y_{a}(k) , \begin{align*} u_{\mathcal {A}b}(k)=&\chi _{\mathcal {A}b}(k) + \alpha _{b}^{\eta -1}y_{a}(k) \\&-\, 2r\cos \theta \alpha _{b}^{\eta -1}y_{a}(k-1) \\&+\,r^{2}\alpha _{b}^{\eta -1}y_{a}(k-2),\tag{11}\\ u_{\mathcal {N}c}(k)=&\chi _{\mathcal {N}c}(k),\tag{12}\\ y_{a}(k)=&\pi _{a}(k) - \gamma (k),\tag{13}\end{align*}

View SourceRight-click on figure for MathML and additional features. where a=1,\ldots,L , b=1,\ldots,\lambda _{a} and c=1,\ldots,\lambda _{n} .

Proof that the proposed definitions for u_{\mathcal {A}b}(k) , u_{\mathcal {N}c}(k) , y_{a}(k) , lead to the desired dynamic behavior of \boldsymbol {H}(z) is presented in Appendix.

Summarizing, we have defined input signals, u_{\mathcal {A}b}(k) , u_{\mathcal {N}c}(k) , and output signal, y_{a}(k) , of our system representation of a node v_{i} , with transfer function matrix, \boldsymbol {H}(z) . The poles on the z-plane of \boldsymbol {H}(z) move on the trajectories shown in Fig. 4 (b) as the attack severity, \psi _{g} , of the g -th routing attack, \omega _{g} , increases. Note that parameters r and \eta are not defined yet.

In order to define parameters r and \eta , we will analyze their respective effects on the poles of \boldsymbol {H}(z) . Parameter r limits the distance that the poles can travel on the z-plane, greater values of r allow the poles to move a greater distance from the origin. The poles movement is not only affected by r , but also by the variation in values of parameters \alpha _{b} given if an attack is present in the network, with an attack severity, \psi _{g} . The parameter \eta is used to increase the movement range of the poles inside the limits defined by parameter r .

Given concrete values for r and \eta , we could define a probability density function (pdf) for the modulus of the pole locations when the network is not being attacked, |z_{\mathcal {N}}|=|\overline {z}_{\mathcal {N}}| , as f_{\mathcal {N}}(|z_{\mathcal {N}}|) . Similarly, we define the pdf for the modulus of the pole locations, |z_{\mathcal {A}}|=|\overline {z}_{\mathcal {A}}| , when there is an attack, \omega _{g} , with severity, \psi _{g} , asf_{\mathcal {A}}(|z_{\mathcal {A}}|) . Then, we could use decision theory to define a threshold, th_{g} , to detect the g -th routing attack, \omega _{g} . If the modulus of the poles of \boldsymbol {H}(z) is greater than the threshold, th_{g} , then IDS_{ij} decides that an attack \omega _{g} is being perpetrated by the j -th neighboring node, v_{j} , of node v_{i} .

The probability of error, P(\epsilon) , represents the probability that IDS_{ij} misclassifies an attack condition as a non attack condition and vice versa, given a decision threshold, th_{g} . Both the decision threshold, th_{g} , and the probability of error, P(\epsilon) , depend on the selected values of the parameters r and \eta . We will state P(\epsilon) as a function of r and \eta , P(\epsilon)=f(r,\eta) . Then we define some restrictions, the expected values of the poles modulus in absence of attack are restricted to a small value, \zeta \approx 0 , E[|z_{\mathcal {N}}|]\leq \zeta . Similarly, the modulus of the expected value of the poles during an attack condition should be grater than \zeta and smaller than an arbitrary value, \xi ; \zeta < E[|z_{\mathcal {A}}|]< \xi . Our final step is selecting the values of r and \eta that minimize probability of error, P(\epsilon) , \begin{align*}&\underset {r, \eta }{\text {minimize}} ~P(\epsilon) = f(r,\eta), \\&\text {subject to}~E[|z_{\mathcal {N}}|]\leq \zeta, \\&\hphantom {\text {subject to}~} \zeta < E[|z_{\mathcal {A}}|]< \xi.\tag{14}\end{align*}

View SourceRight-click on figure for MathML and additional features.

After finding the optimal values of r and \eta , we can obtain the corresponding decision threshold, th_{g} , that will help us to decide if the network performance is being degraded by the g -th routing attack, \omega _{g} .

The number of parameters of the model in (6) is equal to the number of input signals of the system, n , n=\lambda _{a}+\lambda _{n}\leq A+N . In order to detect routing attacks in real time, node, v_{i} , will have to find the parameters, \alpha _{b} and \beta _{c} , given the previous d , delayed measurements of the time series, \pi _{a}(k) , \chi _{\mathcal {A}b}(k) and \chi _{\mathcal {N}c}(k) . The number of delayed measurements, d , must be greater than the number of parameters, d\geq n . In a similar way to the black box case, parameter values can be obtained by the least squares method by rearranging the model in (6) in the form of (5). After getting the instantaneous values of the model parameters, node v_{i} will obtain the instantaneous system poles by applying the \mathcal {Z} -transform to model in (6) and comparing the modulus of the obtained poles to the decision threshold, th_{g} . Each node, v_{i} \in \mathcal {V}_{\tau } , will repeat this process for each neighboring node, v_{j} \in \mathcal {N}_{i} , and for each routing attack, \omega _{g} \in \Omega _{g} . We summarize the proposed root locus based IDS_{ij} to detect the g -th routing attack, \omega _{g} , in Fig. 5.

FIGURE 5. - Root locus based 
$IDS_{ij}$
, during the training stage we obtain the optimal parameters 
$r$
, 
$\eta $
, 
$th_{g}$
, that minimize the classification error probability. For online detection we obtain and compare the instantaneous system poles to the optimal threshold, 
$th_{g}$
.
FIGURE 5.

Root locus based IDS_{ij} , during the training stage we obtain the optimal parameters r , \eta , th_{g} , that minimize the classification error probability. For online detection we obtain and compare the instantaneous system poles to the optimal threshold, th_{g} .

SECTION IV.

A Case Study

In this section we analyze a case of a routing attack in a sensor network, and we compare the two proposed IDS techniques and demonstrate their feasibility. We first analyze the effects of attack severity for a static network, and then we analyze the case where all the nodes in the network are moving for a given attack severity.

A. Scenario and Simulator Description

Simulations were performed to test the proposed IDS. Only one attack was considered for those simulations, RREQ flooding attack. Note that this type of attack can be launched against any proactive routing protocol. In order to analyze the attack severity impact on attack detection rates, the simulated scenarios were randomly initiated without varying any simulation parameter with the exception of the attack severity. Attack severity was different for each simulated scenario, \psi _{g}=\{0, 0.1, 0.3, 0.5, 0.7\} . For the mobility study, we fixed the attack severity value, \psi _{g} = 0.1 , and considered different maximum node speeds from the set, {2, 3, 4, 5}, speed units are given in m/s. The total simulated time per simulation was 180 s with a simulation period of T = 0.05 s for both analyzed techniques. Time series for input and output signals were obtained for the same period T . In order to have non attack and attack data, the first 90 s of simulation there was no attack in the network. The RREQ flooding attack started after 90 s of simulation. Simulation parameters are summarized in Table 1.

TABLE 1 Simulation Parameters
Table 1- 
Simulation Parameters

In order to make a fair comparison, the same data was used for the analysis of both techniques, for the static and mobile cases and the same system order was considered for all cases and models. For similar reasons, the same input and output signals were used for the black box and the root locus methods in order to construct the \omega _{g} detector in IDS_{ij} . The considered performance metric, \pi _{a} , affected by \omega _{g}= \text {RREQ} flooding, was the number of received header bits from l_{ji} during the last simulation period, T . We consider the header bits in the transport layer, network layer, data link layer and physical layer. Control packets were considered as header bits as well. Assume that the attacker node is v_{j} , then attack severity, \psi _{g} , was defined as the v_{j} ’s RREQ bit rate normalized to the maximum channel capacity of all the v_{j} ’s links, max(\mathcal {C}_{ji}:v_{i}\in \mathcal {N}_{j}) , where \mathcal {C}_{ji} is the channel capacity of l_{ji} . Channel capacity is affected at each simulation period by the floor noise and interference due to neighboring nodes transmissions, values of \mathcal {C}_{ji} were centered around 300 kbits/s. The minimum attack severity occurs when the attacker is not consuming any bandwidth with bogus RREQ messages, \psi _{g}^{\min }=0 . The maximum possible attack severity occurs when the attacker is using as much channel capacity as possible for the attack, \psi _{g}^{\max }< 1 . This because channel capacity, \mathcal {C}_{ji} , is a fundamental limit, therefore \psi _{g}\in [0,1 ). The set of features of interest sensitive to \omega _{g} , is X_{\mathcal {A}g} = \{\chi _{\mathcal {A}1}\} , where \chi _{\mathcal {A}1} is the total number of bits received through l_{ji} during the last simulation period, T . No auxiliary features were considered to detect \omega _{g} , so X_{\mathcal {N}g} = \{\emptyset \} is the empty set.

1) Simulator Description

The event-driven simulation tool used for this work was developed using the Python programming language. The simulator is composed by the modules represented as boxes in Fig. 6. The simulation module controls the traffic model and mobility model for each simulated node. It is also in charge of saving relevant network metrics (e.g., node’s positions, average link duration, network throughput) and simulating channel phenomena (e.g., propagation, noise, interference). Each simulated node is composed of transport layer, network layer, data link layer and physical layer modules. Transport layer receives the data traffic profile from the simulator module and controls a UDP or TCP session. Network layer is in charge of routing data packets following the selected routing protocol. Data link layer is in control of the packets queue and it simulates the exponential back off process and the frame collisions of the Medium Access Control protocol. The wireless link communicates with the wireless channel module to receive and broadcast information to neighboring nodes. The node performance metrics module collects relevant metrics from each layer of the communication stack each simulation period. Those relevant metrics are represented as time series in the IDS module, which implements a given intrusion detection system (black box or root locus).

FIGURE 6. - Block diagram of the simulated node functionality and its relationship with the simulation scenario.
FIGURE 6.

Block diagram of the simulated node functionality and its relationship with the simulation scenario.

B. Black Box Method Results

1) Static Network Results

The results for the black box method and the static network case start by dividing in two the time series of \pi _{a}(k) and \chi _{\mathcal {A}1}(k) for each simulation. The time series was collected from a neighboring node of the attacker. The first 90 s of the time series correspond to non attack condition data, the last 90 s of simulation correspond to the time series observed during the attack. After this data partition, we fitted the OE model in (1) for the time series, \pi _{a}(k) , \chi _{\mathcal {A}1}(k) , in a sliding window fashion considering the previous d delayed measurements. For this particular case, the OE model in (1) reduces to, \begin{align*} \pi _{a}(k) = \sum _{e=1}^{2}A^{(e)}\pi _{a}(k-e) + \sum _{f=0}^{2}B^{(f)}\chi _{\mathcal {A}1}(k-f) + \varepsilon (k). \\ {}\tag{15}\end{align*}

View SourceRight-click on figure for MathML and additional features.

Given the fact that we only have one input signal and one output signal, A^{(e)} , B^{(f)} are scalars relating the first and second delayed samples of input and output signals. After fitting the OE model, we get the \mathcal {Z} -transform of the obtained model and the system poles are estimated. This process is repeated for all the collected data for each attack severity \psi _{g} . Fig. 7 (a) shows the poles obtained for different values of attack severity, \psi _{g} , when considering the previous d=15 delayed measurements of the time series for the OE model fitting. The blue poles represent the non routing attack condition, which form three clusters. Note that as the attack severity increases, the poles tend to cluster farther from the origin. Similarly, Fig. 7 (b), shows the system poles for the different attack severity cases for a sliding window size, d=50 . Note that as we increase d , the same behavior as the previous case is observed, but the pole clusters tend to be less disperse.

FIGURE 7. - (a) System poles obtained for the black box based 
$\omega _{g}$
 detector, d = 15. (b) System poles obtained for the black box based 
$\omega _{g}$
 detector, d = 50.
FIGURE 7.

(a) System poles obtained for the black box based \omega _{g} detector, d = 15. (b) System poles obtained for the black box based \omega _{g} detector, d = 50.

Decision regions for the \omega _{g} detector in IDS_{ij} were obtained by the Maximum Likelihood (ML) decision rule. A Gaussian mixture distribution, f_{\mathcal {N}}(Re(z_{\mathcal {N}}), Im(z_{\mathcal {N}})) , was fitted for the real and imaginary components of the poles for the three pole clusters of the non attack condition. Similarly, all the poles obtained for the different attack severity values, \psi _{g} = \{0.1,0.3,0.5,0.7\} , were considered together and a Gaussian mixture, f_{\mathcal {A}}(Re(z_{\mathcal {A}}), Im(z_{\mathcal {A}})) , distribution was obtained for the real and imaginary components of the poles corresponding to an attack condition, i.e., \begin{align*} f_{\mathcal {N}}(Re(z_{\mathcal {N}}), Im(z_{\mathcal {N}}))=&\sum _{w=1}^{3}\rho _{w}\mathcal {N}(\boldsymbol {\mu }_{w}, \boldsymbol {\Sigma }_{w}),\tag{16}\\ f_{\mathcal {A}}(Re(z_{\mathcal {A}}), Im(z_{\mathcal {A}}))=&\sum _{w=1}^{2}\rho _{w}\mathcal {N}(\boldsymbol {\mu }_{w}, \boldsymbol {\Sigma }_{w}),\tag{17}\end{align*}

View SourceRight-click on figure for MathML and additional features. where \boldsymbol {\mu }_{w} is the mean vector, \boldsymbol {\Sigma }_{w} is the covariance matrix of the w -th pole cluster and \rho _{w} is a normalization factor.

Figure 8 (a) shows the decision regions obtained when the d = 15 previous measurements were considered for fitting the OE model. Note that the decision regions for the attack condition become narrower in Fig. 8 (b), where d = 50 . The same results were analyzed for d = 5 and d = 100 .

FIGURE 8. - (a) Decision regions obtained for the black box based 
$\omega _{g}$
 detector, d = 15. (b) Decision regions obtained for the black box based 
$\omega _{g}$
 detector, d = 50.
FIGURE 8.

(a) Decision regions obtained for the black box based \omega _{g} detector, d = 15. (b) Decision regions obtained for the black box based \omega _{g} detector, d = 50.

Table 2 shows the detection accuracy, defined as \text {DA}=100(1-P(\epsilon)) , the false positives rate (FP), defined as the percentage of non attack measurements missclassified as an attack condition, and the false negatives rate (FN), that is the percentage of attack measurements classified as a non attack condition. In general the number of FN is greater than the number FP because the pole locations for the attack condition are more spread than the system poles when there is no attack in the network. Detection accuracy improves significantly as d increases. For the case d = 5 , DA = 61.07%; for d = 15 , DA = 87.57%; for d = 50 , DA = 99.47%, finally for d = 100 , DA > 99.99%. Note that the column corresponding to d=1 is empty for the black box technique, this is due to the fact that the minimum number of delayed samples of the signals to fit the OE model is mpm + mqn . For this particular case, to fit the model, the number of delayed signals needs to satisfy the condition d \geq 4 .

TABLE 2 Empirical Detection Accuracy (DA), False Positives (FP) and False Negatives (FN) for the Back Box (BB) and Root Locus (RL) Based IDS Techniques
Table 2- 
Empirical Detection Accuracy (DA), False Positives (FP) and False Negatives (FN) for the Back Box (BB) and Root Locus (RL) Based IDS Techniques

2) Mobility Case Results

The results of attack detection for the mobility case were obtained following the same methodology as for the static network case. We fixed attack severity, \psi _{g} = 0.1 , and we vary the maximum node speed. For each simulation, we obtain the data series, \pi _{a}(k), \chi _{\mathcal {A}1}(k) , from a neighboring node of the attacker at the moment of the attack. This allows us to analyze attack and attack free data to obtain the attack and non attack clusters of poles. We selected a time window d = 50 to analyze the data because it was the smallest value of the parameter d for which we obtained a DA>99% for the static network. The obtained pole clusters have a similar behavior as for the static network case but detection accuracy deteriorates drastically with speed, as could be seen from Table 3. For a maximum node speed of 2 m/s, DA = 69.86%; for a speed of 3 m/s, DA = 59.48%; and the trend continues. This reduction in detection performance is caused by the fact that system poles tend to be more disperse for a short period of time during new link establishments or link losses, increasing the number of missclassified poles. It is worth mentioning that increasing the window length, d has little effect in improving detection accuracy. As a possible solution to this problem we restart the time series, \pi _{a}(k) , \chi _{\mathcal {A}1}(k) , every time that a link is lost or created. By this strategy we were able to obtain detection performance metrics similar to those for the static network case; for a speed of 2 m/s, DA = 97.30%; for a speed of 3 m/s, DA = 92.02%; for 4 m/s, DA = 91.98%; and for a speed of 5 m/s, DA = 89.25%. Note from Table 3 that the average link duration (ALD), measured in simulation periods, is enough to detect a potential attacker even in high mobility scenarios. For a maximum node speed of 5 m/s, ALD = 154, which is considerable greater than d = 50 . This opens the possibility of using the black box intrusion detection technique in scenarios with greater node speeds and shorter link lifetime.

TABLE 3 Empirical Detection Accuracy (DA), False Positives (FP), False Negatives (FN) and Average Link Duration (ALD) for the Back Box (BB) and Root Locus (RL) Based IDS Techniques
Table 3- 
Empirical Detection Accuracy (DA), False Positives (FP), False Negatives (FN) and Average Link Duration (ALD) for the Back Box (BB) and Root Locus (RL) Based IDS Techniques

C. Root Locus Method Results

1) Static Network Results

For the root locus based detector in IDS_{ij} , Equation (6), reduces to, \begin{equation*} \pi _{a}(k) = \alpha _{1}\chi _{\mathcal {A}1}(k),\tag{18}\end{equation*}

View SourceRight-click on figure for MathML and additional features. where \alpha _{1} is a scalar that represents the proportion of header bits relative to the total number of bits received from link l_{ji} . Similarly to the black box case, for the root locus approach we consider the header bits in the transport layer, the network layer, the data link layer and the physical layer, control packets were considered as header bits as well.

The input and output signals were defined as, \begin{align*} u_{\mathcal {A}1}=&\alpha _{1}^{\eta -1}y_{a}(k) - 2r\cos \theta \alpha _{1}^{\eta -1}y_{a}(k-1) \\&+\, r^{2}\alpha _{1}^{\eta -1}y_{a}(k-2) + \chi _{\mathcal {A}1}(k),\tag{19}\\ y_{a}(k)=&\pi _{a}(k).\tag{20}\end{align*}

View SourceRight-click on figure for MathML and additional features.

In order to obtain the optimal parameters, \eta , r , that minimize the probability of classification error, P(\epsilon) , we obtain several values of P(\epsilon) for different values of \eta =1,2,..15 , and r , r=1,2,\ldots,15 .

For each pair of values \eta , r , we fitted the model in (18) given the previous d delayed samples for the time series, u_{\mathcal {A}1}(k) , y_{a}(k) . We obtained the corresponding pole locations for each \eta and r pairs, then we defined a Gaussian distribution for the modulus of the poles for the non attack condition, f_{\mathcal {N}}(|z_{\mathcal {N}}|) . For the attack condition we fitted a Gaussian distribution, f_{\mathcal {A}}(|z_{\mathcal {A}}|) , considering \psi _{g}=0.1 because system poles for \psi _{g}=0.1 are closest to the non attack poles z_{N} and therefore the greater classification error will be obtained for the smallest considered value of \psi _{g} . With the pole distributions, f_{\mathcal {N}}(|z_{\mathcal {N}}|) , f_{\mathcal {A}}(|z_{\mathcal {A}}|) , we obtained the corresponding P(\epsilon) . Up to this point, we have several values of P(\epsilon) for each \eta , r pair. We obtained the optimal values, \eta =6 , r=1 , by solving the optimization problem in (14) by the exhaustive search strategy. To solve for the optimal parameters \eta and r , we set the restriction parameter \zeta =0.02 because it was the smallest value for which we were able to find a solution. Given the fact that for the black box method the expected values of the pole locations were inside the unitary circle, we selected the parameter \xi =1 to make a fair comparison of both techniques.

Figure 9 (a) shows the system poles for the optimal values of r , \eta for the different attack severity values, \psi _{g} , and d = 15 . The case for d = 50 is shown in Fig. 9 (b). Note that there is a slight reduction in pole dispersion for the d = 50 case, compared to the d = 15 case, although not as noticeable as for the black box method.

FIGURE 9. - (a) System poles obtained for the root locus based 
$\omega _{g}$
 detector, d = 15. (b) System poles obtained for the root locus based 
$\omega _{g}$
 detector, d = 50.
FIGURE 9.

(a) System poles obtained for the root locus based \omega _{g} detector, d = 15. (b) System poles obtained for the root locus based \omega _{g} detector, d = 50.

Figure 10 (a) shows the decision regions for the detector in IDS_{ij} for d = 15 , the decision threshold is th_{g} = 0.02018 in this case. The case where d = 50 is illustrated in Fig. 10 (b), the decision threshold is th_{g} = 0.01781 . For both cases, if the modulus of the poles are below the corresponding decision threshold, |z|< th_{g} , there is no attack coming from v_{j} . Note that this decision rule leads us to a circular decision region for the non attack condition. Detection accuracy is defined in the same way as for the black box method, DA = 100(1-P(\epsilon)) , as well as false positives (FP) and false negatives (FN) rates. The obtained detection accuracies for different attack severity values are presented in Table 2. For the root locus results for d = 5 , d = 15 , d = 50 and d = 100 we do not encounter any classification error given the simulated data so empirical detection accuracy is given as a lower bound assuming one error in the analyzed data set. Note that for the minimum time window length of d = 1 , the detection accuracy is DA = 99.78%; which indicates that the root locus method is robust to the amount of data.

FIGURE 10. - (a) Decision regions obtained for the root locus based 
$\omega _{g}$
 detector, d = 15. (b) Decision regions obtained for the root locus based 
$\omega _{g}$
 detector, d = 50.
FIGURE 10.

(a) Decision regions obtained for the root locus based \omega _{g} detector, d = 15. (b) Decision regions obtained for the root locus based \omega _{g} detector, d = 50.

2) Mobility Case Results

Similar to the black box results for mobility, we analyze the time series \pi _{a}(k) , \chi _{\mathcal {A}1}(k) , obtained from a neighboring node from the attacker at the time when the attack starts. Then we obtained the pole clusters for the non attack and for the attack condition, for a time window length d=1 . Considering the restriction values, \zeta =0.02 , \xi =1 , used for the static network case, we were not able to find the optimal values for parameters, \eta , r , that satisfy the restrictions in (14). We relaxed the restriction value, \xi =2 , and we increased the search space for parameters \eta = 1,2,\ldots,50 , r=1,2,\ldots,50 to obtain optimal parameters, \eta =39 , r=1 . Table 3 shows the detection performance metrics obtained for the mobility case. It can be noticed that the relaxation of restrictions and the increase of the search space allow us to improve detection accuracy even for a minimal time window length d=1 in comparison with the static network results. Note that the increasing in the search space has a small impact on the computational cost of the training stage of the root locus detection method but has no impact on the on line detection.

D. Discussion

This new perspective of thinking about a network node v_{i} as a linear system, allows us to think about causes (input signals) and effects (output signals) to detect routing attacks. The intuition gained by the obtained linear models could be useful not only to detect routing attacks, but also to provide valuable network insights from the local perspective of a node. Those local network phenomena experimented by each node contribute to the network phenomena observed from a global scale.

As mentioned in Section IIIC., and in (27) in Appendix, all the terms of the transfer function matrix \boldsymbol {H}(z) share the same polynomial, Q(z) , on the denominator. This means that independently of the number of input signals considered, the system poles are the same for each individual transfer function inside \boldsymbol {H}(z) , which implies that we will always be working on a two dimensional feature space, the z-plane, for performing the routing attack detection. This two dimensional feature space, allows us to visually represent the data, without regards of the number of input signals. This two dimensional representation does not require any information loss, as could be the case of using dimensionality reduction techniques such as Principal Component Analysis (PCA).

Although we obtained good detection accuracy for both proposed techniques, in the next lines we will compare them from the implementability point of view to help us decide which technique is more feasible.

1) On the Implementability of the Black Box Method

Although it seems natural to use black box system identification techniques to model the unknown dynamic behavior of a node, v_{i} , in practice we could encounter some implementability issues. The first issue is the computational cost of fitting the OE model. In general, the number of parameters of the model is mpm + mqn . For the particular case of the model shown in Fig. 2, where we are considering one output signal and an arbitrary number of input signals, \boldsymbol {y}(k)\in \mathbb {R}^{1} , the number of output signals is m=1 ; \boldsymbol {\chi }(k)\in \mathbb {R}^{\lambda _{a}+\lambda _{n}} , and the number of input signals is n=\lambda _{a}+\lambda _{n}\leq A+N . This is because the sets, X_{\mathcal {A}g} and X_{\mathcal {N}g} , could contain some or all the available features local to node v_{i} , to detect the routing attack. The number of OE model parameters for the case of interest in this paper, reduces to p + qn , p and q remain unknown. Note that the matrix \boldsymbol {X} in (4) has dimensions d \times (p + qn) . This implies that to obtain the model parameters, \hat {\boldsymbol {\theta }} , we need to find the inverse of a matrix \boldsymbol {X}^{\intercal }\boldsymbol {X} , whose dimensions are (p + qn) \times (p + qn) . As the number of input signals, n , and the system order increases, the computational cost of fitting the model increases because p and q increase with the system order. In addition, after finding parameters \hat {\boldsymbol {\theta }} , we need to find the \mathcal {Z} -transform of the difference equations of the OE model and then its poles. Finding the poles implies finding the roots of a degree p polynomial. If p>2 we need a numerical method to solve for those roots. Most of the computational cost of the black box method is originated in the OE model fitting. The OE model could be efficiently fitted by using matrix factorization techniques such as LU decomposition or Cholesky decomposition. Despite the use of matrix factoring techniques, the computational cost of this approach could become expensive for a single low power node in a sensor network. This limits the range of application of the black box intrusion detection method to vehicular ad hoc networks, unmanned aerial ad hoc networks or ad hoc networks, whose nodes have sufficient computing capabilities.

A second issue with the black box approach is that we do not have previous knowledge of the right model of a node. We could face under-fitting or over-fitting problems, so in general, we need to try different values for p and q and decide which values lead to the system representations sensitive to routing attacks.

Finally, we intuitively understand that the potential attacker node, v_{j} , behaves differently than any other node, v_{i} . But for those differences in dynamic behavior to be reflected by the system poles of IDS_{ij} on the z-plane, we need to take into account a considerable number of delayed measurements d for the OE model fitting. Evidence from simulations show that for d = 50 , a detection accuracy of 99.47% could be obtained for a static network. Better results could be obtained for a larger number of delayed measurements of the input and output signals at the expense of increasing the computational costs of the OE model fitting. For the mobile network, we need to restart the time series for the input and output signals. For most cases, link lifetime is enough to adjust the OE model.

2) Comparison of the Root Locus and the Black Box Approaches

Note that the number of parameters for the root locus approach is reduced when compared with the black box model for the same case of one output signal and an arbitrary number of input signals. There are n parameters, where n is the number of input signals, for the root locus method and p + qn parameters for the black box model. Given the fact that in order for the black box technique to model the dynamic behaviour of the system, we need p\geq 1 and q\geq 1 , therefore the root locus approach has less parameters than the black box method. A lower number of parameters in the model reduces the dimensions of the matrices used to estimate the parameter values by the least squares method. Dimensions of the matrix \boldsymbol {X}^{\intercal }\boldsymbol {X} are n \times n for the root locus approach and (p+qn)\times (p+qn) for the black box technique. This reduction in dimensions results in a significant reduction of the computational cost of finding the inverse matrix in (5) for estimating the model parameters. For the root locus approach, we know that \boldsymbol {H}(z) is of order two. Having a second order system has the advantage that to find the system poles we need to find the roots of a second degree polynomial, which has a closed solution. This implies another potential computational cost reduction when compared to the black box method because we do not need a numerical solution to find the instantaneous system poles, for p>2 .

Additionally, in (6), we determined the system poles to be sensitive to routing attacks and not to other causes by making the distinction of the time series of features sensitive to the g -th routing attack, \chi _{\mathcal {A}b}(k) , to the rest of time series, \chi _{\mathcal {N}c}(k) . By properly selecting the features in the sets \chi _{\mathcal {A}g} , \chi _{\mathcal {N}g} , we will be separating the expected value of the modulus of the poles when there is no attack, E[|z_{\mathcal {N}}|] , from the expected value of the modulus of the poles, E[|z_{\mathcal {A}}|] , caused by the g -th routing attack. Finally, we stated the problem of finding the proper parameters r and \eta , in the root locus approach, as a constrained optimization problem that helped us to find the proper input signal definitions that lead to the minimum decision error to detect \omega _{g} by IDS_{ij} . This leads to a method that is robust to the amount of data used for the model fitting in (6), as could be observed from the detection accuracy results in Table 2 and Table 3. The less data is used to adjust the model, the more affordable to implement even in low power devices, as could be the case of sensor nodes.

3) On the Attacker’s Position and Network Impact

An important aspect to consider for further research is the attacker’s position. Depending on the physical location of the malicious node, the impact on network performance of a routing attack could vary. Those impact variations are due to the fact that nodes near the center of the network tend to have more neighbors than nodes close to the outskirts of the network and therefore have access to a larger portion of data traffic. This phenomena could also affect attack detection rates, the more severe the impact on network performance the easier to detect.

SECTION V.

Conclusions

In this work, we proposed two different IDS for routing in RWN based on the same perspective of considering a network node as a linear system. This new perspective allows us to gain some intuitive understanding of the problem. Additionally, by using the system poles on the z-plane as the feature space for attack detection, we can represent all the relevant information in two dimensions. This two dimensional feature space is guaranteed to be independent of the number of input and output signals considered as relevant network metrics for a given attack detection. Good detection accuracy was obtained for both attack detection techniques. For more elaborate scenarios than the simple case presented in Section IV we need to consider additional inputs. The root locus approach is more robust to mobility and has a lower computational cost when compared to the black box method and hence more feasible for low power devices. The black box technique could be implemented in nodes with sufficient computing capabilities, such as nodes in vehicular ad hoc networks, and in unmanned aerial ad hoc networks. The number of appropriate input signals as well as which specific ones lead to better detection accuracy remain an open challenge.

Appendix

System Inputs and Outputs Derivation

In this appendix, we show that a system as the one in Fig. 4 (a), with input signals, u_{\mathcal {A}b}(k) , u_{\mathcal {N}c}(k) , and output signal y_{a}(k) , defined respectively as in (11), (12), (13), will have a transfer function matrix \boldsymbol {H}(z)=\left [{\frac {P_{1}(z)}{Q(z)} {\dots }\frac {P_{n}(z)}{Q(z)}}\right] , where Q(z) is defined in (10).

We begin by substituting y_{a}(k) in (13), into (6) to get \begin{align*} \pi _{a}(k) - \gamma (k)=&\sum _{b=1}^{\lambda _{a}}\alpha _{b}\chi _{\mathcal {A}b}(k) + \sum _{c=1}^{\lambda _{n}}\beta _{c}\chi _{\mathcal {N}c}(k), \\ y_{a}(k)=&\sum _{b=1}^{\lambda _{a}}\alpha _{b}\chi _{\mathcal {A}b}(k) + \sum _{c=1}^{\lambda _{n}}\beta _{c}\chi _{\mathcal {N}c}(k).\tag{21}\end{align*}

View SourceRight-click on figure for MathML and additional features.

We solve for \chi _{\mathcal {A}b}(k) and \chi _{\mathcal {N}c}(k) in (11) and (12), respectively, and substitute them in (21), to get \begin{align*} y_{a}(k)=&\sum _{b=1}^{\lambda _{a}}\alpha _{b}\Biggl [{u_{\mathcal {A}b}(k)-\alpha _{b}^{\eta -1}y_{a}(k) } \\&{ + 2r\alpha _{b}^{\eta -1}\cos \theta y(k-1)-r^{2}\alpha _{b}^{\eta -1}y_{a}(k-2)}\Biggr] \\&+ \sum _{c=1}^{\lambda _{n}}\beta _{c}u_{\mathcal {N}c}(k).\tag{22}\end{align*}

View SourceRight-click on figure for MathML and additional features.

Grouping all the terms containing y_{a}(k) , y_{a}(k-1) and y_{a}(k-2) on the left hand side, and then obtaining the \mathcal {Z} -transform, we get \begin{align*}&\hspace {-.9pc}Y_{a}(z) \!+\! Y_{a}(z)\sum _{b=1}^{\lambda _{a}}\alpha _{b}^{\eta } -2z^{-1}Y_{a}(z)r\cos \theta \sum _{b=1}^{\lambda _{a}}\alpha _{b}^{\eta } \\&+\, z^{-2}Y_{a}(z)r^{2}\!\sum _{b=1}^{\lambda _{a}}\alpha _{b}^{\eta } \!=\! \sum _{b=1}^{\lambda _{a}}\alpha _{b}U_{\mathcal {A}b}(z) \!+\! \sum _{c=1}^{\lambda _{n}}\beta _{c}U_{\mathcal {N}c}(z),\tag{23}\end{align*}

View SourceRight-click on figure for MathML and additional features. factorizing Y_{a}(z) , we get \begin{align*}&\hspace {-.5pc}Y(z) \left [{ 1 + \sum _{b=1}^{\lambda _{a}}\alpha _{b}^{\eta } -z^{-1}\left({2r\cos \theta \sum _{b=1}^{\lambda _{a}}\alpha _{b}^{\eta }}\right) + z^{-2}\left({r^{2}\sum _{b=1}^{\lambda _{a}}\alpha _{b}^{\eta }}\right)}\right] \\& \qquad \qquad \qquad \qquad {= \sum _{b=1}^{\lambda _{a}}\alpha _{b}U_{\mathcal {A}b}(z) + \sum _{c=1}^{\lambda _{n}}\beta _{c}U_{\mathcal {N}c}(z).} \tag{24}\end{align*}
View SourceRight-click on figure for MathML and additional features.

Note that the expression inside the brackets in (24) is equal to Q(z) , i.e., \begin{equation*} Y(z) [Q(z)] = \sum _{b=1}^{\lambda _{a}}\alpha _{b}U_{\mathcal {A}b}(z) + \sum _{c=1}^{\lambda _{n}}\beta _{c}U_{\mathcal {N}c}(z).\tag{25}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

Dividing both sides by Q(z) , we have \begin{equation*} Y(z) = \frac {\displaystyle \sum _{b=1}^{\lambda _{a}}\alpha _{b}U_{\mathcal {A}b}(z) + \sum _{c=1}^{\lambda _{n}}\beta _{c}U_{\mathcal {N}c}(z)}{Q(z)},\tag{26}\end{equation*}

View SourceRight-click on figure for MathML and additional features. which in matrix form is \begin{align*} Y(z)=&\boldsymbol {H}(z)\boldsymbol {U}(z) \\=&\left [{\dfrac {\alpha _{1}}{Q(z)} {\dots }\dfrac {\alpha _{\lambda _{a}}}{Q(z)}~\dfrac {\beta _{1}}{Q(z)} {\dots }\dfrac {\beta _{\lambda _{n}}}{Q(z)}}\right] \begin{bmatrix} U_{\mathcal {A}1}(z) \\ \vdots \\ U_{\mathcal {A}\lambda _{a}}(z) \\ U_{\mathcal {N}1}(z) \\ \vdots \\ U_{\mathcal {N}\lambda _{n}}(z) \end{bmatrix}.\tag{27}\end{align*}
View SourceRight-click on figure for MathML and additional features.

The poles of \boldsymbol {H}(z) are given by the previously defined polynomial Q(z) , therefore \omega _{g} detector in IDS_{ij} will have the dynamic behaviour shown in Fig. 4 (b).

References

References is not available for this document.