Mitigating Traffic Remapping Attacks in Autonomous Multi-hop Wireless Networks

Multi-hop wireless networks with autonomous nodes are susceptible to selfish traffic remapping attacks (TRAs). Nodes launching TRAs leverage the underlying channel access function to receive unduly high quality of service (QoS) for packet flows traversing source-to-destination routes. TRAs are easy to execute, impossible to prevent, difficult to detect, and harmful to the QoS of honest nodes. Recognizing the need for providing QoS security, we use a novel network-oriented QoS metric to propose a self-enforcing game-theoretic mitigation approach. By switching between TRA and honest behavior, selfish nodes engage in a noncooperative multistage game in pursuit of high QoS. We analyze feasible node strategies and design a distributed signaling mechanism called DISTRESS, under which, given certain conditions, the game produces a desirable outcome: after an upper-bounded play time, honesty tends to become a selfish node's best-reply behavior, while yielding acceptable QoS to most or all nodes. We verify these findings by Monte Carlo and ns-3 simulations of static and mobile nodes.


I. INTRODUCTION
M ULTIHOP wireless transmission is the underlying principle of various types of wireless networks: sensor, mesh, ad hoc, cellular (using cooperative user-to-base-station relays), vehicular, opportunistic (delay tolerant) and Internet of Things (IoT) segments. In this paper we focus on wireless networks with autonomous nodes, referred to as multi-hop autonomous wireless networks (MAWiNs), which are an active research field exploring the self-organizing network concept [1] and its numerous embodiments, such as autonomous mobile mesh networks [2], [3], autonomous vehicular networks [4], autonomous sensor networks [5], and flying ad hoc networks [6]. In large-scale IoT systems, multihop cooperative relaying has been observed to improve the throughput, reliability, and energy efficiency of the data exchange between end-user devices (e.g., smart meters) and gateway nodes (e.g., data aggregation points) [7]- [11].
Besides classical threats to wireless transmission, MAWiNs face unique security threats due to node autonomy. Sustained operation of such networks relies on the nodes' benevolent compliance with cooperative protocols, such as fair channel access, transit packet forwarding on behalf of out-of-range source and destination nodes, and route discovery. Nodal cooperation entails certain costs in terms of energy expenditure and quality of service (QoS) received by source (locally generated) traffic: fair channel access requires deferment of one's own packet transmission, which causes delays; transit packet forwarding consumes energy and bandwidth, which diminishes source traffic throughput; and route discovery introduces communication overhead and processing burden. Autonomous nodes thus tend to exhibit rational (selfish) behavior, seeking a favorable tradeoff between received QoS and incurred costs. This makes MAWiNs susceptible to selfish attacks that abuse the employed network protocols to the attacker's benefit.
Selfish attacks can be categorized as "supply-side" (attempts to reduce costs of performing network services for other nodes) and "demand-side" (attempts to acquire undue network resources through aggressive competition). The former are common at the Internet layer, e.g., refusal to forward transit packet saves energy and bandwidth for source traffic, and falsifying route advertisements may deactivate incident routes and thus relieve a node from transit traffic. In response, defense mechanisms need to be deployed, such as secure routing protocols [12], intrusion detection/prevention systems [13], credit-based schemes [14], and trust management frameworks [15].
Known "demand-side" selfish attacks range from the link layer to the transport layer and usually consist in manipulation of sensitive protocol parameters, e.g., the contention window of IEEE 802.11 [16] or congestion control settings of TCP [17]. They have been countered by a number of detection-or prevention-type mechanisms [18] and often investigated using game theory [19], e.g., for incentivizing selfish IoT devices to participate in cooperative communication [20], [21]. A less known variety of selfish attacks by the name QoS abuse [22] or traffic remapping attack (TRA) [23] emerges in environments supporting traffic class-based QoS differentiation to enforce user-network QoS contracts such as Service Level Agreements (SLAs) [24]. By falsely assigning traffic to classes, an attacker node abuses provisioned QoS policies and can receive a higher QoS level for its source traffic at the cost of honest nodes' source traffic. MAWiNs make a perfect scene for TRAs especially if they offer QoS differentiation as part of the network's mission rather than a metered service, which makes launching a TRA costless, whereas their inherent lack of node accountability reduces the risk of detection and punishment. Thus on top of classical problems with network resource deficiency and/or mismanagement, threats to contractual QoS can arise from QoS abuse and should be addressed by a new class of defense mechanisms that can be collectively termed QoS security. Their task is to protect information security, in the sense of enforcing QoS guaranteed to honest nodes, in the presence of QoS abuse. In wireless networks using IEEE 802.11, TRAs can exploit the enhanced distributed channel access (EDCA) function. EDCA defines four access categories (ACs), each with its own parameters controlling the priority and duration of medium access [16]. Packets are mapped to ACs based on the Differentiated Services Code Point (DSCP) in their IP header, which reflects the traffic's Class of Service (CoS) [25]. For simplicity assume that DSCP can be either expedited forwarding (EF) or best effort (BE). The CoS-to-DSCP mapping is implemented by Internet layer packet mangling software, such as Linux iptables. TRAs can be easily executed by setting DSCP = EF in source traffic whose CoS maps to BE and DSCP = BE in forwarded (transit) traffic whose CoS maps to EF, cf. Fig. 1. In contrast, AC parameter modification requires tampering with the DSCP-to-AC mapping embedded in wireless card drivers. Furthermore, TRAs are difficult to detect: determining if higher-layer traffic matches its DSCP designation requires deep packet inspection [26] and global knowledge of the DSCP assignment policy.
In single-hop settings, TRAs have been shown to drastically reduce the throughput of honest nodes unless the latter employ a carefully designed MAC-layer discouragement scheme [26]. The multi-hop nature of MAWiNs poses a number of additional challenges for a defense scheme: • a selfish node can both promote its source traffic and demote transit traffic, • a locally performed TRA has an end-to-end impact: once assigned a false QoS designation, a packet retains it further down the route, • the impact of TRA is unclear ex ante, due to the complex interplay of multiple layers: PHY (hidden nodes), MAC (channel contention), and transport (flow control); this interplay also blurs QoS perception and rules out straightforward detection-based countermeasures, • the absence of single-broadcast hearability rules out simple punitive measures such as threats of jamming, • heuristic end-to-end countermeasures against TRAs are moderately effective [23]; in particular, countermeasures that work well in single-hop settings, such as ACK dropping, fail in multi-hop settings [27].
Being rationally motivated, easy to execute, impossible to prevent, difficult to detect, and harmful to honest nodes, TRAs call for incentive-based defense. Unfortunately, the lack of a central authority rules out common approaches based on reputation building or Stackelberg games [7], [8], [15], hence a novel game-theoretic methodology is also needed. In [28], we presented an early formulation of the multihop TRA problem and preliminary insights into its impact upon the nodes' cost metric. A multistage game-type TRA mitigation scheme was proposed, whose convergence and alignment with nodes' rationality was only supported by numerical and Monte Carlo simulation arguments. Building on a more rigorous network model we offer herein a provably convergent, rational, and effective TRA mitigation scheme in the spirit of "brinkmanship game theory" [29]: credible threats force nodes to toggle between TRA and honest behavior, and so engage in a noncooperative game in pursuit of high QoS. Although our analytical results apply to networks with static nodes, simulations show the scheme is also effective with node mobility. Our main contributions are: 1) Based on a MAWiN model with a static topology and traffic flows, we formally define plausible opportunistic TRAs, and discuss their motivation and impact. 2) We develop a heuristic end-to-end QoS metric that only uses information about the network topology and traffic flows, and verify it by simulation and comparison with alternative heuristics. 3) We design a distributed DISTRESS mechanism to signal the threat of service suspension due to ongoing TRAs. DISTRESS requires little data analysis and inter-node synchronization, and needs not distinguish between TRA and objectively harsh traffic conditions. 4) Using the developed QoS metric as a payoff function, we analyze the game arising among ill-behaved nodes under DISTRESS and state conditions of its desirable outcome. We show that after an upper-bounded play time, honesty tends to become an ill-behaved node's best-reply behavior, keeping QoS acceptable to most or all nodes. These findings are verified by extensive Monte Carlo and time-true simulations of static and mobile nodes.
The remainder of the paper is organized as follows. In Section II we outline related work on QoS abuse in wireless networks and highlight the unsolved problems which justify our research, including the need for a macroscopic MAWiN network model. In Section III-A we formulate a topology and traffic flow model, next used in Section III-B to formalize the notion of TRA. In Section III-C we develop a MAWiN performance model and propose an end-to-end QoS metric; the latter is shown in Section III-D to yield quantitative insight into the motivation and impact of TRAs. In Sections IV-A and IV-B, respectively, we describe the one-shot TRA game arising among ill-behaved nodes, and propose a model of multistage play to discourage TRAs. "Good" multistage strategies are analyzed in Section IV-C and validated by simulations in Section V. Section VI concludes the paper.

II. RELATED WORK
Selfish attacks in MAWiNs have mostly been studied at the Internet layer. The main attack under consideration has been packet dropping, also referred to as forwarding/relaying misbehavior. This attack can be considered as launched either on all packets (full dropping) or only on selected packets (partial dropping). In the latter case the dropping can be either probabilistic or deterministic (e.g., may specifically target some packet types such as routing control packets, or some source-to-destination routes).
The packet dropping attack has been widely analyzed. Due to node autonomy and lack of any administrative control only "soft" countermeasures are possible. Some proposals involve micropayment (credit) schemes, where a virtual currency is earned for relaying and next used to buy similar services [14]. Others have focused on explicit identification of attackers. This can be done passively, e.g., through a watchdog mechanism where nodes promiscuously listen to the channel and observe offending behavior [30], or actively, e.g., using additional endto-end acknowledgments to determine which routes contain packet dropping attackers [31]. Attacker identification can be enhanced through complex audit-and reputation-based schemes where a node derives a reputation score of any other node from first-hand (watchdog-based) experience, and possibly from reputation scores calculated by third-party nodes; low-reputation nodes are identified as attackers [32].
The main response to Internet layer attacks has been of a reciprocation nature, i.e., restricted forwarding of attackers' source packets. This gives rise to numerous analyses of the underlying forwarding game. Various strategies (e.g., tit-fortat) have been considered to enforce honest packet forwarding; see [33] for a systematic treatment. A complementary response is to route traffic around attackers. However, this is in fact beneficial to them, as they can expend less energy and bandwidth on forwarding [34].
At the link layer, most attacks have found the IEEE 802.11 channel access function [16] an easy target. Numerous studies have shown that launching a backoff attack, i.e., changing the transmission deferment parameters (such as idle carrier sensing or backoff times) yields the attacker a considerable increase in throughput and access delays at the cost of honest nodes [35], [36]. Link layer attacks have mostly been studied in a single-hop setting, which is not surprising given their local-scope nature.
QoS differentiation opens new vulnerabilities to attacks referred to in the literature as "QoS abuse" [37], [38] or "class hijacking" [39], [40]. In this area, researchers have also studied selfishness in forwarding multimedia streams [41], nodes misreporting channel request parameters [42], and various cooperative forwarding strategies [43]. Campus network designers have long foreseen that too much highpriority traffic may overwhelm the available bandwidth and/or switch capacity [22]; in infrastructure-based networks under administrative supervision an obvious solution is to allow traffic marking with CoS/DSCP only at the network edge, subject to valid traffic contracts, rather than at users' premises. However, QoS abuse, exemplified by TRAs, is much harder to defend against in ad hoc networks, which lack a well-defined user-to-network interface.
TRAs are tied to the link layer: despite being launched at the Internet layer (through modifying DSCP), they exploit the underlying channel access prioritization. Local-scope TRAs were considered in [26] and acknowledged as a threat to transmission opportunity sharing protocols [44]. Multihop settings are also vulnerable [45]; [46] discusses attacks in IEEE 802.11s mesh networks, [27] -in two-hop relay networks, while [23] provides an overview of TRAs in ad hoc networks along with a discussion of attack detection and defense measures. Additionally, [47] studies a practical relaying scenario where users may execute TRAs.
From a detection viewpoint, TRAs are more challenging in multi-hop settings than in single-hop ones, because it is not always clear how local-scope manipulation of per-traffic class handling translates into end-to-end per-flow or perpacket performance -link layer attacks such as TRAs consist in aggressive competition for a limited resource (the radio channel), bringing more benefit to the attacker and more harm to the honest nodes than do Internet layer attacks, where the benefit or harm is less pronounced. Single-hop settings are also easier to defend: when a traffic remapping attacker has been identified, it can be punished by neighbor honest nodes via responding in kind, e.g., increased transmission rate or jamming [48]. In a multi-hop setting, however, punishment of TRAs may prove ineffective if known local-scope defense mechanisms are directly mimicked. Thus studies of selfish link layer attacks in multi-hop wireless networks leave many insights to be gained.
In this paper we are interested in MAWiN-oriented defense mechanisms justifiable by noncooperative game-theoretic considerations, i.e., rendering TRAs non-beneficial for attackers in terms of perceived QoS. This requires simple performance models of MAWiNs under TRAs that yield closed-form solutions and thus handy payoff functions for arising games. Few such models are known, none of them able to capture on a macroscopic level the complex interplay of channel access queuing and contention, EDCA prioritization, node mobility, and intra-flow competition due to multi-hop forwarding in the presence of hidden nodes (where packet transmissions from one node compete with those from up-and downstream nodes one or two hops away). Existing models are usually limited to chain topologies [49], [50] or tied to specific analytical models of high complexity [51]; they do not consider traffic differentiation.
III. NETWORK MODEL In this section we formalize the network and TRA description, and develop an end-to-end performance model to quantify TRA motivation and impact. A summary of the notation used in the paper is presented in Table I.

A. Topology, Routes, and Flows
A static MAWiN topology is represented by a directed graph T = N, L , where N is the set of nodes, L ⊂ N × N , and Set of h-flows competing with outgoing h-flow (i, r, hac i (r, ac)) sr, dr Source and destination nodes of route r ∆ Set of in-distress nodes ∆ * Set of in-exposure nodes Set of nodes whose source traffic is forwarded by nodes in M E R * Predecessor (previous-hop node) to node i on route r P r,i Set of nodes that precede or coincide with node i on route r σ An action selection rule in the TRA game succ r,i Successor (next-hop node) to node i on route r r, ||r|| An end-to-end route and its hop-length (r, ac) An e2e-flow of intrinsic AC ac following route r rank i (r, ac) Performance metric for e2e-flow (r, ac) at node i R Set of all end-to-end routes in the network R F Forwarding relationship R * F Forward-reliance relationship T = N, L Network topology graph (set of nodes, set of links representing node hearability) (i, j) ∈ L iff i = j and j is in the hearability range of i. Let N * be the set of all directed acyclic routes in T and R ⊆ N * be the set of end-to-end routes in T as determined by the routing algorithm in use. Each route r ∈ R is represented as a sequence of nodes r = (i 1 , . . . , i m ) such that i 1 , . . . , i m are all distinct and (i m , i m +1 ) ∈ L for all m = 1, · · · , m−1; i 1 and i m are the source and destination nodes of r, denoted s r and d r , i 2 , . . . , i m−1 are the transit nodes, and r = m − 1 is the hop length of r. We write i ∈ r if r involves node i; for i, j ∈ r write i < r j (i ≤ r j) if i precedes (precedes or coincides with) j on r. For i ∈ r \ {d r } denote by succ r,i the immediate successor of i on r, and for i ∈ r \ {s r } define pred r,i as the immediate predecessor of i on r; for uniformity, pred r,sr is defined as s r . (In the parlance of wireless multi-hop networks, a node i is said to be hidden from i if (i, i ) / ∈ L and ∃j ∈ N, r, r ∈ R : j = succ r,i = succ r ,i .) Denote by P r,i = {j|j ≤ r i} the set of nodes that precede or coincide with i on r.
We model network traffic as composed of end-to-end (e2e-) flows, each of which is a collection of packets of the same CoS ∈ {EF, BE} and moving along the same route. The corresponding link layer frames are handled by EDCA according to assigned ACs, contained in the AC fields of their headers. For ease of presentation we restrict the used ACs to VO (real-time traffic such as voice/video) and BE (best-effort traffic), with VO having (statistical) priority over BE at the link layer. Since packet mangling amounts to a CoS-to-AC mapping, we define a function mang : {EF, BE} → {V O, BE} such that mang(EF) = V O and mang(BE) = BE. An e2e-flow of a given CoS is represented as (r, ac), where r ∈ R is its route and ac = mang(CoS) is its intrinsic AC as returned at s r . Let F ⊆ R × {V O, BE} be the (quasi-static) set of e2e-flows offered by MAWiN users. Presumably, only nodes generating their own source traffic are interested in staying connected, therefore we assume that at least one e2e-flow is offered at each node, i.e., For further analysis, it is necessary to formally state the fact that a source of a traffic flow is reliant on a set of nodes for forwarding its packets.
Assume that a removal from the network of a forwarding node on r causes s r to be removed as well. Then (i, j) ∈ R * F expresses node i's forward-reliance on node j in that a removal of j ultimately causes a removal of i.
, the first inclusion due to (1).
We use the notion of hop (h-) flows as the granulation level at which incoming traffic is recognized at a next-hop node. Packets of e2e-flow (r, ac) forwarded by j = pred r,i , whose AC fields contain hac ∈ {V O, BE}, are recognized at node i ∈ r as an h-flow (j, r, hac). (Possibly hac = ac, because AC fields can be modified hop-by-hop.) By convention, let e2e-flow (r, ac) be recognized at s r as h-flow (s r , r, ac). For example, if node 3 in Fig. 2 changes the AC fields of incoming packets then e2e-flow #1, designated as ( Autonomous operation of node i is expressed as a function map i : H → {V O, BE} according to which it sets ACs of h-flows. For an incoming h-flow (j, r, hac), where j = pred r,i and i ∈ r \ {s r , d r }, the new AC field forwarded by i further along r is given by map i (j, r, hac).

B. Attack Model
We consider an attacker to be a selfish node which aims to receive a higher QoS level for its source traffic by performing a TRA, i.e., changing the traffic class of incoming transit or source flows. The Internet layer's packet mangling functionality is used as explained in Section I ( Fig. 1) to interfere with the default CoS-to-AC mapping and modify the IP header fields of incoming packets, which are transmitted further along their path with a different MAC-layer priority. Except for the path's destination node, any node can potentially become an attacker. We assume that: • a MAWiN node is capable of assessing received QoS, • in terms of received QoS, a TRA can be beneficial to the attacker and harmful to other nodes, • a TRA can be performed at no expense and at no risk of detection or administrative punishment, and • a subset of nodes are selfish and ready to become attackers. A TRA can be formally described as follows.
Definition 2. A traffic remapping attack (TRA) that a node i ∈ r \ {d r } launches upon an incoming h-flow (j, r, hac), where j = pred r,i , consists in changing its AC, i.e., configuring map i (j, r, hac) = hac.
Such a definition captures the fact that the setting of AC fields under a TRA is both protocol compliant (the use of map j is feasible) and ill-willed (inconsistent with mang(·)).
We consider the behavior of a node which does not perform a TRA to be honest, whereas attackers can either upgrade or downgrade an incoming h-flow's AC. We formally define these behaviors as follows. We adopt a simple model of an attacker: it will launch a TRA on all its source and transit flows provided that the former can be upgraded and the latter downgraded. In our model, each attacker is assumed to be plausible opportunistic. Let A ⊆ N denote the set of attackers. Given A, the new AC field hac i (r, ac) is derived as:  • e2e-flow #3 with ac = V O has an attacker source which, however, does not launch a TRA − due to the plausibility constraints, • likewise, e2e-flows #3, #7, and #9 have an attacker destination which behaves honestly due to the plausibility constraints, • e2e-flow #6 is not attacked at an attacker transit node 3, which could only have launched a TRA + , • e2e-flow #1 with ac = V O encounters three attacker transit nodes, of which the first launches a TRA − , hence the second and third no longer have to, • e2e-flows #2 and #8 experience a TRA + at their source nodes and a TRA − at node 5; this is the maximum number of attacks an e2e-flow can experience.

C. End-to-End Performance Model
Evaluation of the impact of and countermeasures against TRAs requires an analytical performance model of a MAWiN in the presence of TRAs that can deal with arbitrary topologies T and flow sets F , and thus obviate the need for time consuming and setting specific full-stack simulations. Motivated by the deficit of such models in existing literature, cf. Section II, and building on the models of Section III-A and Section III-B, we have developed an approximate rankbased model, described below. Each h-flow is assigned a rank depending on the number and priorities of h-flows it has to compete with for channel access; the collection of ranks of all h-flows constituting a given e2e-flow is then translated into an informative end-to-end QoS metric. At node i, the set of outgoing h-flows is For an outgoing h-flow (i, r, hac i (r, ac)), the set CH i (r, ac) of competing h-flows consists of: (a) other outgoing h-flows at i, which compete via the local transmission queue, (b) outgoing h-flows at nodes in the hearability range of i, which compete via CSMA/CA, and (c) outgoing h-flows at nodes hidden from i, which compete via exclusive-OR reception at succ r,i : For the above outgoing h-flow, the pair [hac, CH] i (r, ac) determines per-hop performance at node i, where we use We propose a per-hop performance metric rank i (r, ac) reflecting that an h-flow is better off at a node if it is VO, and competes with fewer and preferably BE h-flows. Accordingly, the metric should rank [hac, vo, be] i (r, ac) vectors, where vo i (r, ac) and be i (r, ac) represent the number of VO and BE h-flows in CH i (r, ac).
To evaluate rank(·), we used the Markovian model of EDCA [52] to calculate the normalized per-hop saturation throughput S i (r, ac) of e2e-flow (r, ac) at node i for various hac = hac i (r, ac) ∈ {V O, BE}, vo = vo i (r, ac) ∈ {0, . . . , 10}, and be = be i (r, ac) ∈ {0, . . . , 10}. For the resulting 14,520 pairs of (hac, vo, be) vectors, rank(·) represents a good fit if holds for a high percentage of the pairs. (Obviously, a small rank is desirable.) A heuristic metric is where 1 (·) is the indicator function, and the best fit (99.13% of the pairs) occurs at α = 40 and β = 10. The preferences of h-flows are reflected: vo has more impact upon rank i (r, ac) than does be (because β > 1), and there is distinct separation between hac = V O and hac = BE (because α β). For any A ⊆ N , rank(·) induces a heuristic e2e-flow cost metric we call f cost, additive for VO traffic delay and bottleneck-type for BE traffic throughput, defined as: where hac is given by (2) and the notation f cost (r,ac) (A) is meaningful, because hac depends on A. From (7), a nodal cost metric cost can be derived as a weighted sum where γ r,i ≥ 0 and r γ r,i = 1. The status of an attacker (honest) node whose cost relative to the A = ∅ case has increased is lose (mind), otherwise it is don't lose (don't mind).
To validate the rank-based model, we implemented the network topology and flow set of Fig. 2 in the ns-3 simulator, assuming error-free radio channels, static routing, and constant bit-rate saturation-level UDP traffic of 1500 B packets. Each simulation run lasted 200 s with an additional 50 s warm-up time and was repeated five times. Nodes were classified as mind or lose if the throughput of their BE flows dropped by 5% or more, or if the per-hop delay of their VO flows increased by more than 20 ms and exceeded 100 ms. We assessed congruity, defined as the proportion of nodes whose status (mind, lose, don't mind, or don't lose) upon TRAs launched by a random attacker set agrees between the simulation and the rank-based model. Fig. 3 presents the cumulative distribution function (CDF) of congruity obtained after simulating all the 256 possible attacker sets 2 , producing a mean congruity of 0.89.
For comparison consider a heuristic inspired by the |N |person Prisoners' Dilemma (PD) game [53]: if the number of attackers exceeds (does not exceed) a certain threshold then all the attacker nodes' status is guessed as lose (don't lose) and the honest nodes' as don't mind (mind). The corresponding CDFs depicted in Fig. 3 for the threshold varying from 0 to |N | produce mean values between 0.49 and 0.54, not far from a fair coin toss. As another baseline, an unrealistic "informed gambler", who knew an attacker (honest) node's statistical chance of acquiring a lose (mind) status under a random attacker set, might guess the node's status for a given A by tossing an appropriately biased coin. Congruity would then be measured by the expected number of guesses that match the simulation. The corresponding CDF depicted in Fig. 3 produces a mean value of 0.82, inferior to our model's. We conclude that the rank-based model is a reasonably good predictor of the impact of TRAs in MAWiNs with saturationlevel traffic.
It will be convenient to identify nodes directly impacted by a given attacker set.
Definition 5. A node whose status is lose or mind is said to be in distress. Let ∆(A) be the set of such nodes in the presence of the attacker set A ⊆ N .
Hence, the set of in-distress nodes contains nodes whose costs have increased in comparison to the A = ∅ case: Fig. 2.

D. Attack Incentives and Impact
In realistic MAWiN settings, TRAs pose a threat whose credibility (i.e., incentives to launch) and seriousness (i.e., harmful impact upon honest nodes) we now quantify. We ask if, regardless of the currently ongoing TRAs, an honest node turning attacker perceives a QoS improvement and causes some other nodes to perceive a QoS degradation. Neither of these effects is certain, as it depends on a node's position in the network topology. Referring to Fig. 2, suppose that A = ∅ and node 3 turns attacker. Since flow #3 is VO, the attack amounts to a TRA − upon flows #1 and #4. While these two flows suffer, for all the remaining flows the contention softens and their QoS improves. Hence, the TRA is harmless for other nodes. Consider an alternative scenario where it is node 5 that turns attacker. Its source traffic (flow #5) now enjoys elevated priority when forwarded at nodes 5, 4, and 7, but experiences increased contention from itself at node 5 (being forwarded by node 4 and via exclusive-OR reception from node 5 at node 4) and at node 4 (being forwarded by nodes 5 and 7), the likely net effect of which is a QoS degradation.
To investigate the above effects and their scaling with the network size |N |, both numerical calculation using the cost metric (8) and ns-3 simulations were conducted. In the numerical calculation we assumed |N | ≤ 200, uniformly distributed route hop lengths with r min = 1, and r max = 3 or 5, and |A| = 1, 2, 5, 10, or 20. For a fixed parameter configuration, the results were averaged over 10,000 instances of random network topologies, e2e-flow routes, and attacker sets. The transit nodes for a route were chosen at random in geographical proximity to the source node. We examined the scaling with |N | of: • the incentives to launch a TRA, measured as the percentage of attacker nodes whose cost metric did not worsen when turning attacker, i.e., nodes i ∈ A satisfying cost i (A) ≤ cost i (A \ {i}), and • the harmful impact of a TRA launched in the presence of a number of ongoing TRAs, measured as |∆(A)|/|A|, i.e., the average in-distress nodes per attacker node. The results depicted in Fig. 4 indicate that the threat of TRAs is not limited to small-size networks. The incentives for a TRA remain 100% for |A| = 1 and decrease with |A|, but slightly increase with |N | on account of more dispersed attacker nodes. They also slightly increase with the route hop length. A similar scaling, insensitive to route hop lengths, is visible for the harmful impact of a TRA.
For the ns-3 simulations we assumed square grid topologies with |N | = 9, 16, and 25. Each node was the source of an e2e-flow following a minimum-hop route to a randomly chosen destination, and half of the flows were V O (other settings are described in Section V). As before, we used end-to-end throughput and delay as QoS metrics for BE and VO flows, respectively, and the same rules of deciding node status. The results are marked in Fig. 4 with dotted lines. The incentives for TRAs are somewhat lower now, reflecting the fact that nodes located at the edge of the square grid have no transit traffic to downgrade. Meanwhile, in-distress nodes are more numerous, reflecting inter-flow competition effects unaccounted for by the cost metric. Nonetheless, these results confirm that the incentives for and harmful impact of TRAs are significant for large |N |.

IV. TRA GAME DESCRIPTION
A selfish node performs a TRA whenever this improves its cost, anticipating similar conduct of other selfish nodes. This gives rise to a noncooperative TRA game, whose oneshot and multistage variants we now describe. We propose to mitigate TRAs by introducing a distributed exposure signaling mechanism called DISTRESS, under which "good" multistage strategies lead to few nodes performing TRAs when the game terminates.

A. One-Shot TRA Game
In the noncooperative one-shot TRA game, the nodes are players, map i ∈ {T RA, honest} is node i's action, and cost is the (negative) payoff function (i.e., small costs are pursued). A given action profile (map i , i ∈ N ) is equivalent of the set A ⊆ N of plausible opportunistic attackers (nodes launching TRA). Using a players, action space, payoffs representation, the game is defined as Some interesting action profiles are: ∅ (all-honest), and N (all-TRA). In the latter, any e2e-flow (r, BE) experiences a TRA + at s r , and any e2e-flow (r, V O) experiences a TRA − at the first encountered node in r \ {s r , d r }.
Contrary to the intuition that it is always beneficial to upgrade source traffic and downgrade competing transit traffic, the TRA game is not an |N |-person PD. Specifically, due to the complexities of mechanisms determining MAWiN performance, the T RA action does not dominate honest, nor is it necessarily harmful to honest nodes. Moreover, A = ∅ need not be Pareto superior to A = N ; in fact, for some traffic patterns, the reverse is true [28].
The following definition identifies nodes indirectly impacted by a given attacker set. Hence, the set of in-exposure nodes contains nodes which are forward-reliant on nodes currently in distress: ∆ * (A) = E R * F (∆(A)). The rationale behind Definition 6 is the following. Nodes in distress perceive unsatisfactory QoS and so lose incentives to continue packet forwarding services. Thus they pose a credible threat of imminent service suspension, to which exposed are they themselves along with the set of  source nodes whose traffic they forward. In the case of service suspension by a node in distress, each of these source nodes registers an infinite e2e-flow cost (8), and also loses incentives to continue packet forwarding services. By recursion, similarly exposed are all nodes forward-reliant on nodes in ∆(A), i.e., ∆ * (A). TRAs can be mitigated by leveraging node exposure in a way not unlike an immune response is triggered by a foreign toxin. Namely, even a small attacker set creates a ripple effect across the network, causing exposure in a much larger set of nodes than those in distress. Instead of suspending service, nodes in exposure start playing the TRA game, occasionally selecting T RA rather than honest, which may cause exposure in the initial attackers. If exposure, i.e., the threat of imminent service suspension, is reflected in the game payoffs as a large enough (say infinite) cost, such play brings most or all of the attackers back to honesty, which the nodes in distress alone may be too few to achieve. A rigorous argument is given in Section IV-C.
For various |N |, we examined the size of the "immune response" triggered by a TRA, measured as |∆ * (A)|/|A|, the average in-exposure nodes per attacker node. Using cost metric-based calculation we verified that it remains nontrivial and does not distinctly decrease at least up to |N | = 200 (especially under longer routes), cf. Fig. 5. In the ns-3 simulations of square grid topologies up to |N | = 25, ∆ * (A) was inferred by keeping track of the source nodes of flows currently forwarded by each node. The results (marked with dotted lines) show a nontrivial size of the "immune response", which even scales with |N |.
To incorporate exposure into the game payoffs we redefine nodal costs (8) and the TRA game (9) as

B. Multistage Play under the DISTRESS Mechanism
Mitigation of TRAs using the above approach requires that nodes signal exposure to one another and can toggle between T RA and honest in response to other nodes' play, as modeled by a multistage game. Let the TRA game (9) be played in stages k = 1, 2, . . ., and let A(k) ⊆ N be the set of attackers in stage k. Each stage k is assumed long enough for each node i to produce an accurate estimate of cost i (A(k)) and signal exposure throughout the network if needed, hence to also determine cost i (A(k)).
Assume that there are no attackers prior to stage 1, i.e., A(0) = ∅. In stage 1, a set I ⊂ N of ill-behaved (selfish) nodes spontaneously select T RA, i.e., A(1) = I; the other nodes are further called well-behaved. The stage-1 TRAs may bring about distress in some (possibly also ill-behaved) nodes, i.e., induce the set ∆(A(1)) = ∆(I). At the end of a generic stage k −1, each node i estimates its current cost metric and if it finds itself in distress, i.e., i ∈ ∆(A(k−1)), then marks itself as in-exposure and sends a DISTRESS(i) flag to all nodes whose source traffic it forwards; that is, copies of the flag  are sent to all nodes of E R F ({i}), including node i itself.
(The flag is also timestamped to avoid confusion with exposure signaling in another stage.) Having received DISTRESS(j) and checked that (i, j) ∈ R F , node i ignores the flag if it is already in exposure, otherwise marks itself as in-exposure and sends DISTRESS(i) to E R F ({i}) as above. It is easy to see that if M is the set of nodes so far marked as inexposure then the process stops when E R F (M ) = M . Thus, assuming that each flag is issued and reliably delivered to the intended recipient within one stage, at the end of stage k we have M = ∆ * (A(k − 1)). This mechanism, specified as Algorithm 1, will be termed DISTRESS (DIstributed Signaling of TRaffic Exposure to Service Suspension). It is vital that the threat signified by signaled exposure be credible. Therefore a received DISTRESS(j) flag, where j ∈ r \ {s r }, should imply to s r that j is indeed a transit node on r and not an off-route one that does not pose a threat. This is granted if a source-routing protocol such as DSR [54] is employed, in which r is known to s r . Otherwise, DISTRESS(j) can be appended to forwarded packets and at d r returned to s r through a (trusted) end-to-end feedback connection. Fig. 6 provides an illustration. Here, R * F = {(1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (1,8), (2,4), (2,5), (2,6), (2,7), (2,8), (3,6), (5,4), (5, 7)}.
Following a TRA at node 3 in stage k − 1, node 4 sends DISTRESS(4) to s #1 = 1 via d #1 = 10 and to s #5 = 5 via d #1 = 6; subsequently, node 5 sends DISTRESS (5) to s #2 = 2 via d #2 = 9 and to s #1 = 1 via d #1 = 10; the latter flag is ignored by already in-exposure node 1 (to minimize the communication overhead, it could have been suppressed at node 10). At the end of the present stage, We remark that the DISTRESS mechanism is lightweight in terms of the required synchronization and communication overhead (roughly O(|R|) per node in the worst case). Exposure signaling is triggered asynchronously by nodes in distress based on local QoS perception. The cause of the distress, either a TRA or temporarily harsh traffic conditions (e.g., transmission impairments, frequent collisions, or buffer overflow), does not influence a node's behavior. Such distinction is a troublesome aspect of many known misbehavior mitigation schemes [26], [55], because responding to distress caused by exogenous factors may result in punishment of honest nodes. In our solution, exposure signaling can only encourage a node to select honest, therefore does not affect the behavior of an already honest node unintentionally causing distress. By processing a received DISTRESS flag as described above, such a node simply acknowledges an objectively existing threat of imminent service suspension and no punishment occurs. Importantly, exposure signaling is costless, hence performed without incentive calculation, while fake signaling despite satisfactory QoS perception is not beneficial.
The above considerations also indirectly imply that the effectiveness of the DISTRESS mechanism is unaffected by the network traffic volume, in particular background traffic competing with the e2e-flows: under light traffic conditions no node finds itself in distress and the mechanism is not triggered; otherwise DISTRESS signaling simply encourages honest behavior.

C. "Good" Multistage Strategies
A multistage strategy prescribes a node which of the two actions (T RA or honest) to select in each stage, based on the current history of play. To specify a desirable course of play under DISTRESS we use two auxiliary definitions. The first describes flows whose source nodes are not forwardreliant on in-distress nodes and so are expected to survive a service suspension. The second describes a node that cannot benefit from selecting a different action, given the other nodes' actions.
Definition 7. An e2e-flow is called survivable if its source node is not in exposure. Let F * (A) be the set of such flows in the presence of the attacker set A ⊆ N .
Definition 8. A node is called (weakly) best-reply if it cannot unilaterally improve its nodal cost (10) by selecting a different action. Let Γ cost (A) be the set of such nodes in the presence of the attacker set A ⊆ N .
Note that if Γ cost (A) = N then A constitutes a (weak) Nash equilibrium (NE) [53] of the one-shot TRA game (11).
We now formulate the following postulates: • Opt-out -while ill-behaved nodes may occasionally select T RA, well-behaved nodes select honest at all times; that is, being non-selfish, refuse to play the game. Thus A(k) ⊆ I for all k. • Termination -after k 0 stages the game terminates and no node thereafter changes its action. That is, A(k) = A(∞) for all k ≥ k 0 , where k 0 is finite, known in advance, and preferably small. • Rationality -ill-behaved nodes tend to select (weakly) best-reply actions; ideally, I ⊆ Γ cost (A(∞)), i.e., A(∞) is a weak NE of the one-shot game restricted to the illbehaved nodes. A suitable quantitative measure is the fraction of ill-behaved nodes that are best-reply, i.e., |Γ cost (A(∞))|/|I|. • Efficiency -ill-behaved nodes eventually cause one another no distress; ideally, ∆(A(∞)) ∩ I = ∅. A suitable quantitative measure is the fraction of ill-behaved nodes that are not in distress, i.e., |I \ ∆(A(∞))|/|I|. • Defensibility -well-behaved nodes are eventually defended against distress caused by ill-behaved nodes' TRAs; ideally, ∆(A(∞)) ⊆ I. A suitable quantitative measure is the fraction of well-behaved nodes that are not in distress, i.e., |(N \ I) \ ∆(A(∞))|/|N \ I|. • Survivability -eventually, few e2e-flows rely upon forwarding by nodes in exposure. A suitable quantitative measure of survivable network throughput is the fraction of survivable flows, i.e., |F * (A(∞))|/|F |. To satisfy these postulates, a multistage strategy has to employ well-designed action selection and participation rules. The DISTRESS mechanism enables more informed rules by enriching the history of the play: apart from recent actions, a node may recall in-distress conditions and received DISTRESS flags (i.e., in-exposure conditions) in recent stages. We confine a node's memory to the last two stages; the simulations in Section V indicate that it can produce satisfactory quantitative measures related to the above postulates.
We allow node i selecting an action for stage k + 1 to recall its membership in the sets of current attackers, current in-distress nodes, and recent in-exposure nodes, i.e., A(k), ∆(A(k)), and ∆ * (A(k − 1)), without the knowledge of the entire sets that the DISTRESS mechanism does not guarantee. Subject to this restriction, a wide class of feasible action selection rules can be expressed as follows: where σ X,Y,Z = (x,y,z)∈Φ⊆{−1,1} 3 Determined by the index set Φ, there are 2 8 = 256 distinct action selection rules. − 1)). Note that Φ = ∅ and Φ = {−1, 1} 3 correspond to "persistent honest" and "persistent TRA" strategies, respectively, and that (12) subsumes action selection rules with a reduced set of arguments, e.g., Φ = {(−1, y, z), (1, y, z)|(y, z) ∈ Φ }, where Φ ⊆ {−1, 1} 2 , corresponds to an action selection rule that does not explicitly condition A(k + 1) on A(k), i.e., of the form A(k + 1) = σ ∆(A(k)),∆ * (A(k−1)) . Participation in the game can change stage by stage as governed by some in-game condition; let G(k) be the set of in-game nodes in stage k that select action according to (12), whereas the rest, i.e., G −1 (k), retain the previous-stage action. In line with the opt-out postulate, action selection rules and in-game conditions only apply to ill-behaved nodes, i.e., G(k) ⊆ I, which we do not reflect in the ensuing formulae to keep them simple. The dynamics (12) become: where Σ = σ A(k),∆(A(k)),∆ * (A(k−1)) . If (13) is a deterministic finite-order recurrence then the sequence (A(k), k = 0, 1, 2, . . .) eventually becomes periodic and detection of this may terminate the game. It is enough to formulate the ingame condition for a given node that only depends on its recent membership in A, ∆(A), and ∆ * (A). Specifically, let h i (k) = (1 i∈A(k−1) , 1 i∈A(k) ) be node i membership history with respect to A(k−1) and A(k). We formulate the following in-game condition for node i in stage k + 1: That is, a node is out-of-game if its membership history has repeated itself within recent c max stages (since well-behaved nodes stay honest at all times, they formally become out-ofgame as of stage k = 1). We now show that under certain conditions the game is guaranteed to terminate. Proposition 1. If c max ≥ 4 then termination is guaranteed with A(k) = A(∞) and G(k) = ∅ for all k ≥ 8.
Proof. Observe that out-of-game nodes repeat previous-stage actions, thus if i / ∈ G(k) and i / ∈ G(k + 1) then i / ∈ G(k ) for all k ≥ k. This is due to h i (k) = h i (k + 1) being true for all k ≥ k, in violation of (14) with c = 1.
It turns out that under certain conditions the above specification includes multistage strategies exhibiting ideal rationality and defensibility. We will need two more definitions. First, we define a nodal cost function under which the benefit of an attacker always causes distress in some other node. Second, we define a flow set such that a service suspension at any node threatens every flow's survival. In general, all-honest dominance and full forward-reliance are not guaranteed; obviously, the latter is impossible if R = {r|(r, ac) ∈ F } contains single-hop routes. However, if both features are present in a given network, then we can show that there exists at least one action selection rule which is ideally efficient (no ill-behaved nodes cause distress), ideally defensible (all well-behaved nodes are defended against distress), and ideally rational (all ill-behaved nodes select bestreply strategies). Proposition 2 states conditions of existence of ideally rational, efficient, and defensible action selection rules. Simulations show that these conditions are often satisfied in randomly generated MAWiN instances; otherwise, "good" rules (12) nevertheless exist that ensure satisfactory characteristics across various MAWiN topologies and flow sets F , cf. Section V. These rules can be adoptedá priori, relieving nodes from seeking "good" rules for a specific topology and flow set, which they are typically unaware of.
Note that ∆(A) = ∅ implies that A is a weak NE of (11), but the converse is not true; in fact, simulations show that a vast majority of Nash equilibria A feature ∆(A) = ∅.

V. SIMULATIONS
Simulations involved network topologies with both static and mobile nodes, respectively using the Monte Carlo method based on the network model of Section III and the ns-3 simulator implementing a full MAWiN protocol stack. The goal of the Monte Carlo simulations was to assess the rationality, efficiency, defensibility, and survivability measures in realistic settings, when the two assumptions of Proposition 2 (all-honest dominance and full forward-reliance) were not necessarily satisfied. The ns-3 simulations were carried out to investigate the impact of fast-changing MAWiN topology and flow routes on the effectiveness of the DISTRESS mechanism.

A. Static Nodes
For the analysis of static topologies, we implemented the network model of Section III (including network topology, routing, flow configuration, attack behavior, and rank-based estimation of flow cost) as well as the multi-stage TRA game and DISTRESS mechanism of Section IV in a Monte Carlo simulator. Each simulation run consisted of a stage-by-stage play of the TRA game, with nodes deciding their action (TRA or honest) in each stage. 1000 MAWiN instances were generated with |N | = |F | = 10 and e2e-flow routes of uniformly distributed hop lengths with r min = 1 or 2 and r max = 5. Full forward-reliance occurred in 6.5% and 55.2% of the MAWiN instances with r min = 1 and r min = 2, respectively. We observed that: • All-honest dominance was never violated (violations are in general possible but unlikely, e.g., when traffic flows do not impact one another's performance). • For r min = 1 and r min = 2, weak Nash equilibria amounted to 40.2% and 94% of all action profiles A ⊆ N , respectively; however, those with ∆(A) = ∅ were a rarity, amounting to 0.1% and 0.3%, respectively. For each MAWiN instance, 100 independent multistage game runs were conducted with I = N chosen at random subject to ∆(I) = ∅. All 256 feasible action selection rules were tried. It was observed that: • With r min = 1 and r min = 2, respectively 44 and 54 action selection rules (including the trivial "persistent honest") ensured (i) and (ii) of Proposition 2 even in the absence of full forward-reliance. • Conditioning A(k + 1) on A(k) besides ∆(A(k)) and ∆ * (A(k − 1)) did not extend the interesting range of key after-game (asymptotic) characteristics. • On the other hand, conditioning A(k + 1) on A(k) alone, i.e., disregarding information provided by the DISTRESS mechanism, produced about the worst measures of rationality and defensibility. To explain the first observation consider, e.g., A(k + 1) = I ∩ ∆(A(k)). It must be that A(∞) = ∆(A(∞)), but A(∞) = ∆(A(∞)) = ∅ is possible only if eventually all the attacker nodes and none of the honest ones are caused distress, a highly improbable situation. The latter two observations are illustrated in Fig. 7, where each dot represents a pair of aftergame (rationality, defensibility) measures for a given action selection rule, averaged over the generated MAWiN instances and game runs. The outer contour encompasses all action selection rules (12) and the inner one only rules of the form A(k + 1) = σ ∆(A(k)),∆ * (A(k−1)) . The latter captures most of the Pareto front including the best rationality and defensibility measures. Rules of the form A(k + 1) = σ A(k) correspond to the two lower corners of the outer contour, the farthest from the Pareto front.
For more detailed analysis, several representative action selection rules have been chosen. They are listed below in the order of diminishing survivability and tendency to launch TRAs, and improving efficiency and defensibility: (a) A(k + 1) = N ("persistent TRA"), (b) A(k + 1) = ∆(A(k)) ∩ ∆ * (A(k − 1)), (c) A(k + 1) = ∆(A(k)) ⊕ ∆ * (A(k − 1)), where ⊕ denotes disjunctive union, (d) A(k + 1) = (N \ ∆ * (A(k − 1))) ∪ ∆(A(k)), (e) A(k + 1) = N \ ∆ * (A(k − 1)), (f) A(k + 1) = ∆(A(k)) \ ∆ * (A(k − 1)). These rules were observed to differ visibly in the speed of convergence to the after-game characteristics; example stageby-stage trajectories for rules (e) and (f) are shown in Fig. 8 Table II presents relevant after-game characteristics, averaged as above. One notes in particular that: • Rule (e) is the most likely to be adopted by rational ill-behaved nodes, as it leads to highly efficient Nash equilibria in the vast majority of game runs. It also produces good defensibility and survivability. • On the other hand, rule (f) produces ideal efficiency and survivability, but is far from rational, thus likely to be dismissed by ill-behaved nodes. • Rule (a) ("persistent TRA") is moderately rational, but, due to the DISTRESS mechanism, very inefficient. This latter fact is fortunate, as rule (a) produces disastrous defensibility and survivability. • Rules adversely affecting defensibility and survivability, such as (a) through (c), are not very rational and so unattractive to ill-behaved nodes. Rule (d) is moderately attractive, but from rationality and efficiency viewpoints is Pareto dominated by rule (e). Hence, under DISTRESS, ill-behaved nodes are likely to follow rule (e), in which in-exposure nodes cannot be attackers.

B. Mobile Nodes
To study the impact of node mobility, we implemented the multi-stage play of Section IV-C under rule (e) in the ns-   Table III. Each node was the source of an e2e-flow, half of them of high priority. We considered two topologies for initial node placement: gridall nodes arranged on a square grid, and random -all nodes uniformly distributed in the area. The adopted high modulation and coding scheme limits the nodes' transmission range to approximately 10 m. For a more realistic IEEE 802.11 range of 100 m, the maximum evaluated velocity would scale to 14 m/s (50 km/h). In summary, the chosen settings created conditions when TRAs are likely to negatively impact the performance of honest nodes. With node mobility, ∆(A) and ∆ * (A) can change stage by stage although A remains the same, due to changing MAWiN topology and flow routes. Except for quasi-static environments, where typical TRA game duration (on order of a few stages) fits between successive route changes, this may complicate ill-behaved nodes' strategic behavior. For preliminary insight we assumed that they nevertheless stick to rule (e); in particular, they calculate cost i (∅) only once for distress perception, at game start, and maintain the current action (TRA or honest) indefinitely when the out-of-game condition is satisfied. The key question was whether the DISTRESS mechanism would still be able to restrain TRAs. As an evaluation metric for our DISTRESS mechanism, we used the percentage of after-game attackers, which reflects how many of the network's ill-behaved nodes remain attackers after the game has terminated. Using this metric, we compared the performance of a MAWiN under the DISTRESS mechanism with a baseline MAWiN, where nodes do not fear service suspension and remain attackers if such behavior improves the QoS metrics (8) of their source flows.
In Fig. 9 each point is an average of 100 independent simulation runs with half the nodes being ill-behaved (|I| = |N |/2). Simulations confirmed that even for the maximum node velocity the adopted settings created a quasi-static environment -route changes during the TRA game were occasional or none and the play closely resembled that of Section IV-C. Hence, irrespective of node velocity and initial placement, after-game attackers were indeed distinctly fewer than for the baseline MAWiN with the DISTRESS mechanism disengaged; they were mostly limited to ill-behaved sources of single-hop flows, which, having no forwarding services to rely upon, remained unaffected by the DISTRESS mechanism.
To evaluate our proposed mechanism against a varying attack intensity, we note that Definitions 2-4 do not leave room for any gradation of TRA intensity: a node either behaves honestly or executes a plausible opportunistic TRA. Therefore, overall attack intensity in the network is sufficiently reflected by the percentage of ill-behaved nodes in N . Fig. 10 presents the respective simulation results for a high-mobility (1.4 m/s) scenario of the grid topology. DISTRESS is uniformly able to incentivize honest behavior in all (or almost all) of those illbehaved nodes which are the sources of multi-hop flows. As before, only the sources of single-hop flows remain attackers.
Finally, even though a node's placement in the network topology does influence the potential benefits of becoming an attacker (as is visible in Fig. 2), one suspects that the effectiveness of the DISTRESS mechanism does not depend on the placement of attackers. This is because DISTRESS signaling is network-wide and affects the cost metric (11) of each node regardless of its location. Fig. 11 shows that indeed the initial placement (interior, edge, or corner) of illbehaved nodes in a grid-topology MAWiN does not impact the percentage of after-game attackers.
VI. CONCLUSION A traffic remapping attack (TRA) is hard to defend against in MAWiNs due to their multi-hop topology, node autonomy, and complex interplay of factors affecting end-to-end performance. We have proposed a systematic game-theoretic approach to TRA mitigation. The adopted model of a MAWiN  under plausible opportunistic TRAs allows to define a noncooperative multistage TRA game arising among selfish nodes, in which the payoff function is provided by a novel networkoriented end-to-end QoS metric. We have augmented this function to reflect the threat of forwarding service suspension due to ongoing TRAs, as disseminated by a robust and lowcost distributed signaling mechanism called DISTRESS.
Our work distinguishes itself by proposing the first distributed self-enforcing mitigation approach for TRAs in MAW-iNs. Existing solutions are not directly comparable: they counteract other types of attacks, rely on attack detection, require centralized control, or have been designed for singlehop networks and cannot cope with the multi-hop nature of TRAs. However, an advantage of the proposed framework is that by analyzing all feasible action selection rules (12), it enables an exhaustive search of a wide class of selfish nodes' multistage strategies. Therefore, optimum rules can be found according to the selected criteria, so that comparisons with particular solutions, existing or to be found in the future, are less relevant.  Fig. 11. Percentage of after-game attackers depending on initial ill-behaved node placement in the grid topology with high mobility.
Our framework also enables a precise statement of postulates regarding a desirable game outcome: opt-out (wellbehaved nodes need not play), termination (finite game duration is guaranteed), rationality (ill-behaved nodes select bestreply behavior), defensibility and efficiency (well-and illbehaved nodes receive satisfactory QoS), and survivability (little traffic is threatened by forwarding service suspension). We have argued that ill-behaved nodes are likely to use a strategy which, under certain assumptions regarding MAWiN topology and traffic flows, guarantees that these postulates are satisfied. The game outcome remains desirable even for a broader class of static-topology MAWiNs, as demonstrated by Monte Carlo simulations; time-true simulations using ns-3 extend this conclusion to networks with mobile nodes.
Although it has been analyzed assuming fixed e2e-flow routes, the DISTRESS mechanism can work with alternate routing as well. Provided that a node is aware of all currently available routes for the originated e2e-flow, it only needs to mark itself as in-exposure upon reception of a DISTRESS flag from at least one node on each of these routes. Under dynamic routing, the forward-reliance relationship may be time-varying; a DISTRESS flag received from a node on a given route can then be interpreted by a source node as a noncommittal signal to propagate DISTRESS flags further, depending on the projected route stability.
Finally, mechanisms providing QoS security, similar to DISTRESS, might produce viable game-theoretic defense against QoS abuse in other distributed settings offering QoS differentiation; this is left for future research.