A Complex Networks Approach for Reliability Evaluation of Swarm Systems Under Malicious Attacks

The reliability of swarm systems needs to be investigated because of their self-adaption and self-organized features. Previous studies mainly focused on the reliability of a single agent, whereas for swarm systems, the ability to maintain their overall connection and function under adverse conditions is more important. It remains challenging on how to evaluate the reliability of swarm systems at the system level. In this paper, we present a reliability evaluation method for swarm systems by characterizing the behavior of the whole system using the method of temporal network analysis. A novel comprehensive reliability metric, i.e. cooperation reliability (CR), is proposed, considering both system integrity and motion consensus. Meanwhile, by identifying critical individuals in swarm systems, we design different malicious attack strategies. It is found that the malicious attacks perform much more harmfully to the reliability of swarm systems than noise and random attack. Moreover, we find that the reliability of a swarm system is sensitive to the swarm density in our framework due to the dynamical interaction of the system. Our findings may shed light on understanding the complicated behaviors of swarm systems under attack and designing a more robust swarm system.


I. INTRODUCTION
Applications of swarm systems are emerging in various fields, showing clearly the importance of swarm systems. For instance, several UAV swarms, such as Gremlins, Coyote drones, and Perdix, are designed for military tasks like distributed reconnaissance, cooperative attack, and saturation attack; In the civilian field, swarm systems are being applied to distributed sensing, emergency rescue etc. These engineering applications take advantage of the superiority of swarm systems in information sharing, collaboration, and robustness, and require the swarm systems to be reliable. Beginning with the observation of behaviors of biological flocks, researchers have been studying swarm systems for decades. Researchers studied the swarm systems in nature [1]- [10] thoroughly and designed a number of artificial swarm The associate editor coordinating the review of this manuscript and approving it for publication was Cristian Zambelli . systems [11]- [15] to achieve robustness, flexibility, scalability, and more importantly, superior capabilities of the whole system compared with single agents [13], [14]. However, under perturbations like noise or malicious attacks, swarm systems may become unstable and even collapse. Thus, the consensus of the system is difficult to maintain [16]- [21], creating security challenges [22], [23] and reliability concerns [24], [25].
Initially, researchers mainly focused on reliability of a single agent. Winfield et al. [24] proposed three different tentative approaches [25] to evaluate the reliability of a robot swarm. These approaches tried to evaluate the reliability of swarm system by the function of reliabilities of all the agents in the system, but ignored the system's self-adaption and selforganized features, since they were developed on an arbitrary assumption, that the whole system breaks down only if all of its individuals break down. Due to the dynamical change of system topology, usually the reliability function between VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ the swarm system and the single agent is absolutely different, especially under attack situation. The above problems constitute a major challenge as how to evaluate the overall reliability of swarm systems at the system level.
To study the topology and system-level ability of swarm systems, researchers employed complex network theory. They regard a swarm system as a network composed by the individuals (nodes) and the interactions among them (links), and analyze the network by statistical mechanics, graph theory, control theory etc. Using this methodology, Komareji and Bouffanais [18] found that the networks of their swarm model based on the k-nearest neighbor rules exhibit small-world effect and possess Poissonian-like indegree distribution, differing from the power law characteristic of scale-free networks. Moreover, researchers of swarm systems have gained extraordinary results in consensus problem [17], [26], information flow [26], [27], controllability [18], and resilience [18]- [21], [28]. In this paper, we expect that network-based methodology can help us to grasp the big picture of the whole system and develop a overall reliability evaluation approach for swarm systems.
The vulnerability of swarm systems is another compelling problem to study due to the network feature of swarm systems. Albert et al. [29] found that attacking highly connected nodes in a scale-free network can cause more significant damage to the topology than attacking those less connected ones, which revealed the attack vulnerability of scale-free networks. Jianwei Wang et al. proposed a cascading model to explain the attack vulnerability of scale-free networks [30]. Propagation of cascading failures [31], [32] and restoration from cascading failure [33]- [36] has been studied. After [29], a series of studies about attack tolerance or vulnerability of static networks [37]- [41] and temporal networks [42]- [47] emerged. Metrics like degree, closeness, betweenness etc. has been practiced to identify the significance of nodes [48], [49]. Nevertheless, faced with malicious attacks, whether the temporal network of a swarm system exhibits vulnerability just like scale-free networks do is uncertain.
In this paper, we model swarm systems based on the Vicsek model [16], where the function of the particles is ordered motion. We construct temporal networks for our swarm systems and define the system-level reliability of a swarm system as the ability to maintain swarm behavior, i.e. cooperation reliability (CR), which is a combination of system integrity and motion consensus. This is realistic for real situation. For example, UAV swarm with reconnaissance mission needs to transfer and exchange information with each other based on clusters in the system, where connection guarantees the interaction ability. Swarm also needs motion consensus in a self-adaptive formation flight for the ease of system control. To measure the cooperation reliability, the swarm all-terminal reliability (SATR) is defined to evaluate the integrity of the system and the consensus metric is introduced to evaluate the motion consensus of all the individuals. Moreover, using identification methods of vital nodes in complex networks, we design three attack strategies, including random attack and two malicious attack strategies, and study their influences on the reliability of our swarm systems. Specifically, we do not consider the reliability of individual particles, i.e. we assume that every single particle's reliability equals to 1. In all, our main contributions can be summarized as: • A novel reliability metric is proposed, combining system integrity and motion consensus.
• We analyze the temporal networks of swarm systems based on percolation theory, considering the selfadaption and self-organized features of the network.
• We also study and compare the reliabilities and behaviors of swarm systems under different attack strategies. The rest paper is organized as follows. We introduce the models in section II, including Vicsek model and temporal networks. Reliability metrics are proposed in section III and attack strategies are designed in section IV. We present and discuss our results in section V. Finally, we summarize this paper in section VI.

II. MODELS
A. VICSEK MODEL [16] In this section, we describe some basic principles of the Vicsek model briefly and explain how we employ this model in this paper.
The Vicsek model consists of a square plane with periodic boundary conditions and N self-driven particles. These particles move continuously in the plane and their positions and directions of motion are updated at each time step. A particle's motion direction is calculated by averaging the directions of all the neighbors and then adding an external noise.
At a certain time step t, the neighbors of a particle i are all the nearby particles that are within a distance of r, represented by a set of particles P i (t), where d ij denotes the distance from particle i to particle j.
Parameter r can be considered as the maximum communication range among particles. The direction of particle i is updated by the expression where θ i (t + 1) is the direction of particle i at step t + 1, and ∆θ 1 , denoting noise, is randomly chosen from the interval [−η/2, η/2]. In (2), θ (t) r represents the average of directions of all the neighbors of the given particle. If θ j (t) represents the direction of particle j, then θ(t) r is subject to the expression where N i (t) represents the number of the neighbors of particle i. Furthermore, (x i (t), y i (t)) represents the site of particle i, which is updated by the expression where ν is the speed of all the particles.

B. TEMPORAL NETWORKS CONSTRUCTION
Represented by G(t) = (V , E(t)), the temporal network of a swarm system based on the Vicsek model is an unweighted and undirected graph at each time step. Node V i in the networks represents particle i of the swarm system. E(t) represents the interactions (links) among particles. A link fails when the distance between the two particles is larger than r. Here, the distance between particles means the Euclidean distance.
We show how the networks of swarm systems evolve in Fig. 1, with a case of low density and a case of high density. Initially, particles of the swarm system with low density are scattered in the field and the clusters are fragmented and dispersive ( Fig. 1(a)). Gradually, those particles flock and construct large clusters as time passes ( Fig. 1(b) and 1(c)). Whereas, in the other case, the swarm system's network maintains fully connected from the beginning to the end because of high swarm density.

III. RELIABILITY METRICS DEFINITION
The reliability of a system is the probability that it maintains its required function without failure under required work conditions for a given period of time [50]. For different swarm systems, their functions can be various: animal swarms need to migrate, hunt, or evade predators etc; artificial swarm systems can be employed to search and rescue victims, monitor or attack enemies etc. No matter what the specific function is, the swarm systems need to maintain swarm behavior. Here, we propose a method that describes the commonality of variant swarm systems. According to the definition of reliability referred in the beginning, we consider the swarm behavior as the required function of our swarm systems and the noise or malicious attacks as the potential required work conditions. The reliability of a swarm system is thus defined as the ability to maintain swarm behavior under required work conditions like noise or malicious attacks for a given period of time. Different from traditional reliability of electronic or mechanical systems, swarm reliability evaluation requires taking a full consideration on its highly dynamical, interdependent and resilient properties, revealing the profound relationship between swarm reliability and its properties. The failure of a single particle or a part of system may not collapse the system [18]. Therefore, swarm reliability evaluation needs to focus on the behavior of the whole system. VOLUME 8, 2020 To measure the ability of a swarm system to maintain swarm behavior, we consider two aspects, system integrity and motion consensus. We believe that integrity and consensus are the two indispensable properties of swarm behavior. There are two reasons behind this thought. Firstly, the integrity (or wholeness) of a swarm system reflects the connection of its network, which is critical to the information sharing and the decision making. Secondly, the consensus ensures that the individuals in the swarm system act cooperatively. Our results reveal that under harsh conditions like strong noise or malicious attacks, neither of these two properties can guarantee the reliable operation of the system. In other words, in some cases, there exists gathered but disordered swarms or ordered but fragmented swarms. To evaluate the reliability of swarm systems, both integrity and consensus need to be considered.
Network methods are effective in studying the integrity of swarm systems. For instance, Komareji and Bouffanais [18] used the size of giant strongly connected component (GSCC) and the number of strongly connected components (SCC) as two integrity metrics to analyze the resilience of consensus in their model. We define swarm all-terminal reliability (SATR), which is inspired by the all-terminal reliability of networks, to evaluate the integrity of a swarm system in this paper. Compared with GSCC, the SATR describes the overall feature of the system rather than feature of the largest cluster. Given that every single particle's reliability equals to 1, we define SATR of swarm systems as follow.
Definition 1: Swarm All-terminal Reliability SATR(t), the probability that each pair of particles are connected by at least one path. At time step t, if the swarm system is composed of N particles and m(t) clusters, where each cluster contains N 1 , N 2 , . . . , N m(t) particles respectively, the SATR(t) of this swarm system is obtained from the expression where N j 2 is the 2-combination of cluster j and N 2 is the 2-combination of the whole system. Note that SATR is very different from the traditional ATR of a static network. Traditional ATR is probability of each pair of nodes being connected given the probability of failure for each link. However, SATR emphasizes the number of clusters in the system. Cluster size distribution is significant in percolation theory [51], [52], which captures the features of phase transition. Fewer clusters means a more integrated of the system and vice versa. The cluster statistics is widely used in traditional swarm system analyses [53]- [55]. SATR is the overall characterization of cluster statistics. When the system only has one cluster, i.e., m(t) = 1, SATR(t) reaches its maximal value 1. When all particles disconnect from the system, i.e., m(t) = N , SATR(t) = 0. Thus, SATR is more suitable and objective to describe the swarm system. On the other hand, we use the order parameter that Vicsek et al. used in [16] as consensus metric to measure the motion consensus. Definition 2: Consensus metric V a (t), the average velocity of the whole swarm system [16], where N is the number of particles in the system and ν i is the normalized velocity of particle i. V a (t) is approximately 0 when the directions of all the particles are totally unordered. When they move toward the same direction, V a (t) = 1.
To evaluate the reliability of swarm systems comprehensively, we define the cooperation reliability (CR) as the reliability metric as follow.
Definition 3: Cooperation Reliability CR(t), the ability of a swarm system to maintain swarm behavior at the time step t. We calculate CR(t) by the expression Fig. 2 shows how the cooperation reliability converges as the system evolving. The stronger noise, the lower steady value of the cooperation reliability.

IV. ATTACK STRATEGIES
To simulate the potential external attacks on swarm systems, different attack strategies are considered. They are expected to destroy the swarm behavior of swarm systems as efficiently as possible. Let p denote the proportion of the attacked particles, then the number of the attacked particles is denoted by N × p. The attacks on particles are adopted by disturbing their moving directions. The direction of an attacked particle, particle i, is updated by the expression where ∼ θ i (t +1) denotes the direction under attack and θ i (t +1) is the direction obtained from (2). Similar to angle ∆θ 1 , ∆θ 2 is also a uniform distribution random number within the given interval and it denotes the external attack.
Then the location of the attacked particle i is updated by Notably, the noise and the attacks both influence the swarm behavior by disturbing the moving directions of particles, but the differences between them are remarkable. (i) Firstly, the noise is added on all the particles in the system, whereas the attacks only influence a part of particles. (ii) Secondly, usually the strength and effect of noise is relatively mild, whereas the attacks are much stronger. In our study, the attacks on a particle can be intuitively interpreted as 2π noise on this particle. No matter which attack strategy we employ, the way to update the directions of the attacked particles are the same, that is to update the directions according to (8). The difference lies in the selections of the attacked particles. Next, we will introduce three different attack strategies based on different methods of selection.

A. RANDOM ATTACK
In random attack (RA), the attacked particles are randomly chosen from all the particles, ignoring their positions or any other properties. This strategy is designed as a comparison of the malicious attacks presented subsequently. Moreover, the random attack can simulate the failures of individuals in the swarm system. In robot swarm systems, for example, failures of communication module or motion module can cause such errors of direction updating we designed by (8).

B. MALICIOUS ATTACKS
To break down the swarm systems efficiently, the malicious attacks target those vital particles, using vital nodes identification methods of complex networks. We rank all the particles decreasingly according to a certain temporal network property and then select the top N × p particles to attack. We define two specific malicious strategies using different temporal metrics as follow.

1) TEMPORAL DEGREE ATTACK
For a certain node V i , its degree is the number of other nodes that directly connects to it [48]. Nodes connected to more nodes are more vital in some cases since they are able to affect more nodes. Since all the particles are moving, the degree of each particle is time-varying. Therefore, we define the temporal degree D i (t) as the degree of node V i at a certain time step t. Under this strategy, the particles are ranked by their temporal degrees. We call this strategy degree attack (DA) for short in the rest of this paper.

2) TEMPORAL CLOSENESS CENTRALITY ATTACK
The closeness centrality of networks tries to identify the particles that are the closest to the information flow. The particles with higher closeness centralities are closer to the center of a network. Since the networks of swarm systems can be fragmented, we adopt the definition of closeness centrality in (10) [48], which is applicable to both fully connected networks and fragmented networks. For the node V i of a network with N nodes, the closeness centrality is calculated by the expression where l ij denotes the distance between V i and V j . Here, the distance represents the length of the shortest path between these two nodes. Particularly, if node V i and node V j are not connected, the distance between them equals to infinity. The temporal closeness centrality CC i (t) represents the closeness centrality of node V i at a certain time step t. The strategy that attacks particles in terms of their closeness centralities is called closeness attack (CA) for short.

V. RESULTS AND DISCUSSION
We investigate the evolvement and reliability of swarm systems under various conditions. The results show how the noise and different attack strategies affect the integrity and the consensus of swarm systems. The average of the cooperation reliability, CR , is obtained by where SATR denotes the average of SATR and V a denotes the average of V a over the 100 realizations.

A. NOISE EFFECT
The effects of noise on the motion consensus of swarm systems have been studied by Vicsek et al. [16] and Czirók and Vicsek [56]. Nevertheless, how noise influences the integrity of a swarm system is unclear. Komareji and Bouffanais [18] studied the evolvement of the size of GSCC and the number of SCC, yet they only presented the result of a fixed noise. Fig. 3 shows the behaviors of particles under various noises. Firstly, from the perspective of single particles, the trajectories of particles under stronger noise are more winding. As a result, particles under stronger noise travel less distance by the same steps. Particularly, the particles under 2π rad noise (the highest noise) just linger in small regions around their original locations, because the noise is so strong. Secondly, from the perspective of the entire system, the swarm system under stronger noise is more fragmented, indicating lower SATR, and trajectories of all the particles are more disordered, namely lower V a . Also, under stronger noise, the particles are more scattered (Fig. 3(a)-3(c)), resulting from the disturbance of the noise on the swarm behavior.
To investigate the effects of noise quantitatively, we calculate SATR, V a , and CR of swarm systems with various densities under various levels of noise. We find that the stronger the noise, the lower SATR, V a , and CR (Fig. 4). The reason is that under noise, the swarm systems are easier to break apart.
Moreover, Fig. 4 shows that under the same noise, the swarm systems with higher densities always exhibit  higher SATR, V a , and CR, which means the swarm systems with higher densities are more reliable. The fact is, when the density is higher, the particles are more likely to connect with each other. The density also influences SATR. When the density is high, SATR anti-intuitively increases as strong noise increases (Fig. 4(a)). (i) When the noise is weak, the system still maintains swarm behavior. With the strengthening of the noise, the networks are fragmented more severely, so SATR reduces. (ii) However, when the noise is strong enough, the swarm behavior vanishes and the particles are scattered. Given that the density is very high, the more scattered distribution of the particles makes it easier for the particles to connect with each other, so SATR anti-intuitively increases. Yet the particles do not achieve motion consensus in such case. More visually, the network of a high-density swarm system under strong noise is similar to the network in Fig. 1(d). This finding reveals the disadvantage of merely SATR in evaluating the reliability of swarm systems, that  it cannot tell whether the particles flock or just gather disorderly. Moreover, when the noise η increases from 0 to 0.1π, the consensus metric V a just decays a little, whereas SATR declines much more significantly (Fig. 4(a) and 4(b)), which indicates that the consensus metric V a is not enough to evaluate the reliability of swarm systems either.

B. EFFECTS OF ATTACK STRATEGIES 1) RANDOM ATTACK
The reliability of swarm systems under random attack with various failure proportion p and density ρ has been studied.
Normally, SATR decreases as the failure proportion p grows ( Fig. 5(a)). The consensus metric and the cooperation reliability monotonically decrease as p increases (Fig. 5(b) and 5(c)).

2) MALICIOUS ATTACKS
Malicious attacks are expected to break down the swarm systems efficiently. To investigate their impacts, we first observe the networks of the swarm systems under malicious attacks and the trajectories of the particles in Fig. 6.
It shows that the swarm systems under malicious attacks are more fragmented and the motions of their particles are VOLUME 8, 2020  more disordered, compared with the swarm systems only under noise in Fig. 3(a) and 3(d). However, we also notice that the attacked particles usually gather together without motion consensus, as shown in Fig. 6 by their trajectories. We call this phenomenon the disordered gathering of attacked particles.
As for reliability of swarm systems, Fig. 7 shows that SATR, V a , and CR suffer serious damages under the malicious attacks, even when the attack proportion is small. Whereas, due to the disordered gathering of attacked particles, increasing the attack proportion cannot enlarge the much damage on SATR. It can even be counterproductive when the density is too high (Fig. 7(a)).
Furthermore, trends of these metrics with different densities exhibit subtle diversities. The SATR, V a , and CR decrease more significantly in the swarm systems with lower densities, indicating that the swarm systems with lower densities are more vulnerable. Fig. 8 presents the comparison of three attack strategies with various attack proportions. The CR of swarm systems under malicious attacks are lower than that under the random attack. This diversity shows that (1) the malicious attacks are more efficient to break down the swarm systems than random attack and (2) some particles are more critical to the whole system than others, though there is no difference in their individual functions or performances. Moreover, between two malicious strategies, the closeness attack is more efficient than the degree attack, suggesting that the particles with high closeness centralities are more critical than the particles with high degrees. To compare the effects of these three attack strategies further, we investigate SATR, V a , and CR with a fixed attack proportion p = 0.1, where we notice the highest difference among effects of these three attack strategies. Fig. 9 reveals that no matter for SATR, V a , or CR, malicious attacks perform better attack effect than random attack. The particles with high degree or high closeness centrality are able to influence more particles. Thus, they are more important for the swarm systems to maintain swarm behavior. Meanwhile, within the framework of Vicsek model, the superiority of closeness attack over degree attack reveals that the importance of individuals is not only related to the number of neighbors (or degrees), but more importantly, the average distance to the other individuals, i.e. the location in the whole system.

VI. CONCLUSION
In this paper, we propose a novel reliability evaluation method for swarm systems by characterizing the system behavior, considering system integrity and motion consensus jointly. Compared with previous reliability metrics of swarm systems, the cooperation reliability emerges from the interaction among agents, reflecting the self-adaption and self-organized features of swarm behavior. This method is widely applicable for various swarm systems since it describes the commonality of different swarm systems. Moreover, to demonstrate this method, we study the reliability of swarm systems based on the Vicsek model under adverse work conditions including noise, random attack, and malicious attacks. The malicious attacks are designed by vital nodes identification in complex network. Simulation results show that the cooperation reliability comprehensively reflects the swarm system's ability to maintain its function under different work conditions. Also found is the vulnerability of swarm systems under malicious attacks, which indicates that misbehavior of different individuals influences behavior of the whole system differently, due to diverse locations in the system or connections to other individuals. This finding can be helpful for developing attack or protection strategies of swarm systems.
Yet, the malicious attack strategies designed in this paper have some shortcomings, since it is difficult for attackers to obtain the time-varying topology information of the swarm networks instantly. We will study the structural formation pattern of dynamic swarm networks in the future, try to uncover the relevance between failures and the structure of a dynamic swarm system in terms of topology changes and timing sequence, and further design a more robust swarm system. GUANGHAN BAI received the B.Sc. and M.Sc. degrees from the National University of Defense Technology, Changsha, China, and the Ph.D. degree in mechanical engineering from the University of Alberta, Edmonton, AB, Canada, in 2016.
He is currently a Lecturer with the Laboratory of Science and Technology on Integrated Logistics Support, National University of Defense Technology. His research interests include network reliability and system resilience.
YI YANG was born in 1978. She received the Ph.D. degree from the Nanjing University of Science and Technology, in 2008.
She was engaged in postdoctoral research with the School of Reliability and Systems Engineering, Beihang University (BUAA), where she has been a Professor, since 2019. Her main research interests include reliability analysis and design, traffic networks, control science and engineering, CPS, and belief reliability. VOLUME 8, 2020