Hiding From Centrality Measures: A Stackelberg Game Perspective

Centrality measures can rank nodes in a social network according to their importance. However, in many cases, a node may want to avoid being highly ranked by such measures, e.g., as is the case with terrorist networks. In this work, we study a confrontation between the seeker—the party analyzing a social network using centrality measures—and the evader—a node attempting to decrease its ranking according to such measures. We analyze the possible outcomes of modifying, i.e., adding or removing, a single edge by the evader, showing that even without complete knowledge about the network, the effects of the modification on the evader's ranking can often be predicted. We study the computational complexity of finding a set of modifications that reduce the evader's centrality ranking in an optimal way, proving that these decision problems are NP-complete. Moreover, we provide a 2-approximation for the degree centrality, and logarithmic approximation boundaries for the closeness and betweenness centralities. Finally, we define and investigate a Stackelberg game between the seeker and the evader, providing a Mixed Integer Linear Programming formulation of finding an equilibrium. Altogether, we provide a thorough analysis of the strategic aspects of hiding from centrality measures in social networks.


I. INTRODUCTION
E VER since the dawn of the Internet Age, a rapidly growing amount of information about our daily lives is uploaded to the Web.A plethora of this data, such as our conversations, our likes and dislikes, and even our relationships can be represented using network structures.Simultaneously with this process, we can observe the development of an increasing number of social network analysis tools and techniques capable of inferring various information from the data publicly available online.This raises a privacy-related concern, as members of social networks are no longer able to keep their sensitive information private.
One of the most widely-used social network analysis tools are centrality measures [1], [2].A centrality measure is an algorithm that estimates the relative importance of nodes in a network.In other words, with the use of centrality measures, it is possible to identify the key players in a network, where the exact notion of importance depends on the centrality measure of choice.However, there exist situations in which such key players might not want to be identified.We have already mentioned the issues pertaining to the privacy of Internet users.At first glance, the fact that an average social media user might prefer to evade analysis performed with centrality measures may seem unimportant.However, when we consider the situation of opposition bloggers in authoritarian regimes, the consequences of being identified as the most important node in the network may be much more dire.These circumstances allow us to sympathize with a member of the social network who wishes to avoid being detected by a centrality measure.Nevertheless, one can present a scenario in which an ability to evade centrality measures presents a grave danger to public safety.In particular, centrality analysis is often used to pinpoint the leaders of criminal [3] and terrorist [4] organizations.In such situations we would like to diminish the probability that the actual ringleader of the network avoids detection.
Given this ethical asymmetry of possible real-life scenarios, in our analysis we consider an abstract model of hiding from centrality measures in a social network.More specifically, we consider a confrontation between the seeker-a third party who uses centrality measures to pinpoint the most important nodes in the network-and the evader-a member of the social network who wishes to avoid being detected by the seeker.In this work, we assume that the seeker and evader are aware of each others' existence (see Section V-A for the comparison with previous works on hiding from centrality measures where this assumption was not in place).This turns our model into a game-theoretic setting, where both players try to achieve their goals by applying carefully selected strategies.The set of strategies of the seeker consists of centrality measures that can be used to find the most important nodes in the network.On the other hand, the strategies of the evader take the form of adding or removing some of the edges to mislead the centrality analysis performed by the seeker.
To be more precise, in our work we consider a number of research questions.First, assuming limited knowledge about the network structure, is it possible to predict the effect of adding or removing a given edge on the evader's ranking?We answer this question theoretically in Section IV and empirically in Section VII-B.Second, is it possible for the evader to find an optimal way of hiding?We resolve this issue in Section V by analyzing the computational complexity of the problem faced by the evader.Third, what is the outcome of the confrontation between the seeker and the evader?To address this question, we formally define a Stackelberg game between the seeker and the evader in Section VI and study its equilibria in Section VII-C.Altogether, our work presents an exploration of the strategic aspects of hiding from centrality measures in a social network.
The motivation for our work is twofold.From the perspective of the seeker, the results of our work could give law enforcement agencies a fresh insight into the possible ways in which members of criminal and terrorist organizations avoid detection.This insight is particularly crucial, as centrality measures are one of the key tools for analyzing covert networks.From the perspective of the evader, our work could be of use to the members of communities that are discriminated against, such as specific ethnic groups in authoritarian regimes.Careful rewiring of the social network structure might help the leaders of such communities elude harsh repressions.
A preliminary version of this work was published in the Proceedings of the 20th Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2021) [5].The new results added in this extended version are: r Theoretical analysis of whether the addition or removal of a given edge can increase or decrease the ranking of the evader, and what can be the magnitude of this change (Section IV).
r The approximation version of the problem and the analysis of its computational complexity for different centrality measures (Section V-B).
r A formal proof that the MIQP and MILP formulations of the seeker-evader game equilibrium are equivalent (Section VI-B).
r An empirical analysis of the sensitivity of centrality mea- sures (Section VII-B).
r Experiments with the seeker-evader game in the Ambas- sador network, as well as Newman and Prüfer random network generation models (Section VII-C).

II. RELATED WORK
Our article is part of the literature on using edge perturbations to manipulate network properties.The purpose of these interventions could be minimizing the average distance between nodes [6], promoting health-related behaviors [7], and reducing the number of small dense subgroups [8].The body of literature that is most relevant to our setting considers the task of strategically manipulating centrality measures.Some works focus on the problem of adding edges to increase the closeness centrality of a given node [9], or to increase multiple centralities at the same time [10].
In this work, we attempt not to increase, but rather to decrease the value of centrality, in the hope of hiding a selected node.This problem was considered for both the centrality value of a selected evader [11], as well as for the ranking position of a group of network's leaders [12], [13], [14].Another facet of the problem that was analyzed in the literature is the axiomatic characterization of the centrality measures that are resilient to manipulation [15].While most works considers standard network structure, some examine networks that consist of multiple layers [16], or temporal networks where edges exists only at specific moments [17].
A related body of literature concerns itself with hiding from other types of social network analysis tools.These types of evasion techniques are often motivated by the need of privacy protection [18].Social media users who do not wish some of their undisclosed relationship to be uncovered, might be interested in heuristic solutions designed to mislead link prediction algorithms [19], [20], [21].Others might want to counter the analysis performed using node similarity measures [22].A group of people might wish to avoid being identified as a closely-knit faction by community detection algorithms [11], [23], [24].Yet another class of techniques allow to hide the identity of the source of network diffusion from the source detection algorithms [25].Some techniques have also been propose to prevent the inference of an edge type in signed networks where each relation is tagged as either positive or negative [26].
In an even wider perspective, our study is a part of the literature on adversarial attack and defense in networks [27], [28].Many of the works are focused on attacking machine learning methods processing the network data, either by manipulating the data they are trained on (poisoning attack) [29], [30], [31] or by manipulating the input to an already trained algorithm (evasion attack) [32], [33], [34].Another example of such adversarial setting is a confrontation between an attacker trying to spread a diffusion process in a network and a defender trying to stop it [35].

III. PRELIMINARIES
In this section, we present the basic network notation and concepts that will be used throughout the article.For the convenience of the reader, Table I provides a summary of the notation used in the article.

A. Basic Network Notation
Let G = (V, E) denote a network, where V = {v 1 , . . ., v n } is the set of n nodes and E ⊆ V × V is the set of edges.We denote by (v i , v j ) an edge between the nodes v i and v j .In this work we focus on undirected networks, i.e., we do not discern between edges (v i , v j ) and (v j , v i ).We also assume that networks do not contain self-loops, i.e., To make the notation more readable, we will often omit the network itself from the notation whenever it is clear from the context, e.g., by writing N (v i ) instead of N G (v i ).This applies not only to the notation presented thus far, but rather to all notation in this article.
A path in (V, E) is an ordered sequence of nodes, p = v i 1 , . . ., v i k , in which every two consecutive nodes are connected by an edge in E. The length of a path is equal to the number of edges therein.For any pair of nodes, v i , v j ∈ V , we denote by Π(v i , v j ) the set of all shortest paths between v i and v j , and we denote by d(v i , v j ) the distance between v i and v j , i.e., the length of a shortest path between v i and v j .
We will often focus on a particular node v † ∈ V , called the evader.Let V denote the set of all nodes other than v † , i.e., V = V \ {v † }.Furthermore, let V denote the set of all nodes other than v † and the neighbors of v † , i.e., 3 , consists of every remaining edge, i.e., ζ † 3 = V × V .Notice that although the elements of each such class will be referred to as "edges", they may or may not be present in any given network.This is unlike the elements of E, which are the edges that are present in the network G = (V, E).

B. Centrality Measures
A centrality measure is a function, c(G, v i ), that expresses the relative importance of any given node v i in the network G [1].In this work we consider four fundamental centrality measures, namely degree, closeness, betweenness, and eigenvector.
Degree centrality [36] quantifies the importance of a node based on the number of its neighbors.Formally, the normalized degree centrality of a node v i ∈ V in a network G is: Closeness centrality [37] assigns the importance of a node based on an average distance to all other nodes.Formally, the normalized closeness centrality of a node v i ∈ V is: .
Betweenness centrality [38], [39] measures the importance of a given node in the context of network flow.The normalized betweenness centrality of a node v i ∈ V is: .
Eigenvector centrality [40] quantifies the importance of a given node based on the importance of its neighbors.Formally, the eigenvector centrality of a node v i is: where χ * is the eigenvector corresponding to the largest eigenvalue of the adjacency matrix of the network G.

C. Influence Models
The propagation of influence in a network can be described in terms of node activation.At the beginning of the process only a selected set of nodes (known as the seed set) is activated.Inactive nodes can become activated when they are sufficiently influenced by their neighbors.Assume that the process consists of discrete rounds.We then denote by I(t) ⊆ V the set of active nodes in round t, where I(1) is the seed set.The influence model under consideration determines the exact conditions of a node becoming active.In this work we consider two models of influence: independent cascade and linear threshold.
In the independent cascade [41] model, every pair of nodes is assigned an activation probability, p : V × V → [0, 1].In every round t > 1 every node v i ∈ V that became active in round t − 1 activates each inactive neighbor v j ∈ N (v i ) \ I(t − 1) with probability p(v i , v j ).The process ends when there are no newly activated nodes, i.e., I(t) = I(t − 1).
In the linear threshold [42] model, every node v i ∈ V is assigned a threshold value t v i sampled from the set {0, . . ., |N (v i )|} according to some probability distribution.In every round, t > 1, every inactive node v i becomes active if The process ends when there are no newly activated nodes, i.e., when I(t) = I(t − 1).
In either model, the influence of a node, v i , on another node, v j , is denoted by ι(G, v i , v j ) and is defined as the probability that v j gets activated given the seed set {v i }.We assume that ι(G, v i , v i ) = 0 for all v i ∈ V .We define the influence of v i over the entire network as ι(G, v i ) = v j ∈V ι(G, v i , v j ).When referring to the influence of a given node, we mean the influence over the entire network.

IV. POSSIBLE CHANGES IN CENTRALITY RANKING
We first focus on the question of how adding a specific edge to the network or removing a specific edge from the network can affect the centrality c ranking of a given node v † ∈ V ?Can the centrality ranking both increase and decrease after a given network modification?And what about the magnitude of this change, can it be arbitrarily large, or is it strictly limited?Importantly, we focus on the ranking of v † according to centrality c, rather than on the centrality value of v according to c.The ranking is the position of v † in the list of all nodes, sorted according to their centrality values.We assume that nodes with the same centrality value have the same ranking.
In our analysis we divide the edges that can be added to or removed from the network into three classes: edges incident with the evader ζ † 1 , edges between the neighbors of the evader ζ † 2 , and the remaining edges ζ † 3 (all three classes are formally defined in Section III-A).The reason for this division is the fact that the evader who would like to strategically control their centrality ranking has varying levels of control over different edges.The edges on which the evader can exercise the greatest control are those of which the evader belongs to, i.e., the edges from the class ζ † 1 .The addition of this type of edge can be interpreted as performing a telephone call with someone, while the removal of this type of edge can represent removing someone from a list of friends on a social media platform.We can assume that the evader has a smaller amount of control over edges between their neighbors, i.e., the edges from the class ζ † 2 .The addition of this type of edge can be interpreted as introducing two friends to each other, while the removal of this type of edge can represent asking two associates to cease contacts with each other.Finally, we can assume that the evader has the least amount of control over edges outside of their direct network vicinity, i.e., the edges from the class ζ † 3 .The addition of this type of edge can be interpreted as inviting two strangers to the same event, while the removal of this type of edge can represent deleting data about a certain connection from a database.
We can state the question that we intend to investigate as follows: Given a centrality measure c, a network G = (V, E), an evader v † ∈ V , and a class of edges , can the addition or removal of an edge e ∈ ζ † to the network increase or decrease the ranking of v † according to c? Our findings on this matter are summarized in Table II.Due to space limitations, the proofs of our results can be found in Section S1 of the supplementary materials, which can be found on the Computer Society Digital Library at http://doi.ieeecomputersociety.org/ 10.1109/TKDE.2023.3267854.

V. COMPUTATIONAL COMPLEXITY ANALYSIS
Having analyzed the possible outcomes of adding or removing a single edge from the network, we now analyze the computational complexity of a problem of selecting the best subset of edges to hide the evader from centrality measures.Table III summarizes our results.

A. Decision Version of the Problem
We first formally define the computational problem faced by the evader who can perform only local changes.

Definition 1 (Local Hiding):
is the set of edges allowed to be removed, and δ ∈ N is the safety margin.The goal is to identify a set of edges to be added, A * ⊆ Â, and a set of edges to be removed, contains at least δ nodes with centrality c greater than that of the evader.
As can be seen from the definition, we focus on two kinds of network modifications: removing edges incident with the evader, and adding edges between the neighbors of the evader.This choice is informed by the results presented in Section IV.As can be seen in Table II, when it comes to the edges incident with the evader, i.e., edges belonging to the class ζ † 1 , only the removal operation can decrease the evader's ranking according to all three centrality measures (notice how the addition cannot affect the degree centrality in a beneficial way).Similarly, when considering edges between the evader's neighbors, i.e., edges belonging to the class ζ † 2 , only the addition of such edges has a chance of making the evader more hidden from all three measures (the removal can hide the evader from neither the degree nor the closeness centrality).We could also consider adding edges outside of the direct network vicinity of the evader, i.e., edges belonging to the class ζ † 3 , as this operation can also result in decreasing the ranking of the evader according to all three centrality measures.However, as discussed in Section IV, the evader typically has the least amount of control over such edges.Hence, for the sake of realism of the problem, we focus on the modifications of edges belonging to the first two classes.
Let us now comment on the practical aspects of executing network modifications as part of the evader's strategy.In most cases, the evader may remove edges that they are a part of relatively easily, e.g., by ceasing contact with a specific acquaintance.On the other hand, the addition of edges between neighbors might be more demanding.In fact, forming connections with friends of friends (i.e., triadic closure) is one of the driving mechanisms of social network formation [43], [44].What is more, research has shown that two-thirds of Facebook users are willing to accept friend requests from complete strangers [45], suggesting they might be even more likely to accept invitations from friends of friends.Nevertheless, to accommodate for situations in which some of the edge additions or removals are impossible to implement, we introduce sets Â and R that precisely designate which network changes the evader is able to perform.
We now discuss the key differences between the above problem of Local Hiding and the problem of Disguising Centrality studied by Waniek et al. [11].First, instead of seeking the optimal way of decreasing the value of the evader's centrality (which may not provide sufficient cover, especially if they are still ranked among the top nodes in the network), we want the position of the evader in the centrality-based ranking of all nodes to drop below δ.Second, we assume that the evader is only capable of rewiring edges within their network neighborhood-an assumption that holds in many realistic settings, e.g., the evader is able to disconnect herself from any of her friends, or even ask two of them to befriend one another, but is unable to connect to a complete stranger at will, or ask two strangers to befriend or unfriend one another.We also comment on the key differences between our Local Hiding problem and the problem of Hiding Leaders studied by Waniek et al. [12], [14] in the context of constructing covert networks.First, the authors divide the nodes into leaders and the followers, where the changes in the network are allowed only among the followers.Second, they only allow edges to be added among the followers, meaning that no edge can be removed from the network.
Below we present the proof of one of our results.Due to space limitations, the remaining proofs can be found in Section S2 of the supplementary materials, available online.
Theorem 1: The problem of Local Hiding is NP-complete given the degree centrality.
Proof: The problem is trivially in NP, since after the addition of a given set of edges A * and the removal of a given set of edges Fig. 1.An example of the construction used in the proof of Theorem 1 for k = 3.Some edges are printed grey for better readability.Green dotted lines correspond to the edges allowed to be added.R * it is possible to compute the degree centrality of all nodes in polynomial time.
Next, we prove that the problem is NP-hard.To this end, we give a reduction from the NP-complete problem of Finding k-Clique, where the goal is to determine whether there exist k nodes in G that form a clique.
Given an instance of the problem of Finding k-Clique, defined by k ∈ N and a network G = (V, E), let us construct a network, H = (V , E ), as follows (an example of this construction is presented in Fig. 1):
From the definition of the problem we know that the edges to be added to H must be chosen from E, i.e., from the network in the Finding k-Clique problem.Out of these edges, we need to choose a subset, A * ⊆ E, as a solution to the Local Hiding problem (as R = ∅, we are not allowed to remove any edges).In what follows, we will show a correspondence between a solution to the constructed instance of the Local Hiding problem and a solution to the given instance of the Finding k-Clique problem.
First, note that v † has the highest degree in H, which is n + k − 2. Thus, in order for A * to be a solution to the constructed Local Hiding problem instance, the addition of A * to H must increase the degree of at least k nodes in V such that each of them has a degree of at least n + k − 1 (notice that the addition of A * only increases the degrees of nodes in V , since we already established that A * ⊆ E).Since the degree of every node v i in H equals n (because of the way H is constructed), then in order to increase the degree of k such nodes to n + k − 1, each of them must be an end of at least k − 1 edges in A * .
Assume that there exists a solution V * to the given instance of the Finding k-Clique problem, i.e., a subset V * ⊆ V of size k forming a clique in G.We will show that V * × V * is a solution to the constructed instance of the Local Hiding problem.Since the nodes in V * form a clique in G, we have that V * × V * ⊆ E. Since Â = E, we also have that V * × V * ⊆ Â.Finally, since the nodes in V * form a clique in G, the addition of V * × V * to Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
H increases the degree of each of the k nodes in V * by exactly k − 1.We showed that if there exists a solution to the given instance of the Finding k-Clique problem, then there also exists a solution to the constructed instance of the Local Hiding problem.
Assume that there exists a solution to the constructed instance of the Local Hiding problem, i.e., A * ⊆ Â the addition of which to H increases the degree of at least k nodes in V by at least k − 1.However, since the budget is b = k(k−1)

2
, then the only possible choice of A * is the one that increases the degree of exactly k nodes in V by exactly k − 1 each.Hence, the edges in A * induce a clique of size k in Â.However, since Â = E, the same edges also induce a clique of size k in G.We showed that if there exists a solution to the constructed instance of the Local Hiding problem, then there also exists a solution to the given instance of the Finding k-Clique problem.
We proved that a solution to the given instance of the Finding k-Clique problem exists if and only if there exists a solution to the constructed instance of the Local Hiding problem.

Having discussed the decision version of the Local Hiding problem, let us now define its approximation version. Definition 2 (Minimum Local Hiding):
) is the set of edges allowed to be removed, and δ ∈ N is the safety margin.The goal is to identify a set of edges to be added, A * ⊆ Â, and a set of edges to be removed, R * ⊆ R, such that |A * | + |R * | is minimal and the resulting network (V, (E ∪ A * ) \ R * ) contains at least δ nodes with centrality c greater than that of the evader.
Notice that while in the Local Hiding problem we asked whether or not there exists a solution within a certain budget, in the Minimum Local Hiding problem we are looking for a solution that is as small as possible (hence, we are accepting solutions that are not of the optimal size).This key difference results in distinct way of analyzing this class of problems, as we will see in the proofs below.
Due to space limitations, the proofs of our results can be found in Section S2 of the supplementary materials, available online.

VI. THE SEEKER-EVADER GAME
Having analyzed the computational complexity of the problem faced by the evader, we now move to defining the confrontation between the seeker and the evader as a game.

A. The Game Definition
The game takes place between two players: the seeker who is a party analyzing a social network, and the evader who is one of the nodes of the social network analyzed by the seeker.The seeker uses a centrality measure to identify the most important node of the social network, while the evader wishes to avoid being pinpointed as the most important node.We model this confrontation as a Stackelberg game [46].A Stackelberg game is a game between two players, a leader and a follower.The leader moves first, selecting one of their strategies.This move is observed by the follower, who then select one of their strategies as a response.In our case, the leader player is the seeker, whose set of strategies C S consists of the centrality measures that can be used to analyze the network.The follower player is the evader, who observes the centrality measure used by the seeker and selects a strategy from the set Ξ E .Each strategy of the evader consists of removing some edges from the network and adding some edges to the network.
We now discuss the utility functions of both players, starting with this of the evader.In the theoretical analysis presented so far we focused our attention on the problem of lowering the centrality ranking of the evader.Here, we introduce another factor that can motivate the evader, i.e., their influence over the network, measured using one of the models presented in Section III-C.
Let c ∈ C S be the strategy selected by the seeker, and let ξ ∈ Ξ E be the strategy selected by the evader.We define the utility of the evader as: where: r U R e (c, ξ) is the evader's utility coming from their ranking position according to the centrality measure c selected by the seeker, in the network resulting from introducing network modification ξ, r U I e (ξ) is the evader's utility coming from the change in her influence over the network after introducing network modification ξ, m+1 } = Φ is the type of the evader (with m being the number of types) determining whether the evader is more focused on their ranking position or their influence.Next, we discuss the formulas of U R e (c, ξ) and U I e (ξ).Fig. 2 presents the plots of both functions.The evader's utility based on the centrality ranking is defined as: where e is Euler's number, ρ(c, ξ) is the evader's position in the ranking of c after executing network modification ξ, k is the curve steepness, d is the inflection point, β =

and
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
α = 1 − 2β.In our simulations we use the values of d = 5 and k = 3 d .Notice that the function defined this way has a number of desirable properties.First, if the evader is ranked first, i.e., ρ(c, ξ) = 1, their utility is equal to zero.Second, as the evader becomes more hidden their utility increases, i.e., U R e (c, ξ) increases with ρ(c, ξ).Third, the function is convex for ρ(c, ξ) ≤ d, i.e., the marginal gain in utility increases with the evader's ranking, until the evader reaches position d.Fourth, the function is concave for ρ(c, ξ) ≥ d, i.e., further decreasing the evader's ranking beyond position d has diminishing returns.The evader's utility based on the influence is defined as: where ψ(ξ) is the relative change in the evader's influence after executing the strategy ξ, i.e., ψ(ξ) = ι(ξ)−ι 0 ι 0 with ι 0 and ι(ξ) denoting the evader's influence before and after executing the strategy ξ respectively.
Let us now comment on the properties of the function defined this way.First, U I e (ξ) is concave for ψ(ξ) ≤ 0, i.e., the marginal loss in utility grows with the loss in influence.Intuitively, this can be interpreted as the evader who does not mind a negligible loss of influence, but strongly opposes a significant decrease.Second, throughout its domain the value of U I e (ξ) has a similar order of magnitude to the value of U R e (c, ξ), meaning that the aggregated utility of the evader U e (φ, c, ξ) is not dominated by any of those two utilities.
We now describe the utility function of the seeker.The seekerevader game is a zero-sum game.Hence, the goal of the seeker is to minimize the total utility of the evader, i.e., the utility of the seeker is U s (φ, c, ξ) = −U e (φ, c, ξ).We assume that the utility functions of both players and the distribution of the evader types are common knowledge, while the actual type of the evader is unknown to the seeker.

B. Finding the Optimal Strategies
We now formulate the problem of identifying the optimal strategies of the seeker and the evader as a mixed-integer quadratic program (MIQP).As a reminder, the set of strategies of the seeker consists of centrality measures that can be used to analyze the network, while the set of strategies of the evader consists of different ways of rewiring the network.Let t(φ) be the probability that the evader type is φ.Since the seeker knows the distribution of the evader's types, but not the actual type, they are likely to use a mixed strategy, trying to optimize the choice of centrality based on different types of the evader they might be facing.Let p(c) be the probability that the seeker plays pure strategy c ∈ C S .Moreover, let q(φ, ξ) be the probability that an evader of type φ plays pure strategy ξ ∈ Ξ E .Notice that since the evader observes the strategy chosen by the seeker and moves second, they can restrict their choice to pure strategies, i.e., we have that ∀ φ∈Φ ∀ ξ∈Ξ E q(φ, ξ) ∈ {0, 1}.The problem of finding the optimal strategies can now be formulated as follows.
In order to solve the problem efficiently, we linearize it based on procedure described by Paruchuri et al. [47].The main idea of the procedure is based on introducing the variable z(φ, c, ξ) = Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

p(c)q(φ, ξ). The problem can then be formulated as a mixedinteger linear program (MILP) as follows. Definition 4 (MILP formulation):
The mixed-integer linear program finding the optimal strategies is: ) The formal proof that the linearized version of the formulation indeed describes the same problem as the MIQP formulation can be found in Section S3 of the supplementary materials, available online.

VII. EMPIRICAL ANALYSIS
In this section we present the results of our simulations.We first detail the network datasets and random network generation models that we use.Then, we describe the results of our experiments with the sensitivity of centrality measures.Finally, we present the experimental analysis of the seeker-evader game.

A. Network Datasets and Models
In our experiments we consider the following real-life network datasets (their characteristics are presented in Table IV): r WTC [48]-the network of terrorists responsible for the 9/11 attacks in 2001.
r Madrid [49]-the network of terrorists responsible for the 2004 Madrid train bombings.
r Bali [50]-the network of terrorists responsible for the 2002 Bali bombing.
r Ambassador [51]-the network of terrorists responsible for the Philippines ambassador Jakarta residence bombing in 2000.

TABLE IV CHARACTERISTICS OF THE REAL-LIFE DATASETS CONSIDERED IN OUR SIMULATIONS
Facebook [52]-ego network of a student of one of the American colleges.We also consider the following random network models: r Barabási-Albert networks [53]-preferential attachment networks with scale-free degree distribution.In our experiments we add 5 links with each new node (which results in the expected degree of 10) and we set the size of the initial clique to 5.
r Erdős-Rényi networks [54]-networks with the structure of a random graph.In our experiments we set the expected average degree to 10 r Watts-Strogatz networks [55]-networks exhibiting the small world property.In our experiments we set the expected average degree to 10 and the probability of rewiring to 1  4 .
r Newman networks [56]-networks with the scale-free structure, but without the preferential attachment property, generated using the configuration model.In our experiments we set the configuration model parameter to 2.3.
r Prüfer networks [57]-random trees generated using Prüfer sequences.We use sequences where each element is chosen uniformly at random from set {1, . . ., n}.

B. Sensitivity of Centrality Measures
In Section IV we investigated what changes in ranking are possible after addition or removal of an edge belonging to one of the three classes: edges incident with the evader ζ † 1 , edges between the neighbors of the evader ζ † 2 , and other edges ζ † 3 .However, even though our analysis resolved whether the ranking change can happen or not, it remains unclear how probable it is to happen.To resolve this issue, we now perform empirical analysis.
The networks that we consider are described in Section S4 of the supplementary materials, available online.We use the reallife networks as they are, whereas for each of the random models we generate 1,000 networks with 100 nodes.For network under consideration G = (V, E) (whether real-life or random) we compute the initial rankings of degree, closeness, betweeness, and eigenvector centralities.For each pair of nodes v, w ∈ V we consider a network G resulting from either adding (v, w) to G (in case (v, w) / ∈ E), or removing (v, w) from G (in case (v, w) ∈ E).We then compute the rankings of all centralities in G and for every node in the network we record how its ranking positions changed as a result of adding or removing Fig. 3. of the edge modifications that result in a given change in the evader's centrality ranking.Results for random networks are presented as an average over 1000 networks with 100 nodes and the average degree of 10 generated using each model.Labels present values rounded to the nearest percent, values below 0.5% have been omitted for readability.
(v, w) (notice that for every node in the network the edge (v, w) belongs either to class ).Some of the results of our simulations are presented in Fig. 3, the remaining results can be found in Figs.S7 and S8 in the supplementary materials, available online.As it can be seen from the figures, in the vast majority of cases we are able to predict whether the ranking will increase or decrease with high certainty.For example, the removal of an edge belonging to the class ζ † 1 almost always results in a decrease in ranking, while the addition of such edge in an increase.For edges belonging to classes ζ † 2 and ζ † 3 the possibility that the ranking will not change at all becomes significant (indeed, it is often the most probable outcome).However, if we disregard network modifications that do not affect centrality rankings, either the decrease or the increase in ranking is much more probable than its counterpart in most cases.Hence, even without knowledge necessary to compute the centrality ranking, e.g., information about the structure of the entire network, we can usually predict how a given network modification will affect the centrality rankings.
There remains a question about the magnitude of the ranking change, i.e., even if we can predict whether the centrality ranking of the evader will increase or decrease, can we predict the number of positions by which it will change?Fig. 4 presents some of our results regarding the magnitude of the ranking change, the remaining results can be found in Fig. S9 in the supplementary materials, available online.As can be seen from the figure, adding or removing edges belonging to the class ζ † 1 , i.e., edge incident with the evader, not only gives the greatest chance of predicting whether the ranking will increase or decrease, but also Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

C. Seeker-Evader Game Experiments
In this section we present an empirical analysis of the seekerevader game.We first describe the experimental procedure, before presenting results for real-life and random networks.
For a given network G = (V, E) we first select the evader v † ∈ V as the node with the highest average position in rankings generated by the four centrality measures considered in this study, i.e., degree, closeness, betweenness, and eigenvector centralities.We then generate strategies of the evader.The exact set of strategies under consideration depends on whether we consider real-life or random networks and is described in detail below.
For each evader strategy, we then execute it on the original network and compute the evader's ranking positions according to all four centrality, as well as the evader's influence according to the independent cascade model with the activation probability p = 0.1, and the linear threshold model with the uniform distribution of thresholds.We consider the set of evader's types Φ = {.25, .50,.75}.For each evader type, we compute the value of U e using the formula presented in Section VI-A, with the influence value of the evader being taken as an average over the two influence models mentioned above.
Finally, we compute the equilibrium of the seeker-evader game by using the MILP formulation presented in Section VI-B.
To this end, we utilize the PuLP library version 2.6.0 in Python version 3.7.9.
We now empirically analyze the seeker-evader game in reallife networks.Given the reasonable size of these networks, we are able to generate all possible strategies of the evader with the budget of at most b = 4 (for WTC and Madrid datasets) or at most b = 3 (for the other datasets).In other words, we generate all k-subsets of edges between the evader and their neighbors R (that can be removed from the network) and non-edges between the neighbors of the evader Â (that can be added to the network) for k ≤ b.To speed up the MILP computation, we remove all evader strategies that are dominated by another strategy, i.e., we remove a strategy if another strategy is at least as good according to all four centrality measures and both influence measures.The number of remaining (undominated) strategies is presented in the last column of Table IV.Notice that increasing the size of the network causes the growth of the set of potential evader strategies, but does not affect the effectiveness of the equilibria computation once the effective strategies against each centrality measure have been identified.Here, we consider smaller networks to be able to exhaustively search the space of all strategies.Below, we use MILP to compute equilibria in networks up to 100,000 nodes while focusing on a set of particularly effective evader strategies.First, we investigate the utility of the evader U e depending on the composition of the strategy, i.e., the number of removed edges from R, and the number of added edges from Â. Fig. S10 in the supplementary materials, available online, presents the value of U e with the ranking of the evader taken as the average ranking over the four centrality measures.As it can be seen from the figure, the greatest utility of the evader is consistently achieved for strategies that focus on edge removal, as opposed to edge The second row presents the values of the aggregated utility U e (φ, c, ξ, ) for different types of evaders.In each plot the x-axis corresponds to the heuristic used by the evader, while the y-axis corresponds to the utility value.The results are presented as an average over 100 networks with 100,000 nodes and the average degree of 10 generated for each model.The colored areas (very narrow) represent 95% confidence intervals.
addition.What is more, greater utility can be achieved by the evaders focused on their centrality ranking (greater values of φ), rather than by the evaders focused on their influence (smaller values of φ).
The results regarding the equilibria of the seeker-evader game are presented in Fig. S11 in the supplementary materials, available online.As it can be seen, in most networks the mixed strategy of the seeker involves almost exclusively using a particular centrality (the only exceptions being the WTC network).However, the exact centrality used by the seeker strongly depends on the network under consideration.Similarly, the evader usually uses the same strategy in a given network, no matter their type.The exact strategy choice depends on the network, although strong preference for the strategies focused on edge removal can be observed.As for the utility of the evader, we can see that evaders with greater values of φ, i.e., evaders more focused on the centrality ranking rather than influence value, are able to achieve better expected utility.
We now move to the empirical analysis of the seeker-evader game in random networks.In our simulations we generate networks with 100,000 nodes.Given the significant size of the networks, we are unable to generate all possible strategies of the evader.Instead, we consider the repeated use of the hiding heuristic ROAM (Remove One, Add Many) proposed by Waniek et al. [11].A single execution of ROAM with budget k (which we will denote ROAM(k)) comprises of removing the connection between the evader v † and their neighbor with the greatest degree v * , followed by connecting v * to k − 1 other neighbors of v † with the lowest degrees.In our experiments with random networks we assume the total hiding budget of b = 12, and we consider evader strategies consisting of repeatedly executing one of the strategies ROAM(1), ROAM(2), ROAM(3), or ROAM (4).In other words, we consider the following four evader strategies: r executing ROAM(1) twelve times, which in total removes twelve edges from the network, r executing ROAM(2) six times, which in total removes six edges from the network and adds six edges to the network, r executing ROAM(3) four times, which in total removes four edges from the network and adds eight edges to the network, r executing ROAM(4) three times, which in total removes three edges from the network and adds nine edges to the network.Fig. 5 presents the results regarding the utility of the evader with the ranking of the evader taken as the average ranking over the four centrality measures.As can be seen, running ROAM(k) heuristic with greater values of k (i.e., focusing on edge addition, as opposed to edge removal) is slightly detrimental to the utility corresponding to centrality ranking, but it significantly improves the utility corresponding to the influence value.As for the evader types, we observe similar results to those in the real-life networks, with the evaders focused on their centrality ranking (greater values of φ), attaining greater utility values than the evaders focused on the influence (smaller values of φ).Altogether, the utility of the evader seems to be driven by their desire to maintain the influence over the network, as in terms of hiding from centrality measures, the considered strategies offer comparable performance.This is consistent with our findings regarding the equilibria of the game.Fig. 6 presents the results pertaining to the equilibria of the seeker-evader game in random networks.As it can be seen, the centrality measure used by the seeker varies significantly between the network types.As for the evader's strategy, ROAM (4) Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply. is the most commonly used, although for all network structures other ways of hiding are also in use.As for the utility of the evader, we can observe that evaders who are more focused on the centrality ranking rather than influence value achieve greater expected utility.

VIII. CONCLUSION
In this work, we analyzed the problem of strategically decreasing the evader's centrality ranking in a social network by performing local edge perturbations.First, we investigated what ranking changes are possible after adding or removing a single edge, depending on whether that edge is incident with the evader, between two of the evader's friends, or outside the evader's local neighborhood.In the case of degree centrality it is usually easy to predict both the direction of the ranking change (i.e., whether the ranking increases or decreases) and its magnitude (i.e., by how many positions does the ranking change).However, in the case of closeness and betweenness centrality measures, it is most often impossible to make such predictions.Second, we analyzed the computational complexity of the problem faced by the evader when adding or removing multiple edges, rather than a single one, to reduce the evader's ranking.We found that identifying the best possible way of hiding from a given centrality measure is most probably impossible (i.e., the corresponding decision problems are NP-complete), and that optimal solution is usually difficult to approximate (although we were able to identify a 2-approximation algorithm for the degree centrality).Third, we modeled the confrontation between the seeker and the evader as a Stackelberg game.We not only defined the strategies and the utility functions of both players, but we also showed a mixed-integer linear programming formulation of identifying an equilibrium.Fourth, we used simulations on real-life and randomly generated networks to study the effect of a single edge addition or removal on the evader's ranking.We found that even if it is impossible to predict said effect with absolute certainty, based on our results it is possible to make an educated guess in the vast majority of cases.Finally, we perform an empirical analysis of the seeker-evader game on networks.We found that while the exact strategies used by both players vary significantly between settings, the evader usually favors strategies including edge removal.Moreover, evaders who are more focused on their centrality ranking, as opposed to their influence value, can generally achieve greater utility.Altogether, our study provides a broad analysis of the strategic aspects of hiding from centrality measures in social networks.
Our work can be extended in a number of ways.First, in this study, we focus on the case of a single evader.However, one could consider a setting with multiple evaders, either cooperating with each other (i.e., wishing to hide as a group) or confronting each other (i.e., working towards exposing their opponents to the seeker while at the same time remaining hidden), with each variant of the setting posing unique challenges.In the case of cooperating evaders, the strategy space available to them would grow significantly, potentially requiring new computational methods to find effective sets of network modification, while the seeker might apply group centrality measures rather than the standard tools considered in this work.On the other hand, the case of adversarial evaders would greatly complicate the computation of equilibria, changing the decision of the evader from simply selecting the best response to taking into consideration the strategic incentives of all other evaders.Second, we considered the most popular centrality measures, as they are most widely implemented in actual software used for network analysis.However, one could study a broader portfolio of centrality measures, e.g., those based on game theory [58].Alternatively, one could consider a similar setting in an different network class, e.g., in temporal networks [59] or multilayer networks [60].Finally, equivalents of the seeker-evader game presented in this work can be developed for other social network analysis tools that already have hiding tools against unsuspecting seeker.Potential candidates include link prediction algorithms [19], [20], [21], node similarity measures [22], community detection algorithms [11], [23], [24], and source detection algorithms [25].

Fig. 2 .
Fig. 2. The evader's utility functions given d = 20 and k = 3 d .The dashed blue lines represent the inflection points of both functions.

Fig. 4 .
Fig.4.Magnitude of the change in the evader's centrality ranking.The first row presents results for real-life networks.Results in the second row are presented as an average over 1000 networks with 100 nodes and the average degree of 10 generated using each model.Error bars (very narrow in some cases) correspond to 95% confidence intervals.Scales in each row are fixed for easier comparison.

Fig. 5 .
Fig. 5. Utility of the evader in random networks.Each column corresponds to a different network generation model.The first row presents the values of the utility based on centrality ranking U R e (c, ξ) and the utility based on the influence U I e (ξ).The second row presents the values of the aggregated utility U e (φ, c, ξ, ) for different types of evaders.In each plot the x-axis corresponds to the heuristic used by the evader, while the y-axis corresponds to the utility value.The results are presented as an average over 100 networks with 100,000 nodes and the average degree of 10 generated for each model.The colored areas (very narrow) represent 95% confidence intervals.

Fig. 6 .
Fig. 6.The seeker-evader game equilibria in random networks.Each column corresponds to a different network generation model.Each pie chart in the first row presents the mixed strategy selected by the seeeker in an equilibrium of the seeker-evader game.Each group of bars in the second row presents the strategy the evader, with each bar corresponding to a different type of the evader.Each group of bars in the third row presents the expected utility of the evader, with each bar corresponding to a different type of the evader.The results are presented as an average over 100 networks with 100,000 nodes and the average degree of 10 generated for each model.The error bars (very narrow in some cases) represent 95% confidence intervals.

TABLE I SUMMARY
OF THE NOTATION USED IN THE ARTICLE

TABLE II SUMMARY
OF OUR RESULTS CONCERNING POSSIBLE RANKING CHANGES TABLE III THE SUMMARY OF OUR COMPUTATIONAL COMPLEXITY RESULTS