Finite-Time Distributed Algorithms for Verifying and Ensuring Strong Connectivity of Directed Networks

The strong connectivity of a directed graph associated with the communication network topology is crucial in ensuring the convergence of many distributed estimation/control/optimization algorithms. However, the assumption on the network's strong connectivity may not always be satisfied in practice. In addition, information on the overall network topology is often not available, e.g., due to privacy concerns or geographical constraints which calls for a distributed algorithm. This paper aims to fill a crucial gap in the literature due to the absence of a fully distributed algorithm to verify and ensure in finite-time the strong connectivity of a directed network. Specifically, inspired by the maximum consensus algorithm we propose distributed algorithms that enable individual node in a networked system to verify the strong connectivity of a directed graph and further, if necessary, augment a minimum number of new links to ensure the directed graph's strong connectivity. The proposed distributed algorithms are implemented without requiring information of the overall network topology and are scalable as they only require finite storage and converge in finite number of steps. Furthermore, the algorithms also preserve the privacy in terms of the overall network's topology. Finally, the proposed distributed algorithms are demonstrated and evaluated via numerical results.


I. INTRODUCTION
A. Motivation and Literature Review D ISTRIBUTED algorithm plays an important role in estimation [2], [3], optimization [4], [5], and control [6], [7], [8], [9] of networked systems. In contrast to centralized algorithms where all the computations are performed at a control center, the computations in distributed algorithms are locally performed at individual system and by exchanging information with a number of neighboring systems via a communication network. As a result, distributed algorithms have several potential advantages such as scalability to system's size, robustness with respect to failure of individual system, and also preservation of data privacy. Strong connectedness of a graph associated with the communication network topology of distributed systems is a crucial requirement in ensuring the convergence of the above mentioned distributed algorithms. Most of the work on distributed estimation, optimization, and control algorithms take for granted (i.e., assume) that the communication network topology is strongly connected. However, in practice the communication network topology of a networked systems may not always be strongly connected. Therefore, it is of importance to first verify and further ensure (e.g., by adding new links) the strong connectivity of a given communication network topology before executing any distributed estimation/optimization/control algorithms. More importantly, the procedure for verifying and ensuring strong connectivity of a communication network topology also needs to be performed in a distributed manner as the overall network topology is often not available due to privacy concerns or geographical constraints and also in order to comply with the feature of distributed algorithms that will be deployed in the networked systems.
Motivated by the above fundamental yet crucial issue, this paper focuses on the problem of distributively verifying and ensuring the strong connectivity of a directed graph. The communication of many real-world distributed systems is unidirectional whose overall communication network topology can be modelled as a directed graph. For example, in a broadcastbased communication scheme or publish-subscribe protocol (as can be found in Robot Operating System for robotic systems [10] and Open Field Message Bus for smart grid [11]) the receiver/subscriber can decide to use only a portion of all the broadcasted/published information due to their selected preferences or to limit the computational and/or communication cost. Other examples of unidirectional communication include connectivity in social network such as Twitter [12] and wireless network using directional antennae [13].
The problem of verifying a strongly connected directed graph (digraph) can be translated into the problem of computing strongly connected components of a given digraph. Existing algorithms to solve the computation include Tarjan [14], [15], Kosaraju-Sharir [16], and Gabow [17], [18] algorithm, which are based on depth-first-search approach, as well as the relation-transitive-closure-based Warshall algorithm [19]. On the other hand, the problem of ensuring strong connectivity of a directed graph is often described as strong connectivity augmentation problem. The study on the augmentation problem was initiated by the work in [14], [15], followed by subsequent research in [20], [21], emphasizing that the problem is solvable in polynomial time. Note that while the problem of ensuring strong-connectivity problem is equivalent to constructing k-edge-connectivity topology with k ¼ 1, various approaches for k ! 2 in undirected graph topology has also been gaining interest to ensure robustness of the communication network, see for example [22], [23].
Despite the aforementioned approaches in verifying and ensuring strongly connected digraph, most of the solutions focus on the centralized or parallel computation and rely on the assumption that information/knowledge of the overall network topology is available or known beforehand. A fully distributed approach (i.e., without requiring knowledge of the overall network topology) to solve the problem is still limited in literature, with notable examples are presented in [9], [24]. The distributed algorithms in [9], [24] focus on verifying strong connectivity of a digraph after link removals. However, the algorithm still requires the initial graph before link removal to be strongly connected.

B. Statement of Contributions
The main contributions of this paper are twofold. First, we propose distributed algorithms for verifying strong connectivity of a directed graph. The proposed algorithms are inspired by the maximum consensus algorithm [25], [26]. Our second contribution is distributed algorithms to turn a non-strongly connected digraph into a strongly connected one by adding a minimum number of new links. This is achieved by first developing distributed link addition algorithms together with their optimality gap to ensure strong connectivity of a directed graph. A distributed method is then developed to check if the number of added links is minimum and further, if necessary, compute a new set of minimum number of links to make the digraph strongly connected. In addition to be fully distributed and without requiring information of the overall network topology, the proposed distributed algorithms are also scalable as they only require finite storage and converge in finite time steps. The completion in a finite number of steps allows the proposed algorithms to be easily implemented before executing any distributed estimation/ control/optimization algorithms whose convergence require strong connectedness of the underlying communication network. Furthermore, the distributed algorithms are also able to preserve the privacy in terms of the global network topology.
Finally, in comparison to the preliminary version of our work on this problem [1], this paper considers link augmentation problem for not only weakly connected digraph but also disconnected digraph. The distributed algorithms in this paper also ensure strong connectivity of a digraph with minimum number of link addition. In addition, this paper includes all the proofs omitted in the preliminary version together with extensive simulations.

C. Organization
The remainder of this paper is organized as follows. In Section II, we review the basic notions from graph theory and provide the problem settings. Section III presents the distributive algorithm to verify whether a given directed network is strongly connected. The distributed algorithm to estimate strongly connected components of a digraph is then presented in Section IV. Section V presents the distributed algorithm to strongly connect a directed graph. Numerical results is presented in Section VI and followed with concluding remarks in Section VII. All the proofs of the theorems, propositions and lemmas are presented in the Appendix. Illustrative examples to describe the procedure of the proposed algorithms are included as a supplementary material.

II. PROBLEM FORMULATION
In this section, we recall some basic notions of the fundamental theories such as graph theory and maximum consensus algorithm. Then, we define the problem settings within this paper.

A. Notation and Graph Theory
Information exchange between nodes in a network can be modeled by means of directed graph (digraph). A directed graph is denoted by G ¼ ðV; EÞ with a set of nodes V ¼ f1; 2; . . . ; ng and a set of edges (links) E V Â V. A graph Existence of an edge ði; jÞ 2 E denotes that node j can obtain information from node i, or node i is accessible to node j. Here, node i is said to be an in-neighbor of node j while node j is the out-neighbor of node i. Within this paper, the set of all in-neighbors of node i is denoted by N in i ¼ fj 2 V j ðj; iÞ 2 Eg while N out i ¼ fj 2 V j ði; jÞ 2 Eg denotes the set of all out-neighbors of node i. Let the set K consist of all 2-element subsets of V, then the edge set E C :¼ K n E denotes all possible edges that are not present in G.
A path is a sequence of nodes ði 1 ; i 2 ; . . . ; i p Þ; p > 1, such that i j is an in-neighbor of i jþ1 for j ¼ i 1 ; . . . ; p À 1. An elementary path is a path in which no nodes appears more than once. A path is closed if i p ¼ i 1 . A cycle is a closed path such that i 1 ; i 2 ; . . . ; i pÀ1 are all distinct. A graph is acyclic if it has no cycles. A graph is said to be strongly connected if there is a path between any pair of distinct nodes and it is called weakly connected if the graph obtained by adding an edge ðj; iÞ for every existing edge ði; jÞ 2 E in the original graph is strongly connected. A strongly connected component of directed graph G is a subgraph of G that is strongly connected and maximal, as such no additional edges or vertices from G can be included in the subgraph without breaking its property of being strongly connected.
Within this paper, let R be the set of real numbers and Z !0 be the set of non-negative integers. By 1 n 2 R n and 0 n 2 R n , we denote the all ones vector and zeros vector in n-dimension, respectively. For a given set N , jN j denotes the number of elements in this set. Vectors are denoted as boldface letters and matrices are denoted as capital letters in boldface. Finally, the state associated with node i 2 V is represented by the subscript operator, for example state a a a a a a a 2 R b ; b > 1 for node i is shown as a a a a a a a i i i i i i i and the j-th element of vector a a a a a a a i i i i i i i (with j b) is denoted by a i;j .

B. Max-Consensus Algorithm
Consider a directed graph G ¼ ðV; EÞ with n nodes and let us assign state y i ½t 2 R to each node i 2 V. The max-consensus algorithm allows all nodes to distributively compute the maximum value of the initial conditions y i ½0 for all i 2 V. Specifically, each node executes the following update rule [25] with t denotes the t-th communication event.
Definition 1 (Max-Consensus [25]): Given a directed graph G ¼ ðV; EÞ, an initial states y i ½0 for each node i 2 V and the update law (1). Then, max-consensus is said to be achieved, if 9l 2 Z !0 such that If (2) holds for all possible y i ½0, we say that strong maxconsensus is achieved. If (2) only holds for a subset of all possible y i ½0, weak max-consensus is achieved.
Next, we recall the following results. Lemma 1 (Max-Consensus [25]): Let G be a directed graph representing the communication topology of n nodes.
Strong max-consensus: Given any initial value of y i ½0, the necessary and sufficient condition for strong maxconsensus is that there exist a path between any pair of nodes in G, i.e., the digraph G is strongly connected. Weak max-consensus: Given partitions of all nodes based on the initial value of y i ½0 as V m :¼ fi 2 V j y i ½0 ¼ max i2V fy i ½0gg and V o :¼ V n V m . Then, the necessary and sufficient condition for weak max-consensus is that for any node j 2 V o , there exist a path ending in j and starting in a node k 2 V m . Convergence speed: The required number of communication instants is the maximum of the shortest path length between any pair of nodes in G, i.e. n À 1 in the worst case. It will be demonstrated throughout the paper that max-consensus algorithm serves as a unified framework to solve our problem.

C. Problem Settings
Consider a network consisting of n nodes whose connections is given by a directed graph G 0 ¼ fV; Eg, which also represents the communication network topology between the nodes. We make the following assumptions in the remaining of the paper: Assumption 1: Assume that 1) The information of the overall network topology G 0 is not available and each node i only knows the information on N in i , N out i , and n.
2) Each node is equipped with its own computational resources and is assigned with a unique identifier which can be mapped to its vertex number, i.e., i 2 f1; . . . ; ng. Note that the unique identifier is a standard assumption commonly used in designing distributed algorithm which can be realized e.g., by using MAC address, see for example [3], [7]. In addition to Assumption 1, it is also assumed that the communication between nodes occur in a synchronous manner. Furthermore, we consider a discrete-time case, where communication instants may either be defined by a clock or by the occurrence of external events. This can be realized, e.g., by allowing the node to have access to global/universal time and by having the execution timing and interval to be predetermined beforehand.
The objective of this paper is to develop distributed algorithms, under assumption 1, for solving the following problems: Problem 1 (Connectivity Verification): Verify in a distributed manner if directed graph G 0 is strongly connected.
Problem 2 (Connectivity Augmentation): For a directed graph G 0 , add a minimum number of additional edges DE þ E C in a distributed manner to ensure that the resulting graph G Ã ¼ fV; E [ DE þ g is strongly connected, i.e., to solve the following optimization problem For the sake of readability, the notations used in this paper are summarized in Table I. Each notation will be described in more detail when it is first used in the discussion.

III. DISTRIBUTED VERIFICATION OF A DIRECTED GRAPH'S STRONG CONNECTIVITY
In this section, we present a distributed algorithm to verify whether a given network is strongly connected. Here, for each node i 2 V, we introduce the state x x x x x x x i ½t 2 R n for checking if node i is reachable from any other nodes and state f i ½t 2 R for locally verifying if graph G 0 is strongly connected. Within this paper, we refer t 2 Z !0 as the tth communication event. To this end, each node updates each row j 2 V of its state x x x x x x x i ½t, i.e., x i;j ½t, for n iterations according to the following max-consensus protocol whose initial condition is chosen as Given the initialization in (5), this approach allows individual node to estimate the existence of paths from all other nodes to itself as the value of x i;j ½n ¼ 1 for any i 6 ¼ j implies that there exists a path from node j to node i while the value of x i;j ½n ¼ 0 signals the absence of that path [9]. The n iterations is selected to ensure x i reach its steady state.
The following result establishes the relationship between the value of x x x x x x x i ½n and the strong connectivity of directed graph G 0 .
Theorem 1: Given a digraph G 0 and each node executes (4) for n iterations whose initial values are given in (5), the graph G 0 is strongly connected if and only if x x x x x x x i ½n ¼ 1 n for all i 2 V. As a last step, each node needs to verify locally whether To this end, each node updates its state f i ½t for n iterations according to whose initial value is chosen as Each node can then independently verify the strong connectivity of digraph G 0 by observing its own value of f i ½n as shown in the following theorem. Theorem 2: Given a digraph G 0 and each node executes in sequence update rule (4) and (6) for n iterations each, with each initial values as in (5) and (7). The graph G 0 is strongly connected if and only if f i ½n ¼ 0 for any i 2 V.
The pseudo code of distributed verification algorithm for solving problem 1 is summarized in Algorithm 1.
Remark 2 (Privacy Preservation): From the retrieved information through Algorithm 1, each node only knows the existence of path from other nodes to itself (state x x x x x x x i ) and the general notion of the strong connectivity of the graph G 0 (state f i ). Thus, Algorithm 1 does not reveal the overall network topology. Now, assume that after running Algorithm 1 all nodes verify that the graph G 0 is not strongly connected, i.e., G 0 is either a weakly connected or a disconnected digraph. A distributed algorithm is then needed to add new edges to G 0 so that the resulting graph becomes strongly connected. The problem can be reduced to a simpler one by converting G 0 into a directed acyclic graphĜ 0 which contains one node for each strongly connected component (SCC) of G 0 . The resulting node inĜ 0 with no entering edge is called a source, and a node with no exiting edge is called a sink. The new edges to strongly con-nectĜ 0 can then be selected by connecting the existing sink to source following a certain ordering, as shown in [14], [15]. However, the computation for the solution in general is centralized which requires information of the overall network topology. In the following sections, given a non-strongly connected digraph we propose distributed algorithms which first estimate the strongly connected components that each node belongs to (Section IV) and then distributively add new links to make the digraph strongly connected (Section V).

IV. DISTRIBUTED ESTIMATION OF SCC
In the following, inspired by the max-consensus algorithm we propose distributive approaches for estimating the strongly connected component (SCC) of a directed graph. First, let us introduce the following definitions on different types of SCC.
Definition 2 (source-scc): source strongly connected component is a strongly connected component with no entering edges and one or more exiting edges.
Definition 3 (sink-scc): sink strongly connected component is a strongly connected component with no exiting edges and one or more entering edges.

Algorithm 1: Distributed Algorithm for Solving Problem 1
Input: network size n, in-neighbor set N in i Output: verification if G 0 is strongly connected 1: initialize each row of x x x x x x x i ½0 as in (5) 2: for each j-th row of x x x x x x x i (j 2 f1; . . . ; ng), execute max-consensus update law (4) for n iterations.
3: assign f i ½0 as in (7) 4: execute max-consensus update law (6) for n iterations 5: node i knows that graph G 0 is strongly connected when f i ½n ¼ 0 and not strongly connected when f i ½n ¼ 1. An illustration of source-sccs, sink-sccs, and isolated-sccs is shown in Fig. 1. Note that a SCC which is neither sink-scc, source-scc, or isolated-scc can also exist (called as nonassigned SCC) within a directed graph, e.g., nodes 9 and 10 in Fig. 1.
The proposed distributed algorithms allow each node i 2 V to estimate the following: 1) the existence of paths from other nodes to itself; 2) the SCC that it belongs to, namely the set C i ; 3) the existence of entering or exiting edges of its own SCC; and 4) verify whether its own SCC is a source-scc, sinkscc, isolated-scc, or neither of these.
To that end, for each node i 2 V, let us assign states

A. Estimation of Paths and SCCs
As the first step, each node updates its state x x x x x x x i ½t for n iterations according to the update rule (4) whose initial condition is chosen as in (5). Next, let us define the information number of node i, denoted as z i , as the number of nodes that can reach node i, including node i itself. Noting that the existence of a path from node j to i is indicated by the value x i;j ½n ¼ 1, node i's information number is then equal to In order to estimate the information number of other nodes which can reach node i, each node updates for n iterations each row j 2 V of its own state c c c c c c c i ½t, i.e., c i;j ½t, according to the following rule whose initial condition is chosen as After n iterations, the information number of all nodes j that can reach node i will be given by the entry of c i;j ½n. We then have the following results on the information number: Lemma 2: If node i is reachable from node j (i.e., c i;j ðnÞ > 0) and nodes i and j have the same information number (i.e., c i;j ðnÞ ¼ z i ), then nodes i and j are belonging to the same SCC (i.e., they are mutually reachable to each other).
Lemma 3: For each node i, the other nodes in the set P i have a smaller (positive) information number compared to node i (equivalently any nodes in C i ). Specifically, the information number of node i satisfy z i ! jC i j þ max j2P i z j .
As a direct result of Lemma 3, it is clear that within all the entries of c c c c c c c i ½n, its i-th element c i;i ½n ¼ z i always has the highest number. Additionally, from Lemma 2 node i can estimate its own SCC, i.e., set C i , by identifying all nodes which have the same information number with itself, namely Furthermore, each node i can estimate the set P i by collecting all nodes which have lower information number than itself, that is Here, c i;j ½n ¼ 0 represents the case where node j's information is inaccessible to i. Note that the node i's local estimation of C i and P i are identical to all the other nodes which belong to the same SCC (i.e,. C j ¼ C i and P j ¼ P i for all j 2 C i ).
It is easy to observe that the only SCC of a strongly connected graph is the graph itself. In fact, using this observation we can develop an alternative distributed algorithm to solve Problem 1 in which each node distributively checks the membership of its own SCC and verifies if it comprises of all nodes, i.e. V, as shown in the following corollary.
Corollary 1: Given a digraph G 0 and each node executes in sequence the update laws (4) and (8) for n iterations each, with initial conditions given in (5) and (9). Then, G 0 is Algorithm 2: Alternative Distributed Algorithm for Solving Problem 1 Input: directed graph G 0 , network size n, in-neighbor set N in i Output: verification whether graph G 0 is strongly connected 1: initialize each row of x x x x x x x i ½0 as in (5) The pseudo code of alternative distributed verification algorithm for solving problem 1 is summarized in Algorithm 2.

B. Determination of Sink-Scc, Source-Scc, and Isolated-Scc
Using Algorithm 2, node i can estimate the existence of paths from other nodes to itself and the SCC that it belongs to, namely the set C i . In order to provide an effective strong connectivity augmentation which will be described later, it is important that each node is also able to characterize whether its own SCC is a source-scc, sink-scc, or isolated-scc. For this purpose, each node needs to identify the existence of entering or exiting edges of its own SCC. To that end, we introduce the following lemma.
Lemma 4: A SCC has no entering edges if and only if P i ¼ ; for all node i in its membership.
With the estimated value of P i , each node i can determine the absence of an entering edge to its own SCC (i.e., set C i ) based on Lemma 4, namely when P i ¼ ;.
On the other hand, in order to verify if there exists an edge from nodes i in C i to any nodes j = 2 C i , each node updates for whose initial condition is chosen as In other words, the state o o o o o o o i ½n collects the information from all nodes k 2 P i [ C i on whether there exists an edge from node k to any nodes outside of its set C k .
We can then establish the following result which allows each node to distributively characterize its own SCC.
Proposition 1: Given a digraph G 0 and each node executes in sequence the update rules (4), (8), and (12) for n iterations each, with initial values given in (5), (9), and (13), respectively. Node i can then determine the following to characterize its own SCC (i.e. the set C i ): 1) All nodes in the set C i is a source-scc if and only if P i ¼ ; and there exist a node j 2 C i where o i;j ½n ¼ 1.
2) All nodes in the set C i is a sink-scc if and only if P i 6 ¼ ; and o i;j ½n ¼ 0; 8j 2 C i . 3) All nodes in the set C i is an isolated-scc if and only if P i ¼ ; and o i;j ½n ¼ 0; 8j 2 C i . Note that a non-assigned SCC will fall outside of the conditions 1)-3) in Proposition 1, i.e., P i 6 ¼ ; and there exist a node j 2 C i where o i;j ½n ¼ 1. The pseudo code for the proposed distributed estimation and characterization of SCC is presented in Algorithm 3.
Remark 3 (Computational Complexity): Algorithms 2 and 3 finishes in 2n and 3n iterations, respectively. Thus, both algorithms' computational complexity are equal to OðnÞ. In this section, we focus our discussion on the distributed strategies to solve Problem 2. We first propose a distributed algorithm together with its optimality gap in order to add new edges to a non-strongly connected directed graph G 0 so that the resulting graph becomes strongly connected. Then, inspired by the centralized approach in [14], [15], we propose an algorithm to verify in a distributed manner whether the number of added edges is minimum and alternatively provide a solution for the minimum link addition problem. All the computations are performed in a distributed manner and without requiring information of the overall network topology G 0 . We start by introducing the following additional assumption.
Assumption 2: Each node can establish a communication link to any node in G 0 .
This assumption can be satisfied for the publish-subscribe protocol as found in Open Field Message Bus and in social network such as Twitter where a node can request a connection to any other nodes.
In order to simplify the discussion and presentation of the proposed algorithms, in the remaining of the section each sink-scc, source-scc, and isolated-scc is represented by a single node which is a member of their own SCC. To this end, let us denote G m as the resulting graph after the m-iteration of link-addition. Let us define V m sour , V m sink , and V m isol , as a set consisting of representative nodes respectively for source-scc, sink-scc, and isolated-scc in G m . Furthermore, let S m j denote

Algorithm 3: Distributed Estimation and Characterization of SCC
Input: directed graph G 0 , network size n, neighbor set N in i and N out i Output: node i's associated SCC 1: step 1-4 in Algorithm 2 2: estimate C i and P i by (10) and (11) the set of all the source-scc representative nodes accessible to representative node j 2 V m sink . A condensed graph representation of a digraph G m is then isol g V and ði; jÞ 2 E m denotes the existence of path from node i to node j in the original graph G m . Note that all nodes within non-assigned SCC, together with non-representative nodes within source-scc, sink-scc, and isolated-scc, will not have special role during the distributed link addition other than passing the information, hence they are omitted for the condensed graph representation. To this end, the representative nodes can be selected by following a predefined rules, e.g., the node with the highest vertex (ID) number in each SCC is selected as the representative node. Alternatively, the nodes within the same SCC can locally coordinate over a certain decision variable, e.g., to select a node with the most number of out-neighbors, each node can share its own N out i and execute a max-consensus algorithm. For the above two examples, the selection of representative nodes will take no more than n iteration.
Moreover, we consider the representative nodes after each link addition, i.e., V m , to be selected within V 0 . To be precise, the selection of the representative node ensures that V m isol are maintained. Additionally let us denote d m as the number of disjoint subgraphs within G m . An example of this condensed graph is illustrated in Fig. 2. Note that the condensed graph information is introduced only for facilitating the discussion, and not necessarily known by each node in the original graph.

A. Distributed Link Addition Algorithm
Here, we present the algorithm to strongly connects G 0 by utilizing the estimated SCCs obtained from the previous section. Recall that each node can use Algorithm 3 to estimate whether its own SCC is a source-scc, sink-scc, isolated-scc, or neither of these. Let us further assume that the procedure to select representative nodes for all SCCs have been established, and as a result we can present the discussion in terms of the condensed graph G m . To this end, the proposed algorithm will rely on the approach where each node i 2 V m sour broadcasts its information to the rest of the network and accordingly each node j 2 V m sink collects this information. This information broadcasting enables each sink-scc representative j to obtain the information about all the accessible source-scc representative S m j V m sour . The broadcast of information can be distributively realized via another max-consensus update law which takes as many as n time-steps, that is by introducing a state s i ½t 2 R n and initializing its element as s i;i ½0 ¼ 1 if i 2 V 0 sour and s i;j ½0 ¼ 0 for j 6 ¼ i.

1) Distributed Algorithm for Weakly Connected Graph:
We first consider the case where the non-strongly connected digraph is given by a weakly connected digraph which has no isolated-sccs, i.e., V 0 isol ¼ ;. Before proceeding, we introduce the following lemma.
Lemma 5: Given a weakly connected graph G 0 , adding edges ðj; iÞ from each node j 2 V 0 sink to all reachable nodes i 2 S 0 j , results in a strongly connected graph.
The above lemma provides a one-step strategy to strongly connect a weakly connected digraph, namely by adding a set of edges from each j 2 V 0 sink to all reachable i 2 S 0 j . The pseudo code of the proposed algorithm is given in Algorithm 4. Next, let D Ã denote the optimality gap between the added edges using Algorithm 4, denoted by jDE þ j and the minimum number of required links to strongly connect the graph. We then have the following main result.
Theorem 3: Given a weakly connected digraph G 0 ¼ fV; Eg, Algorithm 4 results in a strongly connected graph G m ¼ fV; E [ DE þ g. Furthermore, Algorithm 4 will finish in 5n iterations with one link-addition step (m ¼ 1), whose optimality gap D Ã is equal to where jDE þ j ¼ P i2V 0 sink jS 0 i j jV 0 sour jjV 0 sink j. Note that the resulting jDE þ j also denotes the total number of elementary paths from any pair source-scc to sink-scc that Fig. 2. A condensed graph representation G 0 for the digraph in Fig. 1. The nodes within non-assigned SCCs and non-representative nodes are omitted in the graph G 0 . The graph G 0 is composed of 5 disjoint subgraphs, i.e., d 0 ¼ 5, which are f1; 2; 6; 7; 8g, f11; 15g, f18g, f19g, and f20g.

Algorithm 4: Distributed Algorithm to Strongly Connect A Weakly Connected Digraph
Input: weakly connected graph G 0 , network size n, neighbor set N in Corollary 2: For a weakly connected digraph G 0 with a single source-scc (jV 0 sour j ¼ 1) or a single sink-scc (jV 0 sink j ¼ 1), Algorithm 4 yields an optimal solution with minimum link addition.
2) Distributed Algorithm for Disconnected Digraph: Next, we present distributed algorithm to strongly connect G 0 , given that G 0 is a disconnected graph which separates group of nodes into several disjoint subgraphs, i.e. d 0 > 1. The main idea for the proposed distributed link addition algorithm comprises of two main steps (extending from ideas in Algorithm 4), namely to strongly connect each weakly-connected subgraph and to connect all disconnected subgraphs. Specifically, each linkaddition step adds the following new links: (i) from each i 2 V m sink to all j 2 S m i and (ii) from each i 2 V m isol to a random node j = 2 C i . The pseudo-code of the distributed algorithm is given in Algorithm 5 and its performance is summarized in the following theorem.
Theorem 4: Given a disconnected digraph G 0 ¼ fV; Eg, then Algorithm 5 results in a strongly connected graph G m ¼ fV; E [ DE þ g by adding at most ð2 d 0 þ P i2V 0 sink jS 0 i jÞ new edges. Furthermore, Algorithm 5 will finish in 3n þ 5 nm iterations with the worst case m ¼ 2dlog 2 d 0 e, whose optimality gap D Ã is upper-bounded by jS 0 i j À ðmaxfjV 0 sour j; jV 0 sink jg þ jV 0 isol jÞ: (15) Remark 6: Note that for a weakly connected digraph, the link addition procedure in Algorithm 5 is identical to Algorithm 4, i.e., m ¼ 1. However, Algorithm 5 introduces additional 3n iterations for strong connectivity verification (Algorithm 3 in line 12), thus finishes in 8n.
Remark 7 (Alternative Algorithm): An earlier version of algorithm is presented in [1] without the need to broadcast the source information. However, it may results in a longer computation time as the computation complexity is Oðn 2 Þ.
Remark 8: Analogous to Remark 5, given a weakly connected graph G 0 , the Algorithms 4 and 5 can be executed only with the information of the upper bound of number of nodes n ! n by modifying the step 2 into checking whether C i reflects an isolated-scc, namely property 3) in Proposition 1. Moreover, the exact number of nodes, i.e., n, can be inferred at the end of Algorithm 5 as n ¼ jC i j.

B. Verifying and Enforcing Minimum Link Addition
In the previous subsection, we have presented distributed link addition algorithms to ensure a strongly connected graph. However, as summarized in Theorems 3 and 4, the resulting number of added links is not always guaranteed to be minimum. In the following, we present the procedure to verify whether the number of added links is minimum, and additionally compute a new set of edges to ensure minimum link augmentation by first removing the previously augmented edges DE þ . The computation will be conducted by a single node called a virtual leader. The virtual leader can be selected among any node i 2 V 0 sink [ V 0 isol where ði; jÞ 2 DE þ for some nodes j.
The verification of minimum link addition is conducted once the execution of Algorithm 5 is finished. The strong connectivity of the graph is required in order to collect the information for the minimum link verification algorithm as well as to solve the minimum link augmentation problem. A solution to the minimum link augmentation problem itself is presented in [14], [15], where a maxfjV 0 sour j; jV 0 sink jg þ jV 0 sink j number of links can be added once the information on V 0 isol , V 0 sour , V 0 sink , and S 0 i ; 8i 2 V 0 sink are known. Adopting the approach in [14], [15] to our current setup, the virtual leader needs to collect the following information: 1) original sinks V 0 sink ; 2) reachable sources S 0 i for each i 2 V 0 sink ; and 3) number of added links jDE þ j. Note that the set V 0 sour can be reconstructed from [ i2V 0 sink S 0 i . These information can be obtained by having all nodes i 2 V 0 sink [ V 0 isol to broadcast their own information and the number of links which they added. Using the above information, the virtual leader can then verify if the added link jDE þ j is minimum. If the number of added links is not minimum, the virtual leader then constructs a new set of DE þ which ensure the minimal link augmentation.
The procedure to compute the minimum link augmenting set, as shown in [14], [15], requires an index p and an ordering vð1Þ; . . . ; vðjV 0 sour j þ jV 0 isol jÞ and wð1Þ; . . . ; wðjV 0 sink jÞ. The ordering w contains all nodes in V 0 sink , while the ordering v contains a combination of V 0 sour and V 0 isol . The index p and the orderings need to ensure the following properties: 1) there is a path from vðiÞ to wðiÞ for 1 i p; 2) for each source vðiÞ; p þ 1 i jV 0 sour j there is a path from vðiÞ to some wðjÞ; 1 j p; and 3) for each sink wðjÞ; p þ 1 j jV 0 sink j there is a path from some vðiÞ; 1 i p to wðjÞ. Additionally, the ordering vðjV 0 sour j þ 1Þ; . . . ; vðjV 0 sour j þ jV 0 isol jÞ contains all nodes from V 0 isol . Given the existing information in the virtual leader, the ordering can be constructed by following the steps in Algorithm 6. Lemma 6: Given a list of pairings of sinks and their reachable sources, the Algorithm 6 finds an index p and an ordering of v and w satisfying Properties 1-3.
Once the p and the ordering of v and w are known, the augmenting set can be constructed as a combination of the following edges shown in (16), shown at the bottom of the page. Note that the formulated links in (16) is modified from the original formulation in [14], [15] to circumvent the need to flip the direction of the graph for the case of jV 0 sour j > jV 0 sink j. The information about the augmenting set can then be distributed to all nodes to reconfigure the new edges, that is by locally removing existing DE þ and replacing it with the new one.
The complete pseudo-code of algorithm for verifying and enforcing minimum link addition is presented in Algorithm 7. The results can be formally stated in the following theorem.
Theorem 5: Consider a disconnected digraph G 0 . Given an index p, and an ordering of v and w following Properties 1-3, then the set of edges DE þ in (16) makes the resulting graph G Ã ¼ fV; E [ DE þ g strongly connected. In addition, the number of links added is jDE þ j ¼ maxfjV 0 sour j; jV 0 sink jg þ jV 0 isol j.

Remark 9 (Privacy Preservation):
In addition to the existing information as stated in Remark 4, by executing Algorithm 4, 5, or 7, each node can also retrieved the information of V m sour , V 0 sink , and V 0 isol from the broadcasted information. While this information provide a general existence of paths between source-sccs and sink-sccs, it is still not sufficient for each node to reveal the overall network topology, thus preserving the privacy.

VI. NUMERICAL SIMULATION
In this section, we provide numerical simulations where we test Algorithm 7 as it covers all the main functionalities presented in Algorithms 1-6. The distributed computation of Algorithm 7 including the information exchange are simulated in a single PC using python programming language. The source code is available in the following link https://github. com/TUNI-IINES/dist-strong-connectivity For the simulations we consider three different graphs, namely G A ; G B ; G L . The graph G A ¼ ðV A ; E A Þ consists of 5 disjoint subgraphs and is shown in Fig. 1 Þ is a weakly connected graph consists of 10 nodes as depicted in Fig. 4. Finally, digraph G L is a disconnected graph of 50 nodes as shown in Fig. 3. The detailed parameters for each graph and the theoretical bounds presented in Theorems 4 and 5 are summarized in Table II.
Since the distributed link addition for the weakly connected graph provides a unique solution, it is sufficient to run a single numerical simulation for G B . Furthermore, we conduct 400 and 2500 number of simulations for G A and G L respectively in order to ensure sufficient samples (n 2 ) are collected for verifying our theoretical results in Theorem 4 and 5, as some new links are selected randomly.
All the results of the numerical simulation show that the distributed link addition algorithms result in strongly connected digraph and if the number of added links is not minimum, the

A. Weakly Connected Graph G B With 10 Nodes
The results for graph G B is illustrated in Fig. 4. The algorithm finishes in 11n time steps, where it first introduces 5 new edges (Fig. 4(a)) to strongly connect graph G B before minimum link addition is enforced with 3 new edges ( Fig. 4  (b)). Note that for the weakly connected graph, the link addition procedure in Algorithm 5 is identical to the one in Algorithm 4, which ensures strong connectivity in 5n iteration (m ¼ 1). In addition, Theorem 3 guarantee a smaller optimality gap with D Ã ¼ 2, which is aligned with the observation shown in Fig. 4.   construct Tarjan's ordering as Algorithm 6 15: construct minimum link augmentation as in (16)  16: broadcast the optimal link to reforge new DE þ 17: else 18: broadcast that link is already optimal 19: end if 20: end if 21: save broadcasted information and forward it for n iterations 22: process the information, re-establish new links if previously not optimal

B. Disconnected Graph G A With 20 Nodes
An example of the results for graph G A is illustrated in Fig. 5 while the results for n 2 ¼ 400 repetitions are summarized in Fig. 6. For all the repetitions, the data shows that the algorithm finishes in 21n time steps, which is equivalent to link addition with m ¼ 3 steps. The number of added links for all the repetitions are between 12 to 14 new edges. Both the number of iterations and number of augmented links are within the expected bounds as shown in Table II.

C. Disconnected Graph G L With 50 Nodes
Finally, an example of the results for graph G L is illustrated in Fig. 7 while the results for n 2 ¼ 2500 repetitions are summarized in Fig. 8. The data is divided into two groups, with the majority (1362 results) finishes in 26n time steps and the remaining (1138 results) finishes in 21n time steps, which are equivalent to link addition with m ¼ 4 and m ¼ 3 steps, respectively. The number of added links for all the results are between 46 to 50 new edges. Both the number of iterations and number of augmented links are within the expected bounds as shown in Table II. The results verify the theoretical bounds given in Theorem 4 and 5.

VII. CONCLUSION
This paper proposes distributed and finite time algorithms to verify strongly connected property of a directed graph and to make a directed graph strongly connected with a minimum number of link addition. The strategy is inspired by maximum consensus algorithm which is known to have finite computation time. The proposed strategies provide the solutions without requiring knowledge of the overall network topology and further preserve the privacy in terms of the overall network's topology. Strong connectivity is a graph property that is commonly assumed or required in many distributed systems and is crucial in guaranteeing convergence of many distributed estimation/optimization/control algorithms. Hence, the proposed distributed strategy has broad applications.
Future work will aim towards the asynchronous implementation of the proposed algorithms and to relax the assumption where only upper bound of the number of nodes is known. In addition, several application specific use-cases will be considered, e.g., towards minimizing network's end-to-end delay or a case where a communication link can only established with nodes within a certain communication range.

A. Proof of Theorem 1
We start by showing the necessity ð)Þ. From Lemma 1, since the graph G 0 is strongly connected, each element in x i namely x i;j will converge to max i x i;j ½0 ¼ 1 (strong max-consensus) for all i; j 2 V within the worst-case of n À 1 iterations. Thus, x x x x x x x i ½n ¼ 1 n is fulfilled for all i 2 V. Next, we show the sufficiency ð(Þ through contradiction. We first assume that graph G 0 is not strongly connected, i.e., there exists no path from a certain node i to j. However, as we have x i;j ½n ¼ 1 under update law (4) for all j-th row in x x x x x x x i ½n and for all nodes i in the network, this means that there exist path from any node j to any node i. Hence the graph G 0 is strongly connected, which contradicts the assumption.

B. Proof of Theorem 2
Let us divide all nodes into set V 0 :¼ f8i 2 V j f i ½0 ¼ 0g and V 1 :¼ f8i 2 V j f i ½0 ¼ 1g. Then, we can rewrite Theorem 1 as graph G 0 is strongly connected if and only if V 0 ¼ V and For a non-strongly connected graph G 0 , under update law (6), the value of f i will converge to max i f i ½0 ¼ 1; 8i 2 V (weak maximum consensus) if for any node i 2 V 0 there exists path ending in i and starting in j 2 V 1 [25]. Note that this condition is satisfied as any node i 2 V 0 is reachable from all nodes. This ensures that f i ½n ¼ f j ½n; 8i; j 2 V.

C. Proof of Lemma 2
As there exist a path between any distinct nodes within a SCC, this means that all information from one node can reach the other, which results in an equal information number.

D. Proof of Lemma 3
Node i can be reached by all nodes in P i as well as its own SCC, i.e. C i , thus ensures a higher information number than all nodes in P i . Hence, node i's information number is lower bounded by max j2P i jC i j þ z j , noting that node i's SCC can have multiple entering edges.

E. Proof of Corollary 1
We start by showing the necessity ð)Þ. Since the graph G 0 is strongly connected, Theorem 1 ensures that x x x x x x x i ½n ¼ 1 n for all i 2 V. Hence, all node i's information number is equal to n, initializing c i;i ½0 ¼ n. Strong connectivity of G 0 and update law (8) ensures maximum consensus protocol [25] for each element in c i namely c i;j will converge to max i c i;j ½0 ¼ n for all i; j 2 V. Thus, c c c c c c c i ½n ¼ n1 n . Note that with (10), the above condition is equivalent to each node i ends up with C i ¼ V (alternatively jC i j ¼ jVj ¼ n) for all i; j 2 V. The sufficiency ð(Þ through contradiction follows similar arguments with Theorem 1.

F. Proof of Lemma 4
We can show the proof by contradiction, assume a given node i where its SCC (i.e. set C i ) has no entering edge and P i 6 ¼ ;. The fact that P i 6 ¼ ; implies that there exist at minimum one node outside of C i which can reach node i 2 C i . Hence, there exist an entering edge to its own SCC which contradict the original assumption.

G. Proof of Proposition 1
The three statements follows directly from Definitions 2-4 as results from update rules (4), (8), and (12). The condition P i 6 ¼ ; denotes that there exist at least one node in P i that can reach a node in C i , hence the existence of at least an entering edge to node i's SCC. Conversely, the absence of entering nodes is denoted by P i ¼ ;. The existence of at least an exiting edge is denoted by any o i;j ½n ¼ 1 for all node j 2 C i , while the absence of exiting edge is denoted by o i;j ½n ¼ 0; 8j 2 C i .

H. Proof of Lemma 5
Each new edge ðj; iÞ creates a cycle containing all nodes within the elementary path from i 2 S 0 j V 0 sour to the j 2 V 0 sink , merging the corresponding SCCs into a single SCC. As it occurs simultaneously for all sink-sccs towards all existing source-sccs, this ensure there exist a path from node j to node i for every original edge ði; jÞ 2 E. Hence, by the definition of weakly connected graph, the resulting graph is strongly connected.

I. Proof of Theorem 3
The Algorithm 4 reflects the described step in Lemma 5, hence strongly connects the whole graph.
Upper bound of the added links: The link addition procedure in step 12 introduces jS 0 i j jV 0 sour j number of new links for each i 2 V 0 sink . Thus, the number of added links will be P i2V 0 sink jS 0 i j and is upper-bounded by jV 0 sour jjV 0 sink j. Computational complexity: The Algorithm 3 in step 1 runs in 3n iterations, while the link addition step (steps 3-13) requires 2n steps due to the selection of representative nodes and information broadcast. Hence, by a simple calculation, the execution of Algorithm 4 requires a total 5n iterations.
Optimality gap: The minimum number of edges that must be added to strongly connect a weakly connected digraph is equal to maxfjV 0 sour j; jV 0 sink jg, see [14], [15]. This stems from the fact that we need to introduce at least one exiting edge on sink-scc and at least one entering edge on source-scc. Then, the number of the new edges added through Algorithm 4 is jDE þ j ! maxfjV 0 sour j; jV 0 sink jg. Thus, the optimality gap can be calculated as D Ã ¼ jDE þ j À maxfjV 0 sour j; jV 0 sink jg. As the number of added link is P i2V 0 sink jS 0 i j, the optimality gap is D Ã ¼ P i2V 0 sink jS 0 i j À maxfjV 0 sour j; jV 0 sink jg.
By property (2) there is a path from vðiÞ; p þ 1 i minfjV 0 sour j; jV 0 sink jg to some vertex in wðjÞ; 1 i p and hence to all vertices on C. Then, from property (3) and the addition of edge ðwðiÞ; vðiÞÞ for p þ 1 i minfjV 0 sour j; jV 0 sink jg, there is a path from every node in the cycle C to each vðiÞ; p þ 1 i minfjV 0 sour j; jV 0 sink jg. A similar argument shows that there is a directed path from the nodes in the cycle C to each wðiÞ; p þ 1 i minfjV 0 sour j; jV 0 sink jg and from each wðiÞ; p þ 1 i minfjV 0 sour j; jV 0 sink jg to the nodes in cycle C.
To this end, the set ðwðiÞ; vðiÞÞ for p þ 1 i min fjV 0 sour j; jV 0 sink jg introduces minfjV 0 sour j; jV 0 sink jg À p new edges. In addition to the p þ kjV 0 sour j À jV 0 sink jk number of edges introduced previously, the total number of augmented set in DE þ is minfjV 0 sour j; jV 0 sink jg þ k jV 0 sour j À jV 0 sink jk ¼ max fjV 0 sour j; jV 0 sink jg. Note that for the case of jV 0 isol j > 0, the modification in (16) adds jV 0 isol j new edges by chaining the nodes vðjV 0 sour j þ 1Þ; . . . ; vðjV 0 sour j þ jV 0 isol jÞ in the directed cycle C. The rest of the proof follows the previous discussion when jV 0 isol j ¼ 0, which ensures all nodes are mutually reachable. The additional modification for connecting isolated-sccs results in the total number of augmented set introduced in DE þ as maxfjV 0 sour j; jV 0 sink jg þ jV 0 isol j.