buTCS: An Optimized Algorithm for Estimating the Size of Transitive Closure

Given a directed graph and a node <inline-formula> <tex-math notation="LaTeX">$v$ </tex-math></inline-formula>, the transitive closure (<italic>TC</italic>) of <inline-formula> <tex-math notation="LaTeX">$v$ </tex-math></inline-formula> is the set of nodes that <inline-formula> <tex-math notation="LaTeX">$v$ </tex-math></inline-formula> can reach in the graph. <italic>TC</italic> size is very important in many applications but the cost of <italic>TC</italic> size computation is high in both time and space, which makes computing accurate <italic>TC</italic> size not applicable to the scenario where we need to know the <italic>TC</italic> size quickly for large graphs. Considering that existing approaches are either inefficient or inaccurate, we propose an algorithm, namely <italic>buTCS</italic>, to efficiently make more accurate estimation. Our approach works in linear time and space. The basic idea is to compute a node’s <italic>TC</italic> size based on that of its out-neighbors and perform a top-down verification. We further propose two optimizations to improve the estimation accuracy. The experimental results on 13 real datasets show that <italic>buTCS</italic> achieves better estimation on <italic>TC</italic> size efficiently.


I. INTRODUCTION
Given a directed graph G = (V , E), where V is the set of nodes and E is the set of edges, the transitive closure (TC) of a node u is defined as the set of all nodes that u can reach in G. Obtaining TC size for all nodes in directed graphs is a fundamental graph problem [1]- [18], [21], [23], [24] and has been extensively used in many applications. For example, to efficiently compute transitive reduction [23], [24]. we need to know which node has a larger TC size, such that to determine a better processing order for all nodes to improve the efficiency. For another example, when constructing 2hop labels [16], [17], if we can quickly know the TC sizes of all nodes on forward and backward directions, we can get a better node processing order, such that to efficiently produce 2hop labels with smaller size.
However, the cost of TC size computation is high in both time and space. Calculating the accurate size of TC by using fast binary matrix multiplication (see e.g. [19], [20] for background) has time complexity O(|V | 2.38 ) and space complexity O(|V | 2 ), therefore cannot scale to large graphs. Besides, in the scenario where we need to know the TC size for large graphs quickly, this method is not applicable due to its high time and space complexities.
The associate editor coordinating the review of this manuscript and approving it for publication was Fu Lee Wang .
Considering this problem, researchers proposed several approaches to estimate the size of TC [21]- [23]. These approaches usually have smaller time and space complexities, therefore can be used to efficiently process large graphs. However, these approaches still suffer from poor performance in accuracy or efficiency. In [21], Cohen proposed an algorithm that runs in O(k(|V | + |E|)) time and O(k|V |) space to obtain a TC size estimation for every node by performing k random permutations [21]. The problem of this approach is that its accuracy depends on the value of k. The larger the value of k, the more accurate for its TC size estimation. However, a large k will result in inefficiency. In [22], Zhu et al. proposed heuristics to estimate the lower and upper bounds of TC size [22]. However, both lower and upper bounds are far from accurate. The most recent algorithm lin-earTC was proposed by Junfeng Zhou et al. to estimate every node's TC size with time complexity [23], [24]. Specifically, to estimate the TC size of a node u, it firstly finds from u's out-neighbors the node v that has the maximum estimated TC size. Then, it takes the sum of v's TC size and the number of u's out-neighbors as u's estimated TC size. This approach can work efficiently, but suffers from two problems. First, as the difference between the TC size of u and v may be huge, the estimated TC size of u which is computed based on v's estimated TC size may be far from accurate. Second, as v's estimated TC size may be greater than the accurate TC size of u, u's estimated TC size could be much larger than its accurate TC size.
Considering these above problems, we propose an optimized approach for TC size estimation. The basic idea is similar to that of linearTC, i.e., u's TC size is estimated based on that of its out-neighbors. Here, we make improvements from two aspects. First, we propose a novel spanning tree, based on which we can find the accurate TC size of many important nodes. In this way, we can reduce the difference between the TC sizes of every node u and u's out-neighbor v, such that the estimated TC size of u can be more accurate. Second, we propose a verification operation to improve the estimated accuracy when the estimated TC size is greater than the accurate value. Our contributions are as follows.
1) We propose a two-step TC size estimation strategy.
In the first step, we compute the initial estimated TC size for each node. In the second step, we perform a top-down verification to improve the accuracy of the estimated value. 2) We propose two heuristics for TC size estimation, which makes estimation based on a spanning tree. We further propose a novel spanning tree based on which we identify important nodes and get their accurate TC size, such that to make the overall estimation more accurate. We further propose to use two spanning trees to increase the number of nodes for which we can get the accurate TC size. 3) We conduct extensive experimental studies on real datasets. The experimental results show that our approach can efficiently get more accurate results and scale to large graphs. The remain of this paper is organized as follows. In Section II, we discuss the preliminaries and related work. In Section III, we discuss the general idea of our approach on TC size estimation and the algorithms. We report the experimental results in Section IV, and conclude our work in Section V.

II. BACKGROUND AND RELATED WORK A. PRELIMINARIES
We model a graph as a directed graph, and focus on its direct acyclic graph (DAG) representation, since we can easily compute the size of each strongly connected component (SCC). Hereafter, we use G = (V , E) to denote a DAG, where V is the set of nodes and E is the set of edges and focus on TC size estimation on G. We , which is also called the transitive closure of v. We call o * T (v) as u s descendants w.r.t. a tree T . Given a DAG G = (V , E), we use X = {1, 2, . . . , |V |} to denote a topological order (topoorder) of G, which can be got by a topological sorting on G. A topological sorting of G is a mapping t : V → X , such that t. X and we use I v to denote node v's topo-order interval. We useÑ (v) to denote vs estimated TC size. A topoorder X of G can be got in linear time O(|V | + |E|) [25]. We show important notations in Table 1 for ease of reference.
Problem Statement: Given a DAG G, return the estimated TC size of all nodes.

B. RELATED WORK
Existing works can be divided into two categories: (1) TC size computation [18], [20], which computes the accurate TC size. (2) TC size estimation [21]- [24], which computes an approximate TC size. Compared with the first category, the second category usually works much more efficiently and can scale to large graphs. In the following, we survey the techniques in each category.

1) TC SIZE COMPUTATION
Calculating the accurate TC size using fast binary matrix multiplication [2], [20] has time complexity O(|V | 2.38 ) and space complexity O(|V | 2 ), which cannot scale to large graphs due to the expensive computation and space overheads. In [18], Tang et al. proposed an algorithm based on path-decomposition. The result is a set of n paths. Then, it computes the TC size of nodes on the path in a bottom-up way. The time and space complexities are O(n × |E|) and O(|V |), respectively. However, in practical applications, n is usually very large, resulting in higher time complexity and poor scalability.

2) TC SIZE ESTIMATION
In [21], Cohen proposed to estimate TC size by performing k random (kr) permutations. The basic idea is to generate a random rank value for each node. Then, it sorts all nodes according to this rank value. After that, it performs DFS/BFS from each node v in the sorted order and then deletes all nodes that v can reach. This estimated TC size of each node is calculated based on the rank value of the node that first reaches it. This algorithm guarantees that the estimated error VOLUME 9, 2021 rate and iterative rounds k satisfy Obviously, its accuracy depends on the value of k, and a large k usually means inefficiency. The cost of each random permutation is O(|V | + |E|), and the overall time complexity is O(k(|V | + |E|)).
In [22], Zhu et al. proposed an algorithm to estimate the lower and upper bounds (denoted as lb and ub, respectively) of a node's TC size. The lower bound of u is obtained by summing up the contributions of u's out-neighbors in G, where each out-neighbor v contributes 1 |iG(v)| of its lower bound to u. If |o G (u) | = 1 and |i G (v) | > 1, the lower bound of u may be less than that of v, which may result in v is not the one wanted for u. On the other hand, the upper bound of u is the sum of the upper bounds of u's out-neighbors. Since many nodes share the same set of reachable nodes in G, the upper bound of u may be much larger than the accurate result. Although both lower and upper bounds can be efficiently computed, they are usually far from accurate.
Zhou et al. [24] proposed an algorithm, namely linearTC, to estimate every node's TC size more accurately with time complexity O(|V | + |E|). This approach can work efficiently, but suffers from inaccuracy in practice, as discussed in Introduction. We follow the similar idea and make improvements by proposing novel heuristics. The differences between lin-earTC and our approach lie in two aspects. First, the aim is different. Our approach is aiming at achieving more accurate estimation on TC size. As a comparison, linearTC is aiming at using the estimated TC size to get a better node processing order, that is, it concerns more on which node has larger TC size, but not on whether the estimated value is accurate. Second, the operation is different. According to our aim, i.e., achieving more accurate estimation, rather than which node has larger TC size. We made improvements on linearTC from three aspects: (1) we perform a verification operation to improve the accuracy, (2) we propose a novel spanning tree that can mark nodes with larger TC size, which will be used to generate more accurate estimation for more nodes, (3) we use two spanning trees to mark more important nodes.

III. TC SIZE ESTIMATION
proposed an algorithm linearTC to estimate every node's TC size with time complexity O(|V | + |E|) [23], [24]. It is worth noting that in [23], [24], they want to know for a node, which of its out-neighbors has the largest TC size. They do not care the difference between the estimated value and the accurate value, as long as they can correctly identify the one that has the largest TC size. Different with [23], [24], the aim of our approach is achieving more accurate estimation on TC size.

A. PROCESSING STRATEGY
Our heuristics is based on careful observations, which help to verify the feasibility of our approach. Given a node u, let v max be the node that has the largest TC size among u's out-neighbors, and n u is the number of our-neighbors of u, The smaller the value of TC err , the smaller the difference between u's TC size and the sum of the largest TC size among its outneighbors and the number of its out-neighbors. By Table 2, we know that for most datasets, TC err is very small, usually less than 0.1%, which means that if we can get the TC size of u's out-neighbors as accurate as possible, then we can get more accurate estimation on u's TC size. Therefore, we propose a bottom-up strategy to estimate the TC size of all nodes, during which we use a spanning tree to help get more accurate estimation.

FIGURE 1.
A DAG G with topo-order X and its spanning tree T . The integer on the right of each node in G is its topo-order, and the pair on the right of each node in T is its interval.

1) THE SPANNING TREE
Given a topo-order X of DAG G, the topo-order spanning (TPS) tree T is a spanning tree of G. The incoming edge to a node v in T is from its last in-neighbor u, which has the maximum topo-order among v's in-neighbors w.r.t. X . For example, given G in FIGURE 1(a), the integer on the right of each node is its topo-order, the tree T is shown in FIGURE 1 We assign each node v an interval I v = [s, e] to facilitate checking the ancestor-descendant relationship for nodes in T , where I v .s = t v , and I v .e is the maximum Topo-order of v's descendants in the spanning tree T . I v ⊂ I u means that u is an ancestor of v.
Given a DAG G, we use Algorithm 1 to obtain the Topo-order X and generate a TPS tree T . First, to get the Topoorder X , the topological sorting can be done by (1) finding a set of ''start nodes'' without incoming edges and inserting them into a stack S (lines 2-5); (2) popping out a node v from S, assigning v its visiting order t v , and pushing v's outneighbors which have no incoming edges into S after deleting edges starting from v (lines 7-14); and (3) repeating (2) until S becomes empty. The time complexity of topological sorting is O(|V | + |E|), and the space complexity is O (|V |) (line 6).
Second, we construct the TPS tree T during performing topological sorting. Specifically, after processing a node v, 14). Therefore, T is constructed by inserting nodes into it in the ascending Topo-order X . The interval of each node v ∈ V is constructed in descending topoorder X (lines [15][16][17].
Theorem 1: Given a CN, then its TC size is equal to its interval length.
Proof: For node v, the v's TC size is the number of nodes that v can reach. If v is CN, the interval of v is composed of all its reachable nodes according to the definition 1, so the length of the interval is the accurate TC size.
, a is a CN in FIGURE 1 (b). According to Theorem 1,Ñ (a) = I a .e − I a .s = 11 − 1 = 10, which is its accurate TC size. According to Theorem 1, if we know which node is a CN, we know its accurate TC size, based on which we can make a more accurate TC size estimation. To mark all CNs, we first need to know for each node v, the farthest node u that v can reach. Let far(v) be u's Topo-order satisfying that v can reach u and u has the largest Topo-order among all nodes of o * G (v). To mark all CNs, we first compute far(v) for every node v. After that, if I v .e = far(v), then we can safely say that v is a CN. The correctness is based on the following result.
We then prove that if v is a CN, then I v .e = far(v).

2) ESTIMATION HEURISTICS
According to Theorem 1, we know that if u is a CN, then we do not need to estimate u's TC size, due to that we know the accurate TC size. For other nodes, we use the following heuristics to make estimation, where v max the node with the largest estimated TC size in o G (u). VOLUME 9, 2021 (H1) Using v max to make estimation,Ñ (u) = |o G (u) | + N (v max ).
(H2) Using the sum of sub-tree sizes as the estimation, Example 2: For H1, consider d in FIGURE 1 (a). o G (d) = {e, h}, e is the node with the largest estimated TC size in o G (d) andÑ (e) = 4. With H1, we can get the estimated size of d, i.e.,Ñ (d) = |o G (d) | +Ñ (e) = 2 + 4 = 6. For H2, consider node d in FIGURE 1 (a), o G (d) = {e, h}, we know that |o G (d) | + o * T (d) = 2 + 0 = 2. As can be seen above, H1 and H2 are complementary to each other. When one gets a smaller value, the other usually gets a larger value. For example, for d in FIGURE 1 (a), H2's result is 2, which is smaller than that of H1, and both are not greater than the accurate result. Further, when the set of subtrees of a node u have similar sizes, H1 may get results smaller than that of H2. Considering the above problems, we take the larger value of H1 and H2 as the estimated result, as shown by Equation (1).
There are two cases in Equation (1)

3) TOP-DOWN VERIFICATION
Although both H1 and H2 usually get values smaller than the accurate TC sizes, they may also produce estimated values larger than the accurate results. For example, for b in  However, b's in-neighbor is a, which is a CN, and we know its accurate TC size is 10 according to Equation (1). Given the accurate TC size of a, we know that the upper bound of the TC size of b is 9, and directly using the estimated result of b, i.e., 13, is far from accurate and should be avoided.
Motivated by this, we propose to revise the estimated results by a top-down verification based on CNs.
The basic idea of our approach is to visit all nodes in topdown way. For each visited node v ∈ V , if the estimated value of each node u of v's out-neighbors is larger than that of v, we set its value asÑ (u) =Ñ (v) − 1, such that for each node v, its estimated value is smaller than that of its in-neighbors.

1) LARGE DEGREE NODES ON THE RIGHT
Assuming that node v is one of u's out-neighbors, and u has a large out degree. When the difference between the TC size of u and v is large, the estimated TC size of u which is computed based on v's estimated TC size may be far from accurate. According to Definition 1, every node on the rightmost path of the TPS tree will be marked as CN. Since node u has a large out degree, it will be more probable that u has a large TC size. If we can get its TC size more accurate and use it to help estimate the TC size of u's in-neighbors, the estimation accuracy of many nodes can then be improved. We propose an optimization strategy that first processes nodes with a small out degree when constructing the TPS tree, such that to make sure nodes with the large out degree are on the rightmost path of the spanning tree T . Example 4: We process G using Topo-order X 2 in FIGURE 3(a), which is generated by first processing nodes with small out-degree during topological sorting. As shown by the integer on the right of each node. During the topological sorting, we generate the TPS tree T 2, as shown by  FIGURE 3(b). For example, for node a and its out-neighbors b and f , we first process f due to that o G (b) > o G (f ).
For each node v of the given graph, let big be the one that has the largest out-degree among v's out-neighbors. We use Algorithm 5 to move big to the last position of v's adjacency list, such that node big will be processed last. Then, the TPS tree constructed by Algorithm 1 will produce CNs with large TC size. Based on these CNs, we can make more accurate TC size estimation. Algorithm 5 processes each node in the graph G one by one (line 1). During processing each node v, it identifies the out-neighbor with the largest out-degree, and moves it to the last position in v's adjacency list. The time complexity of this algorithm is O (|V | + |E|), and the space complexity is O (|V |).

2) ESTIMATING WITH TWO TREES
According to Definition 1, we can get the accurate TC size of a CN. The more CNs we identified, the more accurate for TC size estimation. Considering this, we propose to use two TPS trees to find more CNs. The heuristics for TC size estimation is shown as Equation 2, as shown at the bottom of the page. To use Equation 2, we need to first construct two TPS trees and identify CNs on both trees. After that, we can make more accurate estimation using Equation 2.
There are three cases in Equation (2). (Case-1) and (Case-2) u is a CN, we take o * T (u) as the accurate TC size, i.e., |o * G (u) | = I u .e − I u .s (Case-3) u is not a CN, we take the larger value of |{o in two trees as the estimated result, which guarantees thatÑ (u) is closer to the accurate value.
Given a DAG G, our algorithm, namely buTCS in Algorithm 6, obtains TC size estimation for every node, which is done by first calls Algorithm 5 to sort the graph G to ensure the maximum out degree node is at the last position of the current node's adjacency list, then calls Algorithm 1 to get two different TPS trees. After that, it marks CNs by calling Algorithm 2, then calls Algorithm 3 which uses Equation 2 to obtain the estimated result, finally calling Algorithm 4 to verify the estimated result. The overall time complexity of buTCS is O (|V | + |E|). During the processing, we need to maintain two TPS tree and CNs, the space complexity of buTCS is O(|V |).
Example 6: Given G with topo-order X in FIGURE 1(a). The estimated results of linearTC [23], [24] and buTCS are shown in FIGURE 4 (a) and (b) respectively. With    linearTC algorithm, we obtainÑ (a) = 13,Ñ (b) = 13 and N (c) = 8, which is far from their accurate TC size. However, with buTCS algorithm, we mark a, b, c as CNs and can obtain their accurate TC size.

IV. EXPERIMENT
We make comparison with several existing TC estimation approaches, including lb, ub [22], kr (k = 50/100/150) [21], linearTC [23], [24], and our buTCS. All experiments were run on a PC with Intel(R) Core (TM) i5-7300HQ CPU @ 2.50GHz, 8 GB memory, and Ubuntu 18.04 Linux OS. Table 3 shows the statistics of 13 real datasets used in our experiments. Among these datasets used in our experiments, email, 1 wiki 1 LJ 1 and web 1 are directed graphs initially, we transformed them into DAGs by coalescing each strongly connected component into a node of DAGs, and these are downloaded from the same web page. 1 http://snap.stanford.edu/data/index.html All other datasets are DAGs initially, and all of these are large datasets (|V | > 100, 000). email is a DAG transformed from directed graph. email-EuAll, which is an email network from a EU research institution. unip150m 1 is a DAG obtained from the RDF graph of UniProt, 2 which contains many nodes without incoming edges and few nodes without outgoing edges. wiki is a DAG transformed from Wikipedia talk (communication) network wiki-Talk. LJ is a DAG of an online social network soc-LiveJournal1. web is a DAG of web graph web-Google. Patent 1 (cit-Patents) and citeseerx 1 are all citation networks with out-degree of non-leaf nodes ranging from 10 to 30. dbpedia 3 is the DAG of a knowledge graph. govwild 4 is a DAG transformed from a large RDF graph. gounip 1 (go-uniprot) and 10go-unip 1 (10go-uniprot) are DAGs transformed from the joint graph of Gene Ontology terms with the annotations file from the UniProt. twitter 1 is a DAG transformed from a large-scale social network obtained from twitter.com. web-uk 1 is a DAG of a web graph dataset. It is worth noting that for twitter dataset in Table 3, p ↓ leaf +p ↑ leaf > 1, due to that there are many nodes without in-and outneighbors.

A. TC SIZE ESTIMATION ACCURACY
Let N (u) = |out * G (u) |, X is an algorithm for TC size estimation. We use two metrics error(u) and dev (X , G) to show the accuracy of TC size estimation for different approaches. error(u) is computed based on equation (3), which is used as a metric to evaluate the goodness of TC size estimation on a single node. For node u, if error(u) is smaller, it means that u's estimated TC size is more accurate. As a supplement, dev (X , G) is the standard deviation computed based on equation (4), which measures the degree of the standard deviation  of the estimated TC sizes of nodes from their accurate value w.r.t X on G. dev(X , G) is used to evaluate the goodness of TC size estimation over all nodes of the input graph G. The smaller the standard deviation, the less the estimated values deviating from the accurate values. Besides, this metric can measure the goodness of the estimation method, that is, whether the overall estimation result is fluctuating greatly.
Further, we define the deviation promotion rate as Equation 5, by which we can evaluate the improvement of buTCS on each X of existing approaches. For one graph G and method X , a large promotion(buTCS, X , G) means that the buTCS has a significant improvement effect on G compared to X .
FIGURE 5 shows the comparison of error rates on 13 datasets of different TC size estimation algorithms, from which we know that for all graphs, our buTCS's is more accurate than existing methods because for our method many nodes are with error(u) ∈ [0, 0.2). For example, for email, unip150m, wiki, LJ, web, citeseerx and twitter, every node's error is in the [0, 0.2) interval using buTCS. Table 4 shows the standard deviation for different TC size estimation algorithms, from which we can see that for all graphs, buTCS has the smallest standard deviation, which means that buTCS can make more accurate estimation compared with existing algorithms. For example, buTCS gets VOLUME 9, 2021  an estimation with the standard deviation less than 0.01 on several datasets, such as email, unip150m, wiki, LJ, web, citeseerx, twitter and web-uk. Even though linearTC also gets better estimation than other existing approaches, buTCS still works better than linearTC on all datasets. The reason lies in that (1) our spanning tree can help to identify more CNs, (2) our approach performs a top-down verification after estimation. Both two steps make the estimation more accurate than that of linearTC. For kr (k = 50, k = 100, k = 150) method, as k increases from 50 to 150, the  standard deviation becomes smaller, but still larger than our approach. Table 5 shows the deviation improvement of our approach over existing approaches, from which we have the following observations.
(2) Compared with kr (k = 50, k = 100, k = 150), buTCS's improvement ratio is more than 0.90 on amaze, kegg, email, unip150m, wiki, LJ, web and twitter. For kr, it estimates the results based on k random permutations, its accuracy depends on the value of k, the larger the value of k, the more accurate of its estimation. Even though, the improvement ratio is always a positive number on each dataset, which means that buTCS works better than kr on all datasets.
(3) Compared with linearTC, buTCS's improvement ratio is more than 0.20 on email, unip150m, wiki, web, dbpedia, govwild and twitter. Especially on unip150m, the improvement ratio is 1, which also means that linearTC does not work well on this dataset. Further, for linearTC, the estimated TC size of a node may be greater than the accurate value of its in-neighbor, which may lead to the estimated value of its in-neighbor far from accurate.
However, it is worth noting that buTCS does not work well on all datasets, especially on graphs where most nodes have no in-neighbors, such as go-unip and 10go-unip, where more than 95% nodes have no in-neighbors. The reason lies in that for the two graphs, the number of CNs is less than 0.1%, which means that for most nodes, the estimated TC size may not be equal to the accurate value by our approach. As a comparison, for wiki dataset, where 98.9% nodes are CNs, which means that for 98.9% nodes, the estimation will be equal to the accurate TC size. Further, our approach is inspired by TC err , from the fifth column of Table 2, we know that TC err is more than 30% for go-unip and 10go-unip, which also means that our approach may do not work well on the two datasets. Form FIGURE 5, Table 4 and Table 5, we know that even though our approach does not work well on go-unip and 10go-unip compared with other datasets, compared with all existing approaches on the two datasets, it still achieves the best estimation accuracy. FIGURE 6 shows the comparison of actual memory usages, from which we know that buTCS consumes more memory than that of linearTC (usually 7%-10%), due to that buTCS uses two spanning trees. However, as both approaches have linear space complexity, all these graphs can be loaded into memory and processed with limited space. For example, webuk has more than 20,000,000 nodes, the actual memory usage is 985MB for linearTC and 1,072MB for buTCS. FIGURE 7 shows the comparison of running times, from which we know that buTCS consumes similar time as ub, lb and linearTC, due to that the time complexity of these algorithms is O(|V | + |E|). As a comparison, kr needs to consume much more time for TC size estimation, and with the increase of k, it consumes more time.

B. TC SIZE ESTIMATION EFFICIENCY
From FIGURE 6 and FIGURE 7, we know that for buTCS, all these datasets can be loaded into memory and efficiently processed with limited space, due to that it has linear space and time complexities.

C. IMPACTS OF THE OPTIMIZATION
To show the impacts of the optimization techniques used in our algorithm, we test three versions of our approach, including: (1) buTCS-B, which is the basic version without using the two optimizations; (2) buTCS-1, which uses only the first optimization, and (3) buTCS-2, which uses both the two optimizations and is the buTCS algorithm.
From II we know that buTCS-2 works more accurately than buTCS-B and buTCS-1. The reason lies in that buTCS-2 not only constructs two TPS tree, but also moves large degree nodes to the right of the spanning tree. Both optimizations can help find more CNs, which results in more accurate estimation.
From Table 6 we know that buTCS-2 has a smaller standard deviation than buTCS-B and buTCS-1. Table 7 shows the comparison of running times, from which we know that buTCS-2 needs to consume a little more time than buTCS-B and buTCS-1, due to buTCS-2 using both optimizations.

V. CONCLUSION
In this paper, we focus on TC size estimation. We propose an algorithm, namely buTCS, to estimate all nodes' TC size in linear time O(|V | + |E|), where V is the set of nodes and E is the set of edges of the input graph. The basic idea is to estimate a node's TC size in two steps. First, it gets the initial estimation result of a node based on the estimated TC size of its out-neighbors. Second, it revises the result with a top-down verification. We further propose two optimization strategies to make improvements. We conduct rich experiments on 13 real datasets. The experimental results show that we can get a better estimation efficiently compared with the existing algorithms. As an indication, among 13 real datasets being tested, the ratio of the standard deviation improvement for buTCS is around 97% compared with lb, ub and kr. Compared with linearTC, buTCS get a better estimation on all datasets and the ratio of the standard deviation improvement is more than 30% on several datasets.