Structural Decomposition Model for the Evolution of AS-Level Internet Topologies

Modeling Internet graphs at the autonomous-system (AS) level is helpful for recognizing and predicting the development trend of evolving Internet topology from a macro perspective. In contrast to the global statistical models such as the power-law distribution of node degrees, the structural decomposition models can more effectively represent the local connection. In this paper, we propose a structure-based model. Starting with the classification of links among the AS nodes, the proposed model partitions the core and periphery of Internet graphs into 16 atomic-level solid and dotted components. Additionally, the model captures the stable evolving features of these components based on the UCLA dataset that continuously explore Internet graphs over a long historic period from 2001 to 2015. Finally, according to the structure-based model, we design a new Internet-topology generator. Compared with the recently proposed generators, the advantages of our generator are as follows: (1) it accurately captures the structure decomposition property studied in this work, (2) it performs best on three statistical properties of the distance, assortativity coefficient, and maximum degree, and (3) it exhibits the best comprehensive performance in terms of runtime and multiple graph properties.


I. INTRODUCTION
The autonomous-system (AS)-level Internet topology has evolved over time, and its size (i.e., the number of AS nodes) has rapidly grown from 10,000 in 2001 to approximately 50,000 in 2015 [1]. In addition, many potential Internet structures, e.g., named data networking [2], [3], information centric networking [4], and location-based networking [5], may make the Internet evolve to a new era. Thus, predicting the evolution trend of the topology is critical. Moreover, topological models are commonly used tools to test network technologies, such as routing protocol [6]. In contrast to the large-scale realistic networks, these models can generate small-scale graphs with adjustable characteristics, which greatly reduce the cost and provide a variety of topological The associate editor coordinating the review of this manuscript and approving it for publication was Zehua Guo . scenarios for testing. Therefore, accurately modeling and recognizing the evolution of Internet topology is important.
The development of Internet-topology models has gone through three stages, namely, random, power-law and structure-based graphs. Random graphs are generated by randomly connecting the edges of node pairs independent of one another, such as the Waxman model [7]. Although the random-graph models are easy to use, they are not capable of better capturing the statistical features of the Internet [8]. One of the important metrics used in topology analysis is the distribution of the node degrees. Faloutsos et al. [9] demonstrated that this property is well described by the power-laws in the Internet topology. In addition, more studies have focused on power-law models [10], such as the Inet-3.0 model. However, the inherent biases of the traceroute sampling and collection of border gateway protocol (BGP) data from limited vantage points have led researchers to investigate the true existence of power-laws in the topology, VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ although the degree distribution is widely accepted to be heavy-tailed [11]. In contrast to the power-law models, which focus on global statistical properties, the structure-based models pay more attention to the local structures, which represent the unique characteristics of the Internet. AS is a set of routers within a single administration domain, and the Internet is built on two domain categories, i.e., transit and stub [12], where the transit AS usually carries traffic among other domains and a stub AS that is connected to the end hosts relies on at least one transit AS for connectivity to the rest of the Internet. According to the physical interpretation, transit-stub models [12], [13] have been widely used to represent the hierarchical structure of the Internet. However, fewer types of AS categories usually lead to poor performance of the generators that use these models in terms of statistical properties, such as the power-law of the node degrees [13]. In addition, more studies investigated the local structures [14]- [23]. Zhou et al. [16] found that the richclub connectivity phenomenon, namely, the Internet core, has a high connection density. Carmi et al. [17] partitioned the Internet using k-shell decomposition. Çetinkaya et al. [18] analyzed the Internet backbone. Jia et al. [19] studied the structure of the evolving IPv6 Internet. Liu et al. [20] partitioned the Internet into single-edge, binary, and triangular components. Accongiagioco et al. [21] found that the Internet core can be broken down into two layers. Lei et al. [22] studied the structure entropy. Zu et al. [23] proposed a citylevel IP geolocation algorithm that decomposed the Internet through the geographical location of nodes. However, the structural models that capture the evolution features of the Internet topology need to be further studied.
Recently, we have partitioned the Internet transit and stub AS nodes into seven categories by analyzing the physical mean of the normalized Laplacian spectral properties [24]- [29]. However, these works only provided a static node classification and have not yet structurally distinguished the core from the periphery. In this paper, the novelty is twofold. First, we establish an Internet core-periphery structural model composed of 16 atomic-level solid and dotted components and find many uniform distribution features that exist in the decomposed components based on the UCLA dataset that spans 15 years [1]. In addition, we observe that most of the node and edge properties of these components remain constant except for the top five highest degrees of transit AS nodes in the process of historical evolution from 2001 to 2015. Second, we propose an Internet-topology generator based on our structural model and numerically verifies that the generator demonstrates the best comprehensive performance in terms of runtime and multiple graph properties compared with the recently proposed generators.
The remainder of this paper is organized as follows. Section II describes the background and related work. Section III presents our Internet core-periphery structural model. Section IV presents the investigation of the evolutionary stability of the different components of our model using the UCLA dataset. Section V proposes a new generator, namely, SICPS (Simulates Internet graphs using the Core-Periphery Structure). Section VI shows the results obtained by SICPS. Finally, Section VII concludes this paper.

II. BACKGROUND AND RELATED WORK
The Internet topology is usually studied at two levels: router and AS levels [30], [31]. The former adopts routers as nodes, whereas the latter, which is the focus of our research, maps a set of routers to an AS node and describes the BGP connections among these AS nodes.

A. GLOBAL STATISTICAL PROPERTIES
The AS-level Internet topology represents simple and undirected graph G = (V , E), where V and E are the AS node and edge sets, respectively. Because the topology inherits the non-trivial property of complex networks, it is usually described using some statistical properties such as degree, distance, and clustering [30]- [36]. A commonly used degree property is degree distribution P(k), which is the probability that a randomly selected node has k degrees. However, the Internet topology is usually represented by another degree property, namely, the complementary cumulative distribution function (CCDF) degree, which is defined as F (k) = d>k P(d), because this degree property is closer to the power-law [10].
The average neighbor connectivity is a degree correlation property [32], which is defined as where P(k, k ) denotes the joint degree distribution, namely, the probability that a randomly selected edge connects kand k -degree nodes, and k max and k are the maximum and average degrees, respectively. In addition, the assortativity coefficient [33] shows the statistic of the node interconnectivity, which is defined as where j i and k i are the degrees of the nodes at the ends of the ith edge and · denotes the cardinality of a set. This paper chooses spectral property R = V i=1 (1 − λ i ) 4 to describe the distance, where λ i (i = 1, 2, · · · , V ) are all the eigenvalues of the normalized Laplacian matrix of graph G [34] because our recent work has demonstrated that the spectral property R is a good indicator of the average path length [26]. However, the former can be calculated faster using the circle enumeration method than the latter [28].
Moreover, clustering is usually described by clustering coefficient C (k) = 2m (k) (k (k − 1)), where m(k) is the average number of links between the neighbors of k-degree nodes. Specifically, C(k) can be summarized as average clustering coefficient C = k C(k)P(k) [33].
Whereas the aforementioned properties represent the global statistical features of the topology, they neglect the local structure of the Internet, which is an essential factor that distinguishes the Internet from other complex networks.
Inet-3.0 is a classical power-law model of the Internet topology [10]. First, it generates n nodes and uses two power-laws, namely, the degree-rank and degree CCDF, to respectively determine the degrees of the top three and the other nodes. Second, it uses a preference connection rule and the fraction of degree one to construct a tree that includes n nodes. Finally, it fills all free degrees of the n nodes and outputs a graph.

2) ORBIS MODEL
ORBIS uses dK to characterize an Internet graph [14], where dK describes the correlations among the degrees of nodes in subgraphs with d nodes. In particular, 1K and 2K reproduce the degree and joint degree distributions, respectively. In addition, ORBIS uses two random-graph-construction algorithms to simulate the 1K and 2K Internet graphs. Mahadevan et al. [14] found that the 2K random graphs could capture more properties of the Internet topologies except clustering and indicated that the clustering could be reproduced by 3K random graphs. However, the design of 3K generators is still a problem that needs to be resolved [14].

3) S-BITE MODEL
S-BITE is a model based on the core structure [21]. The model partitions the Internet topology into two distinct blocks: the core, where the nodes are tightly interconnected, and the periphery. Moreover, it separates the core into two layers: the centrum, whose density curve remains almost constant during the five-year evolution of Internet from 2007 to 2011, and layer-1, whose degree distribution in the vertical networks, which is composed of the centrum nodes plus all the layer-1 nodes and their connections to the centrum, is approximately uniform. The model simulates the peripheral nodes using three global statistical parameters: p, P 1 , and P 2 . Specifically, p is a probability that a peripheral node connects to two interconnected core nodes. P 1 and P 2 describe two statistical curves that are respectively related to the edge-number and preference distributions of connecting to the core for each newly-added peripheral node. In other words, the local structures of the peripheral nodes are not considered. We note that preference distribution P 2 is timedependent [21], i.e., the distribution is not constant during the Internet evolution from 2007 to 2011.

4) SINETL MODEL
According to Faloutsos' transit-stub model [9], [37], as shown in Fig. 1(a), we add more multi-homed links to construct a static-node-classification model [25], as shown in Fig. 1(b), in which the red dotted lines represent the FIGURE 1. Two transit-stub models of the Internet topology. (a) Faloutsos's star-based model [9], [37]: deleting all the links between the transit AS nodes results in a non-connected graph that is constituted by the union of star subgraphs. (b) Our static node classification model [24], [25]: adding more multi-homed links for better fault-tolerance. multi-homed links. Fig. 1(a) shows that each stub AS node is directly connected to a transit AS node through only one link in the early Internet topology, which is called a singlehomed network. However, in terms of better fault tolerance, an increasing number of stub AS nodes tend to be connected to transit AS nodes through more links, which is called a multi-homed network. Fig. 1(b) shows that more dotted (i.e., multi-homed) edges result in the transit and stub AS nodes to be partitioned into seven categories, namely, Q, S, P, I , J , K , and L. The seven node categories are defined as follows [25]: where G I = (V I , E I ) is a subgraph of G that is induced by node set V P Q and d G (v) and d G I (v) are the degrees of node v in graph G and subgraph G I , respectively. According to Eq. (3), all one-degree nodes in Internet graph G are marked as P, and all nodes that are connected to P are marked as Q. If all the P and Q nodes are removed from G, the following would result in remaining graph G I , in which all one-degree nodes with neighbors whose degree is not less than two are marked as K , all nodes that are connected to K are marked as S, all one-degree nodes whose neighbors have a degree of one are marked as J , all zero-degree nodes are marked as I , and all d-degree nodes (d ≥ 2) that can only be connected to S are marked as L. In addition, we consider all nodes in Internet graph G that have not been marked as noise.
Using the UCLA dataset [1] that contains AS graphs from 2001 to 2015, we find that the average percentage of noise nodes is 3.7% and the average percentage of the L nodes is 1.6%. Thus, the six nodes marked as Q, S, P, I , J , and K can be used to accurately model the static-node-classification of the Internet graph.
Recently, we have designed a model SInetL to sample a given Internet graph using the static-node-classification feature [29]. The model extracts a subgraph using a sampling method of the six types of marked nodes. However, the model neglects the evolving correlation of the Internet graphs at different snapshots. The input of the sampling model is a complete graph that already contains all the topological information, whereas the input of the evolving simulation models only contains some topological features, which are more conducive to human cognition for a nontrivial topological structure. The sampling model is oriented to a static graph and is not suitable for structural cognition and trend prediction.

5) COMPARISON OF THE AFOREMENTIONED MODELS
The AS-level Internet topology is nontrivial [9]. Thus, researchers must recognize the topology from different perspectives. In general, a model is built from a cognitive perspective. Table I lists the cognitive perspectives of the aforementioned models. However, researchers usually do not care about the advantages and disadvantages of the different perspectives but pay more attention on whether the model properties from these perspectives exist in the explored graphs [10], [14], [21]. Thus, in Sections III and IV, we show that we use a series of explored AS graphs in the UCLA dataset to mine the properties of our model.
In the existing literature [10], [14], [21], [29], comparison of the models was mainly realized through comparison of the AS graphs that were simulated by generators because the purpose of the generators was to simulate AS graphs that contained properties of these models.

III. CORE-PERIPHERY STRUCTURAL MODEL
Eq. (3) shows that Internet graph G has many one-degree P nodes, and each Q node must be connected to at least one P node. If the one-degree nodes are called pendants and the nodes connected to the one-degree nodes are called quasi-pendants, subgraph G I is also composed of pendants (K and J nodes), quasi-pendants (S nodes), and the remaining zero-degree I nodes, as expressed in Eq. (3), which exhibits the fractal structure of an Internet graph. We need to note that in subgraph G I , each one-degree J node can only be FIGURE 2. Connection relationships among the six node categories Q, S, P, I, J and K of Internet graph G, where the symbol X Y denotes the set that consists of all the X nodes connected to Y nodes. Note that the solid lines represent different types of edges that really exist in the graph, and the unidirectional and bidirectional dotted lines respectively establish the injective and bijection relationships between two node sets that belong to the same category. Specific-ally, A and B are the same node if the dotted line relation maps A to B.
connected to another J node. In addition, Eq. (3) shows that in graph G, each I node must be connected to at least two Q nodes and each K (or J ) node must be connected to at least one Q node.
According to the aforementioned analysis, we derive the connection relationships among Q, S, P, I , J , and K , as shown in Fig. 2, which classify all the edges of Internet graph G to six bipartite-graph sets and three interconnected sets, where a bipartite-graph set means that a traveler can move from one node set to another by one jump. By considering bipartite-graph edge set I Q − Q I in Fig. 2 as an example, the subgraph induced by the edge set is a bipartite graph, where I Q and Q I are two disjoint node sets, and only edges from I Q to Q I are included in the subgraph.
In addition, Fig. 2 shows that the nine subgraphs induced by the distinct edge sets can be merged again using the nine injective or bijection relationships. According to Eq. (3), the structure shown in Fig. 2 can be analyzed as follows.
• First, we neglect the Q nodes that are not connected to other Q nodes because the percentage of such nodes is extremely small. According to this simplification, at the center in Fig. 2, the Q subgraph that is induced by all the edges linking two Q nodes includes all the Q nodes that are defined in Eq. (3). Except for the Q nodes, S represents the set of remaining transit AS nodes. However, the probability of connecting two S nodes is significantly less than that of connecting two Q nodes because the vast majority of the S nodes have low degrees. In other words, as shown in the lower right corner in Fig. 2, the S S subgraph that is induced by all the edges linking two S nodes includes only part of the S nodes that are defined in Eq. (3).
• Second, according to Eq. (3), we can determine P = P Q , • Finally, we use X − Y to define a bipartite subgraph induced by all the edges that link the X Y and Y X nodes, where X , Y ∈ {P, Q, K , S, I , J }, and analyze some characteristics of these subgraphs as follows. (1) The degrees of all the P Q nodes in the P − Q subgraph are one. (2) The degrees of all the K S nodes in the K − S subgraph are one. (3) The degrees of all the I Q nodes in the I − Q subgraph are not less than two. (4) The J − J subgraph has even nodes, and the degree of each J J node in the subgraph is one. To extract the core, which is tightly interconnected, from the structure shown in Fig. 2, we predefine low-degree threshold d tra low for the transit AS nodes, high-degree threshold d cor hig for the core, and set d tra low = 10 and d cor hig = 100, which can capture the more stable characteristics of the evolution of the Internet (a detailed analysis is presented in Section IV).
In addition, we use the following steps to separate the Q subgraph denoted by Q−Q, which is induced by all the edges between the Q nodes, into five subcomponents.
Step 1: Derive a low-degree node set and obtain first subcomponent Q bip , i.e., a subgraph of Q − Q that is induced by all the edges between the Q B bip (l) and Q U bip nodes. In addition, we partition Q U bip into low-degree node set Step 2: Let Q cor be a subgraph of Q − Q that is induced by node set Q U bip and decompose Q U bip into low-degree node set high-degree node set and middle-degree node set where d Q cor (v) is the degree of node v in subgraph Q cor . Then, we derive second-fourth subcomponents cor (m, l), namely three subgraphs of Q cor induced by different types of edges among node sets Q U cor (h) and Q U cor (m, l) = Q U cor (m) ∪ Q U cor (l).
Step 3: Let Q B red (l) be a subset of node set Q B bip (l), where each Q B red (l) node has at least one Q B bip (l) neighbor in Q−Q. Then, we can obtain fifth subcomponent , which is a subgraph of Q − Q that is induced by all edges between the Q B red (l) and Q B red (l) nodes. Because node set Q is derived by P, the topological structure associated with the Q and P nodes is shown in Fig. 3, in which the structural partition of the P − Q subgraph is where the solid ellipses represent the distinct types of edges that really exist in Inter-net graph G, the dotted ellipses establish the mapping (injective and bijection) relationships described in Fig. 2, and the rectangles denote node sets. Note that the left solid connected on the left and that of the Q − Q subgraph is on the right. We define Q U cor (0) as the set of zero-degree nodes in Q cor and Q B bip (0) as a set of Q B bip (l) nodes that are not connected to the Q U bip nodes. In Fig. 3, we neglect the Q U cor (0) and Q B bip (0) nodes because the percentage of such nodes is much less than that of the rest of the topology. In the P − Q subgraph in Fig. 3, P Q (l) = P Q means that all the P Q nodes have low degrees. In addition, Q P can be divided into low-degree node set is the degree of node v in the P−Q subgraph. As well known, Fig. 3 shows the extraction of the core Q cor and other two components, i.e., Q bip and P − Q, from the perspective of edge classification. We need to note that the dotted ellipses in Fig. 3 can also be viewed as edge connections. Fig. 2 shows that the Q subgraph forms the connection center of the other peripheral structures. At the center of Fig. 4, we reduce the Q subgraph into four types of node sets, namely, Q U cor (h), Q U cor (m), Q U cor (l), and Q B bip (l), which include all the Q nodes. Using the UCLA dataset [1], we confirm that all stub AS nodes P, I , J , and K have low degrees because the maximum degree of these nodes did not exceed 30 from 2001 to 2015. Thus, Fig. 4 shows the corresponding node sets as I Q (l), J Q (l), J J (l), K Q (l), and K S (l). However, transit AS nodes Q and S have two types of degrees, i.e., low and high degrees. Thus, in each subgraph X -Y shown in Fig. 4 which is induced by node sets X Y , Y X and all the edges between X Y and Y X , we partition Y X into low-degree node We note that threshold d tra low is reset to 30 for Q S (l) and S Q (l) in the S-Q subgraph based on the analysis presented in Section IV.F. According to the node-set decomposition, where the solid ellipses represent the distinct types of edges that really exist in the graph, the rectangles denote node sets, and the dotted ellipses U -W establish the map-ping relationships between two node sets U and W that belong to the same cat-egory, for example, we extend the relationships shown in Fig. 2 to the structural model shown in Fig. 4.
In summary, the combination of Figs Table 2.
In Section IV, we present the similarity between the Accongiagioco's core model [21] and our model. In addition, our model exhibits a peripheral structure using fruitful bipartite-graph connection relationship in detail. Furthermore, we explore the stable characteristics of our model using the UCLA dataset, as presented in Section IV.

IV. EVOLUTIONARY STABILITY ANALYSIS
Our analysis uses the UCLA dataset because it provides public AS graphs that span 15 years from January 2001 to January 2015 [1]. This section first lists some notations associated with our structural model in Table 3 and then presents the analyses of the evolutionary stability of the distinct components of our model, as shown in Figs. 3 and 4 and Table 2.

A. Q cor COMPONENT
First, we analyze the node properties of the Q cor component, which has three node sets, namely, Q U cor (h), Q U cor (m), and  Q U cor (l). Fig. 5(a) shows that Q U cor (h) and Q U cor (l) linearly increases and Q U cor (m) quadratically increases with exploration time t, which is defined as the number of months since January 2001. In addition, Fig. 5(b) shows that the degrees of the Q U cor (l) nodes range from 1 to 10, whereas those of the Q U cor (m) nodes range from 11 to 99. From the statistical curves of the UCLA graphs shown in Fig. 5(b), we observe that the low degrees of the Q U cor (l) nodes and the middle degrees of the Q U cor (m) nodes obey two distinct power-law distributions. Fig. 5(b) shows that the low and middle degrees have constant distributions and do not change with time, whereas the high degrees tend to increase with time. We sort the degrees of the nodes in Q U cor (h) by decreasing order and let the ith highest-degree be d h cor (i). Fig. 6(a) shows that the d h cor (i) versus t curve approximately obeys a linear relationship when i ≥ 6, which is defined as where m i and b i are the slope and intercept of the linear relationship, respectively. However, the d h cor (i) versus t curve approximately exhibits some piecewise linear features when i ≤ 5, which means that the evolutionary trends of the top five highest-degrees are not constant over the long historical process in 15 years. From the comparison of Figs. 6(a), (b), and (g), we observe that the high-degree feature widely exists in other components of the Internet topology. In addition, Figs. 6(a), (b) and (g) show the following characteristics.
• In Q cor and Q bip that consist of transit AS nodes, the probabilities of the top five highest degree nodes connected to other transit AS nodes rapidly increases because the slope of the lines shown in Figs. 6(a) and (b) indicates an increasing trend with time.
• In subgraphs X − Q, where X ∈ {P, I , J , K }, the probabilities of the top five highest degree nodes connected to stub AS nodes do not significantly changed, as shown in Fig. 6(g). In addition, Fig. 6 shows that the high degrees in the peripheral structure that consists of Q bip and X -Y subgraphs In terms of (a), (b) and (g), when i ≥ 6, the i th highest-degree vs. t approximately obey a linear relation that can be represented by a tuple consisting of slope m i and intercept b i . Specifically, in (c) and (e), the parameter i is defined as the rank of Q U cor h nodes sorted by the decreasing order of degree in Q cor ; in (d ) and (f ), the parameter i is defined as the rank of Q U bip h nodes sorted by the decreasing order of degree in Q bip ; in (h), the parameter i is defined as the rank of Q I h nodes sor-ted by the decreasing order of degree in I − Q. exhibit better power-law relationships in the m i (b i ) versus i curve when i ≥ 6.
Next, we analyze the edge property of the Q cor component and point out the similarity between Q cor and the Accongiagioco's core model [21]. In [21], the core nodes are divided into two layers, namely, centrum and layer-1. The core is divided into three networks, namely, centrum network that consists of all the centrum nodes and their mutual connections, vertical network that consists of the centrum network plus all the layer-1 nodes and their connections to VOLUME 8, 2020 the centrum, and horizontal network that consists of only the layer-1 nodes plus their connections to other layer-1 nodes. Fig. 3 shows that our model also decomposes core Q cor into three subgraphs, which are respectively induced by three types of edges in core Q cor , namely, . In [21], the centrum network contains stable density curve D (i) = 2e (i (i − 1)) (i = 1, 2, · · · , n c ), where 1, 2, · · · , n c represents all the nodes in the network sorted according to a certain order and e is the number of edges in the subgraph of the network induced by the first i nodes, 1, 2, · · · , i. In the present study, we also use the density curve to present the edge property in the Q U cor (h) − Q U cor (h) subgraph and sort all the nodes in the subgraph according to a decreasing order of degree d Q cor (v). Fig. 7(a) shows that the Q U cor (h) − Q U cor (h) subgraphs of our model also exhibit a stable density curve on the UCLA dataset that spans 15 years.
Moreover, the degrees of the layer-1 nodes in the vertical network follow a uniform distribution [21]. To capture the feature, we define P c (k) as the probability that an Q U cor (h) − Q U cor (m, l) edge connects a k-degree Q U cor (m, l) node in core Q cor . Fig. 7(b) shows that the P c (k) versus k curve also approximately obeys a uniform distribution as k falls in the middle degrees from 11 to 99. Thus, we infer that the Q U cor (m) node set is associated with layer-1. We note that Fig. 5(a) shows that Q U cor (m) Q U cor (l) , i.e., Q U cor (m) occupies most of the nodes in core Q cor . In addition, Fig. 7(b) shows that the P c (k) distribution remains stable in the UCLA dataset over the span of 15 years.
Because the degrees of the Q U cor (l) nodes are obviously less than those of the Q U cor (m) nodes in the Q cor component, the Q U cor (m) − Q U cor (m) edges represent the principal part of the Q U cor (m, l) − Q U cor (m, l) connections. We use the joint degree distribution, which is defined as [32] where m (k 1 , k 2 ) is the number of edges connecting the nodes with k 1 and k 2 degrees, m is the total number of edges, and µ (k 1 , k 2 ) is one if k 1 = k 2 and two otherwise to characterize the Q U cor (m) − Q U cor (m) connection feature. Specifically, k 1 and k 2 represent the degrees in core Q cor . Through the UCLA dataset analysis that spans 15 years, we find that 11,99], which is the range of the middle degrees, as shown in Fig. 7(c). We note that Fig.7(c) includes three extracted distributions P (11, k 2 ), P (50, k 2 ), and P (99, k 2 ). Moreover, we define P l (k) as the probability that a Q U cor (l) − Q U cor (m) edge connects a k-degree Q U cor (m) node in core Q cor . Fig. 7(d) shows that the P l (k) versus k curve also approximately obeys a uniform distribution. We note that the Q U cor (l) − Q U cor (l) edges are neglected in our model because the number of such edges is very small.

B. Q bip COMPONENT
First, we analyze the node properties of the Q bip component, which contains three node sets, namely, Q U bip (h), Q U bip (l), and Q B bip (l). Fig. 8(a) shows that the cardinalities of these node sets Q U bip (h) , Q U bip (l) , and Q B bip (l) linearly increases with exploration time t. From the statistical curves shown in Figs. 8(b) and (c), we observe that the degrees of the Q U bip (l) and Q B bip (l) nodes have constant distributions. We note that the feature of high degrees associated with Q U bip (h) has been analyzed, as shown in Figs. 6(b), (d), and (f ).
Next, we analyze the edge properties of the Q bip component. Specifically, represent the corresponding edge connections in the component.
Because the Q B bip (l) degree distribution is constant in the UCLA dataset over the span of 15 years, we adopt P b (k), which is defined as the probability that a Fig. 9(a) shows that the P b (k) versus k curve obeys a constant distribution in the UCLA dataset that spans 15 years.
In addition, because both the Q U bip (l) and Q B bip (l) degree distributions are constant in the UCLA dataset over the span of 15 years, we use joint degree distribution P (k 1 , k 2 ), which is defined in Eq. (5), to analyze the Q U bip (l)−Q B bip (l) connection feature, where k 1 and k 2 are the degrees of the Q U bip (l) and Q B bip (l) nodes in Q bip , respectively. According to the UCLA dataset, we observe that P (k 1 , k 2 ) of the Q U bip (l) − Q B bip (l) connections for each given k 2 approximately follows  a uniform distribution for k 1 ∈ [1, 10], i.e., the range of the degrees of Q U bip (l) nodes, as shown in Fig. 9(b).
For given k 2 , we derive the following probability and learn that the distribution of the P S (k 2 ) versus k 2 relationship is constant in the span of 15 years, as shown in Fig. 9(c). According to the aforementioned analysis, we can determine that where d tra low = 10 is the maximum degree of the Q U bip (l) nodes in the Q bip component.
According to the list in Table 3, the Q B red (l) − Q B red (l) edges are not included in Q bip , but each Q B red (l) − Q B red (l) edge connects two Q B bip (l) nodes (with degrees k 1 and k 2 ) in Q bip , i.e., the edge corresponds to degree pair (k 1 , k 2 ), where k 1 , k 2 ∈ 1, d tra low and k 1 ≤ k 2 . We sort the degree pairs associated with the Q B red (l)−Q B red (l) connections using Algorithm 1.

11: End while
In Algorithm 1, line 1 inputs the maximum value of the degrees that may occur in the degree pairs. If the maximum value is U b , line 2 outputs a list that contains all possible degree pairs (k 1 , k 2 ), which satisfies k 1 ≤ k 2 ≤ U b . Moreover, lines 3-10 confirm that degree pair (k 1 , k 2 ) with a smaller value of len = k 1 + k 2 is located in front of the list.
We use joint degree distribution P (k 1 , k 2 ) defined in Eq. (5) to show the Q B red (l) − Q B red (l) connections and use Algorithm 1 to sort all possible degree pairs (k 1 , k 2 ). Fig. 9(d) shows that P (k 1 , k 2 ) of the Q B red (l) − Q B red (l) connections also remains stable in the UCLA dataset over the span of 15 years. Furthermore, Fig. 8(d) shows that both Q B bip (l) and Q B red (l) − Q B red (l) linearly increases with exploration time t. Thus, at any t, we can establish the following relationship: Eq. (8) shows that Q B red (l) − Q B red (l) is very small because the slope of the linear expression of the edge number relative to the node number is approximately 0.23.

C. X − Y COMPONENT
This section illustrates that the X − Y component is one of the five instances, namely, P − Q, I − Q, J − Q, K − Q and K − S, which are listed in Table 2. Specifically, X denotes P, I , J , or K , and Y denotes Q or S. These instances share two common features. One is that X and Y are stub and transit AS node sets, respectively, and the other is that X Y (l) = X , i.e., all the X nodes have low degrees and are included in the X − Y component. Y X is divided into high-degree node set Y X (h) and low-degree node set Y X (l), where the corresponding definitions are listed in Table 3. Because the five instances have similar properties, this section presents their combined analyses. Specifically, we take the I − Q subgraph as an example to analyze the X − Y component because the subgraph contains the largest number of edges in these instances.
First, we analyze the node properties of the I − Q component, which contains three node sets: Q I (h), Q I (l), and I Q (l).  Figs. 10(e) and (f ), we can deduce that P (k 1 , k 2 ) ≈ P S (k 2 ) d tra low when P S (k 2 ) is known.
We omit the repetitive analyses of the P − Q, J − Q, K − Q and K − S components because they are similar to the I − Q component. We note that in the P − Q and K − S components, the degrees of all the P Q (l) and K S (l) nodes are one; namely, the analysis shown in Figs. 10(c)-(f ) can be neglected for the two components. According to the comparison presented in Sections IV.B and IV.C, we can find plenty of similarities between the Q bip and X − Y components. However, an obvious difference exists between the two components, namely, more distribution characteristics in the X − Y component obey the power-law.  Table 3. Fig. 4 shows that J J (l) − J J (l) connects two J Q (l) nodes in the J − Q com- ponent. Section III confirms that J J (l) = J Q (l) = J and d J −J (v) = 1 for each node v in the J − J subgraph, which mean that each J Q (l) node is connected to only one J J (l) − J J (l) edge, i.e., edge number J J (l) − J J (l) , is strictly equal to half of the number of J nodes, and node number J is even.
To study the J J (l) − J J (l) connection feature, we sort all possible pairs of degrees d J −Q (v) where v ∈ J Q (l) using Algorithm 1 whose design is presented in Section IV.B and create degree pair We note that 1 ≤ d J −Q (v) < 30 for each node v ∈ J Q (l). Fig. 11(a) shows that the joint degree distribution P (k 1 , k 2 ) versus the degree-pair rank of the J J (l)−J J (l) connections is constant in the UCLA dataset that spans 15 years and is similar to the power-law. Fig. 4 shows that the S S node set is decomposed into S S (h) = S S ∩ S K (h) and S S (l) = S S ∩ S K (l), where S K (h) and S K (l) are high-and low-degree S K node sets in the K − S component, respectively. Thus, the S − S subgraph consists of three types of edge sets, namely, S S (h) − S S (h), S S (h) − S S (l), and S S (l) − S S (l).

E. S − S SUBGRAPH
According to the analysis presented in Section III, S S ⊆ S K = S. Thus, degree pair (d K −S (v 1 ) , d K −S (v 2 )) can be used to characterize each S S (l) − S S (l) edge that connects two S K (l) nodes v 1 We sort all the degree pairs using Algorithm 1, and Fig. 11(b) shows that the joint degree distribution P (k 1 , k 2 ) versus degree-pair rank of the S S (l) − S S (l) connections is constant in the UCLA dataset over the span of 15 years and is similar to the power-law.
Moreover, to study the S S (h) − S S (l) connections, we define P sl (k) as the probability that an S S (h) − S S (l) edge connects a k-degree S S (l) node, where k is the degree of the S S (l) node in the K − S component. Fig. 11(c) shows that the P sl (k) versus k curve obeys a constant power-law distribution in the UCLA dataset that spans 15 years.
Finally, we analyze the connection feature associated with the S S (h) nodes. Fig. 11(d) shows that at any exploration time t, we can establish the following relationships: Eq. (9) indicates that the S S (h) − S S (h) edges, where S S (h) = S S ∩ S K (h), are very sparse compared with node number S K (h) . In addition, the difference between the degrees of the two S K (h) nodes in the K − S component is not obvious because S K (h) is far away from the core. Thus, we assume that the S S (h) − S S (h) edges uniformly connect two S K (h) nodes at random, and the probability that an S S (h) − S S (l) edge connects a k-degree S K (h) node is approximately uniform.

F. S − Q COMPONENT
The S − Q component shown in Fig. 4 has two transit AS node sets S Q and Q S , which can be divided into S Q (h), S Q (l), Q S (h), and Q S (l). The division method is listed in Table 3. In addition, the connections in the component are  Fig. 12(a) shows that the node and edge numbers in the S − Q component linearly increase except for Q S (h) − S Q (h) and Q S (l) − S Q (h) , which quadratically increase. In addition, Fig. 12(a) shows that at any exploration time t, we can establish the following Despite the use of a larger low-degree threshold, Figs. 12(b) and (c) show that the degree distributions of the Q S (l) and S Q (l) nodes in the S − Q component remain constant and tend toward the power-law. Owing to the similarity between the S − Q and the X − Y (see Section IV.C) components, such as the high-degree properties shown in Fig. 12(d), the properties of the edges that connect the low-degree nodes, and that of the edges that connect the highand low-degree nodes, this section only presents the analyses of the difference, i.e., the property of the Q S (h)−S Q (h) edges that connects the high-and high-degree nodes.
In Algorithm 2, we introduce the given n nodes into cordered categories. Specifically, line 3 ensures that the nodes with higher degrees are introduced in the categories at the front of the list. When n < c, lines 4 and 5 separately introduce one node in the first n category and let the last c − n Algorithm 2 Node Classification and Sorting 1: Input: High-degree nodes 1, 2, · · · , n and their degrees d 1 , d 2 , · · · , d n , category number c. 2: Output: Node category list L 1 , L 2 , · · · , L c sorted by a certain order. 3: Initialize L j ← ∅ for j = 1, 2, · · · , c, and sort the high-degree nodes by decreasing order of their degrees. Without loss of generality, we assume d 1 ≥ d 2 ≥ · · · ≥ d n . 4: If n < c do 5: Update L j ← {j} for j = 1, 2, · · · , n. 6: Elseifn ≥ c do 7: Derive r = n/c, and initialize L 1 ← {1, 2, · · · , round (r)}, where round (r) rounds r to the nearest integer. 8: For j = 2 : 1 : c do 9: Update L j ← max L j−1 +1, max L j−1 +2, · · · , round (j×r)}, where max (X ) returns the largest element in X . 10: End for 11: End if categories be empty sets. Otherwise, lines 6-10 introduce the n nodes in the c categories and ensure that the cardinality of each category is approximately equal.
Using Algorithm 2, we classify node sets Q S (h) and respectively. In addition, we represent the Q S (h) − S Q (h) connections using joint rank distribution P (r 1 , r 2 ), which is defined as follows: where m denotes the total number of Q S (h) − S Q (h) edges, m (r 1 , r 2 ) is the number of edges that connects a node in L Q S r 1 and another node in L S Q r 2 , and r 1 and r 2 are the ranks of L Q S r 1 and L S Q r 2 , respectively, in the sorted node categories. Fig. 12(e) shows that for any given r 2 , joint rank distribution P (r 1 , r 2 ) tends to be stable and can be characterized using a quintic fitting curve. In addition, Fig. 12(f ) shows that the S Q (h) nodes with higher r 2 tend to be connected to more high-degree Q S (h) nodes with lower r 1 .

G. (DOTTED) NODE MAPPING COMPONENT
Figs. 3 and 4 show that except for the solid-edge sets, our structural model contains dotted-edge sets, which establish injective or bijection relationships between two node sets U and W . Specifically, u ∈ U and w ∈ W are in the same node if the dotted edge maps u to w. Thus, we can derive that U = e ≤ W , where e is the number of dotted edges. Figs. 3 and 4 show eight node-mapping components associated with the dotted edges, which implement the merging of the solid components, as presented in the analysis in Sections IV.A-IV.F. This section presents the use of a joint rank distribution to represent the U − W connections. For each dotted (node mapping) component U − W , we let G (U ) and G (W ) be two solid components that include U and W , respectively, and define d G(U ) (u) and d G(W ) (w) as the degrees of node u ∈ U in G (U ) and node w ∈ W in G (W ), respectively. Then, we can decompose U and W into high-degree and non-high-degree node sets using d G(U ) (u) and d G(W ) (w), respectively, as listed in Table 4.
In Algorithm 3, we introduce all the nodes in set U , which belong to dotted component U − W , into c + d ordered categories, where c is an input number and d is the degree threshold, as listed in Table 4. Specifically, line 3 decomposes U into high-degree node set U H and non-high-degree node set U N based on Table 4. Line 4 calls Algorithm 2 to divide all nodes U H into top c-ordered categories. In addition, for any k ∈ {1, 2, · · · , d}, lines 5-10 introduce all nodes u in where d G(U ) (u) is the degree of node u in solid component G(U ), as listed in Table 4.

Algorithm 3 Node Classification and Sorting in Dotted
Components 1: Input: Node set U , solid component G (U ), node degree d G(U ) (u) for ∀u ∈ U and category number c. 2: Output: Node category list L sorted by a certain order. 3: Decompose U into a high-degree node set U H and a non-high-degree node set U N based on Table 4. 4: Derive the node category list L 1 , L 2 , · · · , L c of U H using Algorithm 2. 5: Let d be the degree threshold shown in Table 4 that is the maximum degree of the non-high-degree nodes. Initialize k ← d and i ← 1. 6: While k ≥ 1 do 7: Derive a node set S = u u ∈ U N ∧ d G(U ) (u) = k , and let the (c + i) th node category L c+i be S. 8: Update k ← k − 1 and i ← i + 1. 9: End while 10: Update L ← L 1 , L 2 , · · · , L c , L c+1 , · · · , L c+d .
Then, we define the joint rank distribution of the U − W edges as P (r 1 , r 2 ) = m (r 1 , r 2 ) m, where m (r 1 , r 2 ) is the number of dotted edges that maps the UL r 2 nodes to the WL r 1 nodes and m is the total number of U − W edges.
By considering the Q P (h) , Q P (l)−Q U cor (h, m, l) , Q B bip (l) edges as an example, Fig. 13(a) shows that for any given r 2 , joint rank distribution P (r 1 , r 2 ) tends to be stable and can be represented using a quintic fitting curve. Fig. 13(b) shows that the Q P (h) , Q P (l) nodes with higher r 2 tend to be mapped to the Q U cor (h, m, l) , Q B bip (l) nodes with higher r 1 . We note that K S (l) in the K Q (l) − K S (l) dotted component cannot be classified using Algorithm 3 because all the degrees of the K S (l) nodes in the K − S solid component are one. To classify K S (l) into the c categories, we sort all the S K (h), S K (l) nodes as S 1 , S 2 , · · · , S n according to a decreasing order of d K −S (v), which is the degree of node v in K − S, and classify them into c categories: Because each K S (l) node is connected to only one S i (i ∈ {1, 2, · · · , n}) node in K − S, the S i node in the c categories can be replaced by all the K S (l) nodes that are connected to S i , i.e., K S (l) is classified into the c sorted node categories using the classification of the S K (h), S K (l) nodes.

V. INTERNET-TOPOLOGY GENERATOR SICPS
Our structural model decomposes the Internet topology into many bipartite-graph components, which remain stable in terms of statistical features over a span of 15 years except for the top five highest degrees of transit nodes. However, the unstable factor shows the trend of the Internet, i.e., increasingly more transit nodes tend to be connected to a few highest degree core nodes. This section describes the design of Internet-topology generator SICPS based on our structural model and its evolutionary stability. Specifically, SICPS first generates eight solid components using the stability of the node and the edge properties analyzed in Sections IV.A-IV.F. Then, it realizes the merging of the eight solid components using the dotted-edge properties analyzed in Section IV.G.

A. Q cor COMPONENT GENERATION
According to the analysis presented in Section IV.A, we design Algorithm 4 to generate the Q cor component, which consists of two steps, namely, node and edge generations.
In Algorithm 4, lines 3-6 create nodes and their predefined degrees. We note that the free degree of a node is equal to its predefined degree minus the number of edges that have been connected to the node. Lines 7-9 use the density curve to create Q U cor (h) − Q U cor (h) edges. Lines 10-19 use the preference attachment distribution to create Q U cor (h) − Q U cor (m, l) edges, and lines 20-29 use the uniform Algorithm 4 Generation of the Q cor Component 1: Input: Exploration time t. Node properties: linear fitting lines of Q U cor (h) and Q U cor (l) , quadratic fitting curve of Q U cor (m) , degree distribution P l (k) of Q U cor (l) nodes where k ∈ {1, 2, · · · , 10}, degree distribution P m (k) of Q U cor (m) nodes where k ∈ {11, 12, · · · , 99}, fitting curves f 1 (t) , f 2 (t) , · · · , f 5 (t) of top 5 highest-degrees of Q U cor (h) nodes, fitting curves S (i) and I (i) associated with the slope and intercept of the linear relation of the i th highest-degree of Q U cor (h) nodes respectively. Edge properties: density curve D (i) i = 1, 2, · · · , Q U cor (h) , preference attachment distribution P c (k) defined as the probability that a Q U cor (h) − Q U cor (m, l) edge connects a k-degree Q U cor (m, l) node. 2: Output: The Q cor component. 3: Derive the high-degree node number h = Q U cor (h) , the middle-degree node number m = Q U cor (m) and the low-degree node number l = Q U cor (l) at the exploration time t. 4: Generate h high-degree nodes 1, 2, · · · , h and assign the degree d i to the node i (i = 1, 2, · · · , h), where 5: Generate m middle-degree nodes h + 1, h + 2, · · · , h + m and assign the degree d h+i = k ∈ {11, 12, · · · , 99} to the node h + i (i = 1, 2, · · · , m) using the distribution P m (k). Assume d h+i ≥ d h+i+1 for 1 ≤ i ≤ m − 1. 6: Generate l low-degree nodes h + m + 1, h + m + 2, · · · , h+m+l and assign the degree d h+m+i = k ∈ {1, 2, · · · , 10} to the node h + m + i (i = 1, 2, · · · , l) using the degree distribution P l (k). Assume d h+m+i ≥ d h+m+i+1 where i = 1, 2, · · · , l − 1. 7: For i = 2 : 1 : h do Uniformly at random, extract x nodes from {1, 2, · · · , i − 1} that still have at least one free degree, and connect node i to the x nodes. 9: End for 10: Define Uniformly at random select a k ∈ {11, 12, · · · , 99}. Extract the k-degree node subset Y in {h + 1, h + 2, · · · , h + m} that still have at least one free degree, connect node h + m + i to a randomly-selected node in Y that have not been connected by node h + m + i and update x ← x − 1. 24: End while 25: End for 26: Let x be the total number of free degrees of all the nodes h + 1, h + 2, · · · , h + m, and initialize t c ← 0. 27: While x ≥ 2 ∧ t c < 500 do 28: Uniformly at random select k 1 , k 2 ∈ {11, 12, · · · , 99}.
If the two nodes can be extracted, connect the two nodes and update x ← x − 2, otherwise update t c ← t c + 1. 29: End while distributions to create Q U cor (m, l) − Q U cor (m, l) edges. We note that line 10 defines another sorting method of highdegree nodes because the experimental results show that this method can minimize the number of free degrees while maintaining the P c (k) distribution and avoiding multiple edges. Algorithm 4 avoids multiple edges by prejudging whether the two nodes are connected. Because all the nodes and their degrees are predefined, we can apply 1 × e array L and two 1 × n pointers P 1 and P 2 to describe all the edges in the component, where e is the sum of all the degrees and n is the total number of nodes. We initialize P 1 (i) = P 2 (i) = i−1 j=1 d i + 1 for arbitrary node i ∈ {1, 2, · · · , n}, where d i is the degree of node i, and update P 2 (i) ← P 2 (i) + 1, L (P 2 (i)) = j, P 2 (j) ← P 2 (j) + 1, and L (P 2 (j)) = i for each newly added edge connecting nodes i and j. Then, set L ([P 1 (i) : 1 : P 2 (i)]) always stores all the nodes that have been connected to arbitrary node i. We note that in line 4 of Algorithm 4, the symbol round (x) rounds x to the nearest integer.
In Algorithm 5, lines 3-6 create nodes and their predefined degrees. Lines 7-16 use joint degree distribution P (k 1 , k 2 ) that is derived by P S (k 2 ) to create Y X (l) − X Y (l) edges, and lines 17-6 use preference attachment distribution P c (k) to create Y X (h) − X Y (l) edges. We note that lines 24-26 are used to fill the remaining free degrees of the Y X (h) nodes.
When Algorithm 5 is applied to create the Q bip component, we should map Q U bip (h), Q U bip (l), and Q B bip (l) to Y X (h), Y X (l), and X Y (l), respectively. After the generation of the aforementioned six components, namely,  8) and (9) indicate that the number of these connections can be calculated using the inputs of Algorithm 5, namely, the number of nodes. Thus, using the constant fitting curves of these distributions and the edge numbers, the three types of connections can be easily added to the corresponding components. Moreover, Fig. 11(c) shows that distribution P sl (k), which is defined as the probability that an S S (h) − S S (l) edge connects a k-degree S S (l) node, also remains stable. According to the analysis presented in Section IV.E, the S S (h)−S S (h) edges uniformly connect two S K (h) nodes at random, and the probability that an S S (h) − S S (l) edge connects a k-degree S K (h) node is approximately uniform. Thus, using the above-mentioned stable distribution and edge numbers, the S S (h) − S S (h) and S S (h) − S S (l) connections can also be easily added to the K −S component.

C. S − Q COMPONENT GENERATION
The S −Q component is an extension of the X −Y component, and the main difference between them is that the former has an additional type of connection, namely, the edges with two high-degree ends. Hence, the methods for predefining the degrees, those that connect the high-and low-degree nodes and those that connect the low-degree nodes in Algorithm 5 can also be reused to generate the S − Q component, as indicated by lines 3-5 and 15-16 of Algorithm 6. We note that lines 6-14 of Algorithm 6 use the joint rank distribution to create the connection of the Q S (h) − S Q (h) edges.

D. DOTTED COMPONENT GENERATION
Sections V.A-V.C present the creation of eight solid components of the Internet topology. To obtain a complete graph, we need to merge these components using the eight dotted connections shown in Figs. 3 and 4, which can be modeled by U − W that maps each node u ∈ U to only one node w ∈ W . The instances of U and W and solid components G (U ) , G (W ), including U and W , are listed in Table 4. Once the mapping from u ∈ U to w ∈ W is completed, G (U ) and G (W ) can be combined into one graph by merging u and w to a single node. The generation method of the U − W node mapping connections is demonstrated in Algorithm 7.

VI. EXPERIMENTAL RESULTS
This section presents the comparison of the realistic AS graphs that are provided by the UCLA dataset [1], including the results obtained by our SICPS generator and those by four other generators, namely, Inet-3.0 [10], ORBIS [14], S-BITE [21], and SInetL [29], which are introduced in Section II.B. Inet-3.0 is a classical Internet-topology generator that considers both the hierarchical structure and degree power-law properties. ORBIS aims to obtain the 2K degree distribution, S-BITE captures the topological core structure, and SInetL extracts a subgraph from a given Internet topology while maintaining the normalized Laplacian spectral properties. We note that the comparison uses the graph properties discussed in Section II.A. First, a comparison of the largest Algorithm 6 Generation of the S − Q Component 1: Input: Exploration time t. Node properties: linear fitting lines of Q S (h) , Q S (l) , S Q (h) and S Q (l) , degree distributions P Q (k) of Q S (l) nodes and P S (k) of S Q (l) nodes where k ∈ {1, 2, · · · , 30}, fitting curves associated with the slope and intercept of the linear relations of the i th highest-degree of Q S (h) and S Q (h) nodes. Edge properties: three preference attachment distributions, P Q,c (k) defined as the probability that a Q S (h)−S Q (l) edge connects a k-degree S Q (l) node, P S,c (k) defined as the probability that a S Q (h)− Q S (l) edge connects a k-degree Q S (l) node, P S (k 2 ) defined as the probability that a Q S (l) − S Q (l) edge connects a k 2degree S Q (l) node. Joint rank distribution P (r 1 , r 2 ) defined in Eq. (11), which is the probability that a Q S (h) − S Q (h) edge connects r 1 -rank and r 2 -rank node categories. Derive z = round (e h × P (r 1 , r 2 )), and initialize t c ← 0. 10: While z ≥ 1 ∧ t c = 0 do 11: Extract a node v 1 in the category L Q (r 1 ) and a node v 2 in the category L S (r 2 ) where v 1 and v 2 are not connected and have maximum min (fd (v 1 ) , fd (v 2 )). If the two nodes can be extracted, connect the two nodes and update z ← z − 1, otherwise update t c ← 1. Note that fd (v) is defined as the number of free degrees of node v.

12:
End while 13: End for 14: End for 15: According to P Q,c (k), connect Q S (h) − S Q (l) edges using the methods of lines 17-26 in Algorithm 5. 16: According to P S,c (k) , connect S Q (h) − Q S (l) edges using the methods of lines 17-26 in Algorithm 5.
AS graph of the UCLA dataset, which was investigated in January 2015, is shown in Fig. 14.
Because the output of SInetL is a series of subgraphs of the largest AS graph, the simulated SInetL graph related Algorithm 7 Generation of the U − W Node Mapping Connections 1: Input: Two node sets U and W where U ≤ W and U − W is one of the eight dotted connections in Figs. 3 and 4; two solid components G (U ) and G (W ) that include U and W respectively; category numberc; degree threshold d in Table 2; the joint rank distribution P (r 1 , r 2 ), which is the probability that a U − W edge connects two nodes respectively belonging to r 1 -rank W category and r 2 -rank U category. 2: Output: The U − W node mapping connections. 3: Respectively classify U and W into two sorted category lists UL (1) , UL (2) , · · · , UL (c + d) and WL (1) , WL (2) , · · · , WL (c + d) sing Algorithm 3, sort all the nodes in U as u 1 , u 2 , · · · , u m and initialize z (w) ← 0 for each node w ∈ W . 4: For i = 1 : 1 : m do 5: Determine rank r 2 where u i ∈ UL (r 2 ), and initialize b ← 0. 6: While b = 0 do 7: For the given rank r 2 , randomly select a rank r 1 ∈ {1, 2, · · · , c + d} with the rank distribution P (r 1 ) = P (r 1 , r 2 ) r 1 P (r 1 , r 2 ). 8: Uniformly at random, select a node w ∈ WS, establish a dotted connection between u i and w, namely u i and w are viewed as the same node in the AS-level Internet topology, and update b ← 1 and z (w) ← 1. 10: End if 11: End while 12: End for to January 2015 is the largest AS graph. Thus, the results of SInetL are not shown in Fig. 14. Fig. 14(a) shows that SICPS and all other generators perform well in terms of the degree power-law property because SICPS adopts the local degree distributions of the different structural components as inputs and the other three generators use three types of global degree distributions as inputs. From the comparison of Fig. 14(b)-(d), we find that SICPS performs best on all three properties because it partitions the AS graph into atomic-level components, separately generates diverse solid components, and merges them using different dotted connections. The structural decomposition feature enables SICPS to capture not only the global degree properties but also the correlation among the different local components. The average neighbor connectivity represents the correlation of the node degrees, which is not adopted by Inet-3.0 and S-BITE. Because ORBIS uses the 2K degree distribution, which is actually the joint degree distribution, it performs well, as shown in Fig. 14(b) and (d). However, Mahadevan et al. [14] indicated that ORBIS does not consider the clustering property. Fig. 15 shows our comparison of the five generators, namely, Inet-3.0, ORBIS, S-BITE, SInetL, and SICPS, using degree k where F k = d >k P d and P d is the probability that a randomly selected node is d -degree. (b) Average neighbour connectivity K k vs. degree k where K k is simply the average neighbour degree of the average k-degree node [32]. (c) Clustering coefficient C k vs. degree k where C k = 2m k / k k − 1 and m k is the average number of links between the neighbours of k-degree nodes. (d ) Percentage of total nodes vs. node categories P, Q, I, J, K and S defined in Eq. (3) that are the basic node classification of our structure model. a series of AS graphs in the UCLA dataset investigated from January 2001 to January 2015.
Figs. 15(a) and (b) show that SICPS performs well on both the distance and clustering properties because it can accurately model the local structure and their correlations using the solid and dotted components, respectively. S-BITE accurately captures the core structure of the Internet topology. However, it does not consider the peripheral structure, which accounts for more than 95% of the nodes and is more important for the performance in terms of the statistical characteristics. S-BITE performs well in terms of the clustering coefficient because it uses statistical parameter p, which is the probability that a newly added peripheral node connects to two interconnected core nodes, to control the property [21]. ORBIS performs well in terms of most of the properties but neglects the clustering property [14]. Inet-3.0 captures the degree of the power-law properties, but it does not consider the degree correlation [10]. We note that the degree and rank correlations are critical tools for modeling the solid and dotted components of our structural model. Hence, SICPS also performs best in terms of the assortativity and maximum degree properties, as shown in Figs. 15(c) and (d). SInetL extracts a series of subgraphs from the given unique AS graph that was explored in January 2015 while maintaining some properties of the AS graph. In other words, it neglects the evolution of the UCLA dataset from 2001 to 2015, as shown in Figs. 15(a) and (e). Fig. 15(e) shows that the average degrees of the graphs simulated by SICPS are less than those VOLUME 8, 2020 of the real-world AS graphs because some free degrees presented in Section V have not been filled and the L and noise nodes defined in Eq. (3) are removed to simplify the problem. However, these phenomena also exist in S-BITE, ORBIS, and Inet-3.0. Finally, we compare the runtimes of the five generators, as shown in Fig. 15(f ). SICPS, ORBIS, and Inet-3.0 all generate graphs by filling the free degrees; thus, the time complexity of the three generators is O ( E ), where E is the total number of edges. In [29], the time complexity of SInetL was proven to be O E 2 . For each newly added peripheral node, S-BITE first randomly selects a node that already exists in the network and then connects the peripheral node to the selected node. Because the peripheral nodes account for more than 95% of the Internet topology, the generators that are based on the free-degree filling realize better time efficiency.
To quantitatively compare the five generators using the results shown in Fig. 15, for each graph property x in explored graph G e and corresponding simulated graph G S , we define where x (G s ) and x (G e ) denote the property values of x in G S and G e , respectively, and then define average deviation degree D (x, y) of the UCLA dataset for a series of graphs simulated by certain generator y as follows: where UCLA is the set of explored AS graphs included in the UCLA dataset, UCLA is the cardinality of the set, and y is the generator that simulates graphs G s for the explored AS graphs G e ∈ UCLA.
In Table 5, for graph properties x ∈ {R, C, r, k max ,k} and y ∈ {Inet − 3.0, ORBIS, S − BITE, SInetL, SICPS}, we list average deviation degrees D (x, y) and the average runtime of the five generators. The list in Table 5 illustrates that SICPS performs best on properties x ∈ R, r, k max . Further, we observe that D k , SICPS is closer to D k , Inet − 3.0 and D k , S − BITE . Therefore, from the perspective of multiple factors, SICPS demonstrates a greater advantage.

VII. CONCLUSION AND FUTURE WORK
The Internet is a complex network system. According to different requirements, researchers can model the Internet topologies from various perspectives such as macro, micro, wired, wireless, and information centric networking. From different perspectives, the physical meaning of nodes and edges in the topology may be quite different. In the present study, we focus on the AS-level Internet topology in which the nodes represent ASs and the edges describe the data-communication paths among these nodes. Research on the AS-level topology can satisfy many application requirements of interdomain systems, such as routing optimization.
Although many AS-level Internet-topology models have been studied, the models based on deep structural decomposition still need to be seriously investigated because the exponential growth of the topological scale has become a critical obstacle in the analysis of the current Internet behavior [38,39]. Structural decomposition is a useful means to reduce the complexity of the problems.
In the present study, we introduce the periphery that accounts for more than 95% of AS nodes into the structural model and decompose the Internet topology into 16 atomiclevel solid and dotted components from the viewpoint of local connection and evolutionary stability. In contrast to the global statistical characteristics, our structural decomposition model is helpful for researchers to more precisely distinguish the Internet interdomain topology from other complex networks. In addition, our structural model provides many adjustable local characteristic parameters, which can help engineers generate topological environments in diverse scenes.
According to the UCLA dataset that spans 15 years, we obtain many uniform distribution characteristics that exist in the decomposed components of the Internet topology. In contrast to the power-law distribution, the uniform distribution is simpler. The discovery of these simpler distribution characteristics is more conducive to the recognition of the evolution stability of the Internet topology.
In addition, we find that most of the node and edge properties of these components remain constant except for the top five highest degrees of transit AS nodes. The inconstant property implies that the top five transit AS nodes are attracting more nodes (especially other transit AS nodes) to connect to them. The evolution of the Internet topology cannot be stable at all times. In other words, capturing the inconstant property is also important for the recognition and prediction of the topology. Furthermore, we design topology generator SICPS based on our structural model. The comparison results show that SICPS performs best on both global statistical and local structural properties. Although our generator needs more detailed statistical parameters from the 16 components as inputs, we can view the inputs as an accurate portrait of the Internet topology, which is an important reflection of the accuracy of the structural models.
In future work, we will analyze the relationship between the Internet behavior and the 16 decomposed components and apply our structural model and topological generator to network management and other engineering fields.