Topology of International Supply Chain Networks: A Case Study Using Factset Revere Datasets

International supply chain networks play a prominent role in shaping the economic outlook of the world. It has been a recent trend to analyse the topology of supply chain networks in order to gain a wholistic understanding about the interdependencies of firms in this regard. In this work, we undertake an extensive structural and topological analysis of the supply chain networks constructed from the Factset Revere dataset. The dataset is provided by FactSet Research Systems Inc. that captures global supply chain relationships between companies. The dataset consists of 154, 862 companies from 216 countries, with 1,571, 949 supply relationships among them. In addition to considering the global network, we also analysed country-specific networks of ten countries, which are the most significant nations represented in the dataset. The analysis revealed that all supply chain networks studied were relatively sparse scale-free networks, with scale-free exponents ranging from 1.0 to 2.0. In terms of centrality analysis, quite predictably, large multi-national corporates dominated. Comparing the centrality values of firms in terms of the global vs the country-specific networks, two classes of firms were found where the difference in centrality was significant. The first group was small firms with locally-centered business operations, such as Volunteers of America, New York State Teachers Retirement System, CarePlus Health Plan etc, where the country-based centrality scores and the rankings based on them were significantly more prominent than the global equivalent. The second group was firms with specific countries of origin which register themselves in other countries, such as China Shengda Packaging Group Inc (registered in US), Chinacast Education Corps (registered in the US), and China Biologic Products Inc (registered in the US). These firms all had significantly higher global centrality scores compared to country-based centrality scores. Overall, however, it was found that there was strong correlation between global centrality-based ranking and country-specific centrality ranking of firms. This indicated that in general, firms which are important to the global supply chain network are also important to the supply chain networks of individual countries. Studying the community structure of the supply chain networks, we identified twelve dominant communities, many of which had significant correlations with particular industries or countries. Some of these communities were made of firms primarily from a pair of countries, or had other interesting features. Therefore, the topological analysis of the supply chain networks created from this large dataset gives interesting insights about how the international supply chain networks are structured, and how they operate.


I. INTRODUCTION
International supply chains are becoming increasingly interconnected, forming Supply Chain Networks (SCNs), which The associate editor coordinating the review of this manuscript and approving it for publication was Bohui Wang . display features of complex networks [1]- [10]. Therefore, it becomes necessary to study the topological features of such networks, in order to understand their interdependency, evolution, robustness and resilience. A number of recent studies have looked at various supply chain networks, analysing their topological structure and growth patterns [2]- [5], [7]- [10].
In this paper also we attempt such a task, with a specific Supply Chain Network (and a number of country specific subnetworks) whose topological properties have not been studied extensively before.
The SCNs considered here are generated from the Factset Revere dataset, which is a global dataset containing information about mainly publicly listed firms. We analyse the global supply chain network and ten country specific networks generated from it, considering basic topological metrics, such as centrality measures, clustering, path length, measures of scale-freeness etc. We also consider the relationship between country-specific and global rankings of firms based on these metrics, identifying firms which have a global importance not reflected on the country-specific subnetwork, and viceversa. Further, we undertake a community structure analysis of the global network and country-specific subnetworks, highlighting the interplay between specific industries, trade clusters, and international financial and political relationships. Our results show that the sparse scale-free networks created by the supply chains, at international as well as domestic levels, display strong correlations in terms of global and domestic centrality rankings; but there are groups of firms which do not adhere to this pattern, and display a higher prominence in the domestic network compared to the global network or vice versa. Our analysis further highlights some interesting features in terms of community formation between firms belonging to different economic power houses, such as China, the US, and Japan. The conducted analysis provides useful insights about the interconnectedness of the international supply chain networks, how they operate, and how they evolve. This paper makes three important contributions to the modelling and analysis of supply chain networks: firstly, it provides insight into the topology of supply-chain networks, and establishes that they are typically scale-free networks with scale-free exponents which are smaller than those found in most other real world scale-free networks. Secondly, it undertakes comparative centrality analysis which compares the local and global importance of firms, and establishes that there is strong correlation between these in most cases, and highlights the nature of companies which violate this general rule: that is, the paper sheds light into the features needed for companies to be more locally central, or be more globally central, in supply chains. Thirdly, it undertakes community analysis to highlight the interplay between country-based community formation and industry-based community formation among firms. Overall, the contribution of the paper is to shed light on the topological features of typical supply and inter-firm networks, and to demonstrate how such structural features are necessary for these networks to perform their intended functions.
The rest of the paper is organised as follows: section II provides a general theoretical understanding of, and stateof-the-art for, the topological analysis of supply chain networks. In section III, a description of the dataset is provided (subsection III-A), followed by a description of the network creation (subsection III-B), and a list of definitions of the topological metrics and measures used in this work (subsection III-C). Section IV describes the analysis that was conducted and the results obtained. Section V offers a broad discussion, including the advantages and shortcomings of the presented approach, and the novelty and importance of the presented results. Finally, section VI offers a summary of conclusions, and a list of potential future research directions.

II. BACKGROUND
Two important streams of theoretical views can be found in the literature on systems of industrial production. The first stream is based on conventional market theory and posits that markets consist of arms-length transactions between economic actors and it is efficient for them to distribute their transactions across independent partners. This position is also consistent with resource-dependency theory [11], which recommends maintaining unconstrained access to a large number of competing and substitutable partners.
As a reaction to such individualistic views of interorganizational relationships, economic sociologists have proposed a social network view of systems of industrial production. Social network literature focuses on relational considerations between organizations, which were neglected in the original approaches or considered only in their dyadic form in transaction cost economics. These network conceptualizations of systems of industrial production have become increasingly popular as it has become more accepted that organizational behaviour and performance are not well explained by atomistic individualized approaches.
No business link exists in isolation. Two firms may be more likely to establish a partnership if they already have some trusted partners in common. White [12] proposed that in competitive production markets, firms gravitate toward dense cliques of producers watching each other. According to Uzzi [13], it is beneficial for firms to embed their transactions in a dense network of partnerships with other organizations. Close-knit groups can allow for the emergence of trust, free information flows, resource pooling, and collective problemsolving. Uzzi [13] also predicted that within the same context, firms will imitate successful networking strategies therefore converge toward similar arrangements. He et al. [14] showed that dense networks of businesses in China help diffuse shocks and decrease firm's risk of default.
The emergence of clustering in business networks has not been thoroughly explored in the international context in which different mechanisms and constrains may be present compared to the domestic partners. This can be linked to a lack of suitable data and traditional predominance of firm-centric approaches to studies of international business. However, no firm is in complete control of their international networks, let alone the networks of their partners. International business relationships emerge from interactions of all interdependent actors within their business environment. Chandra and Wilkinson [15] argued that because of complex feedback loops, international interfirm partnerships are impossible to predict separately but aggregate structural patterns of internationalization can be explained.
It is in this backdrop that we analyse the topology of international supply chain networks using the Factset Revere dataset, which is more fully described in the next section. Building on network-based theories of international relationships, in this study, we pay particular attention to the complex interactions between firms in the world's three largest economies: (1) the United States, (2) China, and (3) Japan, as these have potentially large effects on the rest of the global economy. It should be noted here that a recent paper by Piraveenan et al. [10] analysed one particular aspect -namely assortativity -of the topology of supply chain networks generated from the Factset Revere dataset. They defined and employed a range of customised assortativity measures in their analysis, and showed that these networks show assortativity in terms of country and level of internationalisation: that is, firms have a slight preference to make supply chain relationships with other firms from the same country, as well as firms which have undergone a similar level of internationalisation. The current study extends the work of Piraveenan et al by analysing the supply chain networks generated from Factset Revere data using multiple topological metrics and concepts. The broad analysis is meant to shed light on clearly identifiable patterns in the topologies of the global as well as country-specific supply chain networks, and act as a catalyst for generating and analysing in-depth questions related to specific topological features, such as the one about assortativity explored already in Piraveenan et al. [10].
It should be noted that a number of recent studies have done work related to this paper: nevertheless, this study offers a unique perspective as described below. For example, Perera et al. [7] undertook a detailed topological analysis of supply chain networks, but focussed on the manufacturing industry only. They focused on comparing the differences in topologies of undirected contractual relationships (UCR) and directed material flow (DMF) supply chain networks, and used datasets collected by Willems [16] and Parhi [17]. While their analysis focussed on a range of basic topological metrics, they have not focussed on global vs local centrality or community structure as this study does. Similarly, Pathak et al. [18] also presented a basic topological analysis, however their focus was on how these topological metrics evolve over time. Hearnshaw et al. [19] also focussed on topological analysis, but their work was more focussed on providing a groundwork for, and justifying the use of, complex network science to analyse supply chain networks. Nayar and Vidal [20] focussed specifically on the robustness of supply chain networks against 'disruptions', and used a multi-agent system to model the supply networks. Thadakamaila et al. [1] also focussed on the 'survivability' (robustness) of supply chain networks modelled through multi-agent systems. Hou et al. [21] also analysed the robustness and resilience of supply chain network topologies using multi-agent system modelling, but they specifically looked at how trust between firms affects the topological evolution of such networks. The focus of Mari et al. [4] was similar, in that they analysed how classical complex network topologies could be adapted to design resilient (robust) supply chain networks. Similarly, Zhao et al. [22] also focussed on the robustness aspect of supply chain networks, analysing their resilience against random and targeted disturbances. Pero et al. [23] focussed on the relationship between structure and function in the supply chain network context. Compared to such existing works, which primarily focus on robustness and resilience and/or the relationship between form and function, the current work is unique in that it focusses, after a succinct topological analysis, mainly on the questions of international importance vs local importance, and understanding the features of, and driving forces behind, community and cluster formation. In short, the current work is unique because it is presented in the backdrop of increasing internationalisation, and attempts to shed light on those topological aspects of supply chain networks which can only be interpreted by comparing international, country-based, and industry-based viewpoints.

A. DATASET
This study uses 2016 FactSet Revere data [24] that includes 154, 862 publicly listed firms and public institutions, including those in the United States (38 708), China (14058), Japan (7411), the United Kingdom (6814), Canada (4287), India (3748), Australia (3155), France (3005), Singapore (1736), and Russia (1202). FactSet collects interfirm relationship data from primary public sources such as investor reports and SEC 10-K annual filings, investor presentations, and press releases. Both relationships disclosed by the company and reverse relationships which are reported by their partners are captured in the data. FactSet Analysts continuously monitor and review the quality of the data.
The FactSet dataset includes 129 categories classifying each firm's main industry type. To investigate the general trends in production network structures across major industrial categories, following Piraveenan et al. [10], we manually matched the original categories with the high level categories of the Standard Industrial Classification system, which is commonly used in the US and the UK, especially by government agencies [25]. The system has ten broad industrial ''divisions'', which in turn are composed of 202 ''industry groups'', according to which we manually matched the Fact-Set industry categories to the high level divisions. These ten divisions are: (1) agriculture, forestry and fishing, (2) mining, (3) construction, (4) manufacturing, (5) transportation, communications, electric, gas and sanitary services, (6) whole sale trade, (7) retail trade, (8), finance, insurance and real estate, (9) services, and (10) public administration. This broad industrial classification of the Factset Revere dataset is shown in Fig. 1.

B. NETWORK CREATION
As mentioned, the Factset Revere dataset that we considered consists of 154, 862 companies from 216 countries, with 1,571, 949 supply relationships among them. We constructed a global network, as well as ten country specific networks from this dataset, representing the US, Russia, China, India, Australia, Japan, Singapore, France, Canada and the UK. We also constructed an EU network, which considered all European Union countries at the time of purchase of this dataset, including the UK. Each country specific network would only include firms from that particular country and the supply relationships among them. Thus, the sum of all links from the country specific networks would be a lot smaller than the total number of links in the global network, even if we consider the country specific networks of all countries represented in the dataset. In our case we only consider 10 countries, therefore the total number of firms in these ten networks is 62,456, while the total number of links is 701,277. Thus, roughly about half the nodes and links in the global network are represented in the ten country-specific networks. Some properties of the ten country specific networks we constructed are shown in Fig. 2.
Of course, the country specific networks had a lot of singleton nodes, but most of these are not singletons in the global supply chain network.

C. NETWORK TOPOLOGY METRICS 1) SCALE-FREENESS MEASURES
Scale-free networks are ubiquitous in real world [26]. In a scale-free network, the degree distribution follows a power law, and the probability of a node to have a degree of k is given by [27]- [30]: where A is a constant and γ is the power law exponent (also referred as scale-free exponent). A higher value of γ results in a degree distribution with a steeper slope, while a lower value of γ results in a flatter degree distribution.
To quantitatively measure the 'scale-freeness' of a particular network, the R 2 -correlation of the degree distribution to a power law can be used. To compute this, the degree distribution of the given network (in log-log scale) should be plotted and a straight-line should be fitted to this distribution (in the form of log(p k ) = −γ log(k)+log(A) ) and the R 2 -correlation (also called the correlation of determination) of the fit should be computed. The R 2 -correlation is computed as: where y i are y-values of the data points,ȳ is the mean y-value of the data points, and f i are the values returned by the fitted function for data points i [31]. This quantity is briefly called 'scale-free correlation' elsewhere in the text, to mean that it is the R 2 -correlation measuring the scale-freeness of a given network.

2) PATH LENGTH AND CLUSTERING MEASURES
The average path length, or the characteristic path length, of a network is simply the average of the length of all paths in that network, computed in terms of number of nodes on each path. The clustering coefficient of a node represents the ratio between the number of links between the neighbours of that node, and the number of all possible links between those neighbours. It is defined as [32]: where k i is the degree of node i, and y i is the number of edges between the neighbours of node i. The network clustering coefficientC is defined as the average of the clustering coefficients of all nodes in that network.

3) CENTRALITY MEASURES
A host of centrality measures have been proposed to analyze complex networks, especially in the domain of social network analysis. The simplest of these perhaps is the degree centrality, sometimes just called degree, of a node. A node's degree is simply the number of links it has with other nodes in the network, and therefore gives some indication about how important that node is to the network. A family of betweenness measures have been proposed [33]- [40] to measure a node's importance as a conduit of information flow in a network. The first and perhaps the most well known measure of these is the classical betweenness centrality measure proposed by [33]. Betweenness Centrality measures the fraction of shortest paths that pass through a given node, averaged over all pairs of node in a network. It is formally defined, for a directed graph, as where σ s,t is the number of shortest paths between source node s and target node t, while σ s,t (v) is the number of shortest paths between source node s and target node t that pass through node v. Closeness Centrality [34], [41] is a measure of how close a network is, on average, to the rest of the nodes in terms of shortest paths. It essentially measures the average geodesic distance between a given node and all other nodes in the network. It is defined as where d g (v, i) is the shortest path (geodesic) distance between nodes v and i. Note that the average is 'inverted' so that the node which is 'closest' to all other nodes will have the highest measure of closeness centrality.

4) COMMUNITY STRUCTURE
The community structure of a network is not a topological metric as such, but a network when partitioned into communities can yield useful information about its structure and VOLUME 8, 2020  Basic topological metrics of the supply chain network for the ten countries of interest. Note well: Nodes (global) represents the number of firms belonging to the given country in the dataset, while Nodes (country network) represent the number of non-singleton nodes in the country-specific subnetwork. Therefore the difference between these quantities represent firms which do not have supply relationships with another firm from the same country. Links (country network) represents the number of links in the country network. The ''average links per node (country network)'' represent the average number of supply relationships a firm has with other firms from the same country, while the ''average degree (country network)'' represents the average number of firms that a firm has supply relationships with in the relevant country. The former is always higher than the later because sometimes a pair of firms may have more than one supply relationship between them in the dataset (multiple links between a pair of nodes are possible). Note well also that the ''average degree (global)'' and ''average links per node (global)'' are not shown in this table, because they are not country-specific.
function. A network is said to have community structure if the nodes of the network can be easily grouped into (potentially overlapping) sets of nodes such that each set of nodes is densely connected internally [42]- [44]. Communities can be non-overlapping or overlapping, but in both cases, the general idea is that pairs of nodes are more likely to be connected if they are both members of the same community or communities, and less likely to be connected if they do not share communities.
There are several algorithms which can be used to partition a network into communities, among which the Louvain method [45] is prominent. The Louvian method optimises the modularity of a network. Modularity is formally defined, for a weighted graph, as: where A i,j represents the edge weights between nodes i, j, k i , k j are the sum of edge weights attached to nodes i, j respectively, m is the sum of all edge weights in the graph, and c i , c j are node communities. The Louvian method is a greedy optimisation method that optimises the quantity Q, and runs with a time complexity of O(Nlog(N )) where N is the number of nodes in the network considered.

A. BASIC TOPOLOGY
We first considered the basic topology of the global network, as well as the country networks of the ten countries we have selected as mentioned above. We analysed the network in terms of basic topological metrics, namely average degree, characteristic path length, network clustering coefficient, scale-free fitness as well as scale-free exponent (naturally, the scale-free exponent is meaningful only in networks where the scale-free fitness is relatively high). These metrics have already been defined in section II. The results of this basic topological analysis are shown in Fig. 2.
From Fig. 2 we may observe that the country networks are relatively sparse: except the US country network, every other country network has an average number of links per node within six to fifteen, depending on the size of the network, which translates to a more or less consistent network density of between 0.104% and 0.117%, which is quite sparse. The US country network has a domestic average number of links per node of 35.25 which corresponds to a network density of 0.055 %, given the relatively large number of nodes, which is even more sparse. We may also observe that all the country networks are scale-free, with scale-free fitness values ranging from 79% to 91% (in section II we have described how the scale-free fitness can be calculated). The scale-free exponents are typically around unity, which is smaller than most real world networks, which have a scale-free exponent between 2.0 and 3.0 [46]. In the case of the country network of Russia, the scale-free exponent is less than one, therefore this network does not have the typical power law distribution associated with scale-free networks [46]. The characteristic path lengths of the networks are between 3.92 and 5.18, which are on the order of the logarithm of the network size. However, the clustering coefficients are not high enough to attribute small-world properties to these networks.

B. CENTRALITY AND RANK-BASED CORRELATIONS
We then computed the centrality distributions of all the country networks. For this, we considered (i) node degree (ii) betweenness centrality (iii) closeness centrality. Figures 3, 4, and 5, show the ten companies that had the highest centrality values by each of these measures in the US, Japan and Australia, which are among the largest country networks in the dataset (the corresponding results for the rest of the countries considered are similar, but have not been presented due to space restrictions). Note well that even though the countries were considered individually, the centrality values were calculated from the single global interfirm network. From these figures we could identify the companies which are most 'central' (and thus, important) in each country. For example, the most influential companies in the United States, according to this dataset, are General Electric, the US Government, IBM, Microsoft Corporation, and Hewlett-Packard corporation when betweenness centrality is used. It might be noted that, considering the size of the dataset (38708 US companies are present in the dataset), the top ten list changes little when different centrality measures are employed. General Electric, the US Government, IBM, Microsoft Corporation, Hewlett-Packard Corporation, and Oracle make the top-ten list regardless of the centrality measure used. A similar observation can be made from Fig. 4 representing Japan, and Fig. 5 representing Australia. For example, Hitachi, Mitsubishi, Sony, Sumimoto, Toyota, Toshiba, Panasonic, and Fujitsu make the top ten list of Japanese firms regardless of the centrality metric used, and BHP Billiton, Australian Government, Worley Parsons, Rio Tinto, Telstra, all make  the top ten list of Australian firms regardless of the centrality metric used. It could be noted that these firms are all wellknown multi-national corporates or government controlled bodies.
What happens if we consider the country specific subnetworks for each country, rather than the single global interfirm network, to calculate the centrality values of firms? Does this significantly change the top-ten list of firms for each country, and does it make the top ten list more diversified across different centrality measures? The answers to these questions can be gleaned from figures 6, 7 and 8, which show the list of top ten firms based on centrality value for United States, Japan and Australia. Obviously, when countryspecific networks are considered, most of the networks are fragmented, so we only consider the largest components of each country specific network. Comparing the top-ten list for United States based on the global network (Fig 3) and based on the country-based network (Fig 6), it is clear that the lists are very similar. Out of 22 firms represented in either list, 10 firms find mention in both lists at least once. and most of these firms find mention multiple times, so that the top-ten lists look very similar. A similar scenario could be observed with respect to Japan and Australia (and other countries that we considered, though the results corresponding to which are not shown), and therefore, it is fair to say that whether global or country-based network is considered in analysing the centrality of companies does not change the top-ten list of companies based on centrality in each country by much.
However, we were interested in not just identifying the firms with the highest centralities in the supply network, but also in identifying firms which have a relatively higher local or relatively higher global importance in each country. Therefore, we ranked the firms in each country based on their 'local' centrality value (centrality values calculated by considering the country networks), as well as the 'global' centrality value (centrality values calculated by considering the single global supply chain network), and plotted global rank vs local rank for each country. This is shown in figures 9, 10 and 11 respectively for the US, Japan and Australia (again, the plots for the other countries are not shown due to space restrictions). We also calculated the correlation coefficient between the global ranks and local ranks for each country. Note well that, even though the global network has firms belonging to all countries, the ranking was done only   to those firms which belonged to the country under consideration. Thus, the 'lowest' rank would equal to the size of the country network, for both local and global ranks.
We may note from figures 9, 10 and 11 that the correlation between global and local centrality ranks is high. For the three countries under consideration, across different centrality measures, this correlation ranges from 95.3% to 58.1%, though it is typically above 80%. This correlation between global and local centrality ranks is also high for other countries that we studied. VOLUME 8, 2020

C. OUTLIER FIRMS BASED ON CENTRALITY MEASURES
Which are the firms which are the 'outliers' in figures 9, 10, and 11 -that is, among the firms belonging to each country (US, Japan and Australia) which have the most difference in terms of their local and global importance? In figures 12, 13, 14, we show firms which have the highest difference in rank based on global-network based centrality measures and country-network based centrality measures. For comparison, clustering coefficient was also considered, which it can be argued is a sort of centrality measure. The 'error' in centrality rankings was calculated by fitting a straight line to the global rank -local rank plot for the relevant centrality measure, and calculating the difference between the global rank and the fitted value for the given firm. Thus, this 'error' can be a decimal number, though it is calculated from ranks which are integer numbers. Firms which have a negative 'error' -that is, much lower centrality ranking based on the global network compared to the norm for firms in that country, are much more important in the global interfirm network and less important in their domestic network. Similarly, firms which have a positive 'error' -that is, much higher centrality ranking based on the global network compared to the norm for that country, are very prominent firms domestically but not so in the international scene. We were interested in finding which are the 'outlier' firms in this respect in all countries that we considered, and whether there are any similarities between these outliers, and some of the results for the prominent countries that we study are shown in figures 12, 13, 14. Note well here that, since we consider ranks based on centrality in figures 9, 10, and 11, the lower the rank, the more important the firm is with respect to the metric in question. However, in figures 12, 13, 14, the errors were constructed as the difference between actual global rank and 'fitted' global rank (indicating the 'expected' global rank for each firm based on its local rank), so if the error is positive, the firm is more locally important, while if the error is negative, the firm is more globally important.
In the case of United States which has the highest representation in the dataset, we could see from Fig. 12 that the 'outlier' firms which are comparatively much more important in global scene are, mainly of Chinese origin, or from other overseas origins, but considered US companies because they are registered in the US. Naturally, they do most of their business with Chinese or other overseas firms, and thus their global importance is significantly higher than their domestic importance in the US interfirm network. On the other hand, firms which have relatively higher domestic importance compared to global importance are typically small-sized firms, which serve a niche market in the domestic population, such as Volunteers of America, New York State Teachers Retirement System, CarePlus Health Plan, AQR Capital management holdings PLC etc. Big Corporates or Internationalised firms are not represented among these outliers, which makes sense.
A similar situation exists in Japan, though less pronounced, as shown in Fig. 13. The Japanese firms which are locallycentral outiers are all small firms which appear to serve some local niche market, such as Sakai moving service, Nishi Nippon City Bank Employee Stock Ownership plan, Asahimatsu foods etc. No large and/or well known Japanese corporate is represented among them. On the other hand, the firms which are the globally central outliers include several companies with apparently non-Japanese roots, even though they are registered in Japan. Examples of these are EPS Corp, Roland Corp, Innotech Corp, Cresco Ltd etc. Therefore, it seems that considering the outliers based on the local-global centrality profiles is an effective way to determine the nature of relative local/global importance of a firm in a supply chain network. This pattern is repeated in most other countries that we have analysed.
When we consider Australia though, as shown in Fig. 14 the picture is less clear. For example, universities are represented as both globally central outliers and locally central outliers. However, this could be due to some universities having overseas campuses, and thus being forced to do business with lots of overseas firms, while others do not. Both globally central outlier firms and locally central outlier firms appear to be mostly small and relatively unheard-of firms, within which it seems hard to form a distinction.

D. COMMUNITY STRUCTURE OF INTERFIRM NETWORKS
Next we analysed the community structure of the global interfirm network, in order to understand firms from which countries or which industries form closest supply connections VOLUME 8, 2020 within themselves. To this end, we applied the Louvain method [45] on the global interfirm network, which has 154,862 nodes and 1,571,949 links, to create a communitybased partition. We found a total of 1014 communities, of which 12 communities had more than 2000 nodes and 18 communities had more than 50 nodes. In the following analysis, we choose to focus on the largest 12 communities (which each had more than 2000 nodes), and any company which did not belong to any of these communities is denoted as belonging to the 'other' community. Fig. 15 shows the percentage of firms from each country we considered in the largest 12 communities generated by the Louvain method (with the 13th community representing the ''other''). Conversely, Fig. 16 shows the percentage of firms from each community present in each of the countries we considered. In other words, Fig. 15 shows the country distribution among communities, whereas Fig. 16 shows the community distribution among countries. From Fig. 15, we may see that US firms dominate communities 0,1, 5 and 6, while Japanese firms dominate community 7, Chinese firms dominate community 9, and all other communities have firms from the 'rest of the' countries (countries outside the list of specifically considered countries in this paper), cumulatively, as the majority. On the other hand, from Fig. 16, we may see that the highest proportion of US firms are in community 0, the highest proportion of Chinese firms are in community 9, the highest proportion of Canadian, Australian, Russian and Indian firms are in community 2, the highest proportion of Singaporean firms are in community 11, the highest proportion of French, and European Union (within which France is also counted) firms are in community 4, and the highest proportion of Japanese firms are in community 7. So it is important to note that if a community has firms from a certain country in the majority, this does not imply that the majority of firms from that country are in that community. Conversely, if the majority of firms from a particular country are in a community, this also does not imply that firms from that country are a majority in that community. That statement is true only for community 0 -US, community 7 -Japan, and community -9, China. Even in these communities, 'majority' simply implies the highest proportion, and not necessarily more than half. Fig. 17 shows the industries present in each of these communities, according to the industry classification presented 154550 VOLUME 8, 2020 FIGURE 12. Outlier firms in US based on centrality scores. The ''centrality errors'' shown represent the difference between the global rank of a firm based on the relevant centrality, and the fitted value of the global centrality rank -local centrality rank profile, as shown in Fig. 9. A relatively higher rank (numerically) would imply relatively lower importance. Therefore a positive ''error'' corresponds to higher local (country-based) importance, and a negative ''error'' corresponds to higher global importance: (a) Betweenness-centrality based errors (b) Closeness-centrality based errors (c) Degree-centrality based errors (d) Clustering-coefficient based errors. earlier in Fig. 1. It could be seen from Fig. 17 that certain industries are predominantly present in certain communities. For example, transportation is predominant in community 2, manufacturing is predominant in community 3, finance in community 5, and service sector in communities 1 and 6. The other communities were not associated with predominant industries to the same extent, even though manufacturing, finance and service industries together made up most of the firms in most other communities as well. Therefore, it seems that a significant proportion of Canadian, Australian, Russian and Indian firms are in the transportation community (it could be noted that these are all large countries which require considerable transport infrastructure), whereas all the other communities which correspond to the largest share of firms from a particular country are non-descriptive in terms of industry. This could be corroborated by Fig. 1 which shows that Russia, Canada, and Australia are the three countries where the transport sector firms make up the highest percentages.
If we consider community 2 (which is dominated by transportation), firms from the European Union (16.93%), the US (15.66%) and China ( 10.27%) are most prominent in it, as Fig. 18 a shows. Similarly, If we consider community 3 (which is dominated by manufacturing - Fig. 18 b), firms from The European Union (23.18%), the US (22.85%) and China (10.15%) are again most prominent in it. Community 5 (which is dominated by finance - Fig. 18 c) is dominated by firms from US (32.54%) and the European Union (25.33%). Community 6 ( which is dominated by service - Fig. 18 d) has more than half US firms (50.62%) and a considerable percentage of European Union firms (18.42%).
The other communities are not dominated by any particular industry, even though community 7 (Fig. 19 a) has an interesting feature. It is dominated by Japanese firms (58.52%).

FIGURE 13.
Outlier firms in Japan based on centrality scores. The ''centrality errors'' shown represent the difference between the global rank of a firm based on the relevant centrality, and the fitted value of the global centrality rank -local centrality rank profile, as shown in Fig. 10. A relatively higher rank (numerically) would imply relatively lower importance. Therefore a positive ''error'' corresponds to higher local (country-based) importance, and a negative ''error'' corresponds to higher global importance: (a) Betweenness-centrality based errors (b) Closeness-centrality based errors (c) Degree-centrality based errors (d) Clustering-coefficient based errors.
Moreover, the firms with the next highest presence are Chinese firms (13.20%), and it appears that firms from these two countries in this community are very tightly connected to other firms from their own nation. Similarly, community 9 ( Fig. 19 b) is dominated by Chinese firms (65.4%), and firms from no other single country have significant presence. Conversely, we may note that 45.7% of all Japanese firms are present in community 7 (thus, even though Japanese firms are a majority in community 7, community 7 firms are not a majority among Japanese firms), while 29.37% of all Chinese firms are present in community 9 (again a majority of Chinese firms are not present in community 9, even though a majority of community 9 firms are Chinese). Now let us consider the reverse scenario: that is, identifying communities in individual country networks. For this purpose, again we applied the Louvian algorithm, but this time on country supply chain networks rather than the global supply chain network. The results for the United States, Japan, Australia, and China are shown in figures 20 a, 20 b, 20 c, and 20 d respectively. We could see from Fig. 20 a that the US interfirm network is dominated by communities 0, 1 and 4 (which are themselves dominated by manufacturing, again manufacturing, and service industries, respectively), while communities 2, 3 and 5 also have more than 10% representation. This tallies well with Fig. 1 which shows that the US interfirm network is dominated by manufacturing and service industries. Similarly, we may observe from Fig. 20 b that the Japanese interfirm network is dominated by communities 7 and 3 (where community 7 represents multiple industries, community 3 is dominated by manufacturing). This tallies well with Fig. 1 which shows that the Japanese interfirm network is dominated by manufacturing industry. Further, Fig. 20 c shows that the Australian interfirm network is dominated by communities 2, 4, and 5(which are themselves dominated by transportation, manufacturing and finance respectively). This again corresponds to Fig.1, which shows  FIGURE 14. Outlier firms in Australia based on centrality scores. The ''centrality errors'' shown represent the difference between the global rank of a firm based on the relevant centrality, and the fitted value of the global centrality rank -local centrality rank profile, as shown in Fig. 11. A relatively higher rank (numerically) would imply relatively lower importance. Therefore a positive ''error'' corresponds to higher local (country-based) importance, and a negative ''error'' corresponds to higher global importance: (a) Betweenness-centrality based errors (b) Closeness-centrality based errors (c) Degree-centrality based errors (d) Clustering-coefficient based errors. that the Australian interfirm network is dominated by firms from finance, manufacturing and transport industries, besides the services industry which is the third dominant industry overall, but does not seem to be represented by any specific large community in Australia. Finally, Fig. 20 d shows that the Chinese interfirm network is dominated by communities 9, 2 and 3 (which are themselves dominated by manufacturing, transportation, and again manufacturing respectively).
This again corresponds to Fig.1, which shows that the Chinese interfirm network is dominated by firms from manufacturing, though transport is not that dominant.

V. DISCUSSION
The extended network analysis of interfirm networks undertaken in this work provided several interesting results. In terms of the basic topology, we observed that the global VOLUME 8, 2020  as well as the ten country-specific networks considered are all scale-free, even though their scale-free exponents ranged from 1.0 − 2.0, rather than the range of 2.0 − 3.0 observed in most other real world networks. The networks were also relatively sparse, and displayed some small-world properties. Then we identified the firms which were the most central  in each country-networks, based on a number of centrality measures. It was found that these were, predictably, quite often large multi-national corporates and government bodies. However, certain firms which are small firms with overseas origins registered in a country, such as China Shengda Packaging Group Inc (registered in US), Chinacast Education Corps (registered in the US), and China Biologic Products Inc (registered in the US), had higher centrality when the global supply chain network was considered as opposed to their own country-specific networks, and some other firms, such as Volunteers of America, New York State Teachers Retirement System, CarePlus Health Plan etc, had higher centrality when the local (country-specific) network was considered as opposed to the global supply chain network. To further analyse this trend, we considered the correlation between global centrality and country-specific centrality for all firms in a specific country. In general, this correlation was quite strong for all countries considered by us, ranging from 95.3% to 58.1%, and typically above 80%. However, certain firms were clear outliers in the global centrality -local centrality profiles of each country. Quite often, these firms seemed to be either firms which had some locally focussed business model, or, on the other side of the spectrum, locally registered firms with oversees roots. Some good examples are New York State Teachers Retirement System and China Shengda Packaging Group Inc registered in the United States.
Then we analysed the community structure of the global supply chain network, and applied the communities identified in the global network to classify nodes in the countryspecific subnetworks as well. We used the Louvian method for community clustering, and found that there were 12 primary communities in the global network which had number of nodes more than 2000. Some of these communities were primarily aligned with certain industry sectors, such as community 2 which is aligned with the transportation sector, and community 5 which is aligned with finance sector. Some other communities had majority representation from certain countries, such as community 6 which had 50.62% US firms, or community 7 which had 58.52 % Japanese firms. Conversely, some country networks also had majority representation from certain communities. For example, Japanese country network had 45.7% of community 7, while Russian country network had 44.34% of community 2. It is important to note however that the majority representation relationships between countries and communities are not mutual: that is, if a community has a majority representation from a certain country, this does not mean that that country will have majority representation from the corresponding community, and vice versa. For example, the sixth community network has a majority of nodes (50.62%) from the US, however the corresponding US country network has only 3.8% nodes from community 6, which is not a numerical majority, and not even the largest community represented in the US country network.
There were also other interesting features in the community structure. For example, The analysis uncovered a dense cluster among Japanese and Chinese firms (community 7). This contrasts with the trading relationships between US and China, or US and Japan. Although US and China, the two largest economies have the largest numbers of firms in the dataset, no cluster emerged with a predominant presence of firms of these two countries. Conversely to the structure of Japan-China business relationships, US and Chinese firms do not seem to form dense Sino-American business cliques. The supply relationships between firms from these two countries are possibly more of isolated dyadic nature. The same applies to interactions between US and Japanese firms.
In presenting these results, we were mindful that the analysis had some limitations. For example, we were unable to consider tie-strength, since this data was not always available, so all analysis was conducted assuming that all ties were of equal strength, which is obviously not the case. We also did not consider the directionality of the ties, or the durability of ties (how long the ties have existed). These considerations should be taken into account while interpreting the results. The performance of community structure analysis was limited by the inherent limitations of the Louvain method [45], [47]. For example, since the method is meant to optimise modularity in the analysed network, it is not necessarily the best in identifying hierarchical communities or communities of communities. Similarly, the results of the centrality analysis have to be taken in context of the inherent weaknesses of each centrality measure used, which are well documented [40]. For example, betweenness based measures only consider 'shortest paths' in calculating centrality, whereas in the context of interfirm networks, non-shortest paths are also important. Even though some centrality measures could take tie-strength into account if available, we did not have this data and thus the centrality measures which we employed did not use it. Perhaps the most significant limitation is that the nature of ties (not merely tie strengths) are quite variable in supply chain networks. For example, a manufacturer-supplier relationship is inherently different from a wholesaler-retailer relationship, and this work, as well as most other works relating to topological analysis of supply chain networks [7], [19], [23], are forced to ignore it and assume that links are homogenous, even when tie strength may vary. Some of these difficulties in supply chain network analysis cannot be easily overcome [7], yet it is important to be cognisant of them in considering the results.
Yet, some of these concerns were partially mitigated by our study design. For example, in identifying locally and globally important firms and analysing the correlation between local and global importance, we used centrality-based ranking rather than centrality values themselves, which eliminated some issues arising from the limitations of centrality analysis. We also consistently used four inherently different centrality measures, so that issues related to particular centrality metrics were mitigated. We also typically had a large number of firms from each country to compute correlations, eliminating finitescale effects. Therefore overall, the analysis resulted in reliable and significant observations. The analysis is among the first efforts which specifically focusses on the 'international vs local importance' aspect of firms in supply chains, thus shedding light also into the consequences of increasing internationalisation and globalisation. The topological analysis, it should be noted, is the most efficient way of observing the overall patterns which are prevalent in the structure of supply chain networks, as indirect methods such as inference, machine learning or data mining are likely to result in relatively unclear generalisations which are not easily quantifiable in terms of their significance.

VI. CONCLUSION
Topological analysis of global and country-specific supply chain networks is very useful to understand interdependencies in terms of supply relationships, and the resultant trade clusters and communities formed which determine the stability and robustness of the global economy. Particularly, during times of crises like the current COVID-19 crisis, the global supply chains would be be affected tremendously, so it is vitally important to understand the relative strengths and weaknesses of global and country-specific supply chain networks, and the part individual firms play in shaping these, and this paper helps address these questions. The primary contributions of this paper are that it establishes that the typical supply chain networks are scale-free with a low scalefree exponent, highlights the nature of firms which are more globally central than locally in supply networks and vice versa, and identifies which industries and countries drive the global community structure of supply and interfirm networks.
However, more data about individual firms themselves, such as their sales, income, size, human resources etc, and other relationships between them, such as shareholding relationships and patent relationships, are important to build a holistic and data-rich set of complex networks which will help us clearly understand the dynamics and evolution of the global financial system. In considering future research directions, we are cognisant of this, and our future research will focus on incorporating and analysing datasets which can give us input about the above-mentioned attributes. Specifically, our future research direction focusses on modelling the impact of crises on supply chain networks, and how such crises impact the economical and financial resilience of firms. In this sense, our future research will focus on modelling 'cascading failures' in interfirm networks, using large data sets with detailed firm attributes. The future research will look at 'fire-sale' events, and other such crisis responses of the interfirm networks as a result of crises, and how these impact the economic stability of world markers. This research has clear applications in several current contexts, such as the COVID-19 crisis threatening the world, the bushfire crisis that threatened parts of Australia last summer, and the crises triggered by trade disputes between countries.
HONGZE JING received the master's degree in complex systems from The University of Sydney, in 2019. His research project was in the area of supply chain management. His research interests include complex systems, complex networks, social networks, and supply chain networks.
PETR MATOUS is currently an Associate Dean with the Faculty of Engineering, The University of Sydney. In 2015, after 13 years at The University of Tokyo, he left his position as an Associate Professor in the Department of Civil Engineering to join The University of Sydney. He researches, lectures, and consults with Australian, Japanese, and international organizations in the fields of network science and international development projects. He also participates in research and social projects in diverse communities across Australia, Asia, and Africa to leverage community networks in contexts constrained by limited resources or environmental disasters. He was a recipient of The University of Tokyo President's Award.
YASUYUKI TODO received the Ph.D. degree in economics from Stanford University. He has been a Professor with the Graduate School of Economics, Waseda University, since 2014, after serving as the Department Head at the Department of International Studies, The University of Tokyo. He is currently a Faculty Fellow with the Research Institute of Economy, Trade & Industry. He has published more than 50 academic articles in refereed journals, including Nature Sustainability, the Journal of Industrial Economics, the Journal of Regional Science, Ecological Economics, Research Policy, and World Development. His research interests include international economics, development economics, and applied micro-econometrics, also focusing on the role of social and economic network in economic growth and resilience based on firm-and householdlevel data from various countries. VOLUME 8, 2020