Information-Centric Networking Cache Robustness Strategy for System Wide Information Management

This study focused on the data center overload and resource allocation imbalance problem during the task processing of System Wide Information Management (SWIM). The Information-Centric Networking (ICN) technology was adopted in the infrastructure level of SWIM. In order to guarantee the cache performance of ICN in SWIM, and reduce the impact of cache node failure on the network cache performance, a double-layer network structure cache robustness strategy (CRS) based on content popularity and node importance was proposed. Some of the most popular content on the selected special backup node and general node was cached to ensure that a cache node failure would not significantly affect network cache performance. The results of the experimental simulation performed on the ndnSIM platform showed that this strategy is more robustness and has a better cache-hit ratio than other cache methods, which can ensure the network cache performance of the SWIM.


I. INTRODUCTION
The concept of SWIM was put forward by EUROCONTROL in 1997 and accepted by the International Civil Aviation Organization (ICAO) in 2002 [1]. SWIM is a large-scale distributed information transmission and highly integrated sharing network of the Air Transportation System (ATS). As an information sharing platform, SWIM uses a service-oriented architecture. It allows interoperability and consistency based on networking and informatization among relevant units of civil aviation and reduces the difficulty of comprehensive scheduling, processing, and integration of data for independent systems. Thus, SWIM can achieve integrated, standardized, and flexible communication, monitoring, navigation, meteorological, operation, and intelligence data distribution and sharing in the independent air traffic management (ATM) system, to meet the needs of efficient and coordinated operations of future civil aviation.
The associate editor coordinating the review of this manuscript and approving it for publication was Ting Wang . Ensuring the performance and reliability of the SWIM system is a key issue, which must be addressed. Originally, SWIM was built based on TCP/IP protocol, which promoted the development of SWIM. In the aspect of cache, an independent intradomain-cache scheme was adopted, and cache redundancy in each node of the entire network can ensure better cache robustness. However, with the further development of SWIM, the internal capacity and performance requirements are constantly increasing. Because traditional TCP/IP is based on direct information exchange between hosts and internal routers that have a cache function but do not cooperate, the cache efficiency is not high, the information transmission efficiency is low, and there is a risk of being attacked. Many studies have evaluated the performance and reliability of traditional network architecture, and some researchers have proposed intradomain-cache cooperation schemes, such as reducing network cache redundancy and network delay and improving cache efficiency [2]. However, such measures are rarely considered for use in new network architecture to solve the problem of the TCP/IP network in SWIM and to adopt an appropriate routing cache strategy to improve cache efficiency and ensure the cache reliability of the system.
In order to provide good network cache performance and robustness for SWIM, we adopted the ICN architecture, using the node cache characteristics in this network structure, to cache the more popular content not only in the general node, but also in the special backup node in the network. Through theoretical analysis and comparative experiments, the cache robustness strategy (CRS) proposed in this paper can provide better cache performance while ensuring SWIM network cache robustness.
The rest of this paper is organized as follows. Related work is presented in section II with discussion on the strengths and weaknesses of each method. Section III discusses the existing cache mechanism to SWIM. Section IV provides a detail description of the ICN caching strategy in SWIM. Section V is the experimental part, which shows the simulation results. Section VI concludes this paper.

II. RELATED WORK
There have been many studies on the improvements for traditional TCP/IP networks. For example, the content delivery network (CDN) technology proposed by Pallis et al. [3] that reduces the problem of excessive service pressure in the network system caused by too many current internet users. The peer-to-peer network method was proposed by Schollmeier [4], which improves the data transmission efficiency of the network system. However, because these methods and technologies are based on IP and occupy too many system resources, problems such as excessive system redundancy and low transmission reliability still exist.
Regarding the IP network applied to the aviation field, Cao et al. [5] studied the routing selection method of link reliability estimation based on mobile prediction, with the goal of improving the reliability and security of information communication between internal nodes of future aviation private networks. Zhang et al. [6] proposed a low-delay and highreliability routing algorithm based on absorption and load balancing mechanisms, mainly to solve the problems of poor timeliness and dynamic change of service load in aviation communication networks. Ye [7] sought to improve the routing technology used in ground-air communication networks by constructing an aeronautical telecommunication network (ATN) multilayer network reliability structure model, which improved the security and timeliness of information transmission in the communication network. Hu [8] proposed the design scheme of a civil aviation mobile network based on HMIPv6. Shahriar et al. [9] studied and designed a method of routing between airborne mobile network (MN) and wired network (WN). The above mentioned research methods of network IP routing, however, only aim to solve a certain problem.
In research on cache for information-centric networking, Ju and Lim [10] used a sharing digest and a Bloom filter to count the number of cache content to design a strategy for sharing the cache digest between neighboring routers. Psaras et al. [11] researched a caching strategy for ICN that caches the contents of data packets according to the probability values of the nodes and designed a probabilistic caching strategy called ProbCache. These strategies and methods are helpful for improving the overall performance of the routing cache in information-centric networking. In actual applications, the default placement strategy of most ICN caches adopts the LCE (Leave Copy Everywhere) [12] method, which is simple to implement. However, because it caches the contents on all nodes through which the data packet return path passes, many duplicate cache contents are generated, the network redundancy is large, and it takes up precious network cache resources. In [13], the authors designed a BEACON caching strategy system based on content popularity prediction. Z. Fan studied a cache method based on cache revenue and content blocking popularity to optimize network cache management [14]. Cho et al. studied a cache method called WAVE [15], which is mainly based on the content popularity of data blocks for a selective cooperative cache and reduces the redundancy of the system cache. In [16], the authors studied a heuristic routing cache strategy based on cache probability. When calculating the cache probability, both the popularity of the content and the benefits of cache placement should be considered. Most of these research methods focus on cache performance and improve the cache capacity and efficiency of the system through some methods. However, they seldom study the cache reliability of the network.
Among the extensive and in-depth works on informationcentric networking, some scholars have also paid attention to network reliability. Al-Naday et al. [17] studied the survivability strategy of information-centric networking and found that using the PURSUIT framework can shorten the system's service recovery time and reduce unnecessary losses after a network disaster. Sourlas et al. [18] proposed an information elastic recovery strategy under extreme network conditions, which can ensure that necessary information can be obtained from cache contents in network nodes when a network or a node fails. In this study, starting from the popularity of content and the importance of nodes, this paper solves the problem of network reliability caused by the failure of network cache nodes through the cooperative caching strategy of a double-layer network structure.

III. SWIM CACHE MECHANISM
In this section, in order to study the application of information-centric networking in SWIM and improve the system's cache performance and robustness, we analyzed the existing problems in the current SWIM network architecture and cache in the SWIM.

A. SWIM NETWORK ARCHITECTURE
The ISO/OSI protocol is widely used in the field of civil aviation. However, with the continuous improvement of the civil aviation industry's requirements for the safety, efficiency, and stability of the ATN, the ISO/OSI protocol cannot meet the needs of the rapid development of various civil aviation services. Due to the rapid development and widespread application of TCP/IP, relevant technologies are becoming increasingly mature and reliable. TCP/IP has become the industrial standard for network interconnection technology [19]. Presently, IP technology is used in the network layer of ATN and SWIM. The SWIM network structure based on TCP/IP is shown in Figure 1.
In the SWIM network structure of Figure 1, the server transmits and manages information by adopting TCP/IP. There can be multiple independent management domains (e.g., airlines, airports, ATM, and other organizations) connected to the backbone network. Moreover, the interaction and sharing of information resources between different regions can be completed through interdomain routers.
Service-oriented architecture (SOA) is characterized by low cost, loose coupling, standardization, and easy integration. SOA technology was adopted in the SWIM design of domestic civil aviation, and further research and development has been carried out on this basis. SWIM does not generate data, but introduces the concept of a virtual information pool and uses SOA technology to collect, process, analyze, exchange, and share business data information of organizations such as airports, airlines, and aviation administration and to transmit and manage the data flowing through them. SWIM is mainly divided into an information technology infrastructure level, a functional service level, and an application level. The specific SWIM structure and functions are shown in Figure 2.
In the SWIM service function level shown in Figure 2, the main services provided include message sending, interface management, enterprise service management, service security, and so forth. The implementation of these services requires SOA technology and enterprise service bus (ESB) support. Especially when sharing service data information, it requires key steps such as registration, release, and discovery of information. SWIM uses SOA architecture technology to transfer information between various systems through services to realize service data information sharing [20]. First, the service publisher registers its specific location and the service information it can provide with the application center. Then, the application center releases the registered service information. Finally, service subscribers find publishers who can provide services for their needs, and they complete the information interaction through end-to-end or message proxy services [21]. The SWIM information exchange model using SOA technology is shown in Figure 3 [22].

B. SWIM IP NETWORK CACHE PROBLEM
The original intention of the TCP/IP design was mainly to facilitate data interaction and information resource sharing between hosts. Therefore, TCP/IP network architecture is relatively simple and can efficiently connect various networks and terminal equipment, but it is not suitable for the scenario of multiple users and large network traffic. Moreover, the traditional IP network is based on location information and is a host-centered end-to-end network. Although civil aviation does not have highest requirements for information security and confidentiality as military aviation, and the degree of openness in the field of civil aviation has not improved. However, the information transmitted on the civil aviation must be guaranteed to be safe and reliable, and to provide real-time and efficient information sharing services. There are many communication nodes in the SWIM network, so we need to consider the routing and caching of nodes in the SWIM network as much as possible. Presently, there are some problems in the routing cache mechanism of aviation telecommunication networks using TCP/IP, such as: i) Based on the IP, routing maintenance is complicated, overhead is large, routing time is long, network time is prolonged, and the load is large.
ii) It is difficult to cache, find, and monitor network content, and there are some problems such as poor security and privacy.
iii) The robustness of the network is not high, and when a link node in the network fails, the system may be interrupted.
iv) Host-based routing and forwarding communication mode, when multiple requests are made for highly popular resources, the system cannot respond in time, which makes it difficult to guarantee the quality of service (QoS) requirements of the users [23].
Based on the above mentioned problems in the network architecture using TCP/IP, finding methods to improve SWIM's reliability, security, and flexibility has become the focus of experts and scholars. It is urgent to optimize and design a new SWIM network architecture to improve the system performance. ICN [24] is a new network architecture recently proposed. Its communication mode is information centered, which is different from the traditional network communication mode that is centered on the host computer. In information-centric networking, users no longer pay attention to the location of the information, but rather focus on the information itself. The information of each node in the  network can be interconnected. The name is used as the unique transmission identifier of the information, while the IP address can be used as the transmission identifier at the bottom of the network. In ICN, data communication is mainly realized through two message types: interest messages and data messages. Because ICN internal nodes have a cache function, users can obtain the required information from the nearest network node cache, thus improving the efficiency of information transmission and sharing and reducing the redundancy and load pressure of the system [25].

IV. ICN CACHING STRATEGY IN SWIM
Presently, the caching strategy adopted in the network are mainly routing process interaction, static configuration, and other methods. These cache strategies have the problems of occupying too many network resources and generating network information redundancy. In this work, the robustness of the system cache was studied. When a cache node fails in the network, ensure that the overall cache performance of the network remains unchanged [26], [27].

A. CACHE MECHANISM
In this section, considering the advantages of ICN over the traditional IP network, an ICN structure can be adopted in SWIM. Using ICN cache characteristics, SWIM changes the traditional store-and-forward mode to the cache-andforward mode, which facilitates users to get the information they need from the cache of the nearest node, avoids the problem of excessive load on data center servers, improves the network information transmission efficiency, and thus enhances the task processing performance throughout SWIM. The schematic diagram of the SWIM infrastructure using ICN is shown in Figure 4.
In the two-layer network structure of SWIM's ICN caching strategy shown in Figure 4, CR is the content router, AR is the access router, the blue dashed line is the interest packet transmission line, and the green dashed line is the data packet transmission line. Visitors such as air traffic control, airlines, and airports can request service information by sending interest packet, and information publishers and intermediate nodes return corresponding service information through data packets.
The service node in SWIM's ICN consists of three parts: the forwarding information base, the pending interest table and the content store. It is shown in Figure 5 [28].
i) Forwarding Information Base (FIB): It is responsible for recording the mapping between the next hop service node to be forwarded by the interest packet and the corresponding information naming to determine the routing direction. The FIB can use the multicast mechanism to forward interest packets for multiple ports matching the name at the same time, improving the efficiency of content request.
ii) Pending Interest Table (PIT): It will track and record the input ports where the interest packets have arrived but the expected matching data has not yet arrived. The PIT uses one entry to record the content name and interface information in the interest packet, and returns the corresponding data packet according to the information.   iii) Content Store (CS): It is the local content storage table of the routing node, which is used as a local cache through each service node to store a copy of the passed information. The CS updates the content according to a certain cache replacement policy, and the data can be multiplexed by the content name to save bandwidth resources.
As shown by the dashed blue line in Figure 4, the service subscriber sends the interest packet of the requested resource. When it arrives at the service node, the node searches the local CS for service information whose name matches the prefix of the requested information. If corresponding entry is found, the information is immediately returned to the incoming interface and the interest packet is discarded. Otherwise, the node will perform the longest prefix match on its FIB to determine the next hop node to forward the interest packet. The longest prefix matching refers to comparing all kinds of information naming and selecting the entries that match the highest-order elements in the given naming. If a corresponding entry is found in the FIB, the incoming interface of the interest packet is recorded in the PIT and pushed to the node indicated by the FIB. If the PIT already contains an entry with the same service information, which means that this information has been requested, the node adds this incoming interface to the PIT entry and discards the interest packet.
As shown by the dashed green line in Figure 4, once the service information matching the requested name is found in the CS of the publisher, the interest packet is discarded and the information is returned as a data packet. According to the form maintained in the PIT, the message will be sent to the corresponding subscriber one by one. When a node receives a data packet, it first creates a copy of the corresponding service information and stores it in the local CS. Then, it matches the longest prefix in its PIT to find the return interface matching the data packet. If the PIT entry lists multiple interfaces, the data packet will be copied, thus realizing multicast transmission. After that, the node forwards the data packet to these interfaces and deletes the entry from the PIT. If there is no matching entry, the router discards the data packet and finally returns the service information [29].

B. NETWORK MODEL OF ROBUST CACHING STRATEGY
The design points of the double-layer hierarchical network structure of the SWIM robust caching strategy [30] are: i) The content router in the core layer has no cache function, which can reduce the search operation of content, thus ensuring high-speed data interaction.
ii) The content router at the edge layer is responsible for user access and has the function of cache content.
iii) Cache nodes are divided into two types, general and special cache nodes, which repeatedly backup important and popular content on special cache nodes.
Suppose there are m + 1 network nodes, and respectively R 0 , R 1 , · · · , R m . One of the nodes is selected as a special cache node, and the rest of the m nodes are considered as a general cache node.

C. CONTENT POPULARITY CALCULATION
Content popularity: The number of times content objects are accessed by all customers within a certain statistical time.
where, R is the content set of the network nodes, and R = {r x |x = 0, 1, 2 · · · m }. The member r x in R is numbered according to its popularity level. S x is the frequency of access to network content objects, for all 0 ≤ x ≤ y ≤ m meeting S x > S y , and is called r x 's content popularity in R. S = {S x |x = 0, 1, 2 · · · m } is called the popularity distribution of R.
The network edge node e performs periodic dynamic statistics on the content popularity according to the request record of the content object. It is expressed as follows: where, S e [r y , x] is the r y content popularity of e in period x, Num e [r y , x] is the number of e requests of content r y in period x, Num e [x] is the total number of content requests for e in period x, and α(0 < α < 1) is the weight of the content popularity of e in the total content popularity [31].
The content popularity of the network core node c is calculated by the node heat and content popularity of the downstream nodes connected to it, and defined as follows: where, S c [r y , x] is the content popularity of the content r y of c in the period x, j is the number of downstream nodes connected to c, Hot[c i , x] is the routing node heat of c in period x, and c i is the downstream node of the network core node c.
The node heat expressed as follows: is the number of requests that c i arrived in the period x, δ(0 < δ < 1) is the weight of c i in node heat of period x − 1 in node heat of period x.

D. NODE IMPORTANCE CALCULATION
The importance of a node in the network is mainly reflected by the node's betweenness, which mainly refers to the proportion of the number of shortest paths through the node to the number of shortest paths between all routing node pairs [32]. It is expressed as follows: where, σ st is the number of all shortest paths from node to node, σ st (i) is the number of the shortest paths through nodes in σ st , and m(m − 1) / 2 is the normalization of the node media in the undirected graph.

E. SELECTION OF SPECIAL BACKUP NODE
When a cache node fails in the network, in order to reduce the number of cache-hit hops, it is necessary to reasonably select the location of a special backup cache node. The average distance from node N i to other nodes in the network is D i . It is given by Equation (5).
The following compares the number of cache-hit hops, which is different when the central node N c is selected as the special backup cache node and the noncentral node N g is selected as the special backup cache node [33]. The functions and parameters described in the formulas in this section are shown in Table 1.

1) THE CENTRAL NODE IS SELECTED AS A SPECIAL BACKUP CACHE NODE
When selecting the central node N c as the special backup cache node, the expression of the total content acquisition hop VOLUME 8, 2020 count F c in the network is defined as: 2) THE NONCENTRAL NODE IS SELECTED AS A SPECIAL BACKUP CACHE NODE When a noncentral node N g is selected as the special backup cache node, the expression of the total content acquisition hop count F g in the network is defined as: From (6) -(10), F c − H g ≥ 0, the number of cache-hit hops of the central node is larger than that of the noncentral node. So, for this study, we selected the noncentral node as the special backup cache node.

V. EXPERIMENT AND RESULTS ANALYSIS
In this section, in order to verify the performance and robustness of the cache scheme presented, we referred to the SWIM structure that is already deployed by an air traffic management of civil aviation (Figure 4), and a simulation experiment was carried out using the ndnSIM [34] simulation platform.

A. SIMULATION ENVIRONMENT AND PARAMETERS
The simulation experiment topology included 1 server, 1 special backup cache node, 4 access routers (AR), 4 content routers (CR), 1 resource manager (RM), and 300 users. The simulation experiment test topology diagram as shown in Figure 6, and the experiment test parameters are shown in Table 2.
The cache robustness strategy proposed in this paper was compared with the LCE cache placement default strategy in reference [12] and the WAVE cache mechanism in reference [15]. At the same time, we simulated CRS-BF, which is when a special backup cache node fails, and CRS-CF, which is when any general cache node fails in this scheme.

B. SIMULATION RESULTS AND ANALYSIS
The simulation experiment was run 1000 times, and the average value of the experimental results was taken. The trace file obtained was counted by Wireshark, and the results were processed by MATLAB. The experimental results are shown in Figures 7-10. Figure 7 shows the service request change amount of the server when a cache node in the network failed, which can reflect the robustness of the network cache.

1) ANALYSIS OF CACHE ROBUSTNESS EXPERIMENTAL RESULTS
As shown in Figure 7, the total simulation time was 400 s. In the first 100 s, there was no failed cache node in the network. The first failed cache node occurred within the second 100 s, and the failed node was a special backup cache node (CRS-BF). In both the third 100 s and the last 100 s, a failed cache node appeared, and the failed node was a general cache node (CRS-CF).
It can be seen from Figure 7 that in the three cache node failure phases, the WAVE scheme had a larger increase in the number of requests; the LCE scheme had the smallest change in the number of requests but the largest number of service requests; and the CRS-BF scheme had the smallest number of requests received by the content source server, and the increase in the number of requests was not much different  from that of the CRS-CF scheme. This is because the cache backup strategy was designed in this scheme. Popular content is cached in both general and special backup cache nodes. No matter when the special backup or the general cache nodes fail, they can provide service to the user's request and avoid causing additional request load pressure on the system's servers. Therefore, it reflects the importance of the backup cache node to the normal operation of SWIM when the node in the network fails. Figure 8 shows that the average request hop count for the CRS-BF, CRS-CF, WAVE, and LCE caching strategies varied with the popularity value δ.

2) ANALYSIS OF EXPERIMENTAL RESULTS OF CACHE AVERAGE ROUTING COST
It can be seen from Figure 8 that the average number of request hops of the four caching strategies decreased with the increase of the δ value. The average number of request hops of the CRS-BF caching strategy was always smaller than that of the other three caching strategies when the δ value was small, and it had more advantages. Because the CRS-BF caching strategy was adopted, many client requests hit in the domain. Compared with the other caching strategies, the number of request hops required was significantly reduced. When the popularity was relatively high, such as when the value of δ was 0.9, most service requests were concentrated, and only the hottest content information needed to be cached. Therefore, when the content popularity value is very high, the average request hops of the four caching strategies are relatively small. Figure 9 shows that the cache-hit ratio for the CRS-BF, CRS-CF, WAVE, and LCE caching strategies varied with the popularity value δ.

3) ANALYSIS OF EXPERIMENTAL RESULTS OF CACHE-HIT RatIo
It can be seen from Figure 9 that the cache-hit ratio was greatly improved with the increase of the δ value, but CRS-BF and CRS-CF had obvious advantages. In the LCE caching strategy, each node in the network caches the contents of packets passing through it, which results in a large redundancy of packet content cache in the network. In addition, because the data packets cached in the network node are frequently replaced, there is no problem with the matching data packets in the node when the interest packets arrive, resulting in a decrease in the cache-hit ratio of routing nodes. The WAVE caching strategy uses a certain probability to cache data packets of nodes, which solves the problem of content cache redundancy to a certain extent, so the cache-hit ratio of this strategy is higher than that of the LCE strategy. CRS-BF and CRS-CF caching strategies have higher cache-hit ratio because different data packet contents are cached according to different centralities of ICN routing nodes, which can avoid frequent replacement of data packets in the cache of nodes with higher centrality. Here, when the value of δ was 0.9, the cache-hit ratio of the three schemes was close to the maximum because popular information was especially concentrated at this time, and the caching strategy performance was not much different. Figure 10 shows that the average network delay for the CRS-BF, CRS-CF, WAVE, and LCE caching strategies varied with the popularity value δ.

4) ANALYSIS OF NETWORK DELAY EXPERIMENT RESULTS OF CACHE METHOD
It can be seen from Figure 10 that the average network delay of the four caching strategies decreased with the increase of the δ value. With the increase of the δ value, the content of service requests was relatively concentrated. ICN routing nodes cache content could satisfy most service requests. The number of route hops required for interest packets to obtain information content was reduced, thus reducing the average network delay. CRS-BF and CRS-CF caching strategies use a two-layer network structure caching strategy based on content popularity and node importance, so that  network node data packets at different levels are relatively stable to be replaced, more request interest packets are ensured to hit in the domain, the number of route request hops is reduced, and the network transmission efficiency is improved. Therefore, compared with WAVE and LCE strategies, the network delay is lower.
These caching strategies are compared in terms of cachehit ratio, robustness, redundancy, and number of user requests. The results are shown in Table 3.
It can be seen from Table 3 that the cache robustness strategy used here had few differences in cache performance compared with the WAVE mechanism [15], but the robustness of CRS-BF and CRS-CF were both better than WAVE. Compared with the LCE strategy [12], the two strategies had few differences in terms of robustness, but the advantages of this caching strategy were more obvious in cache performance. Therefore, the cache robustness strategy proposed in this paper can ensure the overall cache performance of the network, and when cache nodes fail in the network, it can also ensure good robustness.

VI. CONCLUSION
In this paper, in order to solve the potential network reliability problem when the existing information-centric networking cache node failure, we proposed a cache robustness strategy based on content popularity and node importance in this paper. By selecting a cache node as a special backup cache node and cache important or popular content on both the general and special backup cache nodes, the influence of node failure in the network on network performance was prevented, and the survivability and cache robustness of the network were guaranteed. The simulation results showed that, compared with other cache methods, the CRS caching strategy proposed in this paper can ensuring cache robustness, improve the cache-hit ratio, and reduce network delay, but there are also some problems that need further research and improvement: i) The CRS caching strategy proposed in this paper needs to be improved through other methods, such as combining it with cache methods (e.g., on-path), to further improve the overall cache efficiency of the system.
ii) In the CRS caching strategy proposed in this paper, the selection of core node sets and node backup methods was not ideal and can be optimized in this respect.
iii) In this paper, we mainly study the ICN cache robustness strategy for SWIM, and there is no focus on studying the node cache replacement strategy. In the future, the node cache replacement strategy can be studied by considering the association between nodes.
iv) The simulation experiment in this paper was carried out in a simulated SWIM environment, the simulation scene is different from an actual network, and did not compare with the real network performance of other network topologies (e.g., GEANT, GARR) [35]. In order to further verify the effectiveness and wide applicability of the proposed cache robustness strategy, the proposed method needs to be tested in the real SWIM environment and in the real network of GEANT topology.