Exploring the Information Capacity of Remote Sensing Satellite Networks

With the increasing demand for geological and weather information, remote sensing satellite networks (RSSNs) play increasingly important roles in monitoring our planet. On account of the significant differences between RSSNs and traditional communication networks, the traditional communication capacity which only focus on the performance of data transmission process cannot well capture the service capability of RSSNs. In order to provide efficient guidelines to the deployment of RSSNs, in this paper we study a new capacity indicator, called information capacity, which takes into account the whole service process of RSSNs, including information acquisition, processing, storage, and transmission. Specifically, we firstly propose the formal definition of information capacity. Then, a new graph model called microscopic time-expanded graph (MTEG) is developed, which characterizes the intertwined impact of the observation, computational, storage and transmission resources on the service process of RSSNs. Base on this graph model, a mathematical framework is developed to compute the information capacity. Owing to the NP-completeness of the formulated problem, we decompose it into a flow optimization problem and an arc scheduling problem of the MTEG model, and then propose a Graph-based Information Capacity Solving (GICS) algorithm to efficiently solve the problem. Finally, simulation results highlight the necessity of study the information capacity of RSSNs.


I. INTRODUCTION
Remote sensing is the technique of deriving information about objects or phenomena remotely without physical contact with them [1]. Benefiting from the extensive coverage, remote sensing satellites have become important participants in earth observation, and the acquired data of them are of great need in the study of climate change, environment protection, disaster control and so on. To meet with the surging volume and real-time requirements of geological information, more and more remote sensing satellites are launched to form constellation and networks in past decades [2]. Thus, the analysis and investigation of remote sensing satellite The associate editor coordinating the review of this manuscript and approving it for publication was Antonino Orsino . networks (RSSNs) has become a hot topic in the field of space research and design [3], [4].
Network capacity is one of the most prominent metrics to characterize the network performance. The capacity of terrestrial networks has been widely studied [5]- [7] since the seminal work of Kumar in 2000 [8]. In these works the capacity is described as the maximum achievable throughput of the network. Similarly, in [9], [10] the authors study two-layered satellite networks, wherein the capacity is defined as the maximum achievable throughput of the network. In [11], [12], the authors investigate the communication capacity of space-ground networks, which is defined as maximum total amount of data downloaded to the ground per unit time. In above literatures, the capacity defined based on throughput can well capture the service capability of networks. However, the throughput-based capacity can not work well in RSSNs, due to the remarkable differences between RSSNs and traditional communication networks. Fig. 1 illustrates an example of service processing of RSSNs. It can be seen that firstly remote sensing satellites acquire raw images by imagers. Afterwards, the raw images are compressed by compression unit. Then, the compressed data will be transmitted to ground stations and finally sent to the Data Processing Center (DPC). After arriving at the DPC, the compressed images are reconstructed. At last, the reconstructed image is sent to the user. Based on the example, two major differences between RSSNs and traditional communication networks can be concluded.
• Besides the communication process, service capability of RSSNs is also impacted by information acquisition, processing and storage processes.
• Because of the effect of computational resources (e.g., compression unit), in RSSNs the amount of effective information delivered to the users per unit time is much more efficient to characterize performance than throughput. Therefore, a new metric should be considered to characterize the service capability of RSSNs. As shown in Fig. 2, the new metric should cover the ensemble service process of RSSNs, which includes information acquisition, processing, storage, and transmission, rather than only takes the information transmission into account as the traditional throughput-based capacity. For the convenience of following discussion, the traditional throughput-based capacity is referred to as communication capacity, and the new indicator is referred to as information capacity hereinafter.
With the development of information technologies, the joint management of heterogenous resources, especially the 3C resources (i.e., Computing, Caching, and Communication resources), has become a new trend. Currently, some works focus on the joint optimization of multiple processes of information with different resources (especially information processing, storage and transmission) in terrestrial networks, such as internet of things [13], information centric networks [14] and mobile edge computing networks [15], [16]. However, it is still challenging to study the information capacity of RSSNs, due to several reasons: 1) Although there have been a big chunk of works focused on the capacity of networks, the indicator can efficiently characterize the capability to provide effective information to users of RSSNs is still absent. 2) On account of the compressing and reconstructing effect of computational resource, the data flows of RSSNs no longer keep conservation, which raises challenges to the analysis of information capacity. 3) The interactions among the information acquisition, processing, storage and transmission process is complex. For example, the observation and transmission processes of an agile remote sensing satellite 1 may conflict with each other when they require different attitude of satellite platform. Therefore, the service process of RSSNs cannot be resolve into independent parts to analysis respectively, so that it is essential to represent the observation, computational, storage and transmission resources jointly and characterize their intertwined impact the service process.
In this paper we explore the information capacity of RSSNs. We first propose the formal definition of information capacity. Then, we extend traditional time-expanded graph by modeling remote sensing satellite in a microscopic level and adding virtual arcs and virtual vertex. A new graph model called microscopic time-expanded graph (MTEG) is developed, which characterizes the intertwined impact of the information acquisition, processing, storage and transmission resources on the service process of RSSNs. Base on this graph model, we develop a mathematical framework to compute the information capacity. Owing to the NP-completeness of the formulated problem, we decompose it into a flow optimization problem and an arc scheduling problem of the MTEG model, and then propose a Graph-based Information Capacity Solving (GICS) algorithm to solve the problem in polynomial time. Simulation results highlight the necessity of study the information capacity of RSSNs.
The main contributions of this paper are as follows: • To evaluate the service capability of RSSNs, we formally define a new indicator, information capacity, which involves the entire service processes.
• MTEG is developed to capture the time-varying coordination relationship among the observation, computational, storage and transmission resources and their intertwined impact on the service process in RSSNs.
• We formulate the problem of solving information capacity of RSSNs. Since the formulated problem falls in the category of mixed-integer nonlinear program (MINLP), we decompose it into a flow optimization problem and an arc scheduling problem based on the MTEG model. Then, a Graph-based Information Capacity Solving (GICS) algorithm is proposed to efficiently solve the problem.
• Extensive simulation results are provided to validate the effectiveness and necessity of analysis of information capacity in RSSNs. The impacts of different network resources on the information capacity are evaluated. The remainder of this paper is organized as follows. Section II introduces the RSSN system model under consideration and proposed the definition of information capacity of RSSNs. In Section III, we develop a graph model called microscopic time-expanded graph (MTEG) and formulate the problem of solving information capacity of RSSNs. Section IV decomposes the problem and then proposes a graph-based algorithm to solve the information capacity of RSSNs in polynomial time. The performance evaluation by simulations is presented in Section V, followed by concluding remarks in Section VI.

A. NETWORK MODEL
Consider a remote sensing satellite network N (as illustrated in Fig. 1), which is comprised of: • A set of remote sensing satellites, denoted by OS = {os 1 , os 2 , . . . , os n , . . .}. The payloads of each remote sensing satellite include an imager, an image compression unit, a solid state mass storage and a transceiver.
• A data processing center (DPC), denoted by dc. During planning horizon [0, T ], there are a set of earth observation missions to be planed, which is denoted by OM = {om 1 , om 2 , . . . , om n , . . .}. A mission om j can be described by a 3-tuple [ob j , cr j , dy j ], where ob j denotes the observation target of mission om j . Let OB = {ob 1 , ob 2 , . . . , ob n , . . .} denotes the set of observation targets in N . cr j denotes the maximum compression ratio that can be adopted by om j . dy j is the tolerable delay of om j , which is the maximum length of time taken from image acquisition to arriving at the DPC.
RSSN completes the missions according to the plans provided by the Mission Operation Center (MOC). Fig. 2 illustrates the implementation process of a mission. Firstly, remote sensing satellites acquire the raw images of the observation target when it is in the observable range of the onboard imagers. Then, the raw images are encoded by the compression unit. Because of the orbiting movements, the link between a remote sensing satellite and a ground station can be established only when the satellite moves into the coverage of the ground station. That is to say, the compressed data are delivered to the ground station via store-carry-forward paradigm [18]. Therefore, the output of the compression unit would be stored in the onboard storage or transmitted directly. After arriving at the ground station, the compressed data are sent to the DPC and reconstructed. Finally, the reconstructed image is sent to the user.
Therefore, a mission plan should specify the remote sensing satellites and observation duration, compression level, the downlinks and the transmission duration for each mission. Moreover, to guarantee the QoS requirement of each mission, a feasible mission plan p should select a compression ratio no larger than cr j for mission om j and make all the image data be sent to the DPC within dy j . Let P denote the set of feasible mission plans.

B. ONBOARD COMPUTATIONAL RESOURCES
Being capable of solving the ''bandwidth vs. data volume'' dilemma of modern spacecraft, onboard computational resource is playing a more and more vital role in the service of RSSNs. To be specific, with the growth of spatial resolution and swath of satellite imaging payloads, the ever increasing data acquisition rates incurs heavy burden on the limited on board communication and storage resources [19], [20]. For example, the acquisition data rate of CSG (COSMO-SkyMed Second Generation 2 ) satellites can be up to 2 × 1.2 Gbit/s, while the data downlink rate and onboard storage capacity are only 2 × 260 Mbit/s and 1500 Gbit, respectively [21]. By compressing mission data, the onboard computational resources compensate for the limited onboard communication and storage resources, and thus improve the service capability of RSSNs effectively. However, there exist few works which focus on network capacity consider the effect of computational resource. Therefore, in this subsection we propose the model of onboard computational resource.
With the development of onboard image compression algorithm and hardware design [22]- [24], the image compression module has become an indispensable payload of remote sensing satellites. For instance, the image compression module of CSG satellites can provide seven different compression ratios: 10:10 (uncompressed), 10:6, 10:5, 10:4, 10:3, 10:2 and 10:1. There are two classes of image compression methods: lossless or lossy [25]. With lossless image compression, the reconstructed image is exactly the same as the original one, without any information lost. On the contrary, lossy image compression would reconstruct the image with a varying degree of information loss. Generally, for a given lossy image compression method, the larger the compression ratio is, the more information is lost through compression.
In this paper, we assume each remote sensing satellite is equipped with an image compression module with processing rate R p , which can provide Q levels of lossy compression. 3 Let ϕ i (1 ≤ i ≤ Q) denote the compression ratio of the ith level. Consider an M × N image with σ bit data quantization, where M and N are the number of rows and columns of pixels in the image. The original volume of this image is MN σ bits. After the compression of ith level, only MN σ ϕ i bits compressed data are required to be downloaded to the ground. At the DPC, an image of the same size with the original one can be reconstructed.
The distortion of a reconstructed image can be evaluated by image quality measures, such as Maximum Difference (MD), Peak Mean Square Error (PMSE), Normalized Mean Square Error (NMSE) [26]. Take NMSE as an example, which is defined as follows where F(i, j) andF(i, j) denote the samples of original and reconstructed image. In this paper, we use the average NMSE of the images compressed by ith compression level, denoted by ds i , to quantify the distortion caused by the ith compression level.

C. INFORMATION CAPACITY
In this subsection, we define a new metric called information capacity to capture the service capability of RSSNs. For the sake of convenience in comparison later, we first introduce the definition of communication capacity, before formally defining the information capacity.
Definition 1 (Communication Capacity [12]): The communication capacity of N is the maximum sum amount of the data that ground stations can receive from the satellites per unit time, i.e., where α ij (t) represents the availability of a link (the existence of a line-of-sight) between remote sensing satellite os i and ground station gs j , and r ij (t) denotes the data transfer rate between satellite os i and ground station gs j .
As we have discussed in Section I, the major differences between communication capacity and information capacity lie in that communication capacity only focus on the information transmission process, while information capacity should cover the entire mission implementation process including information acquisition, processing, storage, and transmission. With the image compression and reconstruction of the information process, only the amount of data downloaded to the ground cannot reflect the service capability. To evaluate the information capacity of RSSNs, both the data volume and the quality of the images received by users should be taken into account. To this end, we define a metric, called effective data volume, to indicate the amount of valid information of a reconstructed image.
Definition 2 (Effective Data Volume): Consider an image which has been compressed under the ith level. After being reconstructed, the effective data volume of this image is where rd is the data volume of the reconstructed image. Thus, to measure the capability that RSSNs provide effective information to the users, the definition of information capacity is given as follows: Definition 3 (Information Capacity): The information capacity of N is the maximum sum amount of the effective data that DPC can provide to the users per unit time, i.e., where P denotes the set of all the feasible mission plans, which is determined by the given RSSN and mission set OM . rd j (p) denotes the amount of data of om j that can be reconstructed by the DPC under mission plan p, and cl j (p) denotes the compression level chosen by mission om j under mission plan p.

III. ANALYTICAL FRAMEWORK OF RSSNs
In this section, we propose a graphical model which characterizes the cooperative and interactive relationships among multiple resources (including observation, computational, storage and transmission resources) during the service process of RSSNs. Then, the problem of solving the information capacity of RSSNs can be formulated into the multi-commodity flow problem under the resource conflict constraint in the graphical model. VOLUME 8, 2020

A. MICROSCOPIC TIME-EXPANDED GRAPH
It is challenging to develop analytical frameworks of RSSNs due to the dynamic characters and the intertwined effect of the heterogeneous resources (e.g., observation, computational, storage and communication resources) on the information traversing process. Although the time-space graph models (e.g., time expanded graph (TEG) [27] and its extended visions [28], [29]) can be employed to model the dynamic and intertwined properties of resources, the effect of computational resource is still not taken into account. An intuitive way to tackle this problem is to modify the above graph models: when the compression condition is satisfied, reduce the value of flows after they passing by the vertices representing satellites with onboard computational resource. Nevertheless, this method breaks the flow conservation condition of the graph model, which would lead to that the classical network flow theory fails to be employed during the analysis of information capacity. Therefore, a more sophisticated graph model which not only incorporates the the effect of computational resource, but also retains the analytical characteristics of the traditional time-expanded graph is required. To this end, we further extend traditional time-expanded graph, wherein the main modifications is given as follows: • To model the onboard compression process, we represent remote sensing satellites in a more microscopic way. To be specific, each remote sensing satellite is decomposed into three parts (as shown in Fig. 3(a)): imager, compression unit and storage-transmitter part, and each part is represented by a vertex in the graphical model.
• To guarantee the flow conservation condition, we add a virtual vertex (i.e., v r in Fig. 3(b)) into the graph model, which connects to the vertices representing compression unit by virtual arcs. With the aid of these virtual vertex and virtual arcs, the data compression process can be modeled by a flow splitting process. Since this extended graph model represents satellite nodes from a more microscopic point of view compared to the traditional time expanded graph, we refer it to as microscopic time-expanded graph (MTEG). Fig. 3(b) illustrates the MTEG of the example RSSN in Fig. 1 , is a directed graph composed of K layers. To construct the MTEG, we first divide the planning horizon [0, T ) into K slots, each with duration τ . Then, a snapshot is extracted from each slot to form the corresponding layer in MTEG, which represents the topology during the slot. Note that although used to model dynamic networks, MTEG are static graphs, i.e., all the arcs therein are static. In other words, with MTEG, RSSN with consecutive topology evolution can be approximated into a network of which the topology is static during each slot and only changes at slot transitions.
There are two kinds of vertices in G K : ordinary vertices and virtual vertices, i.e, V = V o V v . The ordinary vertices correspond to the temporal replicas of the observation targets, imagers, compression units, storage-transmitter parts, ground stations and DPC in N , i.e., where The virtual vertices v r is a virtual sink of the redundant information which is reduced by compression.
There are four kinds of arcs in G K : observation arcs, link arcs, storage arcs and process arcs (represented by the green, blue, red and yellow lines in Fig. 3(b), respectively). The observation arcs model the opportunities for remote sensing satellites to acquire mission data from observation targets, i.e., The capacity of observation arc (ob k i , im k j ) ∈ A ob , is the maximum amount of data can be acquired by remote sensing satellite os j from observation target ob i in the kth slot, i.e., where r im is the data acquisition rate of remote sensing satellites. The link arcs represent the communication opportunities between remote sensing satellites and ground stations and the links from ground stations to DPC, i.e., where and The capacity of link arc (st k i , gs k j ) ∈ A og is the maximum amount of data that can be transmitted by link (st i , gs j ) in kth slot, i.e., where r ij (t) is the data rate of link (st i , gs j ). In addition, since ground stations and DPC are connected by high speed wired links, we set the capacity of arc (gs k i , dc k ) ∈ A gd to be infinity. The storage arcs model the storage capability of satellites, ground stations, and DPC. The storage arc set is defined as The capacity of data storage arc where sg(v i ) is the storage capacity of v i . In addition, as ground stations and DPC are always equipped with mass storage, the storage capacity of them are assumed to be infinity. The processing arcs represent the compress capability of the onboard image compress unit. According to the relationship with the compression vertices (i.e., ps k i in MTEG), there are three kind of processing arcs. The set of processing arcs is denoted by where arcs in models the connections for raw image data passing from the onboard imager to compression unit. The capacity of process arc (im k i , ps k i ) ∈ A ip is set to be the maximum amount of raw image data that can be processed by the compression unit in a slot, i.e., where R p is the processing rate of the compression unit. Arcs model the process that compressed data flow to the storage and transceiver parts from the onboard compression units. Arcs represent the virtual process that redundant information reduced by compression unit moving to the virtual vertex. The capacity of the processing arcs in A pt A pv are set to be infinite.
Through the compression vertices, virtual vertex and processing arcs, the MTEG can model the image compressing process by a flow splitting process. That is the flow from the imager vertices splits into two parts after passing by a compression vertex. The first part passes through the arcs in A pt , which represents the effective data retained by compression process. The second part goes to the virtual vertex through the arcs in A pv , which represents the redundant information reduced by compression process. By this way, the MTEG can not only model the the effect of computational resource, but also keeps the flow conservation condition of the network flow theory.
Thus, the MTEG can model the intertwined effect of the observation, computational, storage and communication resources on the mission complete processes of RSSNs. More specifically, the flows in the MTEG represent mission complete processes (i.e., how the observation, computational, storage and communication resource are scheduled to complete the mission) in N . Take the flow f in Fig. 3(b) as an example, it represents that the remote sensing satellite os 1 acquires x bits raw image data from observation target ob 2 by imager im 1 , and compresses the raw image under the ith level during [0, τ ). After compression, ϕ i −1 ϕ i x bits are reduced. Then, the remaining x/ϕ i bits data are stored in os 1 in [τ, 2τ ), and finally delivered to dc via gs 1 in [2τ, 3τ ). Therefore, the mission scheduling problem in N can be formulated into the multi-commodity flow problem in MTEG.

B. PROBLEM FORMULATION
As we can observe from the definition, the information capacity of RSSNs can be obtained by solving the mission plan which maximize the amount effective data outputted by the DPC. Based on the correspondence between mission complete process and the flows of METG, we can formulate the problem into a multi-community flows problem. More specifically, we firstly represent the mission execution process (i.e., information acquisition, processing, storage and transmission process) for each mission as the flows in the MTEG. For each mission om n , to guarantee the execution process satisfy the tolerable delay requirement dy n , the corresponding flows should pass by no more than dy n /τ layers of MTEG. Therefore, the set of alternative executing processes of mission om n are modeled by the set of flows F n = {f |ob k n → {vr, dc l }|0 ≤ l-k ≤ dy n /τ − 1} (16) where ob k n → {vr, dc l } denotes the flow originated from ob k n and destined to vr and dc l in MTEG. Note that flow ob k n → {vr, dc l } has two destination vertex, we refer v r as virtual destination and dc te n /τ as original destination. Let F = 1≤n≤|OM | F n denote the set of flows representing the alternative mission executing processes of all the missions.
We introduce a set of boolean vectors y n = (y 1 n , y 2 n , . . . y Q n ) to indicate the compression level used by mission om n . More specifically, if the ith compression level is employed VOLUME 8, 2020 by mission om n , then y i n = 1; otherwise, y i n = 0. The compression ratio selected by om n can be expressed as y n T , where = (ϕ 1 , ϕ 2 , . . . , ϕ Q ). Let x(f ) denote the value of flow f . The objective is to maximize the sum effective data volume can be provided by DPC per unit time, which is expressed as where DS = (ds 1 , ds 2 , . . . ds m ). Let Hereinafter, we introduce the constraints on the flows in F. Conservation Condition of Flows: As we have discussed in the previous subsection, with the introduction of virtual vertex into MTEG, the flows corresponding to mission execution process satisfy the flow conversation condition, i.e., , and s(f ) and d(f ) denote the source and original destination vertex of flow f , respectively. The flow conversation constraint restricts that through the compression process, 1 − y n T of the raw image data are reduced, only y n T of the raw image data are required to be delivered to the DPC.
Capacity Constraints: The capacity constraint models the effects of the capacity of observation, processing, transmission and storage resources on the mission execution process. Resource Conflict Constraints: There exist conflicts among the schedules of the same resource (or different resources), on account of the limited service capability of antenna/imager and the restriction of satellite platform attitude. For example, because of using single access antenna, a remote sensing satellite can only communicate with only one ground station at one slot, even if there are multiple ground stations in its coverage range. In order to model this kind of conflicts, we introduce a set of boolean variables δ(st k i , gs k j ), whose value is 1 if link (st i , gs j ) is active at the kth slot and 0 otherwise. Then, the conflicts of communication resource can be formulated as and Similarly, to model the conflicts among observation resource, we introduce a set of boolean variables δ(ob k i , im k j ), whose value is 1 if imager im j points at ob i at the kth slot, and 0 otherwise. Then, we have Moreover, there also exist conflicts between the schedule of communication and observation resource, when they require different platform attitude of the same remote sensing satellite. Therefore, we have and where ξ L (ob k i , im k j ) denotes the set of communication resources conflict with (ob k i , im k j ), and ξ O (st k i , gs k j ) denotes the set of observation resources conflict with (st k i , gs k j ). Sets ξ L (ob k i , im k j ) and ξ O (st k i , gs k j ) are obtained by computing the attitude required by the scheduling of observation/transmission resource with the orbit of satellites and the location of targets/ground stations. Since no data are transmitted by non-active links, in METG flows only pass the link arcs which represent active links in corresponding time slots, i.e., Similarly, for the observation arcs we have Compression Constraints: For each mission om n , only one compression level can be selected, i.e., 1≤i≤Q y i n = 1, ∀1 ≤ n ≤ |OM |.
To guarantee the requirement of image quality, the selected compression ratio of om n should satisfy After being acquired, the raw images of mission om n are compressed at the compression unit with ratio y n T . In other words, f ∈ F n should satisfied (29) where ks(f ) is the layer index of s(f ) in the MTEG.
Above all, considering all the constraints and the objective previously described, the problem of maximizing sum effective data volume obtained by DPC per unit time can be written as: In P1, x(f ) and x(v k i , v k j , f ) are continuous variables, and y i n and δ(st k i , gs k j ) are integer variables. Besides, the constraints in Eq. (18) and Eq. (29) is non-linear. Therefore, P1 is an MINLP (mixed integer non-linear programming) problem [30], which is NP-hard in general [31].
It should be noted that although the RSSN model considered in this work is relatively simple, which only includes remote sensing satellites, ground stations and data processing center, the MTEG and the analytical framework based it are convenient to be extended to handle more complicated networks. For example, by adding vertices representing communication satellites (or data relay satellites) and drawing arcs for these vertices according to visible relationship, the MTEG can be extended to incorporate communication satellite network (or data relay system).

IV. GRAPH-BASED PROBLEM ANALYSIS AND SOLUTION
The hardness to solve problem P1 optimally mainly comes from two folds. One is the product terms which leads to P1 nolinear, and the other is the large amount of integer variables. In this section, we firstly decompose problem P1 into two sub-problems of the MTEG graphical model, i.e., flow optimization problem and arc scheduling problem. Then, we develop algorithms to solve the two sub-problems, respectively. At last, a graph-based algorithm is proposed to calculate information capacity by solving the two subproblems iteratively.

A. PROBLEM ANALYSIS AND DECOMPOSITION
Observed from the perspective of both mission completion procedure in RSSN and the flows in MTEG, the variables of problem P1 can be divided into three parts: • Compression level indication variables y: boolean variables, which indicates the compression level selected for each mission. In MTEG, y i n determines the splitting ratio of mission flows.

B. TRANSFORMATION AND SOLUTION OF FLOW OPTIMIZATION PROBLEM
Given an arc schedule, the flow optimization problem (FOP) can be obtained by fixing the observation and communication resource scheduling variables δ in problem P1, which can be expressed as follows.
It can be observed that the obtained problem is still an MINLP, because of the integer variables y and the product terms in the objective and constraints. Note that, the only integer variables y are resulted from the discrete selection of compression ratios. With the development of onboard image compression technology, the number of compression ratios can be supported onboard increases gradually. When the adjustment of onboard compression compression ratio VOLUME 8, 2020 tends to be continuous, the integer variable y can be approximated by continuous variable. Therefore, to solve the flow optimization problem with high speed and low complexity, we first relax the integer compression level indication variables. In this case, y i n can be considered as the selection weight of compression ratios, and the selected compression ratio can be approximated by y n T . Thus, FOP can be transferred into an LP (Linear Programming) problem through replacing the product term. Specifically, we consider the transformationx Based on this transformation, the objective can be reformu- Q n (f )). Combining Eq. (30) and Eq. (27), we have the following transformation By substitute Eq. (30) and Eq. (31) into Eq. (18), the flow conservation constraint can be reformulated as Similarly, the constraint in Eq. (29) can be reformulated aŝ (33) In summary, by means of the transformation in Eq. (30) and Eq. (31), we can reformulate flow optimization problem as It can be observed that the reformulation of the flow optimization problem, referred to as FOLP, is an LP (Linear Program) problem, which can be solved with polynomial time. Based on the resultsx i n (f ) of FOLP, the relaxed compression level indication variablesỹ can be obtained by following expression.
Then, the compression level of each mission can be determined by rounding the sum of compression levels weighted byỹ, i.e., At last, we solve FOP with fixed y to obtain mission flow variables x and information capacity C in . Algorithm 1 summarizes the detailed procedure of solving the flow optimization problem. As we have discussed in Section III.B, the schedule of different observation and/or communication resources may conflict with each other due to the limited service capability of antenna/imager and the restriction of satellite platform attitude. To provide a conflict-free schedule in MTEG, we propose conflict graph, denoted by CG, to model the conflict relationship among different resources. Fig. 4 depicts the conflict graph of the observation and communication resources shown in Fig. 3(b). Each node in conflict graph represents a possible resource schedule which corresponds to an arc in MTEG. For example, node nd(ob 1 1 , im 1 1 ) in Fig. 4 represents the schedule that the observation resource of satellite os 1 (i.e., im 1 ) observes ob 1 in the 1st slot, which corresponds to arc (ob 1 1 , im 1 1 ) in Fig. 3(b). The edges in conflict graph represent the conflicts between the resource schedules. In other words, if there two resource schedules conflicts with each other, there exist an edge between the two nodes corresponding to them in CG. According to the source of conflicts, the edges of conflict graph can be divided into two categories: • Resource service restriction edges: represent the conflicts caused by the limitation on the number services supported by observation and communication resources at the same time. For example, edge nd(st 1 1 , gs 1 1 ) ↔ nd(st 1 1 , gs 1 2 ) represents the conflict that in the 1st slot 34064 VOLUME 8, 2020 satellite os 1 can only communicate with one ground station of gs 1 and gs 2 .
• Platform attitude restriction edges: represent the conflicts resulted in the different requirement of platform attitude to a satellite. For example, edge nd(ob 3 2 , im 3 1 ) ↔ nd(st 3 1 , gs 3 2 ) represents the conflicts that in the 3rd slot satellite os 1 can either observe ob 2 or communicate with ground station gs 2 , due to the different platform attitude requirement of os 1 . Similar to MTEG, the conflict graph is a layered graph. Moreover, the layers of conflict graph are independent with each other, because the edges only connect nodes in the same layer. The resource schedules contained in an independent set 4 of conflict graph are conflict-free, since there exists no edge between the nodes of an independent set in CG. Therefore, through sequentially finding independent sets for each layer of the conflict graph, an conflict-free schedule of observation and communication arcs can be obtained.

2) INITIALIZATION OF OBSERVATION AND COMMUNICATION ARC SCHEDULE
An IASC (Initial Arc Schedule Construction) algorithm is proposed to construct a conflict-free schedule of observation and communication arcs through conflict graph at the beginning of alternatively solving FOP and ASP problem. In order to accelerate the convergence speed, an effective initial arc schedule should be constructed to obtain a good solution for the following flow optimization problem. To achieve this goal, we strike a balance between the scheduled observation and communication resources to avoid wasting the scheduled resources. To this end, the main idea of IASC algorithm is to set weight for the nodes according to the ratio of the capacity scheduled observation and communication arcs, and then find the maximum weight independent set in the conflict graph layer by layer.
Algorithm 2 details the complete procedure of IOCAS algorithm, which consists of two stages. In the first stage, 4 The independent set in graph G is a set of nodes such that there is no two nodes are adjacent in G [32]. the compression level r is initialized as follows.
Then, we search for the observation and communication resources which have opportunities to complete a mission, and set the weights of them in the conflict graph to 1.
In the second stage, the arc schedule is solved through conflict graph layer by layer. For each layer, we firstly set the weight of the nodes corresponding to observation arcs to ρ, which is expressed as wherein c o and c c are the sum capacity of the scheduled observation and communication arcs, respectively. Weight ρ is designed to strike a balance between the scheduled observation and communication resources so that achieve a high resource utilization. For example, when c o > c c · ϕ r , the capacity of scheduled observation resources is larger than amount of effective data the scheduled communication resources can transmit even with the help of data compression. This means part of observation resources in the preceding layers cannot be fully utilized. In this case, weight of observation arcs are set to ρ < 1 to reduce the ratio of scheduled observation resources in the succeeding layers. Secondly, we solve the maximum independent set 5 of current layer. After all the layers have been traversed, we obtain the independent set of conflict graph IS by combining the independent sets of each layer and thus obtain the observation and communication resource scheduling variables.

3) UTILIZATION ORIENTED ARC SCHEDULING
A UOASU (Utilization Oriented Arc Schedule Update) algorithm is proposed to improve the arc schedule based on the solution of flow optimization problem. As shown in Algorithm 3, we firstly search the set of scheduled resources with low utilization, denoted by A LU . Then, for each arc in A LU (without loss of generality, denoted by (v k i , v k j )), the weight of nodes in the same layer of CG are reset to improve the opportunities of other resources to be scheduled. More specifically, for the node corresponding to arc in A LU , its weight is reset to its utilization, i.e., . For the other nodes, the new weight is expressed as wherein 0 < α < 1 and ∼ U (0, 1) is a uniformly distributed random variable in (0,1). After the weights of all 5 Note that the size of each layer in conflict graph is small because of the limited number of satellites in RSSNs. Therefore, in spite of the maximum weight independent set is NP-complete [33], we can solve it within limited time through either enumerative methods (e.g. branch-and-bound algorithms [34]) or approximation heuristic algorithm [35]. {(1 − ds i )ϕ i }; 5: for 1 ≤ n ≤ |OM | do 6: for each (ob k n , im k i ) ∈ A do 7: if ∃(st l i , gs l j ) ∈ A and l − k ≤ dy n τ then 8: w(ob k n , im k i ) ← 1; 9: w(st l i , gs l j ) ← 1, ∀(st l i , gs l j ) ∈ A and l − k ≤ dy n τ ; 10: end if 11: end for 12: end for 13: for 1 ≤ k ≤ K do 14: 15: find maximum weight independent set of CG k ; 16: find the maximum weighted independent set of the k-th layer of CG, and then update corresponding arc scheduling variables in δ; 9: end for nodes in the k-th layer being reset, we find the maximum weighted independent set of the k-th layer of CG, and then update corresponding arc scheduling variables in δ.

D. GRAPH-BASED INFORMATION CAPACITY SOLVING ALGORITHM
Based on the decomposition of P1 and the algorithms to its subproblems, a graph-based algorithm is proposed to solving the information capacity. The outline of GBICS (graph-based information capacity solving) algorithm is illustrated in δ ← UOASU (G K (V, A), OM , x 0 , δ 0 ); 8: (C in , x, y) ← SFOP (G K (V, A), OM , δ); 9: if C in > C 0 in then 10: (C 0 in , x 0 , y 0 , δ 0 ) ← (C in , x, y, δ); 11: else 12: generate a random number ∼ U (0, 1); 13: if e C in −C 0 in β t ·M R > then 14: (C 0 in , x 0 , y 0 , δ 0 ) ← (C in , x, y, δ); 15: end if 16: end if 17: t ← t + 1; 18: end while algorithm. Afterwards, the flow optimization problem and arc scheduling problem are solved iteratively by UOSAU algorithm and SFOP algorithm, respectively. More specifically, in each iteration, if the new solution is better than the current one, it is accepted. Otherwise, the new solution is accepted with probability e C in −C 0 in β t ·M R as illustrated in line 12-14 of Algorithm 4, wherein 0 < β < 1 and M R 1. It should be noted that the probability decreases with the difference between the new solution and current one C in − C 0 in and iteration number t. When t is small, the acceptance probability is large to avoid being stuck in a local optimum at early iterations. When t get large, the acceptance probability become small to accelerate convergence.

V. SIMULATIONS
In this section, simulation results are presented to validate our analysis and investigate information capacity of RSSNs. We conduct a baseline scenario which consists of 20 remote sensing satellites, 6 ground stations and one data processing center. The remote sensing satellites locate in four sun-synchronous orbits at a height of 619.

A. INFORMATION CAPACITY WITH VARYING CAPABILITY OF DIFFERENT RESOURCES
In this subsection, we investigate the impacts of different resources on information capacity, and show the difference between information capacity and traditional communication capacity [12]. To evaluate the impact of observation resource on information capacity, we varying the resolution of the imagers from 10m to 1m and show the information capacity when tolerable delay is 0.5h, 1h and 2h in Fig. 5. As expected, the communication capacity is not varying with the capability of observation resource, because it does not take into account the information acquisition process. In comparison, the information capacity is non-decreasing with the capability of observation resource. To be specific, when the tolerable delay is small, the information capacity is near linearly increasing with the data acquisition rate. This is because due to the small tolerable delay, only the data acquired from the observation targets near ground stations have opportunities to be downloaded, which results in the underutilization of downloading links. In this case, the larger data acquisition rate is, the more data can be download. When the tolerable delay is large, the growth of information capacity become slow with the capability of observation resource until stagnation. This is because in this case the data from much more observation targets can be delivered through storecarry-forward paradigm. Therefore, when the data acquisition rate is too large, the information capacity is restricted by the capacity of other resources. 6 In practical application, the principle of deciding the value of τ is to strike a balance between the accuracy of the model and computational complex.   6 depicts the relationship of information (and communication) capacity with varying capability of computational resource. As can be observed, the communication capacity is not varying with the capability of computational resource, since it does not consider the information processing process. When the tolerable delay is small, the information capacity increases slowly with the compression ratio. This is because with the small tolerable delay, the transmission and storage capability is large enough for most images without compression. When the tolerable delay is large, the growth of information capacity is nearly linear with the compression ratio. In this case, the amount of acquired data is much larger than the transmission capacity. Therefore, the larger the compression ratio is, the more effective information can be downloaded. To evaluate the impact of communication resource on network capacity, we varying the transmission rate from 20Mbps to 70Mbps and plot the communication capacity and the information capacity when tolerable delay is 0.5h, 1h and 2h in Fig. 7. As expected, the communication capacity increases linearly with the capability of communication resource. In comparison, the information capacity has near-liner growth with the capability of communication resource at first, and then tends to become nonlinear saturation, because of the limitation of other resources. More specifically, when the tolerable delay is small, the information capacity hardly changes with the transmission rate. This is because due to the small tolerable delay, only the data acquired from the observation targets near ground stations have opportunities to be downloaded, thus the downloading capacity is always larger than the amount of acquired data. When the tolerable delay is large, the information capacity has near-liner growth with the capability of communication resource for a longer period, and then tends to saturation. This is because in this case large amount of acquired data can be delivered via storecarry-forward paradigm, thus large downloading capability is required. Therefore, the larger the transmission rate is, the more effective information can be downloaded. At last, we investigate the impact of storage resource on information capacity, and show the difference between information capacity and traditional communication capacity. Fig. 8 depicts the information capacity and communication capacity when the storage capacity varies from 30Gbit to 180Gbit. As can be observed, the communication capacity is not varying with the capability of storage resource, since it does not consider the impact of storage resources. In comparison, the information capacity is non-decreasing with the capability of storage resource. To be specific, when the tolerable delay is small, the information capacity tends to become saturation after a short period of near-linear growth. This is because in this case only the data acquired from the observation targets near ground stations have opportunities to be downloaded, thus the required storage capacity is very limited. When the tolerable delay is large, the information capacity has near-liner growth with the capability of communication resource for a longer period, and then tends to saturation. This is because in this case large amount of data can be delivered via store-carry-forward paradigm, thus large amount of storage capacity is required.

B. INFORMATION CAPACITY WITH VARYING NETWORK ARCHITECTURAL PARAMETERS
In this subsection, we investigate the impacts of different network architectural parameters (e.g., the orbital parameters of the satellites and the distribution of the ground stations) on information capacity, and show the difference between information capacity and traditional communication capacity. Four new simulation scenarios are conducted by changing one parameter of the baseline scenario (referred to as S0). Note that to investigate the impact of different parameter separately, each new scenario has only one parameter different from S0. Here, for the sake of brevity, for each new scenario we only list the parameter different from the baseline scenario.
• S1: The altitude of remote sensing satellites is 500km. • S2: The altitude of remote sensing satellites is 1000km.    9 illustrates the influence of varying altitude of remote sensing satellites on both communication and information capacity. It can be observed that both capacity increase with altitude. This is because the length of visible windows between the satellites and ground stations/observation targets increases with the altitude of remote sensing satellites. Moreover, the variation of communication capacity with altitude is more obvious than that of information capacity especially when tolerant delay is large. This is because the communication capacity increases linearly with the communication windows. In comparison, although high altitude brings more observation and transmission opportunities, the information capacity still be limited by the contention among resource scheduling and the amount of storage and computational resource, especially with large tolerant delay. Fig. 10 illustrates the influence of different ground station deployment on both communication and information capacity. In scenario S3, we deploy ground stations only on China mainland. In scenario S0, we deploy the ground stations globally and choose the locations with lower latitude. In scenario S4, we deploy the ground stations globally and choose the locations with higher latitude. As can be seen from Fig. 10, the global deployment performs better than the local deployment, and the deployment at higher latitudes performs better than the deployment at lower latitudes. This is because the global deployment can avoid the overlap of the coverage areas of the ground stations. Moreover, because the remote sensing satellites fly on near-polar orbits, the visible time for a ground station increases with the latitude of the ground station. Therefore, both global and high latitude deployment of ground stations can bring more downloading opportunities for remote sensing satellites.

VI. CONCLUSION
In this paper, we explore the information capacity of RSSNs. We first propose the formal definition of information capacity. Then, we extend traditional time-expanded graph by modeling remote sensing satellites in a microscopic level and adding virtual arcs and virtual vertex. A new graph model called microscopic time-expanded graph (MTEG) is developed, which characterizes the intertwined impact of the observation, computational, storage and transmission resources on the service process. Base on this graph model, we develop a mathematical framework to compute the information capacity. Owing to the NP-completeness of the formulated problem, we decompose it into a flow optimization problem and an arc scheduling problem of the MTEG model, and then propose a Graph-based Information Capacity Solving (GICS) algorithm to efficiently solve the problem. Finally, extensive simulation results are provided to validate the effectiveness and necessity of analysis of information capacity in RSSNs. The impacts of different kinds of resources and network architectural parameters on the information capacity are evaluated.