Demand-Aware Onboard Payload Processor Management for High-Throughput NGSO Satellite Systems

High-throughput satellite (HTS) systems with digital payload technology have been identified as a key enabler to support 5G/6G high-data connectivity with wider coverage area. The satellite community has extensively explored resource allocation methods to achieve this target. Typically, these methods do not consider the intrinsic architecture of the flexible satellite digital payload, which consists of multiple processors responsible for receiving, processing, and transmitting the signals. This article presents a demand-aware onboard processor management scheme for broadband nongeostationary satellites. In this context, we formulate an optimization problem to minimize the number of active onboard processors while meeting the system constraints and user requirements. As the problem is nonconvex, we solve it in two steps. First, we transform the problem into demand-driven bandwidth allocation while fixing the number of processors. Second, using the bandwidth allocation solution, we determine the required number of processors with two methods: 1) sequential optimization with the branch-and-bound method and 2) bin packing with next-fit, first-fit, and best-fit methods. Finally, we demonstrate the proposed methods with extensive numerical results. It is shown that the branch-and-bound, best-fit, and first-fit methods manage the processors better than the next-fit method. Furthermore, branch-and-bound method requires fewer processors than the above methods.


I. INTRODUCTION
Satellite communication network providers are expected to offer broadband connectivity to meet the needs of an increasingly heterogeneous market, including the broadcast industry, airplane industry, maritime sector, government agencies, and end-users [1].Furthermore, satellite technology has a significant role in the era of 5G and beyond in terms of integrating satellite networks with terrestrial networks, providing backhaul services, and offering coverage for the Internet-of-Things (IoT) applications and beyond [2], [3], [4].
Satellites are now incorporating advanced digital payload technologies to adapt to these diverse markets and emerging applications.These payloads have the reconfigurable capability to perform various functions, including changing the beam coverage, allocating satellite resources, and adjusting the Radio-Frequency (RF) power distribution dynamically in response to traffic demands [5], [6].However, advanced management techniques must be employed to better exploit the functionality of these digital payloads.
Several studies have been conducted regarding satellite resource management based on user demand requirements.In this case, the power allocation over satellite downlinks in light of traffic demand and channel characteristics has been investigated in [7].Moreover, in [8], power allocation has been proposed while considering interbeam interference among the beams.Additionally, a two-stage power allocation method using metaheuristics to minimize the system's unmet capacity has been proposed in [9].Similarly, an energy-aware power-allocation technique to minimize unmet system capacity and transmit power consumption has been explored in [10].Furthermore, a reinforcement-based approach and a game-based approach for allocating the system power have been developed in [11], [12] and [13], [14], respectively.However, these approaches only focus on power management without considering payload management and bandwidth allocation.Alternatively, bandwidth allocation is explored in [15] and [16] to match the traffic demand with the system capacity.However, power allocation and payload management are not incorporated into these methods.
In the literature, different resource allocation methods have been discussed with regard to joint power and bandwidth allocation.In [17], a modified version of the simulated annealing algorithm has been proposed in order to achieve a fair distribution of power and bandwidth across beams while simultaneously taking fairness into account.Similarly, reinforcement-based techniques in [18], [19], and [20] and using conventional neural networks in [21] have been proposed to jointly allocate the power and the bandwidth of the satellite system.Furthermore, an iterative convex optimization approach to utilize the satellite resource efficiently has been studied in [22], [23], and [24].Additionally, resource optimization for integrating satelliteterrestrial communication has been considered in [25], [26], [27], [28], [29], and [30].
There have been a few studies that have examined the relationship between the management of resources and the intrinsic architecture of digital payloads.A model for quantitatively evaluating the flexibility of payloads with a digital channelizer has been explored in [31].Additionally, in [32], a frequency resource allocation method based on beam requests has been proposed to improve the overall throughput of satellite communication systems.Moreover, it investigates the advantages of digital channelizers compared with the traditional resource allocation method.
While the above methods manage the satellite resources, they do not address the constraints imposed by the payload processors, which require a different perspective.Fig. 1 illustrates a generic digital payload comprising RF inputs, RF outputs, and digital processor ports equipped with Analog-to-Digital Converters (ADCs), Digital-to-Analog Converters (DACs), filters, and modulation/demodulation techniques.A typical payload of this type is called a regenerative onboard processor.Alternatively, it is known as a digital transparent processor if the payload does not provide modulation and demodulation capabilities [33].
The RF front-end input is used to down-convert the received signal into a baseband/intermediate frequency (IF).It consists of a filter, a low-noise amplifier, a mixer, and a gain controller.The processor then converts the baseband/IF signal to digital format using ADC and applies channelization (filters), modulation, and demodulation techniques to process the signal further.Finally, the digital signal is converted back to an analog signal through DAC.The DAC output signal is filtered and amplified on the RF front-end output side for transmission [5], [33].
The payload1 operates the signals through multiple digital processors because 1) a single processor can handle only a limited number of beams in order to reduce signal processing complexity and to avoid unnecessary delays due to sequential processing of tasks, and 2) two or more signals with the same carrier frequency need independent processors to avoid interference.Therefore, appropriate mapping of the carrier bandwidth of signals to processors is required.On the other hand, the optimized signal bandwidth may depend on the user demand; a low demand requires less bandwidth, whereas a high demand requires a larger bandwidth.Thus, the number of operating processors may vary according to the demand.In general, if the demand is low, fewer processors are needed while more processors become necessary as demand increases.Hence, depending on the demand, not all of the available processors may need to be switched ON.This reduces the amount of power needed to operate the processors in order to configure each beam.
In this context, we focus on the optimization of the forward link bandwidth for broadband nongeostationary (NGSO) satellites with the main goal to minimize the number of active onboard processors while meeting the system constraints and end-user traffic demands.The detailed contributions of this article are summarized as follows.
1) We propose a mathematical formulation for the demand-aware onboard payload processor optimization to minimize the number of operating processors while considering the following aspects: a) frequency planning constraints; b) user demand constraint, and c) processor abstraction constraints.Hence, the satellite payload processors can be managed flexibly according to the system constraint and the beam demand requirement.2) The optimization problem is nonconvex due to nonlinear functions and integer-valued variables present in the formulation.We propose a two-step approach to efficiently tackle the problem.First, we design the bandwidth allocation strategy based on the frequency planning and demand satisfaction requirements while fixing the number of processors.Second, we consider the processor abstraction problem to determine the minimum number of processors required to accommodate the above bandwidth allocation strategy.We propose the following solutions to address this problem: a) sequential optimization based on the branch-and-bound technique; b) bin packing [34] based on next-fit, first-fit, and best-fit methods.3) Finally, we demonstrate the performance of the proposed methods through extensive numerical evaluations.We observe that the branch-and-bound, bestfit, and first-fit methods have better performance in flexibly managing the payload processors than the next-fit method.
The rest of the article is organized as follows.In Section II, the system model for the NGSO satellites is provided.The proposed demand-aware onboard payload processor management is presented in Section III.In Section IV, the simulation results are discussed.Finally, Section V concludes the article.

II. SYSTEM MODEL
We consider a downlink NGSO satellite to provide high data-rate connectivity to ground users.To continuously serve the globe, Fig. 2 shows a single constellation with multiple satellites.Since each NGSO satellite possesses similar characteristics, this model focuses on a snapshot of one NGSO satellite within a specific geographic region.This satellite employs multibeam technology to produce N narrow beams to serve a particular geographical area, see Fig. 2. The beams can be generated using direct radiating arrays or focal-array-fed-reflectors-based antennas [35].The generated beams are assumed to be sparse to reflect a realistic scenario in which the satellite covers only the areas of interest.The O3b mPOWER satellite, for example, uses steerable and shapable spot beams that are continuously shifted and scaled to cover specific geographical areas [36].Hence, depending on the coverage area, some beams may not have overlapping regions.Furthermore, the desired area covered by the satellite may represent maritime, aeronautical, and terrestrial providers.Therefore, the demand may substantially vary from one beam to another.Although multiple users may be served within the same beam coverage area, in this work, we follow a user scheduling abstraction, i.e., we consider a single user per beam and we examine the system's performance at different user locations.For this, we select the user's location from a uniform distribution within the beam's coverage area.For example, Fig. 3(a) and (b) shows the user selection at different instances.Then, the system response for each instance will be recorded and averaged to determine the overall system performance.This will be presented in the simulation results in Section IV.Henceforth, we will use the terms beam and user interchangeably.

A. Satellite Payload Processor Model
The processor is responsible for receiving and processing the signals before forwarding them to the user terminals.Using multiple processors, the satellite payload can process a total bandwidth of B fwd on the forward link.Each processor operates in a Ka-band available bandwidth B ava from 19.7 to 20.2 GHz.In this case, B fwd B ava .In conventional satellite systems, the bandwidth B ava is divided into bandwidth chunks, and a frequency reuse scheme (e.g., a four-color scheme) is used to reuse each bandwidth in adjacent beams.Accordingly, the processors handle the bandwidth chunks.However, bandwidth fragmentation may occur within each processor's available B ava , as shown in Fig. 4(a).This is because conventional systems do not optimize the bandwidth distribution across processors, resulting in the use of a substantial number of processors.Hence, optimizing bandwidth distribution across the processors would be more beneficial in minimizing the number of active processors and bandwidth fragmentation.This can be seen in Fig. 4(b) where the bandwidth chunks are rearranged to be handled by fewer processors than in Fig. 4(a).In this context, optimizing bandwidth distribution across the processors will be addressed in Section III.

B. Frequency Planning Model
In our model, each beam can be allocated an orthogonal carrier to avoid cochannel interference.However, there is the possibility of reusing the same carrier frequency band if beams are sufficiently separated.In this case, a group of beams may share the same carrier frequency band.Additionally, beams belonging to the same group require a different processor to process each beam's signal, see Section III-A.In this context, we define the set of groups as where G m is the mth group of beams and the maximum number of groups is given by M = 2 N − 1.Furthermore, we define B m as the bandwidth of the carrier assigned to the mth group.Fig. 5 shows an example of a satellite system with N = 3 beams at different instances.In this case, we have M = 2 N − 1 = 7 possible groups and the groups are Accordingly, for each group, the satellite allocates the corresponding bandwidth B m from the available bandwidth B ava .For example, from Fig. 5, at instance one, the satellite allocates B 1 and B 6 for G 1 and G 6 , respectively, while the bandwidth allocation for the other groups is zero.At instance two, the satellite selects G 3 , G 4 , and G 6 with their corresponding bandwidth B 3 , B 4 , and B 6 .Hence, in our model, each beam can be assigned to multiple carriers.However, the bandwidth allocation must satisfy the following constraints: The constraint T 1 guarantees that overall bandwidth utilization should not exceed the total available processor bandwidth.The T 2 constraint implies that B m cannot be negative.Furthermore, we denote the transmitted power of beam i belonging to the mth group as p i,m .The overall power allocation is restricted by the total system power P total and is defined as Note that the power sharing enabled by flexible payloads has been assumed, such that total power can be arbitrarily distributed across beams [37].

C. Channel Model
The satellite channel is primarily affected by the Lineof-Sight (LoS) component, the antenna radiation pattern, and the rain attenuation [38].Hence, we can model the channel as where h i,m [ j] is the channel coefficient from a beam j to beam i of in the mth group and consists of the following.
1) Free-Space Path Loss (FSPL): The attenuation of a signal as it propagates through free space [39].This path loss is defined as where λ is the carrier wavelength and d i is the LoS distance between the satellite and the ith beam.2) Antenna gain (AG): This refers to the overall gain obtained from the receiver and transmitter antennas, which is modeled [40], [41] as follows: where G R is the receiver gain, G max is the maximum beam gain, and J 1 (u i,m [ j]) and J 3 (u i,m [ j]) are firstorder and third-order Bessel functions of the first kind, respectively.Furthermore, where θ i,m [ j] represents the angle between the center of the beam j and the desired location of beam i as viewed from the satellite.Furthermore, θ 3˜dB refers to the half-power beamwidth.3) Rain attenuation: This reduces the amplitude of the transmitted signal by scattering and absorbing it.This rain attenuation effect is notably severe for carrier frequencies above 10 GHz.Following the recommendation of TU-R P.618-13 [42], ITU-R P.839 [43], ITU-R P.837 [44], and ITU-R P.838 [45], the rain attenuation is given by where is the rain attenuation for beam i with a percentage P of the average rainfall rate in a year [44].
Let g m [i] = g i,m [1], g i,m [2], . . ., |2 represents the channel gain vector of beam i belonging to the mth group, where |G m | is the cardinality of the mth group.While having the transmit power and channel definitions, the Signal-to-Interference-plus-Noise Ratio (SINR) of beam i in the mth group is determined as Then, the Shannon capacity for the beam i in the mth group is Subsequently, the total capacity provided by the system to the beam i is This capacity should match the demand D[i] > 0, i.e., However, for larger demands, the constraint T 5 may not feasible to satisfy.In this case, the total demand satisfaction in all beams is measured by the unmet system capacity as follows:

III. PROPOSED DEMAND-AWARE ONBOARD PAYLOAD PROCESSOR MANAGEMENT
In this section, we propose a demand-aware onboard payload processor management optimization for high throughput NGSO satellite systems.First, the processor abstraction will be presented to provide details on how the processor will handle the beam bandwidth chunks.Then, we formulate a problem while considering the satellite payload model, the frequency planning model, the channel model, and the processor abstraction.Afterward, a solution will be presented for this problem.

A. Processor Abstraction
This section discusses in detail the system constraints associated with processor abstraction.A single processor must handle orthogonal carriers to avoid signal interference.Hence, signals with the same carrier frequency require different processors.Additionally, a beam may belong to several groups depending on its demand.Correspondingly, multiple bandwidth chunks can be assigned to the same beam. 2In this case, it is preferable to allocate the bandwidth chunks belonging to the same beam, to the same processor.Consequently, we can avoid the amount of signal processing required to recombine bandwidth chunks from multiple processors for the respective beams.In this context, we provide the specific characteristics of the processor abstraction as follows.
1) Bandwidth Mapping: The system is required to assign the corresponding bandwidth chunks of beams to active processors.Let x m [i][l] be a binary value mapping indicator with x m [i][l] = 1 indicating that the bandwidth chunk B m of beam i belonging to the mth group is mapped to processor l.Note beams belonging to the mth group share the same bandwidth chunk B m .Hence, each beam requires different processors to avoid signal interference.On the other hand, beams belonging to different groups can be assigned to the same processor.Furthermore, the system must avoid assigning two or more processors to handle the same bandwidth chunk B m of beam i.Additionally, mapping is permitted only when the lth processor is in active mode, i.e., y l = 1 otherwise y l = 0 indicating that the lth processor is offline.Hence, we introduce the following constraints: Here, T 6 prevents beams belonging to the same group from accessing the same processor, whereas T 7 avoids repetitive bandwidth mapping to the processors for ith beam belonging to mth group.For B m > 0, the constraint T 8 ensures that all beams in the mth group are mapped to their respective processors.Note that the symbol • in T 8 denotes the ceiling function.This mapping is only possible if y l is equal to one.Furthermore, all the bandwidth chunks mapped to the lth processor must not exceed B ava .Hence, F l ≤ B ava holds and expressing F l in terms of x m [i][l] and B m as leads to the following constraint: 2) Carrier Contiguity: For practical reasons, we assume that carrier contiguity (CC) can be employed to assign the bandwidth chunks belonging to the same beam to the same processor.This reduces the additional signal processing required by the system to recombine bandwidth chunks belonging to the same beam from multiple processors for transmission.Therefore, we define a binary-valued variable z l [i] with z l [i] = 1, indicating that all the bandwidth chunks of beam i are mapped to the lth processor.Accordingly, we write this as follows: (15) where T 13 confirms that all the bandwidth chunks are assigned to the lth processor.Furthermore, T 14 prevents mapping CC of the ith beam into multiple processors.

B. Problem Formulation
Here, a payload processor management problem is formulated to determine the minimum number of processors needed for onboard signal processing to handle the bandwidth chunks of each beam.Hence, we can closely match the system's capacity with the beam demand and switch OFF completely unused processors to preserve battery power.Additionally, since it is an NGSO satellite, the situation may change many times during the passage.Hence, to improve the adaptability of the obtained solution to future changes, especially since the size of the bandwidth chunk to be added in the future is unknown, it is reasonable to save as much processing capacity as possible.Accordingly, we propose to load the active processors as much as possible, such that some processors would have large capacity savings, i.e., large bandwidth chunks can be accommodated in the future.In this context, we consider to minimize the L l=1 y l while utilizing the maximum load of each active processors, i.e., F l .Hence, the problem is formulated as follows: where L l=1 w l = 1, w l < w l+1 , and w l is the priority factor of the lth processor.Hence, a processor with the lowest priority factor will have a greater chance of being activated than one with the highest priority factor.The objective of the problem (16) and the constraints T 5, T 8, T 12, and T 13 have nonlinear functions while the remaining constraints have integer linear functions.Hence, the type of the problem is a mixed-integer nonlinear program.The solution to this problem is thus difficult to obtain for the following reasons.
1) Nonconvexity: The nonlinear function of the objective, the constraints T 5, T 8, T 12, and T 13 makes the problem nonconvex; thus, convex optimization methods cannot solve it.Hence, (16) needs to be convexified first.
2) Complexity: Two factors contribute to the complexity of this optimization: a) The total number of bandwidth and transmit power optimization variables are increases exponentially as the number of beams increases, which is given by 2 N − 1 and N (2 N − 1), respectively.Thus, as N increases, the computational time for the optimization increases.b) The search space due to the combinatorial type of optimization variables x m [i][l], y l , and z l [i] increase exponentially as N and L increase with 2 (2 N +N+L−1) , 2 L , and 2 L+N , respectively.Hence, the computational complexity of ( 16) combined with its nonconvexity makes the problem much more difficult to solve.For this, complexity reduction is required.
To address the above issues, we decompose the original problem ( 16) into two parts.First, we solve the frequency planning problem so that the offered capacity closely matches the beam demand while fixing the number of processors.Consequently, we simplify it into a demanddriven bandwidth allocation problem by convexification and complexity reduction.Then, we consider the processor abstraction problem by fixing bandwidth allocation.For this, we propose two methods: 1) Sequential optimization, which uses the branch-and-bound technique and considers the processors sequentially for handling bandwidth allocation.2) Bin packing, which uses the processors as bins and bandwidth chunks as items to be packed heuristically.In this context, we explore next-fit, first-fit, and best-fit bin-packing methods.

C. Demand-Driven Bandwidth Allocation
Here, we want to allocate the necessary bandwidth and transmit power to each beam based on its demand.Hence, matching the offered capacity with per beam demand is required.For this, we minimize the system unmet capacity in (11) while fixing the number of processors to L, 1 ≤ L ≤ L. For L processor, we only consider groups of beams containing L or less beams.In this case, we need only groups from G that satisfies G n ∈ Z, where n ∈ V, V = {m : |G m | ≤ L, ∀ m } and Z = {G n }, ∀ n .For example, with L = 2, groups that satisfy |G n | ≤ 2 are selected.In this case, if |G n | = 2 is chosen, i.e., two beams share the same B n , then it is possible to assign one processor to each beam.Accordingly, the problem ( 16) for L processors is simplified to the following subproblem: T 1 : where Note that our goal is to closely match the per beam demand with the offered capacity.Hence, the constraint T 5 is equivalently represented in the objective function of (17) as unmet system capacity of (11).Unfortunately, the nonlinearity of the SINR makes the (17) nonconvex.Furthermore, it has exponential complexity because the optimization variable.Hence, to tackle the problem, we consider a suboptimal solution of (17).For this, we assume that the spectral power density S psd is known for each beam.This will help us to avoid the nonlinearity of ( 7).Accordingly, with p i,n = S psd B n , ( 7) is rewritten as Then, replacing the max function of the objective function using upper bound slack variable 17) is provided as T 1, T 2 T 3 : Problem ( 19) is a linear program that can be solved by well-known solvers [46].Note that this optimization part does not include processor abstraction such as bandwidth mapping and CC.Hence, L in this optimization reflects the number of operating processors without CC, i.e., a beam's bandwidth chunk can be distributed across multiple processors.However, the exact number of processors required by the system to support CC will be determined in Section II-I-D.Algorithm 1 shows the demand-driven bandwidth allocation.With the increasing number of processors, the total amount of bandwidth chunks that can be accommodated increases monotonically as well.However, if the number of processors is too low, the bandwidth allocation problem may be infeasible.Hence, the minimum number of processors is determined by solving the bandwidth allocation problem with a single processor and then gradually incrementing their number until the problem becomes feasible.Hence, the algorithm solves and updates the value of L until the objective function of (19) satisfies the threshold value of .The represents the minimum threshold required to match a beam's demand with the capacity offered.In this case, is selected in the order of 10 −6 b/s.

D. Mapping Bandwidth to Processor
In this section, we are interested in mapping the bandwidth of each beam obtained in Section III-C to the processors.With this bandwidth allocation solution, we simplify Algorithm 1: Demand-driven Bandwidth Allocation.(16) to the following optimization problem: Note that G n ∈Z and B n is known, see Section III-C.Hence, T 6 − T 15 constraints are updated accordingly.Since the bandwidth B n is known, the constraints T 8, T 12, and T 13 become linear.However, the problem remains challenging because nonlinearity of the objective function and the aforementioned exponential complexity associated with the combinatorial nature of the integer part of the problem.To tackle this challenge, we propose two methods: 1) sequential optimization and 2) bin packing.
1) Sequential Optimization Method: We transform (20) into a sequential optimization problem to solve it.Fig. 6 provides a toy example of sequential optimization that shows a mapping of the bandwidth of 5 beams into three processors.Initially, all processors are in offline mode, thus y l = 0, ∀ l .The first step in mapping bandwidth chunks is to choose which of the three processors to use.Then, we would like to assign as many bandwidth chunks as possible to this specific processor before moving on to the next.Consequently, we can utilize fewer processors.In this context, the processor with the lowest priority factor is selected first.This is because from (20), a processor with a lower priority factor will have a greater chance of being considered for operation than one with a higher priority factor.In this case, w 1 is the lowest priority factor, then the first processor is chosen, i.e., y 1 becomes one while y l , ∀ l =1 of the others remains zero.Thus, the minimization of L l=1 w l y l F l +1 is simplified to a minimization of w 1 /(F 1 + 1).It is possible to convert minimization of w 1 /(F 1 + 1) into maximization by interchanging its numerator and denominator functions, i.e., maximization of (F 1 + 1)/w 1 ≈ F 1 .Accordingly, we maximize F 1 subject to the beam's bandwidth chunks.In this example, both beams 1 and 4 maximize the function F 1 .Since we already use the first processor, we map the remaining bandwidth to subsequent processors.We follow the same procedure as described above to select the processor with the lowest priority factor and allocate bandwidth chunks accordingly for the remaining processors.Hence, we select the second processor since it has the lowest priority factor than the third processor in this example.This processor operates at maximum efficiency with beams of 2 and 5. Finally, the third processor is selected and assigned to beam 3. Note that in this sequential optimization, we take into account one processor at a time, then the minimization of 20) is simplified to the maximization of F l .Furthermore, T 8 and T 10 of ( 20) are not required for this sequential optimization because both constraints are only useful when considering multiple processors at once.Furthermore, we can discard T 7 and T 11 in this sequential optimization because both are equivalently represented by T 9.In the above context, for a single processor, (20) reduces to Note that by observing the T 13 equality of both left and right parts of the equation, we equivalently rewrite F l as Then, we consider (22) as an objective function of (21).Equation ( 21) has less computational complexity than (20) because we reduce the search space of the optimization variables corresponding to bandwidth mapping, processor indicator, and CC from 2 M+N+L , 2 L , and 2 L+N , to 2 |V|+N , 0, Algorithm 2: Sequential Optimization Method.and 2 N , respectively.Furthermore, ( 21) is an integer linear program that can be solved by a branch-and-bound method using MOSEK in CVX solver [46].Algorithm 2 describes the sequential optimization method.First, the algorithm solves (21) for the first processor and records the beams corresponding to this processor into I l .Then, for the subsequent processor, the algorithm solves (21) for all beams not included in the set I l , ∀ l .Finally, it terminates when all the bandwidths of beams are mapped to the processor.This occurs when the total number selected beams are equal to 2) Bin-Packing Method: This method solves the binpacking problem by packing different-sized items into a finite number of bins, each having a limited capacity [34].We consider processors as bins and bandwidth chunks as items in this context.Let the set of bandwidth chunks per beam be denoted as B = {B 1 , B 2 , . . ., B N }, where B i is the set of bandwidth chunks for beam i.The normalized size of B i is defined as W i , which indicates that the overall bandwidth allocated to ith beam and normalized by B ava .Hence, our goal is to pack or map W i ∀ i to the processors while taking into account the processor abstraction scenarios such as 1) the bandwidth chunks of B i that correspond to beam i must all be mapped to a single processor, 2) beams that share the same bandwidth must be mapped to different processors, and 3) to use as few processors as possible.In this manner, we consider the following bin-packing methods. .For this, assuming 5 processors, with each processor having the capacity of 1, the next fit does the following.The first step is to activate the processor l and determine whether the first item W 1 will fit.Since W 1 fits on the current processor, this algorithm places it on this processor.Next, the second item W 2 is selected to check if it fits on the processor l.In this case, the processor cannot accommodate the W 2 since the sum of W 1 and W 2 exceeds its capacity.The algorithm then closes the l processor and activates the l + 1 processor to assign W 2 .It then selects the next item W 3 to pack on the l + 1 processor.Subsequently, the item W 4 is selected.Since beam 3 and beam 4 belong to the same group G 16 , W 4 cannot fit on the l + 1 processor.Additionally, placing the item W 4 with W 2 and W 3 exceeds the processor's capacity.Hence, the algorithm activates the l + 2 processor to pack W 4 .Following that, W 5 and W 6 are placed on l + 3 and l + 4 processors, respectively.This is because beam 5 and beam 4 belong to the same group G 35 and thus beam 5 cannot be placed on a l + 2 processor.Additionally, since beam 6 and beam 5 belong to the same group G 37 , placing beam 6 in l + 3 is not allowed.See Fig. 7 for this example.The computational complexity of this algorithm is given by O(L).2) First-fit algorithm: In this algorithm, all processors are kept active and arranged in order.Then, it attempts to place each new item in the first processor available if it does not violate the orthogonality bandwidth and capacity of the processor.With the example above, the first-fit algorithm places W 1 on the l processor and W 2 on the l + 1 processor.Then, W 3 is included in the l processor.Following this, W 4 is selected and placed on the l + 2 processor since both l and l + 1 have limited capacity.Furthermore, W 4 is not allowed to be with W 2 or W 3 because it belongs to the same group G 35 and G 16 , respectively.Subsequently, W 5 is selected and included in the processor l.Finally, W 6 is assigned to l + 2 processor because it cannot fit either in l and l + 1 processor.
In this case, we save two processors compared to the next-fit method.See Fig. 8 for this example.The computational complexity of this algorithm is given by O(L 2 ). 3) Best-Fit algorithm: Like the first-fit algorithm, all processors are in the order activated, but a new item is placed on the processor with the maximum load and does not violate the system's constraints described above.For the above example, W 1 and W 2 are packed into lth and l + 1 processors, respectively.Then, the algorithm places W 3 to the processor with the maximum load that satisfies the system's constraints.For this, the l + 1 processor has a higher load than the lth processor, and thus, W 3 is assigned to it.Subsequently, the item W 4 is placed on the lth processor because it has the maximum load and fulfills the system constraints.For item W 5 , the algorithm assigned it to the l + 2 processor because it cannot fit in the l or l + 1 processor.Similarly, W 6 mapped into the l + 3 processor.Note that beam 6 and beam 5 belong to the same group G 37 ; thus, beam 6 cannot be placed in the l + 2 processor.In this case, we save one processors compared to the next-fit method.See Fig. 9 for this example.The computational complexity of this algorithm is given by O(L 2 ).Hence, the remarks are as follows. Remarks: 1) The next fit, first fit, and best fit have a polynomial complexity; thus, the respective solutions can be obtained from a computer in polynomial time.In contrast, sequential optimization has exponential complexity, which requires more time to obtain a solution from a computer than bin-packing methods.In this work, we assume a) the number of beams (N) equals the number of processors (L) and b) the bandwidth allocation per beam cannot exceed the operating processor bandwidth B ava , which is indicated in constraints T 1 and T 12. Accordingly, the solution to the sequential optimization and the bin-packing methods is bounded by 1 ≤ L ≤ L.

IV. SIMULATION RESULTS
This section evaluates the performance of the demandaware onboard payload processor management via numerical simulations.For this, we consider an MEO satellite operating at 8063 km above the Earth.Table I provides a summary of the key simulation parameters.In this simulation, we compare the proposed sequential optimization based on the branch-and-bound method and bin-packing algorithms, such as next fit, first fit, and best fit.For the sequential optimization method, we simulate the following two scenarios.The first is sequential optimization with CC, as explained in Section III-D1.The second scenario is when the sequential optimization uses the branch-andbound method without CC (WoCC).In this case, the beam can be assigned to multiplier processors.However, beams that share a bandwidth chunk cannot belong to the same processor.The WoCC is obtained by removing the constraint T 14 of ( 21) and replacing the objective function with the upper-side equation of (22).

A. System Response to Uniform Demand
In this section, we demonstrate how the proposed algorithm behaves when there is a uniform demand.Specifically, we assume a demand of D[i] = 20(α + 2) ∀ i Mb/s with an integer value of α ∈ [1,8].In addition, we average the simulation results from 400 Monte Carlo runs.Fig. 10(a) shows the number of processors needed to accommodate uniform beam demands.The system requires fewer processors for low demands and many for high demands.Hence, as the demand increases, the number of active processors increases.This is because each beam needs more bandwidth for higher demands.Thus, more processors are required to handle the bandwidth of each beam.For example, five processors are adequate for 60 Mb/s, whereas 9 processors are sufficient for 100 Mb/s.Similarly, 12 processors are adequate for 160 Mb/s demand.However, as the demand increases to 200 Mb/s, the system can operate with 15 processors.
In addition, Fig. 10(a) compares the performance of the proposed schemes in utilizing the processors in response to the uniform beam demand.In this case, the branch-and-bound-WoCC uses fewer processors to handle each beam's bandwidth.This fewer-processor utilization results from branch-and-bound-WoCCs ability to map bandwidth chunks to multiple processors.However, this creates additional overhead for the system to recombine these chunks from multiple processors for transmission.The branch-and-bound-CC method requires fewer processors than next-fit, first-fit, and best-fit methods.For instance, at 100 Mb/s, it uses four processors.In contrast, next-fit, first-fit, and best-fit methods utilize nine, five, and five processors, respectively.For demand below 100 Mb/s and above 200 Mb/s, the first-fit, best-fit, and the branch-andbound-CC methods have a similar performance.On the other hand, the next-fit method utilizes more processors than the other methods.This is because the algorithm only sees the current state of the processor and the current bandwidth.It has no prior knowledge of the remaining processors and bandwidth chunks.Hence, the next fit has limited flexibility in mapping the bandwidth to the processors.The branch and bound have flexibility in selecting beams to assign their corresponding bandwidth to each processor.In contrast, the best fit and first fit have processor selection freedom to map the current beam on the given processor.Generally, it is possible to implement the branch-and-bound method for a few beams.However, as the number of beams increases, the computational time for branch and bound may increase.Hence, the best-fit approach or the first-fit approach is best to employ for higher beams.Fig. 10(b) depicts the amount of bandwidth handled in each processor when the beam demand is 200 Mb/s.The processors' load varies on average, with more loads in the lowest indexed processors than the highest ones.For example, processor 1 has more load than processor 16.This is because all methods map the bandwidth chunks sequentially to the processors based on their order.Hence, the lowest indexed processors are more likely to be checked by the algorithm, resulting in more bandwidth handling.However, the next-fit algorithm shows that processors from 1 to 14 each have similar loads.This is because once the algorithm knows a beam i is unfit for the lth processor, it closes the lth processor and begins on the l + 1 processor.Thus, checking each processor for every bandwidth chunk is less likely than the other methods.This results in slight load variations between the processors.
With the branch-and-bound-CC method, the performance is better than best-fit, first-fit, and next-fit methods, which fits as many beam bandwidths as possible in the order of the processors.However, it is less efficient than the branch-and-bound-WoCC method.Generally, having more bandwidth on a single processor results in free space on the remaining processor.Hence, we can use this for other applications.Similarly, Fig. 10(c) shows the number of carrier frequencies handled by each processor when each beam requests 200 Mb/s.For example, in the first processor, 8, 12, 12, 14, and 21 carriers are handled when the next-fit, best-fit, first-fit, branch-and-bound-CC, and branchand-bound-WoCC methods are applied, respectively.Both branch-and-bound-WoCC and branch-and-bound-CC methods have better performance mapping the number of carriers to the processors than other methods.Fig. 10(d) describes the processors required corresponding to frequency carriers of beams for 200 Mb/s.To satisfy all beams, we have 100 carriers handled by 11 processors when the methods branch-and-bound-CC, first fit, and best fit are applied.In contrast, the branch-and-bound-WoCC and next-fit methods handle the carriers in 4 and 20 processors, respectively.Hence, the branch-and-bound-WoCC method has better performance than other approaches.However, for practical implementation, it may require a complex signal processing technique to combine the carriers of each beam from multiple processors.While among the other methods, the branch-and-bound-CC method has better processor utilization.For example, given two and six processors, the branch-and-bound-CC method maps 27 and 71 carriers, respectively.In contrast, the first fit maps 23 and Generally, all methods perform better on processor utilization than next fit.For this, see Table II frequency carriers versus processors in percent for 200 Mb/s.For example, with eight processors, branch-and-bound-CC, first-fit, and best-fit methods can handle 88%, 83%, and 83% of carriers, respectively.However, the next-fit method can only manage 56% of carries.In contrast, branch-and-bound-WoCC can efficiently operate in 100% cases with five processors.Fig. 11 shows the computational time for all schemes.We observe that the branch-and-bound method takes more time than bin-packing methods.For example, for a 100-Mb/s demand, the computational time of the branch-andbound-CC and branch-and-bound-WoCC is 2.7 s and 1.65 s, respectively.In contrast, the computational time for next fit, first fit, and best fit at 100 Mb/s is 2.4 ms, 3.5 ms, and 3.7 ms, respectively.Typically, the bin-packing method uses a heuristic approach to map bandwidth chunks to processors.This requires less computational time compared to the analytical optimization approach of branch and bound.
Generally, the proposed algorithm takes less time to compute, permitting us to control the payload processors in real time.Moreover, if timing becomes a constraint, the proposed method can be used as a basis for an adaptive algorithm to control the number of active processors.This is due to a slow and continuous variation in traffic demand as well as communication channels on a satellite-terrestrial link.Hence, the algorithm may not be recomputed each time for a slight change in the parameters.However, this is beyond the scope of this work.Fig. 12 shows the performance gap 4 of the bin-packing method while considering the branch-and-bound-CC method as a baseline.The performance gap of the next-fit method is significantly higher than that of the first-fit and best-fit methods.For instance, at 100 Mb/s, the performance gap of next fit is 105%, whereas the performance gap for first fit and best fit is 10%.Generally, since the next-fit method has a lower computational time, it is suitable for time-constrained scenarios at the expense of requiring more processors.If the time constraint is not critical, it is possible to implement the branch-and-bound-CC method to utilize fewer processors while a tradeoff can be achieved between the number of active processors and computation time using the first-fit and best-fit methods.

B. Heterogeneous Demand
In this section, we study the performance of the proposed methods for heterogeneous demand.The traffic demand can be modeled as a Poisson distribution which follows an exponential distribution [47].Hence, we generate each beam demand using an exponential distribution.The cumulative distribution function of this distribution is provided as Here, we assume β and D[i] are the mean traffic demand of the system and random variable indicating the demand of beam i, respectively.Then, we obtain demand D[i] from F (D[i]) using the inverse method [48].For this, we generate a uniform random number F (D[i]) = χ[i] in the interval (0,1).Hence, D[i] is obtained as follows: Accordingly, we generate three heterogeneous demand5 distributions based on the (25), which are called low, moderate, and high demand, as shown in Fig. 13.

TABLE III Processors Used per Demand
Table III shows the processor usage in case of low-, moderate-, and high-demand distributions.Each distribution is represented in the table by its mean value.We observe that the system uses few processors when beams have low demand while more when moderate and high demands occur.However, the moderate-demand distribution requires fewer processors than a high-demand distribution.The branch-and-bound-CC method has better processor utilization than first-fit, next-fit, and best-fit methods for moderate-and high-demand distribution.For instance, the branch-and-bound-CC uses four processors at moderate demand, whereas the first fit, the best fit, and the next fit, respectively, utilize five, five, and seven processors.However, all the proposed methods except next fit show the same performance in the case of low demand.As expected, the branch-and-bound-WoCC has better processor utilization than the other methods.However, it does not support CC, so it cannot assign all bandwidth chunks of a beam to one processor.Hence, it needs an additional signal processing technique to recombine the bandwidth chunks of a beam from multiple processors for transmission.This problem does not occur in other methods that handle a beam's bandwidth chunks in a single processor.
Fig. 14(a) depicts the overall bandwidth of signals carried out in each processor when the demand is high.The processors with the lowest index values handle most bandwidth chunks instead of those with the highest index values.Similarly, the frequency carriers managed by each processor are shown in Fig. 14(b).The first processor executes most frequency carriers, while other processors handle the remaining ones.This is because the algorithm runs sequentially to use each processor.Hence, it utilizes the first processor before checking the next processor.Additionally, we can see from Fig. 14(c) that fewer processors handle the majority of the carriers.For example, using only six processors, the system can deal with 94%-97% of the carriers with the branch-and-bound-CC, first-fit, and best-fit methods.Hence, we can save up to three processors.

C. Arrangement of Bandwidth Chunks on the Processors
The following section demonstrates the distribution of the bandwidth chunks corresponding to each beam among the processors.Here, we consider a moderate demand scenario, and Fig. 15 shows the arrangement of the bandwidth chunks.Specifically, Fig. 15 Furthermore, two or more beams share the same bandwidth chunk represented by the same color.For example, beams 3, 6, and 20 are represented by red color since they share the same bandwidth chunk.In addition, Fig. 16(a) and (b) provides the arrangement of the normalized bandwidth chunks on the processors when the system employs next-fit and branch-and-bound-WoCC methods, respectively.We observe that branch-and-bound-WoCC has a better bandwidth arrangement compared to other methods.This is because the method supports distributing the bandwidth chunks of any beam to multiple processors.For instance, beam 3, 6, and 20 in Fig. 16(b) utilizes multiple processors.Therefore, the flexibility to assign bandwidth chunks to each processor increases.Hence, it utilizes fewer processors than other methods.However, it requires more signal processing techniques to recombine the respective bandwidth chunks of beams.Unlike branch-and-bound-WoCC, none of the remaining methods require additional signal processing since the bandwidth chunks of a beam are processed only in one processor.For example, beam 3 uses only the first processor.
The branch-and-bound-CC on processor one, the best fit on processor two, and the branch-and-bound-WoCC on processor one, shown in Fig. 15(a), (c), and Fig. 16(b), respectively, have more bandwidth chunks compared to the first-fit and next-fit methods.Consequently, the remaining processors will have more space for managing other applications.Regarding the number of processors utilized, the branch-and-bound-CC, first fit, and best fit show similar performance.On the other hand, the next-fit method shown in Fig. 16(a) utilizes more processors than other methods.Furthermore, this creates a significant amount of unused bandwidth in each processor.In contrast, branch-andbound-WoCC uses fewer processors than the other methods.

V. CONCLUSION
In this article, we propose an approach in this article to manage onboard processors for high-throughput NGSO satellite systems.For this, we develop algorithms that can flexibly control the number of utilized payload processors in response to the system and user requirements.Accordingly, we formulated an optimization problem to minimize the number of operating processors under beam abstraction, demand satisfaction, and processor abstraction constraints.
The optimization problem is nonconvex, and we solve it in two steps.First, we design the bandwidth allocation strategy.Subsequently, we determine the exact number of processors required to accommodate this bandwidth allocation.In this context, we propose a sequential optimizationbased branch-and-bound method and bin-packing method using next fit, first fit, and best fit.Consequently, we evaluate the performance of each of the proposed methods.As a result, the branch-and-bound, best-fit, and first-fit algorithms provide the best results in flexibly managing payload processors.

Fig. 7 .
Fig. 7. Mapping bandwidth to processors using the next-fit algorithm.

1 )
Next-fit algorithm: This activates only one bin/processor at a time.Furthermore, a new processor is activated when the item W i does not fit in the current lth processor.In this case, the item W i is mapped to the next l + 1 processor if beam j in the lth processor and beam i belong to the same group or if the processor capacity is insufficient to support theW i of beam i.As an example, consider W 1 = 0.6, W 2 = 0.7, W 3 = 0.3, W 4 = 0.4, W 5 = 0.1, and W 6 = 0.55 as well as the selected beam group:3 G 7 ={Beam 1, Beam 2}, G 16 ={Beam 3, Beam 4}, G 22 ={Beam 1, Beam 2, Beam 6}, G 35 ={Beam 2, Beam 5, Beam 4}, and G 37 ={Beam 2, Beam 5, Beam 6}

Fig. 10 .
Fig. 10.Comparison of the proposed techniques for uniform demand distribution.(a) Processors used per demand.(b) Bandwidth per processor, for beam-demand = 200 Mb/s.(c) Number of carriers per processor, for beam-demand = 200 Mb/s.(d) Processors needed to process the number of carriers for beam-demand = 200 Mb/s.

Fig. 11 .
Fig. 11.Computational time of branch-and-bound and bin-packing methods for different demands.

Fig. 12 .
Fig. 12. Bin-packing performance gap compared to branch-and-bound-CC for different demands.

Fig. 14 .
Fig. 14.Comparison of the proposed techniques for heterogeneous demand distribution.(a) Bandwidth per processor, for the high-demand scenario.(b) Number of carriers per processor, for the high-demand scenario.(c) Processors needed to process the number of carriers for the high-demand scenario.
(a)-(c) depicts the arrangement of the normalized bandwidth chunks of each beam on the processors for branch-and-bound-CC, first-fit, and best-fit methods, respectively.Each figure has a beam number, and the color indicates the bandwidth chunk per beam.