Age-Optimal Downlink NOMA Resource Allocation for Satellite-Based IoT Network

The upcoming satellite-based Internet of Things (S-IoT) has the capability to provide timely status updates to massive terrestrial user equipments (UEs) via non-orthogonal multiple access (NOMA), due to the worldwide coverage inherited from satellite. Considering the constrained power and storage resources while keeping the information freshness in S-IoT, we first propose three constraint conditions including average/peak power constraints, network stability and minimum throughput requirement. Then, we formulate a long-term age of information (AoI) minimization problem under the three constraint conditions. To solve this complex long-term problem, we transform the above mentioned constraints into three queue stability problems via the Lyapunov optimization framework, thus converting our long-term multi-slot stochastic optimization problem into a series of single time slot deterministic optimization problems. Moreover, we leverage the ListNet algorithm to derive the weights of the queue backlog and channel conditions to obtain an optimized power allocation order with linear complexity. Finally, we utilize the particle swarm optimization algorithm to derive the NOMA long-term AoI minimization (AM) power allocation problem within a practical complexity, named NOMA-AM scheme. Simulation results show that the proposed NOMA-AM scheme has the lowest expected weighted sum AoI compared to several benchmark schemes.

Abstract-The upcoming satellite-based Internet of Things (S-IoT) has the capability to provide timely status updates to massive terrestrial user equipments (UEs) via non-orthogonal multiple access (NOMA), due to the worldwide coverage inherited from satellite.Considering the constrained power and storage resources while keeping the information freshness in S-IoT, we first propose three constraint conditions including average/peak power constraints, network stability and minimum throughput requirement.Then, we formulate a long-term age of information (AoI) minimization problem under the three constraint conditions.To solve this complex long-term problem, we transform the above mentioned constraints into three queue stability problems via the Lyapunov optimization framework, thus converting our long-term multi-slot stochastic optimization problem into a series of single time slot deterministic optimization problems.Moreover, we leverage the ListNet algorithm to derive the weights of the queue backlog and channel conditions to obtain an optimized power allocation order with linear complexity.Finally, we utilize the particle swarm optimization algorithm to derive the NOMA long-term AoI minimization (AM) power allocation problem within a practical complexity, named NOMA-AM scheme.Simulation results show that the proposed NOMA-AM scheme has the lowest expected weighted sum AoI compared to several benchmark schemes.
Index Terms-Age of information, ListNet algorithm, Lyapunov optimization, resource allocation, satellite-based IoT.

I. INTRODUCTION
T HE upcoming satellite-based Internet of Things (S- IoT) would enable massive machine type communication (mMTC) in anywhere and anytime by integrating satellites with terrestrial IoT user equipments (UEs) together [1], [2].
Hence, S-IoT will play a key enabler in the fifth generationadvance (5G-A) and the future sixth generation (6 G) wireless networks [3], [4].Furthermore, with the rapid development of mobile communications and S-IoT, there appear increasing needs for the timeliness of status updates in various scenarios, such as precise agriculture, factory automation, smart cities, environment monitoring, intelligent transportation system (ITS), etc [5].In these applications, the information freshness is of paramount importance, since the obsolete information may lead to unpredictable or even disaster result.For example, the environment monitoring system detects the disastrous phenomenons such as forest fires or earthquakes, which should feedback to the control center and make effective reaction as soon as possible.To fill in the gap of effectively characterizing the information freshness of status updates, [6] defines a new metric in term of age of information (AoI), as the time elapses since the freshest status update is generated.The authors in [7] have compared the AoI performance of non-orthogonal multiple access (NOMA) with orthogonal multiple access (OMA) in a two-UE access system, and validate that both OMA and NOMA could improve the average AoI under different simulation environments.
Considering that in the S-IoT downlink network, if the satellite utilizes the conventional OMA scheme to transmit status updates to massive terrestrial UEs, it can only serve one UE in each time slot.In this way, only the served UE has the chance to reduce its AoI, while all other UEs' AoI would increase, which might increase the average AoI and deteriorate the information freshness in the S-IoT downlink network.Therefore, we introduce the NOMA scheme in the S-IoT downlink network, and enable the satellite to simultaneously transmit status updates to multiple UEs via NOMA scheme [8].Thus, the AoI of UEs can be lower down simultaneously, and the average AoI in S-IoT network can decrease to a lower level compared with OMA scheme.In fact, due to its capability of reducing the transmission phases, NOMA is viewed as a potential enabler of mission critical communications (MCC) [9], since the worst case one way propagation latency is expected to be 21 ms for low earth orbit (LEO) satellite at 1200 km, and 13 ms for LEO satellite at 600 km [10].
Note that different from the terrestrial network, satellite usually owns constrained storage and power resources, due to the extremely expensive launch cost and limited mass of satellite platform and payload [11], [12].As a consequence, an AoI minimization resource allocation scheme for the downlink NOMA S-IoT network is well worth studying.The authors in [13] derive an optimal scheduling scheme via Markov Decision Process (MDP).The authors in [14] optimize the average AoI through jointly scheduling IoT devices and sampling status updates.However, with the increasing of system parameters, leveraging MDP method to optimize AoI is faced with two non-trivial problems, i.e., the exponentially exploding state space and huge computation complexity [15], which also is termed the curse of dimensionality [16].To avoid such problems in MDP, the Lyapunov optimization framework has been utilized to solve the AoI optimization problems in recent works [17], [18], which can solve the stochastic network optimization problems with long-term constraints [19].Moreover, the Lyapunov optimization framework is also studied in AoI optimization for sample management and scheduling in [20], [21].The authors in [22] design an online optimization algorithm to maximize the network throughput subject to the average AoI constraint.Furthermore, the optimal order of power allocation to UEs in the downlink NOMA S-IoT network under multiple constraints is important and extremely complicated [23].The authors in [24] jointly optimize the UE order and power allocation to lower the power consumption.However, due to the limited storage in satellite and variant channel conditions in the downlink NOMA S-IoT network, it is extremely difficult to obtain an optimal UEs ordering.Therefore, we resort to the learning to rank (LTR) algorithm, which performs a ranking by utilizing the machine learning techniques and receives great attention due to its effectiveness in numerous scenarios, especially in natural language processing, rank prediction and data mining [25].The authors in [26] classify the LTR algorithms into three categories: the Pointwise, Pairwise, and Listwise approaches.A large number of experiments show that the Listwise approach outperforms the other two approaches on benchmark data sets and is capable of modeling the ranking problem more naturally [27].In this paper, we derive a ListNet algorithm to obtain the optimized UEs resource allocation order, which is a representative method of Listwise approach and optimizes the listwise loss function [28].Motivated by the above mentioned, we propose an age-optimal resource allocation scheme with the help of Lyapunuov optimization framework in the downlink NOMA S-IoT network under three constraint conditions, and summarize our contributions in this paper as follows: r To the best of our knowledge, this is the first work to propose an AoI minimization resource allocation scheme in downlink NOMA S-IoT network, which is aiming at minimize the expected weighted sum AoI (EWSAoI) under three constraint conditions, i.e., the average/peak power constraints, network stability and minimum throughput requirement.We establish three virtual queues for the above constraints to derive the power consumption on the average/peak power constraints, queue backlog for the network stability, and throughput debt under the minimum throughput requirement, respectively.Then, we utilize the Lyapunov optimization framework to solve the long-term stochastic optimization problem.
r Due to the variant characteristic of channel condition and queue backlog, it is extremely difficult to determine an optimized UEs' power allocation order.To avoid the complexity via exhaustive searching scheme, we utilize the ListNet algorithm to derive the weights of the queue backlog and channel conditions to obtain an optimized power allocation order with linear complexity.Then, considering the non-convexity of our age-optimal resource allocation optimization problem, we utilize the particle swarm optimization (PSO) algorithm to derive a NOMA long-term AoI minimization (AM) power allocation policy, named NOMA-AM scheme.Our NOMA-AM scheme can outperform state-of-art schemes due to the optimization of both power allocation order and power coefficients.
r We analyze the complexity of our age-optimal NOMA- AM scheme and conduct extensive simulations compare with existing benchmarks, such as NOMA-DPPA (dynamic programming based power allocation) scheme [29], Max-Weight scheme [20], NOMA-G, NOMA-Q and OMA schemes.Simulation results demonstrate that the NOMA-AM with ListNet algorithm can achieve the lowest EWSAoI among the benchmark schemes.Moreover, the EWSAoI performance of NOMA-AM scheme is also investigated under different fading channel conditions and different number of antennas.Finally, we analyze the tradeoff between the EWSAoI and average power consumption in the downlink NOMA S-IoT network, where the EWSAoI in our NOMA-AM scheme can be decreased with slightly increasing of average power consumption under the long-term average power constraint.The remainder of this paper is outlined in the following.Section II depicts the system model, including the downlink NOMA S-IoT network and AoI modeling.Section III elaborates the long-term age-optimal problem.Section IV converts the long-term age-optimal problem into Lyapunov optimization.Section V derives the ListNet algorithm and NOMA-AM scheme.In Section VI-B, we demonstrate the simulation results.Finally, we present the conclusion in Section VII.

II. SYSTEM MODEL
In this section, we describe a downlink NOMA S-IoT network in detail and provide the model of EWSAoI for the received status updates, which can characterize the information freshness of whole system.

A. Downlink NOMA S-IoT Network
We consider a downlink NOMA S-IoT network in Fig. 1, containing a LEO multi-beam HTS S and K terrestrial UEs in each steerable spot beam coverage [30], [31].Assume that the frequency band in S is divided into three sub-bands to make sure adjacent steerable spot beams are allocated with non-overlapping frequency spectrum as shown in Fig. 1.To further avoid inter  interference of the adjacent steerable spot beams, S allocates the resources in a hybrid multiple access way, i.e., we assume that S serves different steerable spot beams in an OMA way while it communicates to K UEs within a same steerable spot beam coverage via NOMA [32].Hence, we only need to consider one spot beam in this downlink NOMA S-IoT network.We make the assumption that S moves while all these K UEs are stationary.Note that since the altitude between S to UEs is several hundreds kilometers high, the Doppler shifts caused by the motion of S are identical for different UEs in a same spot beam [33].Moreover, when the guard bandwidth in S is set at twice the Doppler shifts, the influence of Doppler shifts on the system can be relieved [34].
By taking advantage of NOMA scheme, S can communicate with K activated UEs simultaneously.We divide a time period into T time slots and let t represent the current time slot (t ∈ {0, 1, 2, . . ., T − 1}).Without loss of generality, we denote the duration of each time slot τ equals to the propagation latency from satellite-to-terrestrial UE.As shown in Fig. 1, we establish three virtual queues to evaluate the network performance, which are power debt queue P (t), queue backlog Q i (t) (i = 1, 2, . . ., K) and throughput debt queue U i (t).The arrival process of P (t), Q i (t) and U i (t) are the average power constraint P mean , data arriving rate ar i (t) and data departing rate br i (t), respectively.The departure process of and U i (t) are the total power consumption K i=1 p i (t), data departing rate br i (t) and UE i 's minimum throughput requirement h i , respectively, and the average power consumption P = For convenience, the related notations are summarized in Table I.
Considering the scattering and masking effects caused by the barriers and obscuration around the terrestrial UEs in satellite communications, we apply the widely-used shadowed-Rician fading channel model in the downlink NOMA S-IoT network, which is proposed in [35] and takes both the fading and masking effects into consideration [36], [37].Moreover, we assume that the channels between S and different UEs are independently identically distribution (i.i.d.).When S equips single transmitting antenna, the probability density function (PDF) of channel gain |ch i | 2 is as follows [35], and Ω i are the Nakagami-m parameter, mean power of multipath component and line of sight (LoS) component, respectively.Moreover, we assume that the channel state between S to K UEs is invariant at each time slot but randomly changes from one time slot to another.Thus, the cumulative distribution function (CDF) of channel gain |ch i | 2 for S with single transmitting antenna is as follows [35], where Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
When S equips N transmitting antennas, the corresponding CDF of channel gain |ch i | 2 is as follows [38], where , and B(., .)denotes the Beta function.
Suppose that S has N transmitting antennas and each UE has one receiving antenna.Let s i (t) and p i (t) ∈ C N denote the desired signal and the complex weight column vector of allocated transmit power for UE i in time slot t, respectively.Without loss of generality, we assume that the allocated transmit power for the desired signals of K activated UEs are sorted in an ascending order with respect to their index numbers, i.e., the signal s K (t) has the highest power level |p K (t)| 2 , and the superposed signal s(t) for K activated UEs can be expressed as follows, According to the downlink NOMA scheme, the superposed signal s(t) has been broadcast to the UEs, and the received signal where ch i (t) ∈ C N represents the row vector of channel coefficients from N antennas, which follows the shadowed-Rician fading distribution, and F = 92.4+ 20 log f + 20 log d is the free space loss from S to UE i , where f denotes the downlink spot beam frequency of S, and d is the altitude of S. n i (t) ∼ CN (0, σ 2 ) is the additive white Gaussian noise (AWGN) with zero mean and variance σ 2 .
To conclude, we make the following important assumptions in our system model without loss of generality: 1) S serves different steerable spot beams in an OMA way while it communicates to K UEs within a same steerable spot beam coverage via NOMA; 2) The influence of Doppler shifts in our system can be relieved by setting the guard bandwidth in S twice the Doppler shifts; 3) S has N transmitting antennas while each UE has one receiving antenna; 4) The allocated transmit power for the desired signals of K activated UEs are sorted in an ascending order with respect to their index numbers.
Then, according to the NOMA scheme, UE i utilizes successive interference cancelation (SIC) to recover its desired signal s i (t) from y i (t) by treating other UEs' signals as the intra-cell interference.Without loss of generality, we assume that the allocated transmit power for the desired signals of K activated UEs are sorted in an ascending order with respect to their index numbers, i.e., the signal s K (t) has the highest power level p K (t), which is decoded by assuming other signals as interference at first.If the decoding is correct, y i (t) will subtract s K (t) and decode s K−1 (t) until s i (t) is recovered at UE i .According to the SIC decoding order, let where B is the bandwidth.
Therefore, in order to guarantee the SIC decoding successfully at each UE, the received signal power of different UEs' signal must be distinguishable [39].Without loss of generality, assume that |g and the power of the received signal of each UE at UE 1 must satisfy the following conditions [40]: Similarly, the received signal power at UE i (i = 2, . . ., K) also needs to satisfy these conditions to guarantee the successfully SIC decoding as follows, In summary, the allocated power for the signal of UE i at S should satisfy the following conditions to guarantee the SIC decoding, otherwise the SIC decoding would be failed:

B. Expected Weighted Sum Age of Information
In this paper, we adopt the "generate at will" model to reduce the queueing delay for the status updates waiting in the queue for the transmission opportunity, which is proposed in [41].Thus, the HTS can generate the status updates for the covered UEs at the beginning of each time slot [42], [43].Let d i (t) ∈ {0, 1} denote whether the status update of UE i is transmitted successful or not, where d i (t) = 1 means a successful decoding at UE i , and d i (t) = 0 is a SIC failure at UE i , we have Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Let a i (t) denote the AoI of UE i at time slot t, which is a non-negative integer depending on the result of SIC.If the SIC is successful in time slot t at UE i , the AoI of UE i updates to a i (t + 1) = 1 as the recovered signal is generated at time slot t, otherwise a i (t + 1) = a i (t) + 1, which means the AoI of status at UE i is one time slot older.Thus, the AoI is determined by the SIC result in each time slot at UE i , where the SIC is highly affected by the allocated power for each UE i 's signal at S. Therefore, the power allocation strategy should be carefully designed to ensure successfully SIC decoding to minimize EWSAoI.The AoI evolution of UE i is illustrated in Fig. 2.
Thus, the AoI evolution for UE i is determined by SIC, and we have Therefore, we utilize the EWSAoI to characterize the information freshness of all K UEs in our downlink NOMA S-IoT network, and we have where the expectation E[•] is affected by the channel variation of all K UEs in the same spot beam and the resource allocation policy at S, and w i is a positive real number that represents the importance of UE i .In addition, if we set w i = 1(i = 1, 2, . . ., K), the EWSAoI will degenerate to the average AoI (AAoI).

III. AGE-OPTIMAL PROBLEM FORMULATION
In this section, we formulate the EWSAoI minimization problem under three constraint conditions in the downlink NOMA S-IoT network.

A. Average/Peak Power Constraint
Different from the terrestrial network, satellite usually owns extremely constrained storage and power resources.Hence, when we design resource allocation policy for S-IoT network, the short-term peak power constraint should be considered: where P max is the maximum power that S can provide in one time slot duration τ .Moreover, the long-term average power constraint P mean should also be taken into account, and we have,

B. Network Stability Constraint
Assume that all the queues in S are empty at the initial time slot 0. At the end of each time slot t, the status updates arrive at the queue backlog.If the status update of UE i is not transmitted in many time slots, Q i grows continually while the storage buffer in S is limited.Therefore, the system should satisfy the following network stability constraint,

C. Minimum Throughput Requirement
In the network utility maximization power allocation [39], the UEs with worse channel condition cannot receive their status updates in time, which might deteriorate the EWSAoI.Note that we are aiming at minimize the EWSAoI by improving the number of UEs to successfully recover their status updates under the average and peak power constraints, we introduce the longterm minimum throughput requirement.Let h i > 0 represent the long-term minimum throughput requirement of UE i .Consider the data departing rate br i (t), the long-term throughput of UE i can be defined as follows, hi = lim As a consequence, the long-term minimum throughput requirement of UE i is given by, hi = lim

D. Problem Formulation
Therefore, the original resource allocation problem is to minimize EWSAoI Ā in (12) via the power allocation of UEs' signals |p i (t)| 2 (i = 1, 2, . . ., K) under a throughput threshold and queue backlog, which can be formulated as follows, s.t.C1 : Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
C3 : lim C4 : lim where C1 and C2 are the average and peak power constraints, respectively.C3 is the minimum throughput requirement.C4 is the network stability constraint, which serves for the stability of whole system and avoid data overflow.Note that the above optimization problem is non-convex.If S allocates the power according to the descending order of UEs' channel gains, the optimized performance is similar to the NOMA-G scheme.Similarly, the above optimization problem will degenerate to the NOMA-Q scheme if S allocates the power according to the queue backlog.Thus, we convert the above long-term age-optimal problem into the Lyapunov optimization framework in the following Section IV.
However, the explicit relationship between the channel gains and queue backlog is still unattainable.Therefore, in Section V, we utilize a learning based intelligent approach to find the power allocation order, then we can apply the PSO algorithm to obtain a global optimum power allocation, and we validate the superior performance of our NOMA-AM scheme over state-of-art schemes in Section VI.

IV. LYAPUNOV OPTIMIZATION
In this section, we first provide the evolution of three virtual queues P (t), Q i (t), U i (t) for the long-term constraints in Section III.Then, we convert the long-term age-optimal problem into the Lyapunov optimization framework.

A. Virtual Queue Model
We can leverage the Lyapunov optimization framework to solve the complex long-term age-optimal problem in Section III.To convert three constraint conditions in above long-term ageoptimal problem into system stability problem in the Lyapunov optimization framework, we establish three virtual queues, and the evolution of three virtual queues are given in the following.
Definition 1: The queue X(t) ∈ {P (t), Q i (t), U i (t)} is mean-stable if satisfies the following expression [19]: Lemma 1: If the power consumption debt P (t), queue backlog Q i (t) and throughput debt U i (t) are mean-stable, the long-term average power constraint C2, minimum throughput requirement C3 and network stability constraint C4 can be satisfied.
Proof: Please see Appendix A. We utilize P (t), Q i (t) and U i (t) to characterize the long-term system stability of the downlink NOMA S-IoT network.The evolution of them are listed as follows, r First, the evolution of power consumption debt P (t) is given by (20) Thus, the long-term average power constraint P mean is the upper bound of average power consumption P .r The queue backlog Q i (t) describes the status updates is buffered and wait at queue Q i to forward to UE i .The evolution of Q i (t) is expressed as follows, (21) where ar i (t) is the data arriving rate at queue Q i (t).
r The throughput debt queue U i (t) can record the part of throughput that is less than average throughput h i , which is updated according to:

B. Formulation of Lyapunov Optimization
Let Ξ(t) denote a vector combining P (t), Q i (t) and U i (t).The quadratic Lyapunov function can be given by, To characterize the variation of Lyapunov function during different time slot, we leverage the Lyapunov drift, which represents the variance value of Lyapunov function from one time slot to the next [19], and we have, Thus, we can maintain the system stability via reducing the Lyapunov drift, since the lower Lyapunov drift can avoid the queue backlogs step into congestion states.Moreover, to minimize the ESWAoI, a penalty function is defined as follows, and the drift-plus-penalty (DPP) expression can be given by where V ≥ 0 represents an importance weight to characterize the relative importance between the EWSAoI minimization and system stability.Therefore, by adjusting the value of V , we can achieve a tradeoff between the EWSAoI and system stability.Furthermore, we can derive a upper bound of DP P (Ξ(t)) according to the Lyapunov optimization as follows, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
where c is a constant and satisfying the following inequality: The detail derivation of upper bound ( 27) is summarized in Appendix B. Therefore, our original problem in (18) has already become the minimization problem of DPP, and we have Then, we can draw the parts containing p i (t) in ( 29), and formulate an optimal power allocation as follows Note that the optimal power allocation in ( 30) is a single time slot optimization problem, which is independent with other time slots.Therefore, we can ignore the notation t and simplify (30) as follows, We can observe that ( 31) is non-convex and the proof is given in Appendix C.

C. Optimized Power Allocation Order Based on Queue Backlog Q and Channel Condition g
As above mentioned, the power allocation order of status updates for each UE in S is of great importance in the downlink NOMA S-IoT network.On one hand, it is worth noting that the queue backlog has great effects on the EWSAoI.Our optimization goal is to minimize (31), and the queue backlog is inversely proportional to the objective function.Therefore, to minimize (31), we should set priority to the UEs with less queue backlog when S allocates the power resources.However, if S allocates power to status updates for each UE with the least queue backlog while ignoring their channel conditions, the system could not make full use of the power resources.Therefore, we take both the queue backlog Q and channel condition g into consideration when we sort the power allocation order for the status updates of each UE.To obtain the optimal allocation order based on the queue backlog Q and channel condition g, we propose a ranking function as follows: where v 1 and v 2 are the importance weights of the queue backlog Q and channel condition g, respectively.Then, we need to acquire the optimal values of v 1 and v 2 and calculate F v (Q, g) to obtain the optimized power allocation order via F v (Q, g).Therefore, we leverage the ListNet algorithm to derive optimal v 1 and v 2 in the following Section V.

V. SOLUTION AND COMPLEXITY ANALYSIS
In this section, we first utilize the ListNet algorithm to derive v 1 and v 2 in (32), and obtain the optimized power allocation order with linear complexity.Then, we leverage the PSO algorithm to solve (31) and finish the design of NOMA-AM scheme.

A. ListNet Algorithm
To derive v 1 and v 2 in (32), we adopt the ListNet algorithm, which takes advantage of neural network as model and gradient descent as optimization algorithm [28].In each time slot, the ranking function Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Then, the top one probability of UE i is calculated as: , (33) and the gradient of v can be calculated as: Algorithm 1 summarizes the process of ListNet algorithm in detail, and the time complexity of ListNet algorithm is of order O(T K), while the conventional exhaustive searching for the optimal power allocation order is O(K!).

B. Power Allocation by PSO Algorithm
Since the optimization problem in ( 31) is non-convex, we leverage the PSO algorithm [44] to solve (31) as the fitness value.The PSO algorithm has less computation and storage complexity in comparison with the dynamic programming and  Karush-Kuhn-Tucker (KKT) conditions, which are usually used in solving non-convex optimization problems.
In the PSO algorithm, every particle has a velocity and position, and all the particles move in search space to find the fitness value, i.e., the optimal power allocation coefficients in (31).Therefore, let po = [po 1 , po 2 , . . ., po K ] denote the position of particles, and ve = [ve 1 , ve 2 , . . ., ve K ] represents the velocity.Moreover, the PSO algorithm updates particles by calculating the following iteration: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.and where δ 1 ≥ 0 and δ 2 ≥ 0 both are acceleration constants and serve for the step length adjustment, r 1 and r 2 are two variables ranging from 0 to 1, which are used to make the search more randomized, P o m represents the individual particle in its best position, while G m is the best position of whole group of particles, m denotes the current number of iterations, and the inertia weight W (W ≥ 0) is obtained as, where W e and W s are the final inertia weight and original weight, respectively, M is the maximum iterations.This AoI minimization resource allocation scheme is named as NOMA-AM and summarized in Algorithm 2.

C. Complexity Analysis
Considering the extremely constrained power and storage resources in S, we elaborate the computational complexity and storage occupation of our NOMA-AM scheme in the following.In Algorithm 2, there exists a double "for" loops from Line 8 to Line 16.The computation complexity of updating K UEs' power allocation coefficients in Line 12 is O(K).The operations for max particles in Line 9 to search the best position during M iterations in Line 8 are O( max ) and O(M ), respectively.Therefore, the total computation complexity of NOMA-AM-Baseline scheme is O(KM max ).Note that the ListNet algorithm can be performed in Line 2 to optimize the power allocation order with O(T K), and the total computation complexity of NOMA-AM scheme is O(KM max ) + O(T K), where T << M max and the computation complexity is slightly higher than O(KM max ).In addition, the computation complexity of NOMA-DPPA scheme in [29] with a polynomial computational is O(K 5 ), and the Max-Weight scheme is similar to the NOMA-AM-Baseline scheme as O(KM max ) [20].A comparison of computational complexity for NOMA-AM scheme with other benchmark schemes is summarized in Table II.
Moreover, the storage consumption for NOMA-AM scheme is O(2(K + 1) max ), where the storage units for all the particles is O(K max ), and O(K max ) units to store ve, P o m and fitness(P o m ) occupy O(2 max ) units.

A. Simulation Setup
In this section, we simulate the EWSAoI of our NOMA-AM schemes, including the NOMA-AM-ListNet scheme (utilizing ListNet algorithm to obtain the optimized power allocation order) and NOMA-AM-Baseline scheme (without using ListNet algorithm).Moreover, we compare them with other five benchmark schemes: 1) NOMA-Q scheme, where the power allocates to s i (t) is proportional to Q i (t) at S. 2) NOMA-G scheme, where the power allocates to s i (t) is inversely proportional to the composite channel gain g i (t) of UE i .3) NOMA-DPPA scheme [29], which obtains the solution through the dynamic programming with high computational complexity.4) Max-Weight scheme [20], which is designed for AoI optimization by reducing only the Lyapunov drift instead of the drift-pluspenalty in each time slot.5) OMA scheme, which utilizes the conventional OMA scheme to transmit status updates to UEs successively.We set the altitude of HTS as 300 km, and the one way propagation latency equals 5 ms, which equals to the duration of time slot τ .The important simulation parameters are summarized in Table III.Moreover, the channels between S and UEs follow i.i.d.shadowed-Rician fading distribution, and the PDF and CDF of shadowed-Rician fading distribution are given in ( 1) and ( 2), respectively.The simulated shadowed-Rician fading channel parameters are given in Table IV.

B. Simulation Results
First, we study the effects of different fading parameters on EWSAoI in our NOMA-AM scheme, and also compare the EWSAoI and average peak AoI (PAoI) with the FHS fading parameters.In Fig. 3(a), when the SNR ≤ 20 dB, the EWSAoI of our NOMA-AM scheme under three simulated channel conditions are all poor, especially in the FHS fading parameters.Moreover, the average PAoI is significantly larger than the EWSAoI as shown in Fig. 3(a), especially when SNR ≤ 20 dB.With the increasing of SNR, the EWSAoI of NOMA-AM scheme decreases, since higher SNR can improve the SIC decoding in the downlink NOMA S-IoT network.Moreover, the EWSAoI of our NOMA-AM scheme under three channel conditions are approaching with SNR ≥ 30 dB.In Fig. 3(b), with the increasing of numbers of UEs, the average PAoI under FHS fading parameters has a remarkable growth to the EWSAoI.Note that the gap of the EWSAoI under FHS fading parameters between that of with the ILS and AS fading parameters becomes larger, because Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.the increasing of UEs leads to lack of power resources under FHS fading parameters and severely deteriorates the freshness of status updates in downlink NOMA S-IoT network.Therefore, we use the EWSAoI with FHS to validate the efficiency of our proposed NOMA-AM scheme in the following simulations.
Note that we leverage MRC at UE i to improve EWSAoI, and we investigate the EWSAoI performance of NOMA-AM scheme under different number of antennas.As shown in Fig. 4(a) and (b), we can find that the EWSAoI is improved with the increasing number of antennas N when SN R < 20 dB.Specifically, when SN R = 10 dB, the EWSAoI with N = 4 is 39.1% lower than that of N = 2. Fig. 5 shows the EWSAoI performance versus SNR of proposed NOMA-AM-ListNet and NOMA-AM-Baseline schemes, and we compare them with the NOMA-DPPA, Max-Weight, NOMA-Q and NOMA-G schemes.Moreover, Fig. 6 compares the EWSAoI performance of the above NOMA schemes and the OMA scheme versus number of UEs, which validates that the EWSAoI of NOMA schemes are significantly lower than the OMA scheme.The simulation results show that the EWSAoI performance of our proposed NOMA-AM scheme outperforms the existing schemes both under the single antenna scenario in Fig. 5(a) and Fig. 6(a), and multiple antennas scenario with N =4 in Fig. 5(b) and Fig. 6(b).Moreover, the NOMA-AM-ListNet scheme has significantly lower EWSAoI than that of NOMA-AM-Baseline scheme, which validates that we can further reduce the EWSAoI by leveraging the ListNet algorithm with introducing negligible computation complexity to obtain the optimized power allocation order.
Finally, we conduct a simulation concerning the effects of importance weight V on EWSAoI Ā as well as average power consumption P in Fig. 7.When V increases, Ā decreases and P increases both in single antenna scenario in Fig. 7(a) and multiple antennas scenario in Fig. 7(b), because Ā decreases due to the positive value of P (t) in (20) increases, which means P increases as well.Therefore, a tradeoff between Ā and P can be found that when V goes large, minimize EWSAoI equivalent to increase average power consumption, and the power consumption still satisfies (18c) in our NOMA-AM scheme.

VII. CONCLUSION
In this paper, we have proposed an age-optimal resource allocation scheme for the downlink NOMA S-IoT network, which can minimize the EWSAoI under three constraint conditions.First, we converted the long-term age-optimal problem into the Lyapunov optimization framework.Then, we utilized the ListNet algorithm to derive the appropriate weights of queue backlog and channel condition, and obtain the optimized power allocation order with linear complexity.Finally, we leveraged the PSO algorithm to derive an AoI minimization power allocation scheme within linear complexity, i.e., the NOMA-AM scheme.Simulation results showed that our NOMA-AM scheme has the lowest EWSAoI in comparison with other benchmark schemes both in the single antenna and multiple antennas scenarios.We also studied the EWSAoI performance of the NOMA-AM scheme under different fading channel conditions, and discussed the effects of importance weight V on the EWSAoI and verified the tradeoff between the EWSAoI and power consumption.

APPENDIX A PROOF OF LEMMA 1
First, it is obvious that if Q i (t) is mean-stable, the network stability constraint C4 can be satisfied.Then, for the power consumption debt P (t), if P (t) is mean-stable, which means lim t→∞ E[P (t)] t = 0. Sum up P (t) in the range from 0 to T , and we can obtain: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Take the limit and the expectation of (39), we have the following inequality: Take the limit and the expectation of (42), we have the following inequality:
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

APPENDIX C NON-CONVEXITY OF (31)
To prove (31) is non-convex, we need to calculate the Hessian matrix of (31).For convenience, we assume S has single antenna and consider a two-UEs scenario, and we have: Then, we calculate the Hessian matrix of (48) in the following: where and Note that H 2×2 is indefinite due to the uncertainty of l 1,1 .Therefore, (48) is non-convex according to [46].
Then, we calculate the Hessian matrix of (31) in the following: where and (57) We can derive that (31) is non-convex for in a similar way and complete the proof.

Fig. 4 .
Fig. 4. EWSAoI performance versus SNR and number of UEs under various number of antennas, respectively, where b i = 0.063, Ω i = 0.000897, m i = 1.(a) EWSAoI versus SNR, where the number of UEs is 3. (b) EWSAoI versus number of UEs, where the SNR is 17.5 dB.

Fig. 7 .
Fig. 7. Tradeoff between EWSAoI and P , where the number of UEs is 3. (a) Single antenna scenario.(b) Multiple antennas scenario with N = 4.

Algorithm 1 :
ListNet Algorithm.Input:Queue backlog Q, channel condition g, learning rate η, number of time slots T and number of UEs K; Output:Importance weights v 1 and v 2 ; 1: Initialize parameter v 1 and v 2 ;

TABLE IV TABLE OF CHANNEL
PARAMETERS