Reduced Complexity Optimal Resource Allocation for Enhanced Video Quality in a Heterogeneous Network Environment

The latest Heterogeneous Network (HetNet) environments, supported by 5th generation (5G) network solutions, include small cells deployed to increase the traditional macro-cell network performance. In HetNet environments, before data transmission starts, there is a user association (UA) process with a specific base station (BS). Additionally, during data transmission, diverse resource allocation (RA) schemes are employed. UA-RA solutions play a critical role in improving network load balancing, spectral performance, and energy efficiency. Although several studies have examined the joint UA-RA problem, there is no optimal strategy to address it with low complexity while also reducing the time overhead. We propose two different versions of simulated annealing (SA): Reduced Search Space SA (RS3A) and Performance-Improved Reduced Search Space SA (PIRS3A), algorithms for solving UA-RA problem in HetNets. First, the UA-RA problem is formulated as a multiple knapsack problem (MKP) with constraints on the maximum BS capacity and transport block size (TBS) index. Second, the proposed RS3A and PIRS3A are used to solve the formulated MKP. Simulation results show that the proposed scheme PIRS3A outperforms RS3A and other existing schemes such as Default Simulated Annealing (DSA), and Default Genetic Algorithm (DGA) in terms of variability and DSA and RS3A in terms of Quality of Service (QoS) metrics, including throughput, packet loss ratio (PLR), delay and jitter. Simulation results show that PIRS3A generates solutions that are very close to the optimal solution.

coverage areas, carrier frequencies, back-haul link types, and communication protocols. Indeed, the deployment of small cells within a macrocell can provide support in terms of higher communication speed and better coverage for mobile users located at the macrocell border or in regions with high traffic demand [2].
In particular, the integration of femtocell BSs (FBSs) with macro-cell BSs (MBSs) has drawn considerable attention recently. Fig. 1 illustrates such a macro-femtocell-based HetNet deployment scenario in which MBSs and FBSs collectively serve the users and can provide improved QoS levels [3]. User association (UA) to BSs and resource allocation (RA) during data transmission are primary challenges in such a HetNet environment. As a result, the dual UA-RA problem needs to be examined to enable high QoS support while considering variables such as BS capacity, user requirements, and channel quality.
We have formulated the UA-RA problem as multiple knapsack problem (MKP) in our paper, which can be broadly described as follows: Given a set of items, each of which has a weight and a value, it is essential to measure the amount of each item to be included in a knapsack so that the total weight is smaller than the capacity of the knapsack and the total value is as high as possible [4]. A collection of m knapsacks with various capacities is given in the case of the MKP. The knapsacks are represented by BSs (MBS and FBS) in this solution, and the items to fit into the knapsacks are represented by instances of user equipment (UE). The item weights are user demands, while item values are the available throughput for each knapsack. Knapsack capacity is the maximum capacity available. We use simulated annealing (SA) to solve the MKP problem, and based on the resource block (RB) utilization rate and the transport block size (TBS) index, the available throughput is determined. This paper introduces two innovative SA solutions: Reduced Search Space Simulated Annealing (RS 3 A) and Performance-Improved Reduced Search Space Simulated Annealing (P IRS 3 A). RS 3 A performs fine-tuning of some parameters such as to predict the best possible starting solution. A good choice of starting solution helps reduce the search space for SA, which as any meta-heuristic algorithm results in solving combinatorial optimization problems faster and more efficient. P IRS 3 A also performs search space reduction, which helps increase the likelihood of selecting UEs with the highest estimated throughput, while also considering other Figure 1. A two-tier macro-femtocell-based HetNet parameters such as BS capacity, UE requirements, and channel efficiency. However, P IRS 3 A goes one step further and also involves removal of ineffective solutions. The problem search space often contains many ineffective or infeasible solutions, which a typical SA algorithm considers when evaluating the target function, wasting time and effort. P IRS 3 A removes the ineffective solutions from the solution space before applying SA to solve MKP and finds a good solution faster and in a more efficient manner. This paper's principal contributions are as follows: 1) We propose two improved versions of SA, i.e., RS 3 A and P IRS 3 A. We have shown that our proposed scheme P IRS 3 A converges to a bounded near-optimal solution and outperforms two alternative SA-based approaches and Default Genetic Algorithm (DGA) in terms of variability and overhead time. 2) Under the maximum BS capacity and TBS index constraint, we introduced P IRS 3 A as a decentralized scheme for solving the UA-RA problem in HetNets as MKP. We have also demonstrated that optimization function represents a hyper-plane and its convex and maximum BS capacity, TBS index constraints are convex and separately reflect half-spaces and intersection of half-spaces and hyper-planes form a polyhedron. 3) We present simulation results to show the effectiveness of the proposed scheme with different system parameters. The performance of P IRS 3 A was compared against that of two other schemes: Single-Cell (SC) and Default Simulated Annealing (DSA). In SC, all UEs try to establish a connection with MBS only. DSA, on the other hand, corresponds to the classic SA scheme for the UA-RA problem formulated as MKP without finetuning in terms of parameter range and solution search space reduction. The rest of the paper is organized as follows: Section II surveys existing related research work. Section III presents the general form of MKP, introduces SA, and explains fine-tuning of SA in terms of parameters range and solution search space. Section IV discusses the system model and formulates the problem. RS 3 A and P IRS 3 A algorithmic structure is presented in Section V. Section VI includes a case study for the joint UA-RA problem and presents simulation settings. Analysis of testing results is done in Section VII. Finally, conclusions are drawn in Section VIII.

II. RELATED WORK
The problem of UA-RA in the under-laid HetNets has been studied recently. The approaches mainly differ in terms of architecture (i.e., centralized or distributed), the number of parameters considered, and the execution time. For instance, Alnoman et al. [5] proposed a joint UA-RA de-centralized approach to maximize the overall network throughput using a Mamdani-type fuzzy logic controller (FLC). The users were first classified based on their data rate requirements, and the controller decides the amount of bandwidth to allocate to each class. The results were compared to greedy-based and best signal-to-interference noise ratio (SINR)-based approaches. The results showed improvements in the data rate, bandwidth usage, and blocking ratio. However, this work did not consider the network load balancing issue, which affects the overall system throughput. Wang et al. [6] divided the UA-RA problem into two sub-problems. The first sub-problem was solved using graph theory by fixing the power allocation (PA), UA, and RA, while the second was solved using a convex difference function in which UA-RA was fixed, and PA was solved. However, this scheme did not provide services for user equipments (UEs) with bad channel conditions. Feng et al. [7] proposed two schemes for the joint UA-RA problem, one centralized and one distributed. The centralized iterative scheme was broken down into two sub-problems: first, using a cutting plane approach, the UA problem was solved; second, a primary decomposition approach solved the joint frame design and the RA problem. Both sub-problems were iteratively solved to find an optimal solution. The de-centralized scheme used repeated games between users, which has shown to achieve a Nash equilibrium. Overall, the centralized scheme had better system throughput than a distributed scheme. However, the scheme incurred a large overhead, which makes its use unrealistic in large-scale networks. Additionally, the resources between wireless backhaul and small-cell UEs are assigned orthogonally; hence, the spectrum efficiency decreases as user number increases. Using the Stackelberg game method, Zhong et al. [8] solved the UA-RA problem while considering the back-haul potential of BSs. Sapountizs et al. [9] provided an optimal solution for UA in back-haul restricted HetNets by finding the optimum cell to be associated with. Without an effect on its load, the search for this optimum cell is done iteratively. Luo et al. [10] suggested a joint UA-RA scheme to minimize network packet delay and suggested different QoSaware UA (QoSA) strategies: descent of block-coordinate, multiplier alternating-direction method, and multi-flow. These algorithms minimized the packet delays in a distributed way and have lower complexity than conventional UA strategies.
Barbosa et al. [11] proposed the use of DoE (Design of Expert) [12], RSM (Reduced Surface Methodology) and racing algorithms to improve the genetic algorithm (GA) and SA efficiency to solve the problems of classical optimization. We have used this work as the base for tuning SA's significant hyper-parameters for solving the MKP. RSM 1 is suggested as a fine-tuning technique by the authors of [13] to achieve greater proximity of regions with promising settings. The racing concept was studied in [15], [16] using F-race, a racing algorithm where candidate configurations are removed using Friedman statistics. The work in [17] helps us understand how to use various indicators such as channel quality index (CQI) and modulation coding scheme (MCS) to map RB usage rate with available throughput. The authors of [18] suggested the second phase of Radio Network Planning and Optimization (RNPO), i.e., comprehensive planning in which BS is put on the geographical area of interest and attempts to optimize their position based on the SA. In summary, the above work gives us an idea that SA can be used in a wide range of HetNet applications. The authors of [19] investigate cooperative jamming in a two-tier 5G HetNet and use convex optimization techniques to find feasible solutions to non-convex problems. We have used some of the suggested convex optimization techniques to prove that restrictions on MKP are convex from this work. Some simulation configurations in [20], such as the propagation loss model and fading model, have also been used in our work. Graph theory-based management resource allocation via knapsack in cellular networks with the device to device (D2D) communication underlays is suggested in [21]. In [22] authors use GA to address RA in HetNets, but with many limitations. Some of those limitations are: 1) the problem is formulated without taking into account various parameters in HetNets such as fading, interference, channel state information (CSI) 2) the results of the proposed scheme are compared with a knapsack, but there is no information on the weights and values of items and knapsack capacity.
The authors of this paper have previously introduced a novel quality efficient femtocell offloading scheme (QEFOS) which mitigates the effect of interferences and improves QoS and user quality of experience (QoE) [23] and a preliminary version of an enhanced SA for solving the UA-RA problem as a MKP, which this paper extends [24]. Many works suggest solving UA-RA problems, but none of the proposed work considers the TBS index, user demand, and maximum BS capacity in a single problem. Overall, no work solves the UA-RA while reducing complexity and overhead time as proposed in this work.
In our work, we formulate the dual UA-RA problem as a MKP, which is an NP-complete optimization problem [4]. Many techniques based on dynamic programming, branchand-bound, greedy, GA, ant colony, particle swarm optimization (PSO) were employed for solving MKP [25]. However, on the one hand, it is challenging to apply exact solution finding methods due to their exponential computational complexity. In contrast, approximation techniques avoid most complexityrelated drawbacks and achieve good results, such as in [25]. Belonging to the latter category, SA was chosen to solve MKP. SA is a probabilistic technique used to find a global minimum of an objective function by progressing through 1 The RSM framework is available in DoE software, and a more detailed explanation about its use can be found in [14]. many local minima [11]. Other approximation meta-heuristics algorithms, such as GA and PSO, can be used to solve the MKP. However, they were not used for the following reasons: all meta-heuristics algorithms have strong searching abilities in general. However, some meta-heuristics algorithms, such as SA, are single solution-based algorithms, while others, such as GA and PSO, are population-based algorithms. This means that SA begins with just one solution and attempts to improve it. In contrast, GA and PSO would have multiple possible solutions (i.e., hundreds), depending on the population size. As a result, SA would be easier to deploy and converges faster than GA and PSO when solving a combinatorial optimization problem like MKP. However, since GA has more exploration than SA, better optimal solutions are predicted. However, this comes at the expense of a long execution and convergence time. Due to all these quintessential properties possessed by SA, it was chosen. As in solving a large real-time scale networking problem like UA-RA, an algorithm with reduced complexity, low convergence time, and low computational resource requirement should be given utmost importance.
UA-RA is a recent research topic of high interest for network academics and researchers, and solving it using a metaheuristic algorithm is not the only solution available. Many other methods, such as Deep Reinforcement Learning (DRL) and Deep Learning (DL), can solve the UA-RA problem. However, they are not used due to some major limitations compared to meta-heuristics like SA, which is the algorithmic choice in this paper. DRL and DL are machine learning (ML) research avenues with a strong reputation for solving a wide range of learning tasks, but they are not easy to train [26]. However, lately the training cost is dropping and increasing number of such solutions are expected to be considered [27]. Therefore, it is recommended to follow closely the progress of this research avenue. On the other hand, SA is a compact and effective technique that offers excellent solutions to single and multi-objective optimization problems while reducing computation time significantly [28]. Meta-heuristic algorithms aim to solve problems faster, solve large problems, and generate robust algorithms. Unlike DL and DRL, they are versatile, easy to design, and simple to implement. Given a low enough temperature and enough perturbations, metaheuristics algorithms like SA are theoretically guaranteed to find the optimal solution to a problem [29]. On the other hand, DRL lacks the theoretical guarantees of algorithms such as SA, which take a hill-climbing approach and are less prone to policy collapse issues. By setting the greedy criterion only to accept better solutions, SA can achieve monotonically better performance. In contrast, DRL cannot [30].
Recently, several ML or DL-based algorithms have been proposed and applied to HetNets systems, including deep neural network (DNN) [31], long short-term memory (LSTM) [32], convolutional neural network (CNN) [33], Q-learning [34], deep Q-network (DQN) [35], and deep deterministic policy gradients (DDPG) [36]. However, on one hand, the DLbased models (e.g., DNN, LSTM, and CNN) have outstanding prediction and reasoning capabilities. Still, they require a considerable amount of labelled training data [37]. On the other hand, when the HetNet systems scale grows, DRL-based Population-based approaches maintain and improve multiple candidate solutions, often using population characteristics to guide the search; population based meta-heuristics include GA, PSO, etc. [44]. Reinforcement Learning -This is a branch of ML used to help an agent to learn the optimal policy when the agent has no information about the surrounding environment [34], [35], [36]. Deep Learning -This is branch of ML used to help an agent to learn the optimal policy when the agent has some information about surrounding environment in advance [31], [32], [37]. Deep Reinforcement Learning -This is an advanced model of reinforcement learning technique in which deep learning is utilized as an effective tool to improve learning rate for reinforcement learning algorithms [36], [40]. models (e.g., Q-learning, DQN, and DDPG) cannot converge, and the final results are unstable [38]. Slow convergence speed is a problem with Q-learning, mainly when the problem state space and action space are large. Additionally, the algorithms must save complete tables of an immediate value for each state-action pair, such as the Q-value. These tables may be too big for mobile devices to handle. DRL often causes poor performance in this regard [39]. DL-based Schemes: In [31], a distributed DL algorithm was proposed to make an offloading decision for MEC systems, where several DNNs were trained in parallel and the offloading decisions were made cooperatively. In [32], a LSTM network was proposed to predict the traffic of small base stations (SBS), and a cross-entropy loss function was applied to evaluate the LSTM and obtain the offloading strategy. In [37], a distributed deployment strategy for the multilayer convolutional neural network was presented, which included two parts: pre-processing and classification. The pre-processing part was deployed on the edge server for feature extraction and data compression to reduce the edge and cloud systems data transmission. In contrast, the classification part was deployed for pattern classification and recognition. These proposed DLbased methods need prior knowledge and labeled samples, which may be hard to obtain for a dynamic environment.
Reinforcement Learning (RL)-based Schemes: In [34], a Q-learning-based mobile offloading strategy was proposed for a mobile offloading game. In [35], a DQN-based approach was applied to jointly optimize the networking, caching, and computing resources in the vehicular networks. These proposed RL-based methods may be unstable and hard to converge for large search spaces with large-scale users.
DRL-based Schemes: In [36], a DRL-based energyefficient UAV control method was proposed to design the trajectory of UAV by jointly considering the communications coverage, fairness, energy consumption, and connectivity. In [40], a multi-agent DRL-based method was proposed to solve the joint UA-RA optimization problem. The optimization issue is investigated to obtain the optimal long-term network utility while guaranteeing UE QoS requirements. The optimal solution is obtained by jointly associating UEs to BSs and allocating channels to UEs.  [42]; on the other hand, SA's complexity depends on the cooling schedule [43]. The O notation represents the growth rate of an algorithm that is less than or equal to a specific value. We can infer from this in-depth review that our proposed algorithm outperforms other ML, DL, and DRL-based algorithms for real-time wide networking problems like UA-RA in terms of implementation, computational complexity, and producing nearby global optimal solutions with low convergence time. In this context, we propose RS 3 A and P IRS 3 A, two novel algorithms for UA-RA in HetNets that focus on optimal UE selection with reduced complexity and low processing time. In the next sections, we concentrate on the problem formulation and solutions, present simulationbased testing and analyse the results, which demonstrate the benefit of RS 3 A and P IRS 3 A in comparison with alternative solutions.

A. Generalized MKP
The MKP is a classical 0-1 problem of combinatorial optimization that can be extended to different fields. A set of items for n 2 and a set of m resources are given. The profit p j and the resource consumption value r ij are allocated to each item j(j = 1, ..., n) for each resource i(i = 1, ..., m). The problem is to define a subset of all items leading to the highest total profit while not exceeding the upper bound b i of the resource. It is possible to formulate the MKP as: Variable x j is an indicator of item j. If x j is set to 1, item j is selected, or 0 means item j is not selected for j = 1, ..., n. Eq.(1) represents the total profit of selection items and Eq.(2) the m resource constraints. A well stated 0/1 MKP assumes that p j > 0 and we assume that r ij ≤ b i ≤ n j=1 r ij for all i = 1, ..., m and j = 1, ..., n.

B. Generalized SA
SA is a local search algorithm that can circumvent the local optima problem. Its ease of implementation and its convergence properties made it an algorithm of choice for solving combinatorial optimization problems like MKP. It was named as such because of its similarity to the physical solid annealing process [11], which involves heating and controlled cooling of material by varying the temperature. If the temperature decreases very slowly, a stable state can be observed, which cannot be reached if the temperature falls quickly [11].
SA tries to evade the local optima by allowing temporal deterioration of actual solutions (i.e., moves to a solution that corresponds to a worse objective function value), where the deterioration is controlled by a parameter temperature t, which determines the mobility of the system and is reduced by a positive factor < 1 in the algorithm. The likelihood of accepting a deteriorated solution decreases as the algorithm progresses. For a given value of t some exchange trials D (repetitions) are performed, until the value t is less than final temperature δ. The initial temperature t should be initialized with t := α*ρ, where α is defined in Algorithm 1.
Some definitions are needed to define the SA algorithm for MKP formally. Let Υ be the solution space; define Φ(ω) to be the neighborhood function for ω ∈ Υ. SA starts with an initial solution, ω ∈ Υ. A neighboring solution ω ∈ Φ(ω) is then generated randomly in most cases. SA is based on Metropolis acceptance criterion 3 , which models how a thermodynamic system moves from its current solution ω ∈ Υ to a candidate solution ω ∈ Φ(ω), in which the energy content is being minimized. The candidate solution, ω , is accepted as the current solution based on the acceptance probability = exp Define t a as temperature parameter (t) in iteration a, such that t a > 0 f or all a and lim a→∞ t a = 0 (5)

C. Reduced Search Space SA (RS 3 A) Phase
A is an algorithm applied to various problems (P) with parameters (V). The issue of fine-tuning an algorithm can therefore be summed up as a search space: where α, β, ...,ζ are parameters of algorithm A for a given problem P and 1 ... η α , 1 ... η β , 1 ... η ζ are the finite ranges of values assumed for each parameter. The number of parameters as well as their ranges can vary extensively according to A and P studied, such that η α x η β x ... x η ζ should be possibly a number of combinations tests of A on P [11].
Our fine-tune algorithm approach can be expressed as a process that starts with an arbitrary set of instances from a class of optimization problems and follows the ranges from the algorithm for each parameter, applying 2 k full factorial architecture to research the response (effect) of multiple parameters (factors). A relationship of cause and effect between factors and responses can be developed using complete factorial designs, the usual representation of which is an empirical regression model (linear or quadratic) for the mechanism under study, as follows: where Y is the response, β 0 , β 1 , β 2 , β 12 are coefficients, X 1 and X 2 are factors, and is the experimental error. Factorial designs help define the variables that affect the response. However, the interest is to identify the factors that can maximize the process and generate values (of factors) closer to the optimum, Eq.(7), the relationship between factors and reaction may not be sufficient. We use RSM to achieve greater proximity to an area with promising settings. The RSM is a system of statistical and mathematical techniques that employ factorial designs, regression analysis, and methods of optimization in situations where many input parameters influence the output of a process [13]. The outcome of RSM is a model of the second-order, given by The application of a racing algorithm to describe the algorithms setup is our approach's final stage. The irace package [44], is an iterated racing implementation. Iterated racing is an automated configuration process consisting of three steps: (1) sampling new configurations by specific distribution, (2) choosing the best configuration using racing from the newly sampled ones, and (3) updating the sampling distribution to bias the sampling to the best configurations. Until a termination condition is met, these three steps are repeated. In iterated racing as implemented in the irace package, a sampling distribution independent of the other parameters has been associated with a configurable parameter, apart from constraints and conditions between parameters. A truncated normal distribution for numerical parameters or a discrete distribution for categorical parameters is the sampling distribution. Ordinal parameters are used as numerical parameters (integers). In a normal distribution, the distribution update consists of changing the mean and the standard deviation, or the discrete probability value of the discrete distributions. The update biases the distributions to increase the probability of sampling the parameter value in the best configurations discovered so far in future iterations.
The truncated normal distribution is a significant option in the world of probability and statistics and its use is natural when a normal distribution is employed. For example, when one wants to threshold or screen values from a normally distributed dataset, the remaining data has a truncated normal distribution. There is substantial motivation to study the truncated normal distribution from a statistical perspective [46]. It was also shown in [47] that the truncated normal distribution estimators generally have a smaller mean square error than the classical non-truncated normal distribution. This property possessed by truncated normal distribution motivated us to select it. We employed a tool, i.e., irace package, which uses sampling based on the truncated normal distribution. The next paragraph explains how and why the truncated normal distribution is used.
The samples are drawn first from a uniform distribution. However, in subsequent iterations, they are drawn from a normal distribution based on the values of the parameters in the elite configurations (this is only for numerical parameters) so that new samples are more likely to be similar to the best values found thus far. Since parameter values are bounded to a fixed range, the truncated normal distribution is favored over the classical non-truncated distribution. The conventional non-truncated normal distribution does not accommodate outof-bounds samples. The out-of-bounds samples can be treated by replacing them with the nearest bound values, which would result in boundary values having a very high probability, possibly greater than the distribution's mean. With the truncated normal distribution, if the mean is close to the boundary, values near the boundary are sampled more frequently, and if the mean is far from the boundary, values near the boundary are sampled less frequently.
The best configurations are chosen using racing after new configurations are sampled. With a finite set of candidate configurations, a race begins. The Fig. 2 illustration includes ten C i configurations. The candidate configurations are evaluated Figure 2. Racing for configuring automated algorithms [44]. Each node is an assessment of a single instance configuration. 'x' implies that there is no statistical test, '-' implies that at least one configuration has been discarded and '=' no configuration has been discarded by the test.
for a single instance at each point of the race (I j ). Those candidate configurations that perform statistically worse than at least another one are discarded after several steps and the competition continues with the remaining configurations that survive. Since the first elimination test is critical, usually a higher number of instances are seen before conducting the first statistical test. Subsequent statistical analyses are conducted more often (by default for every instance). This procedure continues until reaching a minimum number of surviving configurations or a maximum number of instances that have been used.
No study has used this trio (DoE, RSM and irace) to solve MKP using SA. In Python 3 and R 4 , all codes are built from scratch, as no such codes are directly available for the use of the irace package to solve MKP using SA.

D. Performance Improvement
An important issue with SA is that there are too many ineffective solutions which are explored. The reason is that the SA algorithm is essentially a prescription for a partial walk in the configuration or problem space. At the end of each iteration, a random step is produced. The associated evaluation point's target function is evaluated to determine whether or not to accept that step. As the problem search space contains many ineffective solutions, which a typical SA algorithm considers when evaluating the target function, wasting time and effort, before applying SA to solve MKP, P IRS 3 A removes the ineffective solutions to improve its performance.
To remove these ineffective solutions, the MKP items were arranged in decreasing order of their profits, i.e., the higher the profit, the lower the index. If the number of items n is greater than b n , we only take the front b n items to participate in SA selection process. The reason is that the last nb n items are never selected as the target is to obtain the maximum profit in the knapsack. Therefore, these remaining nb n items are associated to ineffective solutions and are removed from the search space.

IV. SYSTEM MODEL
As shown in Fig. 1, consider the downlink of a HetNet consisting of fixed BSs and randomly placed UEs. The area shown in Fig. 1 A high-speed back-haul with minimal delay (such as high-speed fiber) is connected to all BSs. Let N be the set of UEs located inside the region G and ψ j ∈ Ψ be the requested downlink rate (bits per second) of UE j, where Ψ is the discrete set of service classes. In this paper, we are interested in the video 5 services as it require high bandwidth. Each UE can only be associated with at most one BS at any time instance and we defineμ as the total path loss (i.e., follows a log-distance path loss model) between BS i and UE j in decibels (dB). Other notations are given in Table. III.
The UEs might not receive high data rates because of timevariable fading channels in wireless communication (i.e., LTE, 5G). Fueled by urbanization, this process is more extreme in urban areas. Another reason for low UE data rate can be congestion at BS, caused by the existence of many users. In today's wireless communication system, a decentralized scheme is typically used to solve such problems, enabling UEs to communicate with BSs that provide them with the best channel conditions and satisfy minimum QoS requirements.

A. Problem Formulation
We aim to build a decentralized scheme under maximum BS capacity and TBS index constraints to solve the UA-RA problem in HetNets. This kind of UA-RA problem has to be done in a self-organized way. Next, the potential objective function is considered, and the UA-RA problem is formulated as a MKP optimization problem. 1) Objective Function: There are two kinds of entities in our system with different viewpoints and priorities: UEs and BSs. On the one side, each UE, given its QoS requirement, wants to achieve the maximum data rate. The BSs, on the other hand, want to satisfy the UEs QoS requirement within their limits. Hence, we define the objective function as the sum of available throughput (i.e., estimated based on RB usage rate) under the BSs maximum capacity and TBS index constraint while considering the entity's perspective and objectives.
In general, the UA-RA problem that is solved by formulating it as MKP has a linear objective function with linear constraints except for Eq.(11), an integer constraint, or a binary constraint as it takes either 0 or 1 value only. Hence, MKP is a mixedinteger linear program (MILP) 6 problem, solved by P IRS 3 A to achieve a near-optimal solution and enhance the QoS perceived by users.
2) Optimization Problem: Under QoS provisioning, we describe the Optimization Problem for the UA-RA with MBSs and FBSs as follows: OP : minimize : −f (x) subject to : (9), (10), (11), (12) Constraint Eq.(10) refers to the fact that the allocation of resources to UEs does not surpass the maximum BS capacity (b i ). Constraint Eq.(11) shows the unique association property, where UE j can only be associated with one BS at any moment. Due to the unique association, the number of possible downlinks (UE associations) is reduced from 2 N M to M N [20]. Constraint Eq. (12) implies that each UE j should have a TBS index above a certain threshold Γ to participate in the UA-RA problem (i.e., TBS index vector). Constraint Eq. (13) indicates that each BS can serve at least one UE. OP belongs to a class of assignment problems which are proven to be combinatorial and NP-hard [4], [20], [25].

3) P IRS 3 A -A Decentralized Scheme:
Our objective is to design a decentralized scheme that helps each UE j to associate with that BS i that offers the highest throughput (i.e., high CQI, better channel conditions). We deploy P IRS 3 A in an Information Service Server (ISS) near MBS, which solves UA-RA problem episodically as illustrated in Algorithm 1. We assume that MBSs are chosen first, followed by FBSs, and that UEs and BSs can communicate through control signals (CS). In each episode e, every UE j estimates the throughput they can achieve with BS i as described in Section IV.B. UE j informs ISS about this estimated throughput p j and requested service r ij demands 7 . Based on Eq.(12), 7 In this paper we assume that UEs demands for video services only. vector Q i (i ∈ M ) is formed. Using CS, BS i inform ISS about their maximum capacity (b i ) 8 . On receiving all required information, P IRS 3 A runs and those UEs part of an optimal solution will be associated with BS i and are flagged as "accepted" and BS i allocates resources (sub-channels) to all the "accepted" UEs (Fig. 3).

B. Available Throughput Estimation Method
In this subsection, we clarify our proposed method of estimating the available throughput based on metrics that can be acquired by a UE. By calculating the available throughput, the UE will determine how much data it can obtain from a base station [17]. This available throughput corresponds to p j in OP. Available throughput means how much data transfer throughput a UE can use. It can be determined by measuring CQI and mapping it to get MCS and TBS index. The amount of data each RB (Fig. 4) may carry depends on the BSs modulation method, and this method is chosen based on the quality of the signal between the BS and the UE. The conventional method for estimating the throughput is shown in Fig. 5. The BS sends CS signal to UE. From the CS signal, UE calculates the CQI 9 , and UE sends back the feedback of the CSI, which includes the CQI to BS [48].
When BS receives CQI, the MCS for the downlink to the UE is chosen. The LTE framework specifies the CQI guidelines for choosing a modulation form. Using the MCS, the BS modulates data on each RB in the modulation mode. The amount of data that BS transmits to the UE depends on the MCS, and so the number of RBs allocated to the UE. The MCS and TBS indices mapping table [49] are used to get the TBS index. This TBS index and the number of RB allocated, calculated based on the method presented in [50], [51], are employed to estimate the available throughput for each UE j to each BS i.  Fig. 5(b). In Fig. 5(b), firstly, BS i sends a CS to each UE j who wants to associate with it. On the basis of received CS each UE j calculates CQI and using CQI, MCS and TBS mapping indices table [49] j th each UE calculates TBS index and RBs allocated and estimates the available throughput which UE j can receive from BS i. UE sends this information of the TBS index and available throughput to the ISS where our proposed scheme runs. If the TBS index information by UE j passed Eq.(12), it will participate in the MKP problem. If it gets selected after solving MKP using P IRS 3 A, it will get an association with BS i, and resources will get allocated.
V. ALGORITHM STRUCTURE P IRS 3 A is solved in episodes, as shown in Algorithm 1. In each episode e ∈ E, BS i ∈ M inform its maximum capacity b i (in Mbps), which represents the Knapsack capacity (the BS that participates in episode e will not participate in the next episodes) to ISS. Next, each UE j ∈ N will provide the TBS index (obtained by mapping) and estimated available throughput p j and demands r ij to ISS. Those UEs that satisfy Eq.(12) will be selected in the problem. Vector P (estimated available throughput (Mbps)), R (users demands (Mbps)), and vector Q (UEs satisfying Eq.(12)) are generated. Those UEs which do not satisfy Eq.(12) form a vectorQ. The size of vectors Q,Q, P , R vary, depending on which UEs satisfy Eq. (12). U p denotes a set of participating UEs. X:= [x 1 , x 2 , ..., xñ] denotes a matrix of variable x ĩ j (j ∈ U p ). Each episode e has two phases: diversification (exploration) and intensification (exploitation). In exploration, only those UE j which qualify based on Eq. (12) are selected; the remaining ones are considered in the next episode e + 1. In exploitation, selected UEs to take part in the problem and on solving Eq.(9) subject to Eq.(10), Eq.(11) selected UEs will be associated with BS i, and resources allocated to them. After each episode, set N is updated, and only those UEs which do not meet Eq.(12) and those not part of the obtained nearoptimal solution ω are left over. The P IRS 3 A will stop when N becomes an empty set; thus, UEs no longer participate in the exploration. The P IRS 3 A convergence. Meta-heuristic design leads to the concept of probability convergence. We give the following definitions to explain the concept.

A. Fine Tuning
We have selected the following set of parameters which affect the most MKP solving efficiency using meta-heuristics SA [52]: initial temperature control parameter (ρ), control parameter ( ), final temperature (δ), and the number of iterations during one temperature range (D).
We use the relative deviation from the optimum, provided by our results, to generalize the results and compare them to each other, as indicated in Definition V-.1.
where the computed solution is f(e) and f (e * ) is the bestknown solution to the problem. Thus, the lower the value of κ(e) for the meta-heuristic, the better the performance of the algorithms [11]. In the first step of our approach, the parameters and their necessary corresponding levels (low and high) required by a 2 g complete factorial are in  Let Drop itemĩ, and pick up another item randomly, get new solution ω ; Incumbent solution ← f(ω ); a = a+1; until a = D; Set t = * t; until t > δ; N = N -ω ; e = e+1; The fine-tuning of the meta-heuristics SA on MKP uses four arbitrary instances available on Github 10 with well designed and developed PYTHON 3 code for solving MKP using SA.
The 2 g=4 full factorial architecture is used here [11] to identify the critical factors and response effect. We used a DoE ANOVA study to identify these factors. As a consequence, in the first step of our method, we can assume the following for 10 https://github.com/bharat1992-bit the SA algorithm: (i) Out of all four factors studied, two factors are significant regardless of the instance selected i.e., Initial temperature control parameter (ρ), control parameter ( ).
(ii) Final temperature (δ) and number of iterations for one temperature (D) are significant in different instances.
(iii) Depending on the instance studied, there are variations between the interactions of the variables.
The next stage consists of applying the RSM to explore the neighborhood regions around a promising area and, according to the studied example, to obtain values for each parameter. RSM is a mathematical framework with statistical techniques used in problem modeling and optimization, where several factors affect the system response. In this context, RSM employs a sampling technique that finds the best match for each studied parameter to obtain a sub-optimal value that corresponds to estimated throughput Eq.(9) with low convergence time. From the RSM results presented in TABLE V, we define a range of values between the minimum and maximum of each parameter. It forms a space for the search of candidate configurations. The procedure employed consists of a simultaneous variation of all four parameters until ANOVA shows statistical significance. RSM results suggest an empirical model for the four parameters. Following RSM, the next stage employs the irace package to implement the racing algorithms. There is no code in the irace package for solving MKP using SA. However, there are various examples of how to solve other optimization problems, such as TSP, solved by SA. The racing algorithms' efficiency and RSM's power were used.
We have used the irace package for racing algorithm implementation to select as good as possible configuration out of many options in our last step. For this study, the settings used for SA are ρ ∈ {0.62, 0.7}, ∈ {0.255, 0.5, 0.745}, δ ∈ {0.00041, 0.00051}, D ∈ {39, 52, 65}. The values of the ρ, , δ, D are taken from TABLE. V. Because of the large difference between the minimum and maximum values obtained from the results of RSM as shown in TABLE. V, three different values, including the middle point, for , D, are considered. Every possible combination leads to a different setting of the algorithm (explained in Section III.C), so that our search space was composed of 36 different SA algorithm parameter settings. We reached the best setting (TABLE. VI 11 ) for each algorithm for solving MKP after applying the racing algorithm in the search space. In terms of parameter range, the trio lets us achieve RS 3 A, i.e., Reduced Search Space SA.

B. UA-RA problem simulation settings
In NS3, we perform comprehensive simulations to evaluate our proposed algorithm. We use the optimal f (e) solution for the benchmark, which is computed using the planned and built 11 The default corresponded to settings of SA used in [11] and suggested corresponds to settings of SA obtained from fine-tuning.  Figure 6. Various elements of the whole data simulation, optimization, and network process.
P IRS 3 A in PYTHON 3. In TABLE VII, the major simulation parameters are given. Fig. 6 shows the link and the chain that connects the various elements of the whole data simulation, optimization, and network process. Firstly, for all our experiments, we assume the BSs to be deployed at fixed locations as shown in Fig. 7. Second, we Figure 7. An example of a two-tier HetNet including one MBS: M 1 , and 10 FBSs: M 2 , M 3 , · · · , M 11 . The FBSs are usually located at commercial and residential buildings that constitute hotspots for wireless traffic. The UEs U E 1 , U E 2 , · · · , U En in a region G are either served by either MBS or FBSs selected by ISS. randomly deploy UEs ( N = [70, 85, 100]) following a homogeneous Poisson point process (PPP) for different experiments. Third, we consider a discrete user demand (i.e., requested data rate). In this network, we consider a log distance path loss model as explained and used in [20]. We examine various QoS metrics of those UEs who qualify all constraints and are part of the near-optimal solution in each episode e ∈ E. We evaluate 1) throughput (Mbps), 2) packet loss ratio (PLR) (%), 3) Jitter (ms), 4) Delay (ms). We compared our proposed schemes with the other two schemes, i.e., SC and DSA.

A. Results after parameters and search space reduction
All results provided in this subsection were computed by Eq. (16) and obtained before the fine-tuning process, i.e., default SA (DSA), after fine-tuning in terms of the range of parameters, i.e., RS 3 A, and finally after further fine-tuning in terms of solution search space reduction, i.e., P IRS 3 A, for comparison purposes.
The first set of results (TABLE VIII) corresponds to ten runs of the meta-heuristics (DSA, RS 3 A, P IRS 3 A) in ten instances 12 of the benchmark of MKP. In TABLE. VIII, column AM is the arithmetic mean of Eq.(16) in ten runs of each instance; column σ represents the standard deviation in ten runs of each instance; Nopt is the number of times that algorithms reach the optimum for each instance.
Statistics showed in TABLE VIII indicate output improvements both in RS 3 A and (P IRS 3 A). It should be noted that P IRS 3 A for all instances is more promising. Fig. 8 illustrates the arithmetic mean of the ten runs of metaheuristics (DSA, RS 3 A, P IRS 3 A, DGA) on ten different instances of the MKP. DSA corresponds to the default simulated annealing without any fine-tuning of search space of the parameter ranges and the solution space. RS 3 A corresponds to SA with fine-tuning of search space of the parameter ranges. DGA corresponds to classical GA 13 . We observe that P IRS 3 A is the closest to the optimal solution than DSA, RS 3 A, and DGA.
In contrast, DGA produces a better optimal results compared to DSA and RS 3 A due to more exploration than these SA solutions but lower optimal values compared to P IRS 3 A. It can also be noted from TABLE. VIII column of Nopt that there is a greater likelihood (probability convergence) of obtaining optimal solutions under P IRS 3 A compared to DSA and RS 3 A. Indeed, DSA is a single solution-based algorithm with less exploration and high exploitation, because of which it can easily be stuck in local minima. Parameter search space reduction helps find the best starting solution, while solution search space reduction helps remove the useless regions, saving significant time in finding the global optima. By combining these two aspects, P IRS 3 A achieves the global optima with low convergence time. 12 In this work, Instances or Inst. refer to a problem set having a knapsack with individual capacity and items with weights and values, which are solved using a different version of SA. All the Instances selected have 40 items. 13 The hyperparameters considered for evaluating MKP using DGA are [11] : number of generations = 203, crossover probability = 0.54, probability of mutation = 0.79, and population size = 110.  Results from the σ column of TABLE VIII also suggest that our proposed modification in DSA makes the process more stable. Fig. 9 depicts the distribution of near-optimal value (f (e)) per instance. We notice that, due to fine-tuning in both parameters range and solution space, P IRS 3 A offers the solution close to the optimal solution than two other schemes. Through the results (TABLE VIII) and supported by the graphical analysis ( Fig. 9), we can highlight that our proposed solution P IRS 3 A produces better results. Table. IX, Fig. 10(a), and Fig. 10(b) show the execution time expressed in ms and sec for DSA, RS 3 A, P IRS 3 A, and DGA when varying the number of items 14 . P IRS 3 A has an average execution time 60.71%, 15%, and 99.9% shorter than the times of DSA, RS 3 A, and DGA for 50 items and around 69% ,21%, and 99.9% for 2000 items, respectively. DGA's high exploration time resulted in a long overhead time, making it unsuitable for solving real-time comprehensive network problems. This confirms that our proposed scheme has reduced complexity, while incurring a low overhead time.

B. QoS Assessment
We have considered only one MBS (M m = 1), ten FBSs (M f = 10). All FBSs are initially off, and they will turn on if required sequentially. Under the SC scheme, all the users have an association with MBS only. In contrast, under  the DSA scheme, we formulated the UA-RA problem as the MKP problem. We solved it using DSA, i.e., without any finetuning in parameters range and solution space. On the other hand, under the P IRS 3 A scheme, we formulated the UA-RA problem as MKP and solved it using SA, but fine-tuning both in parameters range and solution space. We assessed QoS metrics (throughput (Mbps), PLR, delay, and jitter) perceived by users in a HetNets environment and results are presented in Fig. 11, Fig. 12, and Fig. 13 with error bar which corresponds to 95 % confidence interval (CI).
1) 70 Users: N = 70 users are randomly placed in a given area G. Fig. 11(a) (TABLE. X). The P IRS 3 A throughput is 40% higher than that was achieved when the DSA scheme is employed and 80% higher than achieved when the SC scheme is employed.
In terms of PLR, from Fig. 11(b), we note that under the SC scheme, users experienced more than 54.5% packet loss with CI 15 equals to 46.7 -62.2 %. Under the DSA scheme, the user experienced more than 19 % packet loss with CI equals 13.45 -26.35 % and almost 4% packet loss with CI equals 3 -5 % under P IRS 3 A scheme (TABLE. X). The P IRS 3 A PLR is 78% less than that experienced when the DSA scheme is employed and 93% less than that achieved when the SC scheme is used.
In terms of delay, from Fig. 11(c), we observe that under 15 Note that from now on we use 95% CI or CI interchangeably From the above results, we can conclude that our proposed scheme P IRS 3 A has outperformed the other two schemes in terms of all QoS metrics considered, hence satisfying UEs ψ j .
2) 85 Users: N = 85 users are randomly placed in the given area G. From Fig. 12(a)   The P IRS 3 A PLR is 70% lesser than that was experienced when the DSA scheme is employed and 89% lesser than that was achieved when the SC scheme is employed.
In terms of Delay, from Fig. 12 (TABLE. XI). The P IRS 3 A Jitter is 13.3% lesser than that was experienced when the DSA scheme is employed and 34% lesser than that was achieved when the SC scheme is employed.
From the above results, we can infer that when the number of users increased to 85, our proposed scheme still satisfies users' demand ψ j and outperformed the other two schemes in terms of all QoS metrics. Hence, this addresses our proposed scheme's potential, which can satisfy users' demands for such massive traffic.
3) 100 Users: N = 100 users are randomly placed in a given area G. Fig. 13(a) (TABLE. XII). The P IRS 3 A throughput is 42% higher than that was achieved when the DSA scheme is employed and 84% higher than that was achieved when the SC scheme is employed.
In terms of PLR, from Fig. 13 In terms of jitter, from Fig. 13(d), we note that under the SC scheme, users experienced around 6.5 ms jitter with CI equals 6.1-6.9 ms. Under the DSA scheme, the user experienced around 4.65 ms jitter with CI equals 4.22-5.08 ms and almost 4.15 ms with CI equals 3.61-4.69 ms under P IRS 3 A scheme (TABLE. XII). The P IRS 3 A jitter is 11% less than what was experienced when the DSA scheme is employed and 36% less than what was achieved when the SC scheme is used.
From the above results, we observe that when the number of users increased to 100, our proposed scheme roughly satisfies user demands ψ j but achieves PLR higher than 10%. However, P IRS 3 A outperformed well the other two schemes in terms of all QoS metrics.

VIII. CONCLUSIONS AND FUTURE WORK
This paper proposed two innovative simulated annealing based solutions: RS 3 A, and P IRS 3 A, designed for solving the complex user association and resource allocation (UA-RA) problem in HetNets. The UA-RA problem was formulated as an MKP where BSs represent the knapsacks in this solution, and UEs are the items to be fitted into the knapsacks.
It is shown that the decentralized scheme P IRS 3 A converges to an optimal solution in comparison to other two versions of SA i.e. DSA and RS 3 A and DGA, with reduced complexity and low overhead time. It was also demonstrated that optimization function represents a hyper-plane and its convex and maximum BS capacity, TBS index constraints are convex and separately reflect half-spaces and intersection of half-spaces and hyper-planes form a polyhedron.
The numerical results demonstrate that context-awareness factors like maximum BS capacity, UE demands, and channel conditions (TBS index) significantly improve resource utilization and result in user QoS improvements. Future work will include 1) integration of dynamic interference mitigation with the UA-RA problem and proposal of a novel approach jointly solving it, and 2) considering quality of experience (QoE) estimation in the problem solving process. APPENDIX A PROOF OF LEMMA 1 p represents a vector of values of items andx represents a vector of association i.e the values ofx can either be 1 or 0. The problem is to maximizep *x andp *x can achieve any value in each episode and we consider that value Z max .
p * x = Z max (17) where, (17) represents a HYPER-PLANE (Fig. 14). In order to proof HYPER-PLANES are convex, it needs to satisfy the weighted linear combination of p * x 2 = Z max (20) p * (θ * x 1 + (1 − θ) * x 2 ) = θ * p * x 1 + (1 − θ) * p * x 2 = θ * Z max + (1 − θ) * Z max = Z max (21) If there is no restrictions on θ, then it is also a Affine function. Hence all Affine functions are convex, but converse is not true. He has authored over 140 international publications, over ten patents and contributed to standards at the broadband forum. His main research is in the area of 5G optical networks, where he carries out pioneering work on the convergence of fixed-mobile and access-metro networks, and on the virtualization of next generation networks, and has been invited to share his vision through several keynote and talks at major international conferences across the world.
Gabriel Miro Muntean is a Professor with the School of Electronic Engineering, Dublin City University (DCU), Ireland, and Co-Director of the DCU Performance Engineering Laboratory. Prof. Muntean was awarded the PhD degree by DCU for research on adaptive multimedia delivery in 2004 and BEng and MEng degrees in Software Engineering by "Politehnica" University of Timisoara, Romania in 1996 and 1997, respectively. He has published over 400 papers in top-level international journals and conferences, authored four books and 21 book chapters, and edited six additional books. His research interests include quality, performance, and energy saving issues related to multimedia and multiple sensorial media delivery, technology-enhanced learning, and other data communications over heterogeneous networks. Prof. Muntean is an Associate Editor of the IEEE Transactions on Broadcasting, the Multimedia Communications Area Editor of the IEEE Communications Surveys and Tutorials, and chair and reviewer for important international journals, conferences, and funding agencies. Prof. Muntean is senior member of IEEE and IEEE Broadcast Technology Society.