ACO-Based Scheme in Edge Learning NOMA Networks for Task-Oriented Communications

Conventional communications systems centered on data prioritize maximizing network throughput using Shannon’s theory, which is primarily concerned with securely transmitting the data despite limited radio resources. However, in the realm of edge learning, these methods frequently fall short because they depend on traditional source coding and channel coding principles, ultimately failing to improve learning performance. Consequently, it is crucial to transition from a data-centric viewpoint to a task-oriented communications approach in wireless system design. Therefore, in this paper, we propose efficient communications under a task-oriented principle by optimizing power allocation and edge learning-error prediction in an edge-aided non-orthogonal multiple access (NOMA) network. Furthermore, we propose a novel approach based on the ant colony optimization (ACO) algorithm to jointly minimize the learning error and optimize the power allocation variables. Moreover, we investigate four additional benchmark schemes (particle swarm optimization, quantum particle swarm optimization, cuckoo search, and butterfly optimization algorithms). Satisfactorily, simulation results validate the superiority of the ACO algorithm over the baseline schemes, achieving the best performance with less computation time. In addition, the integration of NOMA in the proposed task-oriented edge learning system obtains higher sum rate values than those achieved by conventional schemes.


I. INTRODUCTION
Deployments and applications in the Internet of Things (IoT) involve a vast network of interconnected users that generate substantial amounts of data.However, transmitting large quantities of data from diverse IoT devices to a distant cloud server creates significant communications challenges and increases latency in transmissions [1].Consequently, the concept of edge computing has emerged as an alternative to traditional cloud computing to tackle these issues.Edge computing harnesses the storage, communication, and computational capabilities available at edge servers to The associate editor coordinating the review of this manuscript and approving it for publication was Christian Pilato .
efficiently collect and manage this immense volume of data.Additionally, edge servers facilitate quick access to the extensive data distributed across end-user devices, enabling rapid model learning and the delivery of intelligent services and applications to IoT users [2].Aligned with cutting-edge smart IoT sensors in 5G networks and the anticipated 6G networks, edge computing is evolving into edge intelligence, ushering in a new era of more sophisticated and intelligent IoT applications and services [3].
In the realm of edge learning, the primary goal is to swiftly acquire intelligence from the abundant, yet widely dispersed data generated by subscribed IoT users.This hinges critically on the processing of data at edge servers, and on establishing efficient communication between these servers and IoT users.However, as opposed to the ever-increasing processing capabilities of edge servers, communications encounter obstacles in the form of wireless channel issues, making it the bottleneck in achieving ultra-fast edge learning [4].Additionally, the diverse nature of ubiquitous IoT users and the complexity of transmission environments introduce interference that significantly undermines the reliability and communication speed of the IoT network, particularly when transmitting vast amounts of data to an edge server [5].To tackle these challenges, traditional data-centric communication systems aim to maximize network throughput based on Shannon's theory, which focuses on securely transmitting data despite constrained radio resources [6].Nonetheless, such approaches often prove ineffective in the context of edge learning, because they rely on classic source coding and channel coding theories, failing to enhance learning performance.Therefore, a shift in wireless system design is imperative, moving from a data-centric perspective to one that prioritizes task-oriented communication [1], [7].
In this regard, task-oriented communication aims to extend its scope beyond transmitting data at the micro level, where performance is assessed based on factors like bit or packet error rates, and instead emphasizes communication experiences that consider macro-level performance metrics, such as learning rate and inference accuracy [8].Task-oriented communication in particular can lessen communication load by supplying only task-relevant information, such as feature extraction for edge inference, as opposed to sending all the data and ignoring information structures [9].For instance, in [10], the authors introduced a learning-driven communication strategy designed to optimize local feature extraction, and distributed feature encoding for task-oriented purposes.This approach aims to eliminate redundant data and transmit only crucial information needed for downstream inference tasks, instead of reconstructing data samples at the edge server.In [1], the authors achieved effective communication within the task-oriented framework by optimizing power allocation and edge learning error prediction.Moreover, they implemented multi-user scheduling to mitigate interference issues in densely populated networks.Similarly, in [11], the authors focused on enhancing learning performance rather than communication throughput in the edge learning network.Therefore, they proposed an approach called learning-centric power allocation (LCPA), which is an analytically based solution for allocating radio resources in scenarios driven by learning.Simulation results showed that the LCPA scheme overcame conventional power allocation methods in terms of classification error.
None of the previous papers discussed above considered non-orthogonal multiple access (NOMA) to address spectrum scarcity issues caused by the massive connectivity for future wireless networks in task-oriented communication systems.Indeed, the IoT's rapid advances for 5G and beyond wireless networks must accommodate the massive connectivity demands imposed by the rapid growth in IoT devices.However, this reality introduces a spectrum scarcity issue, which can be dealt with through the adoption of a NOMA transmission strategy that operates in the power domain and employs techniques like superposition coding and successive interference cancellation [12].Thus, motivated by the benefits provided by the NOMA technique and next-generation communication systems envisioned to be task-oriented, in this paper, we investigate a low-complexity design to optimize the learning error and power allocation for task-oriented communications in an edge learning NOMA network.In the pursuit of finding optimal solutions, traditional optimization methods can introduce significant computational complexity.Additionally, these methods lack flexibility, requiring reformulation whenever network alterations occur [13].As a remedy to these limitations, the realm of metaheuristic algorithms within artificial intelligence (AI) has emerged, offering a powerful approach to addressing intricate computation problems.Metaheuristic algorithms systematically generate potential solutions for optimization challenges, subsequently selecting the most promising option, while maintaining a balance between computational efficiency and solution accuracy.
In the domains of science and engineering, several metaheuristic algorithms have garnered substantial popularity, including the genetic algorithm [14], cuckoo search (CS) [15], ant colony optimization (ACO), and particle swarm optimization (PSO) [16].Notably, Mohiz et al. [15] delved into a comprehensive exploration of diverse metaheuristic algorithms, discerning their effectiveness in optimizing task placement within network-on-chip cores.Their study concluded that CS outperformed baseline schemes by exhibiting minimal computational overhead.In [13], quantum particle swarm optimization (QPSO), an extension of the PSO algorithm, was applied to optimization problems for wireless communication networks.The simulation results showed that QPSO overcame the standard PSO and several metaheuristic methods to maximize the secrecy energy efficiency in a cooperative NOMA system.Motivated by the inherent benefits that metaheuristic algorithms bring to bear in tackling complex optimization dilemmas, this paper embarks on evolutionary computing algorithms for the resolution of power allocation optimization problems in an edge-learning NOMA system.Among these strategies, we propose an ACO-based scheme as a promising candidate that manages to strike a balance between accuracy and computational complexity.To comprehensively assess the performance of our proposed network, we formulated two distinct optimization problems: a single-task case (SC) and a multiple-task case (MC).
The main contributions of this paper can be summarized as follows.
• A task-oriented power allocation scheme is proposed for SC and MC optimization problems where the NOMA transmission strategy is considered in an edge learning system to improve network performance.To validate the advantage from using NOMA, we evaluate its performance against the maximum ratio combining (MRC) technique in terms of achievable data rate.
• Furthermore, we propose a novel power allocation scheme based on the ACO algorithm.This scheme is aimed at minimizing learning errors while optimizing power variables within a task-centric edge learning NOMA system.In addition, for comparison purposes, we solve the conventional sum-rate maximization problem.To assess the practical applicability of our algorithms, we examine their performance in the context of three perception tasks related to autonomous driving systems [17].
• Simulation results show that the proposed ACO-based algorithm efficiently resolves task-specific power allocation issues with significantly reduced computation time compared to conventional baseline schemes.
In particular, we investigate four additional algorithms: QPSO [13], PSO, CS, and butterfly optimization [18].In addition, our simulations reveal that the implementation of NOMA in a task-oriented edge learning system achieved higher data rates in comparison to the standard MRC technique.The rest of the paper is structured as follows.The system model is described in Section II.In Section III, we present the problem formulation.In Section IV, we described the proposed ACO-based optimization scheme.In section V, we provide the simulation results, and the computational complexity analysis.In section VI, we outline unsolved problems, and provide future research directions.Finally, conclusions are described in Section VII.

II. SYSTEM MODEL
We consider the task-oriented edge learning NOMA system shown in Figure 1, in which the edge server is equipped with N antennas.We consider L user groups, with L different learning tasks, {τ 1 , . . ., τ L } .Let us define the group of users as = { 1 , . . ., L }, where l represents the users executing the l-th task, with the number of users equal to We assume each user is exclusively associated with a single task, and the total number of users is denoted as Regarding the L distinct learning tasks depicted in Figure 1, each of these tasks encompasses a specific dataset, a learning model, the process of fine-tuning model parameters, and a task-oriented power assignment.Moreover, h k ∈ C N ×1 denotes the complex-valued channel fastfading vector from the k th user to the edge server.It is assumed that users in a higher indexed group have better channel conditions than users in a lower indexed group [19].Without loss of generality, we consider ∥h 1 ∥ 2 ≤, . . ., ≤ ∥h K ∥ 2 .Following NOMA principles for uplinks that utilize the successive interference cancellation (SIC) technique, the decoding process prioritizes channels in descending order of power.This means the signal with the highest power and associated with the k th user is decoded first at the edge server.Subsequently, this initial decoding contains interference from all users that have comparatively weaker channel conditions.Thus, the achievable rate of decoding a message from the k th user can be expressed as follows: where g k,j denotes the composite channel gain from the j-th user at the edge server when decoding data of the k th user, computed as loss of the k th user, and σ 2 is the variance of additive white Gaussian noise.Note that if we consider the MRC technique, interference from other users is considered noise.Subsequently, the achievable data rate of the k th user can be expressed as follows: At the edge server, the number of samples transmitted by a user to learn task τ l can be calculated as follows [1]: where W denotes the total bandwidth in hertz, and T represents transmission time in seconds.For each data transmission, D l represents the number of bits.Meanwhile, for the l th pre-trained task, R l represents the initial amount of historical data.Furthermore, in order to link wireless resource allocation with the performance of machine learning, a nonlinear exponential function, , is formulated to represent the characteristics of the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
learning error function, where the tuning parameters a l and b l represent non-independent and identically distributed parallel datasets, respectively.In practice, values for a l and b l are determined through a process of fitting the learning error function from the historical dataset.This tuning function closely aligns with the empirical data of the machine learning model, showcasing a good degree of alignment [1].

III. PROBLEM FORMULATION
In this paper, we minimize the learning error function by jointly optimizing the power allocation variables in the proposed NOMA edge learning network for multi-users with task-oriented communication.Moreover, we consider an SC and an MC.Therefore, the optimization problem for the SC is formulated as follows: min where constraint (4b) indicates that the power allocation of all users does not exceed the total available power, P. For the MC, the optimization problem is formulated as follows: min subject to (4b), and (4c), where R i D i is the weight of diverse datasets.Note that the objective function in (5) can adapt to various learning tasks by dynamically adjusting weight factors φ l , ∀l.Furthermore, for comparison purposes, we formulate the optimization problem for the traditional sum-rate maximization as follows: subject to (4b), and (4c).(6a)

IV. ACO-BASED OPTIMIZATION METHOD
In this research article, we investigate the ACO algorithm to address the optimization problems formulated in (4a), (5a), and (6a).Initially, the ACO algorithm was introduced by Socha and Dorigo for discrete spaces, drawing inspiration from the foraging behavior of real ants [20].In the ACO algorithm, a group collaborates to discover the optimal solution by utilizing an artificial pheromone trail to navigate through potential paths or solutions.This sharing of information through pheromone deposition enables ''the ants'' to construct effective paths using a discrete probabilistic approach.Nevertheless, the inherent pheromone deposition mechanism of ACO is tailored for discrete domains, necessitating adaptation for continuous spaces.Thus, the ACO algorithm designed for continuous domains achieves this by replacing discrete probability distributions with continuous probability distributions, which are represented as probability density functions (PDFs) [21].Within each iteration of the ACO algorithm, the system learns these PDFs by accumulating a historical record of candidate solutions, which is stored in a dedicated solution archive.To be more specific, ACO retains the most promising solutions within a designated file, denoted as a solution archive as depicted in Figure 2.This practice forms a Gaussian probabilistic model that implicitly simulates the concept of the continuous pheromone.It is worth noting that a single Gaussian function is limited to searching in one dimension because it possesses only one maximum.To overcome this limitation, a Gaussian kernel PDF is employed, which is essentially a weighted sum of several one-dimensional Gaussian functions.Furthermore, to cater to multi-dimensional search spaces, a distinct Gaussian kernel PDF is constructed for each dimension.Each of these Gaussian kernel PDFs utilizes three essential vectors: the weights, the vector of means, and the vector of standard deviations.This comprehensive approach allows ACO to adapt and excel at optimization of real-valued parameters in continuous domains, making it a powerful tool for solving complex optimization problems.In this paper, we consider ACO to optimize the power allocation variables, {p k } , which minimizes the cost function given by the learning error in both the SC and the MC.The set of power variables, p k , is denoted by the vector C. Let us change the notation of p k to c k where k = 1, . . ., K indicates the index of the dimension vector.In this paper, K = 6 since we consider six users; each of them is assigned one power allocation variable, c k .Accordingly, the group of variables is C = c 1 , c 2 , c 3 , c 4 , c 5 , c 6 .Then, to build the solution archive [21], random solutions are generated in the range [0, P] for each c k .Every solution C j of the archive is composed of a vector of the variables to be optimized, c 1 j , c 2 j , . . ., c k j , . . ., c K j , the associated objective function values, f C j , and associated weight we j where j = 1, 2, . . ., r.The solution archive is built in ascending order according to the performance of the objective function values, {f (C 1 ) , f (C 2 ) , . . ., f (C r )} .Moreover, the weight of the j-th element in the solution archive can be expressed as follows: where ζ is the intensification factor and is a positive real number that manages the degree of selection pressure in the process.For high values of ζ , a large number of possible solutions in the archive may be chosen.Therefore, as the value of ζ rises, convergence may be slower.
In addition, f r denotes the sample size of the new candidate solutions.The generation of f r is based on a guide solution.For this purpose, the Roulette wheel selection algorithm is utilized to choose a guide solution from the solution archive in accordance with equation ( 8) such that the higher the fitness, the higher the probability of selection as a parent solution: After the guide solution is chosen, the f -th new candidate solution is generated based on the Gaussian PDF.Thus, for every decision variable, a Gaussian model is built: where c k guide,f indicates the k th element of the vector C guide in the guide solution.Moreover, the values of mean and variance for the Gaussian kernels in the guide solution can be expressed as follows: Chose a guide solution from the solution archive.Save and assess the generated solution by solving problem (4a) or (5a) to obtain f C f .12: Accordingly, in each iteration It, where It = 1, 2, . . ., Ite ACO , the results in the solution archive are updated to achieve the most optimal outcome recorded to this point.Ite ACO denotes the maximum number of iterations in the algorithm.Finally, the best result so far is the optimization result until meeting the termination criteria.Otherwise, the ACO algorithm recalculates the probabilistic model.Algorithm 1 summarizes the ACO algorithm to minimize the learning error in problems (4a) and (5a) while optimizing power allocation {p k } .Note that optimization problems (4a) and (5a) are solved individually by the ACO algorithm.

V. SIMULATION RESULTS
In this section, we showcase simulation results to assess the performance of the designed schemes in comparison to benchmark methods.To generate the simulation results, we used a computer with a 4 GHz i7-6700K CPU and 16 GB of RAM.The simulation parameter settings for the proposed wireless communications were similar to those in [11].Specifically, we set communication bandwidth W = 180kHz, the number of users to K = 6, the total power budget at P = 20mW, the path loss of the k th user was ρ k = −90dB, and channel h k was based on CN (0, ρ k I ) .Since we considered multiple-user connectivity in an IoT network, we assumed the number of user sets is equal to the number of L different tasks [1].Therefore, we considered three tasks, one for each set of users (i.e., three user sets).Each set was composed of two users.Thus, for task-oriented learning at the edge, we considered three tasks in autonomous driving [17]-Task 1: weather classification utilizing RGB images and a CNN, Task 2: object detection using point cloud data and sparsely embedded convolutional detection (SECOND), and Task 3: traffic design using RGB images and YOLOV5.In these experiments, datasets were generated by the open-source autonomous driving simulation platform called CarlaFLCAV.These datasets are available online at https://github.com/SIAT-INVS/CarlaFLCAV. In the simulation experiments, Task 2 was selected to perform the SC.Meanwhile, the three-task case representing the MC is given by Task 1, Task 2, Task 3.Each RGB image contained D 1 = D 3 = 0.7MB, and that of each point cloud sample D 2 = 1.6MB.Moreover, the number of historical data samples was R 1 = R 2 = R 3 = 300.The learning parameters for Task 1, Task 2, and Task 3 were (a 1 , b 1 ) = (10.34,1.2) , (a 2 , b 2 ) = (0.5, 0.1) , and (a 3 , b 3 ) = (8.89,0.64) , respectively.The results were averaged over several channel realizations.
In the simulations, we considered five swarm-intelligence schemes: ACO, PSO, QPSO, CS, and BOA.The settings for the parameters of each algorithm listed in Table 1 were determined by analyzing the optimal outcomes obtained from several experiments.Figure 3 and Figure 4 show the convergence behavior for the SC and the MC, respectively, of the proposed ACO algorithm and the four additional swarm intelligence baseline  schemes when the number of antennas is equal to N = 6, and the variance of additive white Gaussian noise is σ = −90dBm.The learning error is computed by (4a) and (5a) for the SC and the MC, respectively.From Figure 3 and Figure 4, we observe that as the number of iterations increased, the learning error decreased.Moreover, for the SC and the MC, observe that PSO achieved faster convergence than the other swarm-learning algorithms, followed by CS and the proposed ACO algorithm.On the other hand, the worst performance was given by BOA, followed by the QPSO algorithm.
To validate the superiority of the ACO algorithm over PSO and CS, we evaluated the algorithms in terms of computational complexity and computation time.In particular, the computational complexity of PSO depends of the number of particles, N PSO , and the number of iterations, Ite PSO .Therefore, its computational complexity is expressed as O (N PSO • Ite PSO ).The computational complexity of CS relies on the number of nests, N CS , the probability of abandonment, pa, and the number of iterations, Ite CS .Therefore, the computational complexity for CS is given by O (N CS + (pa • N CS ) • Ite CS ).Regarding the proposed ACO algorithm, computational complexity is based on the sample size,, f r , and the number of iterations, Ite ACO , which results in O r f • Ite ACO .Accordingly, we can appreciate the ACO algorithm has the least computational complexity because it requires fewer particles and iterations than the PSO and CS algorithms.In Table 2, we evaluate the investigated schemes in terms of computational time and learning error performance.Overall, we can see that the learning error values vary slightly between the algorithms in both the SC and the MC.However, the main difference among of them is computation time; the least computation time was obtained by the proposed ACO algorithm where the result is remarkably lower than that obtained by CS and PSO.This can be attributed to the fact that the ACO algorithm achieves convergence with a smaller number of particles compared to its counterparts.Moreover, in Table 2, we compare the learning performance and computational complexity between the swarm-based algorithms and the traditional exhaustive search (ES) method, typically employed to identify optimal solutions in optimization problems.The ES technique, while thorough, is burdened by its significant computational demands and slow convergence rate due to its systematic evaluation of every potential solution.Notably, the computational complexity of exhaustive search scales with the number of candidate solutions.In contrast, our proposed-based scheme called ACO, offers expedited convergence towards near-optimal solutions and requires less computational overhead compared to ES.
Figure 5 and Figure 6 show the learning error computed by (4a) and ( 5), respectively, considering different available transmission power levels.To gain more insight into the proposed edge learning system, Figure 5 and Figure 6 show the learning error versus the number of antennas, with two values for the variance of additive white Gaussian noise: σ = −80dBm and σ = −90dBm.We can see from Figure 5 and Figure 6 the benefit of a multiple-antenna system, because as the number of antennas increased the learning error decreased for both the SC and the MC.This demonstrates the advantage of multiple antenna at the edge server.
Furthermore, to validate the advantage of the NOMA system against the conventional MRC method, Figure 7 and Figure 8 show the sum rate given by the summation of a user's rate versus the total power budget.Overall, observe that as the transmission power increased, the sum rate improved.This is because by increasing the total available power, a greater amount of power becomes available for data transmission from the users to the edge server, resulting in an increase in the achievable rate that minimizes the objective  function defined by the learning error.It is worth noting that according to equations (3) and ( 5), an increase in the sum rate corresponds to a decrease in learning error.Moreover, from Figure 6 and Figure 7, we can see that the edge learning system with the NOMA technique outperformed the conventional edge learning with the MRC technique in terms of sum rate.This is because NOMA can remove the interference from other users by applying SIC.In this manner, the decoding process is carried out in descending order according to the channel conditions, as expressed in (1), instead of treating the interference from other users as noise, as in (2).
Figure 9 shows the relationship between learning error and transmission power in our proposed scheme in (5), which is designed to minimize the learning error.The baseline scheme, (6), is primarily focused on the traditional goal of maximizing the sum rate.The figure reveals a compelling trend as the power budget increased across all scenarios-a reduction in learning error.However, insight emerges when comparing these two schemes.Our proposed optimization framework in (5) outperformed the conventional sum-rate maximization approach in (6) by achieving a significantly lower learning error.This superiority stems from our incorporation of ML techniques in our optimization problem, a facet that the sum-rate maximization scheme in (6) neglects.Furthermore, Figure 10 shows the edge learning error versus the number of antennas between the proposed task-oriented communication with NOMA and the baseline with MRC.From Figure 10, we can observe that NOMA is able to reduce learning error compared to the benchmark MRC scheme, underscoring the efficacy of NOMA in enhancing edge learning outcomes.In addition, to showcase the outstanding generalization of our proposed ACO-based approach for task-oriented communication in NOMA networks, we consider three distinct architectures for various classification: a 6-layer convolutional neural network (CNN6) deployed for classifying the MNIST dataset [22], a deep residual network consisting of 110 layers (ResNet110) applied to the CIFAR10 dataset [23], and a PointNet model utilized for processing 3D point clouds within the ModelNet40 dataset [24].The learning parameters for these tasks are: 15, 0.44, 1600 , 24584) , and (a 3 , b 3 , R 3 , D 3 ) = (0.95,0.24, 800, 192008) , respectively.For further insights into obtaining these learning parameters, readers are directed to Section III of [11].Accordingly, Figure 11 shows the learning error performance by using the aforementioned three-task case, versus the transmission power, P, and different number of users, K when the number of antennas is equal to N =2.Similar to Figure 9, from Figure 11, we can observe that as the transmission power increases P, the learning error is improved.On the other hand, from Figure 11, we can see that as the number of user increases, the learning error slightly rises since the total transmission power allocated to each user need to satisfy a maximum value, P. Therefore, as more users are served in the system, less power will be assigned to each user, as well as, the interference between users increases as the number of users increases.
It is worth to highlight that for time-varying channel conditions, the implementation of the ACO algorithm is imperative within each coherence time subsequent to channel estimation.Approaches rooted in swarm intelligence offer a promising avenue for optimization, boasting both efficiency and low complexity.These methods yield solutions that are nearly optimal while demanding minimal computational resources and ensuring stable convergence [25].Such characteristics make them particularly suitable for supporting delay-sensitive applications across wireless communication networks.Particularly in scenarios characterized by highly dynamic channel conditions, the necessity for rapid algorithms capable of furnishing nearly optimal solutions akin to ACO becomes apparent.Furthermore, there exists the option to adjust ACO parameters, such as sample size and maximum iteration count, to expedite convergence of the objective function.However, such alterations entail the risk of obtaining sub-optimal solutions, potentially local optima rather than global optimal.

VI. FUTURE WORKS
Future directions in research should investigate the integration of reinforcement learning (RL) techniques into the task-oriented communication framework for edge learning systems.RL is a branch of machine learning that focuses on learning optimal decision-making policies through interactions with an environment.In the context of edge learning and communication systems, RL could be utilized to dynamically adapt communication strategies based on the learning task at hand [26].This could involve optimizing power allocation, resource allocation, and scheduling decisions in real-time to maximize the learning performance of edge devices.Here, RL algorithms could adapt communication parameters based on feedback from the learning process, such as error rates or learning progress.
Moreover, within the framework of mobile edge computing systems, mobile devices utilize servers to delegate tasks for low-latency computing, a process that can occur in either partial or binary modes.In the partial mode, computational tasks are divided into two segments: one segment is processed locally on the mobile device while the other is transferred to a nearby mobile edge computing server for execution.Conversely, in the binary mode, the entire task is either completed locally on the device or transferred entirely to a nearby mobile edge computing server via the uplink connection [12], [27].In terms of future directions, an interest approach to explore is to joint task offloading and resource optimization in NOMA-based vehicular networks [27], [28] with edge computing AI technology.This can be beneficial from the resource management perspective since both, the user and the server can perform tasks.However, it also brings some complexity and security challenges to solve because of the load burden at the user and the task transmission in an open wireless environment susceptible to eavesdroppers' attacks.
Additionally, the reduction of the communication overhead plays a crucial role in wireless communications for optimizing network performance, reducing costs, minimizing latency, and improving resource allocation.In this paper, the communication overhead for power allocation with ACO is primarily attributed to uplink channel estimation, which is generally performed using pilot symbols.In particular, let us consider an edge computing system with a coherence interval composed of symbol periods.The pilot sequences assigned to users are designed to be pair-wise orthogonal and consist of ς symbols [29].During the coherence interval, channel estimation takes ς symbols, while the data uplink transmission occurs in the remaining − ς symbols.It's important to note that the required number of symbols in the pilot sequence, ς, increases with the number of users.For example, we can select ς ≥ K to avoid pilot contamination.Therefore, the channel estimation overhead increases with the number of users.However, it is worth noting that stateof-art schemes for power allocation also necessitate channel estimation procedures.Therefore, concerning communication overhead, the proposed ACO-based scheme involves similar communication overhead compared to state-of-theart schemes.In future work, reducing the channel estimation overhead can be studied through pilot-reuse and investigating schemes to mitigate pilot contamination in those scenarios.

VII. CONCLUSION
In this paper we proposed a novel power allocation design based on the ACO algorithm to optimize the learning error in a task-oriented edge NOMA system for an SC and an MC.The proposed ACO-based scheme provides a low-complexity solution because it requires fewer particles and iterations to achieve convergence than required by the comparative swarm learning techniques.Moreover, the ACO algorithm effectively achieved the best performance with less computation time than its counterparts.Furthermore, simulation results demonstrated that the integration of NOMA in the proposed task-oriented edge learning system reaches higher achievable data rates.

FIGURE 1 .
FIGURE 1.The system model of NOMA for task-oriented edge learning.
where λ > 0 represents the exploration and exploitation balance.High values of λ indicate high exploration.On the other hand, small values of λ represent high exploitation.It is worth noting that the f -th new candidate solution,C f = c 1 f , . . ., c K f with f = r + 1, . . ., r + f r ,is generated, dimension by dimension, based on the guide solution.Algorithm 1 ACO-Based Scheme to Solve Problems (4a) and (5a) 1: inputs: set the parameters Ite ACO , K , f r , ζ, λ, N , P. 2: Compute r random solutions in the range of [0, P] for the power allocation variables to be optimized, {p k }, denoted by C j = c 1 j , . . ., c K j , j = 1, 2, . . ., r, and assess their performance by solving the optimization problems to obtain f C j .3: Sort the r solutions in the archive, A = C 1 , C 2 , . . ., C r 4: Evaluate the weights in accordance with (7a).5: while It ≤ Ite ACO do 6:

9 :
Generate a sample, c kf , from a Gaussian distribution with parameters µ k guide,f , σ k guide,f .

end 13 : 14 :
Update solution archive A = C 1 , C 2 , . . ., C r+f r with the best r candidate solutions and remove the remaining.Sort the r solutions in the archive.Increase the number of iterations: It = It + 1. 15: end while 16: Output: Set C 1 = c 1 1 , . . ., c k 1 , . . ., c K 1 as the best solution for power variables {p 1 , . . ., p k , . . ., p K } of problem (4a) or (5a).Finally, f r new candidate solutions are assessed to obtain f C f and are added to the solution archive: A = C 1 , C 2 , . . ., C r+f r .Then, the best r solutions are preserved for the subsequent iteration, and the remaining solutions are discarded, thereby restoring the solution archive's size to r.

FIGURE 3 .
FIGURE 3. Convergence behavior of the swarm intelligence schemes for the SC.

FIGURE 4 .
FIGURE 4. Convergence behavior of the swarm intelligence schemes for the MC.

FIGURE 5 .
FIGURE 5. Learning error versus number of antennas in the SC.

FIGURE 6 .
FIGURE 6. Learning error versus number of antennas in the MC.

FIGURE 7 .
FIGURE 7. Sum rate versus transmission power in the SC between the proposed edge learning task-oriented communications with NOMA and the baseline with MRC.

FIGURE 8 .
FIGURE 8. Sum rate versus transmission power in the MC between the proposed edge learning task-oriented communications with NOMA and the baseline with MRC.

FIGURE 9 .
FIGURE 9. Learning error versus transmission power in the MC.

FIGURE 10 .
FIGURE 10.Learning error versus number of antennas between the proposed edge learning task-oriented communications with NOMA and the baseline with MRC.

FIGURE 11 .
FIGURE 11.Learning error versus transmission power and number of users in the MC.

TABLE 1 .
Simulation parameters for swarm-based schemes.