Software-Defined GPU-CPU Empowered Efficient Wireless Federated Learning With Embedding Communication Coding for Beyond 5G

Currently, with the widespread of the intelligent Internet of Things (IoT) in beyond 5G, wireless federated learning (WFL) has attracted a lot of attention to enable knowledge construction and sharing among a huge amount of distributed edge devices. However, under unstable wireless channel conditions, existing WFL schemes exist the following challenges: First, learning model parameters will be disturbed by bit errors because of interference and noise during wireless transmission, which will affect the training accuracy and the loss of the learning model. Second, traditional edge devices with CPU acceleration are inefficient due to the low throughout computation, especially in accelerating the encoding and decoding process during wireless transmission. Third, current hardware-level GPU acceleration methods cannot optimize complex operations, for instance, complex wireless coding in the WFL environment. To address the above challenges, we propose a software-defined GPU-CPU empowered efficient WFL architecture with embedding LDPC communication coding. Specifically, we embed wireless channel coding into the server weight aggregation and the client local training process respectively to resist interruptions in the learning process and design a GPU-CPU acceleration scheme for this architecture. The experimental results show its anti-interference ability and GPU-CPU acceleration ability during wireless transmission, which is 10 times the error control capability and 100 times faster than existing WFL schemes.

(UAV) [9], [10], robots [11], health care [12], supply chain finance, and so on. However, in the 6G era, the paradigm shift from 'interconnected things' to 'interconnected intelligence' through modern machine learning technology is facing three major challenges. First, data transfer with private information to a cloud server is vulnerable to eavesdropping and data tempering attacks. Second, due to the limited wireless channel transmission resources, the machine learning training model which aggregates a large amount of distributed data through the wireless network channel may lead to network congestion, which causes network delay to be too long and affects the efficiency of machine learning.
In order to solve the above problems, researchers have studied and designed federated machine learning (FL) [13]. FL is a novel distributed learning system, which is also considered as a promising privacy-sensitive and low-latency intelligent IoT application solution with the ability to utilize distributed computing resources. We consider a scenario where there is such a network structure with a server and several clients in a wireless network environment. In this wireless network environment, if all data of the client is simply transmitted to the server, it will be a heavy burden for the wireless network with limited resources, and the server cannot obtain all the wireless data quickly, which will affect the performance of FL model. Compared to traditional machine learning, FL has more prospects for data privacy. FL allows the training data stay on the local, and only requires each IoT device to upload its locally updated model to an edge aggregation server over wireless transmission during model training. This enhances devices privacy and data security by preventing data collected on IoT devices from being leaked to other devices and aggregation servers. Among them, WFL [14], [15], [16], [17], [18] is the most recent research hotspot, and many new studies have also been generated around WFL. Such as communication problems [19], [20], user selection problems [21], [22], [23], [24], [25], security problems [26], [27], and automatic modulation classification problems [28], [29]. WFL relies on both cloud servers and edge devices. A complete federated learning process includes multiple rounds. Each participant uses local data to perform local training to obtain a local model, and sends the weights parameter to the cloud server through wireless channel coding which can do correction and error detection of wireless transmitted digital signals and enhance the ability of data to withstand various interference. The cloud server aggregates the weights parameter of each participant to obtain a global model. Then, the weights parameter of the global model are sent back to the edge devices for the next round of training until the model converges. WFL can share federated learning models to complete learning tasks without sharing training data. Because the data does not leave the local during the training process, so the data security of the participants is guaranteed, and a lot of communication overhead is saved.
Although WFL has the above characteristics, its performance is still affected by the following factors [30]. First, earning model parameters will be disturbed by bit errors because of interference and noise during wireless transmission. In the wireless transmission medium, unstable uplink and downlink transmission will cause the parameter weights of the model to be incorrectly transmitted. Second, traditional edge devices with CPU acceleration are inefficient due to the low throughout computation, especially in accelerating the encoding and decoding process during wireless transmission. Third, current hardware-level GPU acceleration methods cannot optimize complex operations, for instance, complex matrix multiplication in wireless channel coding in the WFL environment.
To solve the above challenges, first, we designed a new architecture that optimize WFL with wireless channel coding. Specifically, we embed the encoding and decoding process into the local learning process and the server parameter aggregation process of WFL. Through wireless channel coding, the model parameters obtained by learning and aggregation can be checked and corrected while transmitting in the wireless channel, so the transmission process of model parameters of WFL is more reliable. Second, we also designed a software defined CPU-GPU hybrid architecture to speed up the encoding and decoding process of model parameters by wireless channel coding. Compared with traditional hardware acceleration, software-defined GPU-CPU acceleration can accelerate the wireless channel coding process of specific model parameters through software programming. Third, The software-defined acceleration method we designed makes the acceleration not limited to hardware conditions. It can optimize the transmission of different parameters of different edge devices through software programming, and has strong portability. Finally, it is worth mentioning that our work is the first we know to consider combining the parameter transmission process with the wireless channel coding work, using communication optimization learning. Simulation experiment shows that the WFL architecture with wireless channel coding embedded is more anti-interference compared with the WFL without wireless channel coding. In addition, we also prove that the GPU-CPU hybrid acceleration architecture can greatly improve the wireless channel coding efficiency, compared with pure CPU acceleration method and pure GPU acceleration method.
The remainder of this article is organized as follows. In Section II we will introduce recent related works about WFL optimization and GPU acceleration methods. Section III we will introduce the system model of anti-interference WFL architecture embedded with wireless channel coding. Section IV shows the algorithm we designed to implement the anti-interference WFL architecture. In Section V we designed experiment to simulate a real wireless environment to test the anti-interference and acceleration ability of the anti-interference WFL architecture. Finally, Section VI concludes this paper and points out future work.

II. RELATED WORKS
In recent years, research on WFL has received a lot of attention. Researchers have tried to optimize WFL from different aspects. Under the condition of limited wireless network resources [31] and client energy resources [23], [32] participating in WFL local training, Zhou proposed a bandwidth allocation algorithm with low energy consumption [33], which enables clients to engage in learning more sustainably. Xu and Wang proposed to intelligently select clients participating in WFL local learning based on energy consumption from the long-term perspective of learning as a whole [34], not limited to learning rounds. By optimizing the joint client selection and bandwidth allocation under long-term client energy constraints, the long-term performance of wireless federated learning is guaranteed in complex network environments. Based on the expected convergence speed of the WFL algorithm, Chen et al. quantifies the influence of wireless factors on WFL, and under the given user selection and uplink resource block (RB) allocation scheme [35], derives the optimal transmit power for each user, thereby optimizing user selection and uplink RB allocation to minimize the WFL loss function. The above studies all aim to optimize the resource allocation of clients participating in WFL local training, and do not consider the influence of the instability of the wireless channel on the transmission of model parameters between the client and the server in complex wireless environment.
In the wireless network environment, wireless channel coding is the guarantee of fast and correct data transmission. Afshin Abdi compresses the stochastic gradient (SG) transmitted in WFL based on Random Linear Coding (RLC) to reduce communication overhead and accelerate convergence [36]. Amiri and Gündüz proposes a new simulation scheme called A-Distributed Stochastic Gradient Descent (A-DSGD) by exploiting the additional properties of wireless MACs for over-the-air gradient computation [37]. In A-DSGD, these devices first thin out their gradient estimates and then project them into a low-dimensional space imposed by the available channel bandwidth. These predictions are sent directly through the MAC without using any digital codes. Because A-DSGD utilizes the limited bandwidth more efficiently and the natural alignment of gradient estimates across channels, it converges faster. Although the above researches optimize the communication process of model parameter transmission between client and server in wireless environment, they do not consider the influence of wireless transmission error on model parameters. We simulate the communication process of model parameter transmission between client and server in WFL, and embedded wireless channel coding inside, under different bit error rates, the accuracy of model parameter transmission is ensured, so as to ensure the accuracy and reduce the loss of WFL.
With the development of hardware technology, the computer has more and more CPU calculation and operation units, which makes the program run faster and faster. However, in some scenarios, compared to the CPU, the number of computing units of the GPU is much higher, even if the computing power of the GPU is not as good as CPU, it can perform thousands of calculations at the same time, which is more efficient. In the communication process of WFL, GPU acceleration can greatly accelerate the process of encoding and decoding model parameters, which improves the efficiency of learning. Ling and Cautereels studied GPU as a digital signal processing accelerator for cloud RAN [38], which improves the throughput of data and accelerates the decoding process of LDPC. Chance Tarver works by changing the parallelization strategy of mapping GPU cores to blocks, using many GPU cores to quickly compute a codeword for low latency, or using cores to process multiple codewords simultaneously for targeted highthroughput applications [39]. The above researches are based on GPU hardware optimization to speed up the encoding and decoding process. By analyzing the operation characteristics of wireless channel coding, we propose a GPU-CPU hybrid architecture at the software level, which accelerates the encoding and decoding process of model parameter communication between the WFL client and server.

III. SYSTEM MODEL
In order to reduce the influence of wireless channel instability on the communication process of WFL and improve the anti-interference ability of WFL, we designed an WFL architecture with wireless communication coding embedding in, and designed a software-defined GPU-CPU hybrid acceleration architecture to accelerate the encoding and decoding process between the client and the aggregation server. Next we will introduce our designed anti-interference WFL architecture embedded with wireless communication coding, and a software-defined GPU-CPU hybrid architecture for accelerating wireless channel coding.
The proposed framework of anti-interference WFL architecture embedded with wireless channel coding is shown in Fig. 2, which consists of three planes as listed below.

A. ANTI-INTERFERENCE WIRELESS FEDERATED LEARNING PLANE
This plane is formed by connecting an aggregation server with many edge devices with limited wireless communication resources, such as smartphones, tablets, unmanned autonomous, drones and industrial equipment, etc. network composition. In order to handle the high-frequency communication between aggregation servers and edge devices, it is necessary to ensure fast and stable wireless communication. However, the instability of the wireless channel limits the communication process between the aggregation server and the edge device. If there is no error correction method, when error happens in the wireless transmission, or when it is attacked by byzantine attacks, data poisoning, and model inference, the model trained by WFL will be unreliable, especially in an open wireless network environment.

B. WIRELESS CHANNEL CODING EMBEDDING PLANE
The wireless channel coding embedding plane is composed of the aggregation server and the edge client. Each end is further divided into a wireless channel coding module and a wireless channel decoding module. The wireless channel coding module uses the aggregation model parameters of the aggregation server and the local model parameters of the edge client as information symbols, and encodes the block model parameters, so that the wireless transmitted data has error detection and even error correction capabilities. The wireless channel decoding module code decodes the local model parameters received by the aggregation server and the global model parameters received by the edge client. Through wireless channel coding, even if it is subjected to unstable wireless transmission, after error detection and error correction of the parity bit, the finally decoded model parameters are close to or even the same as those before coding, so the performance of WFL can be guaranteed.
Commonly used wireless channel codes mainly include Low Density Parity Code (LDPC), Cycle Redundancy Check (CRC), Turbo Code and Polar Code.

C. CPU-GPU HYBRID ACCELERATION ARCHITECTURE PLANE
Today, many studies have realized hardware-accelerated wireless channel coding, such as using a graphics processing unit (GPU) as an alternative to FPGA and ASIC decoders to accelerate the decoding process of LDPC. According to the characteristics of wireless channel coding and decoding in mathematical calculation, we defined a CPU-GPU hybrid acceleration architecture from the software aspect, and dynamically accelerate the encoding and decoding process of model parameters, which is simpler and more portable than hardware acceleration.
The wireless channel coding embedded anti-interference wireless federated learning framework we designed, while retaining the characteristics of FL data not leaving the local, through wireless channel coding, further guarantees the communication security between the aggregation server and the edge device. First, the aggregation server encodes the initial model parameters and sends them to all edge devices participating in the learning through wireless transmission. The edge device decodes the received model parameters, initializes the local model, and performs training according to the local data. After the local training round ends, each edge device encodes its model parameters and returns them to the aggregation server through wireless transmission. The aggregation server decodes all model parameters, and aggregates all model parameters into parameters of a global model through an algorithm. Repeating this process until the global model sent to the edge devices reaches convergence, and the entire WFL task is completed. In this process, the GPU-CPU hybrid acceleration architecture we designed dynamically optimizes the communication process between the aggregation server and the edge clients, and reasonably allocates CPU acceleration and GPU acceleration to the encoding and decoding processes. On the premise of reducing communication overhead and saving computing resources, the accuracy of model parameter transmission is guaranteed.
In order to realize this anti-interference WFL architecture, next we will show the specific implementation algorithm of our designed wireless channel coding embedded in wireless federated learning and software-defined GPU-CPU hybrid acceleration architecture.

IV. ALGORITHM
The specific implementation algorithm of wireless channel coding embedded in WFL and the software-defined GPU-CPU hybrid acceleration architecture is proposed in this section.

A. WIRELESS CHANNEL CODING EMBEDDING IN WIRELESS FEDERATED LEARNING
In order to realize the anti-interference WFL architecture, we embedded wireless channel coding in both the edge client side and aggregation server side of WFL. For the edge client, decoding is performed when receiving global model parameters from the aggregation server, and encoding is performed when sending local model parameters to the aggregation server. For the aggregation server, encoding is performed every time the aggregated global model parameters are sent to the edge client, and decoding is performed when receiving an update of the local model parameters uploaded by the edge client.
As shown in Figure 3, this framework consists of one cloud server and N edge clients. Each user k has a local dataset D k . For each local dataset D k , is the input vector of user k, and y ki is its corresponding output (we currently only consider singleoutput federated learning algorithm). We define the vector ω i as the local model parameter trained by x ki and vector g as the global model parameter aggregated by ω i . For each client i, their task is to train the optimal parameter ω i to minimize the loss function of the local model. And the entire FL process translates into solving the following optimization equation: where f (ω i , x ki , y ki ) is the loss function describing the performance of the FL model obtained by the input vector x ki and output vector y ki . Among them, g is the weight parameter of the aggregation model obtained based on the model parameter ω i uploaded by the edge client to the aggregation server according to the federated average algorithm. As with any machine learning, the most important factor affecting learning performance is the weight parameters of the machine learning model. For the research in this paper, it is the weight parameter g of the aggregation model. Whether g can be transmitted quickly and accurately is directly related to the performance of WFL. Therefore, we embedded wireless channel coding in the WFL model, to design an anti-interference WFL architecture with error checking and error correction, and perform wireless channel coding on the weight parameter g of the aggregation model of wireless transmission so that g can be transmitted accurately, ensuring performance of WFL.

B. WIRELESS ENCODING
Wireless channel coding is a process of matrix transformation and matrix operation. First, we need to divide the global model parameters g into information symbols for wireless transmission, and transform it into a binary matrix as the basis of operation. The global model parameter g contains the weight parameter of each node of the model whose type is a floating-point number. We need to convert the weight parameter g of the floating-point number into a binary matrix for encoding operation. Dividing the floating-point number weight parameter g into an integer part F and a fractional part f , and the subscript represents the number of digits. We convert the integer and fractional parts to binary and store them in the matrices A = [a 1 , . . . , a i , . . . , a n ] and B = [b 1 , . . . , b i , . . . , b n ], and then merge the matrices A and B separately: where F n is the integer part of F n−1 divided by 2, written as F n = F n−1 /2 , and f n is the fractional part of f n−1 multiplied by 2, written as f n = f n−1 × 2 − f n−1 × 2 .
Because the storage method of floating-point numbers in the computer follows the IEEE 754 floating-point number counting standard, it can be expressed as: Therefore, we also need to convert the matrix after the matrix A and B are combined into a representation form composed of sign bits S, exponent codes E, and mantissa bits M. According to the positive or negative of the weight, the value of the sign bit S is taken, and the positive number is 0, and the negative number is 1. After moving the decimal point, the value of the exponent E is equal to the number of digits moved after the representation is reproduced in scientific notation. Take the mantissa M according to the precision to get the value of the mantissa bits. Finally, a binary matrix of weight parameters X = [SME] is obtained.
Secondly, we need to construct the parity matrix H applicable to the global model information symbol binary matrix. Parity matrix H is a sparse matrix composed of only 0 and 1. According to the shape of the segmented global model information symbol binary matrix, the shape of the parity matrix H is determined. As follows, on the basis of confirming the shape of the parity matrix H, we define the m × n mother matrix M(H) of the parity matrix H, and use the 0 and 1 in M(H) to be all 0 of L × L The parity matrix H can be obtained by replacing the sub-matrix with the cyclic sub-matrix P a ij of L × L We define cyclic shift sub-matrix P and parent matrix M(H) as: Replacing each 1 in the above formula with an L×L cyclic shift sub-matrix P a ij , and replace each 0 with an L×L all-zero matrix to obtain a mL × nL H matrix: where a ij is the shift term. The unit matrix is cyclically shifted to the right by a to obtain the cyclic sub-matrix P a ij . When storing the H matrix, we only need to store the value of each a ij in the above formula, instead of storing the position of each 1. As long as M(H) is expanded, we get the constructed parity matrix H. In order to perform matrix operations with the global model information symbol binary matrix, we need to change the parity matrix H into a generator matrix G through matrix transformation. First, we perform row-column transformation on the H matrix, and change the right half of the matrix into an identity matrix to obtain the intermediate matrix H = P T , I . Convert the left half of the middle matrix H to rank and exchange it with the identity matrix of the right half to obtain the generator matrix G = I, P . The specific process is as follows: After obtaining the generator matrix G, we do r = X × G to get the encoded information r.
Obviously, because the left side of the generator matrix G is an identity matrix, the left side of the encoded one-dimensional matrix r we get is the same as the global model information symbol binary matrix X, and the check bit is behind it. This is why G is called a generator matrix. On the basis of retaining the original information, it generates a check digit for verification, which provides a basis for error detection and correction in subsequent decoding.

C. WIRELESS DECODING
In wireless signal transmission, noise, pulses caused by alternating current or lightning, transmission equipment failure and other factors will cause bit errors, causing the transmitted binary signal to send bit flips (from 0 to 1 or from 1 to 0). Therefore, we add wireless channel coding and add check bits on the basis of the original information symbols. If a bit-flip error occurs in the wireless transmission of the global model parameter g, it can check the position where the bit flip occurs and correct it. We embed a bit-flip decoding method in the client, and the client performs decoding after receiving the encoded global model parameter g from the server.
We set when clients received the original information symbols with parity bits like r = [r 1 · · · r i · · · r n ]. Now the parity matrix H will do its job. We do matrix multiplication r ×H T , then we can get multiple syndromes s.
When we compute the syndrome s, we use modular two addition. When the results of the syndrome s i are all 0, it means that the wireless transmission is correct. When the result of the syndrome s i is 1, it means that the coded information matrix r involved in the calculation of the syndrome s i may have bit errors during wireless transmission. At this time, in the syndrome s i whose calculation result is 1, the coded information matrix r i that participated in the most calculation times has the greatest possibility of error in wireless transmission, so we perform bit flipping on the r i that participated in the most calculation times, and then re-calculate the syndrome s until the result of the syndrome s is all 0, and we have obtained the correct binary information matrix r of the global model parameters. After reconverting the binary information matrix r into floating-point numbers and importing them into each node of the local model, the local model update of the clients are completed, and a new round of local training can be started.  In the next section, we will design experiments to test our designed software-defined GPU-CPU empowered WFL with Embedding LDPC Communication Coding in terms of anti-interference and acceleration ability.

A. EXPERIMENT SETUP
Our experiments are deployed on the hardware with 2.90-GHz Intel Xeon Gold 6326 CPU, 256-G RAM, and 8-T disk. The operating system is Linux Ubuntu 20.04 LTS and the simulations are conducted on Python 3.8. The parameters of the our experiments are presented as follows.
We set the number of clients m = 10, the global iteration round G = 100 (that is, the number of communication iterations between the server and the client), the local iteration round L = 5, the number of clients participating in training in each round k = 5, the number of samples for each round of local training is s = 32, and the learning rate lr = 0.01. The frame of encoder code length N = 648 and the code rate r = 1/2.
Obviously, different from the ideal conditions, in the actual environment, the wireless transmission will be affected by various factors. Errors occur during the transmission of model parameters, which affects the convergence of the model and affects the accuracy and loss of learning. Therefore, we set the parameter p of the bit error rate to indicate the degree of bit errors that occur during wireless transmission. It represents the probability that a bit flip may occur in the binary information during transmission.
Next, we will conduct experiments on the anti-interference performance and encoding-decoding acceleration capabilities of our proposed framework.

B. EXPERIMENT RESULT
1) Anti-interference experiment: The convergence of our proposed wireless channel coding embedded antiinterference WFL architecture over p = 0.01, 0.02, 0.03, 0.04 and traditional federated learning without code embedded are shown as Fig. 4 and Fig. 5.
From Fig. 4, We can see that along the number of epoch increases, the Accuracy of our proposed Code-embedded anti-interference WFL architecture gradually increases and converges to a high stable value. Specially, We can see that when the bit error rate is low, the WFL architecture with wireless channel coding embedding has similar performance  to traditional FL. Therefore, the impact of embedding wireless channel coding on FL performance is negligible. As the bit error rate increases, the time it takes for the wireless channel coding embedded WFL architecture to reach convergence increases and fluctuates slightly. The same effect can be seen from Fig. 5. Fig. 5 is the change curve of the loss and epoch rounds of the anti-interference WFL architecture and traditional WFL under the same conditions. When the bit error rate is low, the loss of anti-interference WFL is almost the same as that of traditional WFL under ideal conditions. The loss for anti-interference WFL increases slightly as the bit error rate increases. Therefore, from the two aspects of model accuracy and loss, the anti-interference WFL architecture will not have a negative effect on the convergence of the model. Therefore, we designed an experiment to simulate the WFL embedded with wireless channel coding and ordinary WFL without wireless channel coding in the actual environment, and compared the performance of the two architecture under the influence of bit errors. As shown in Fig. 6 and Fig. 7, the change curve of model accuracy and loss with epoch rounds.
We plot the accuracy, loss and bit error rate p of the environment where the anti-interference WFL embedded with wireless channel coding embedded and ordinary WFL without wireless channel coding embedded is plotted into the following table.
From Table 1, we can find that before the bit error rate p reaches 0.3, the performance of wireless federated learning embedded in anti-interference wireless channel coding is stable at the level of wireless federated learning under ideal conditions. When the bit error rate p reaches 0.3, There is a slight decrease in accuracy and loss. In contrast, the common federated learning without code embedding, only when the bit error rate p is between 0.001 and 0.005, the learning performance is close to the level of wireless federated learning under ideal conditions. When the bit error rate p reaches 0.01, the learning performance drops greatly, and the learning performance is not even as good as the performance of the anti-interference federated learning framework in the environment of 10 times bit error rate (i.e., p = 0.01). When the bit error rate p exceeds 0.01, because there are too many bit errors in the transmitted model parameters, the output of model accuracy and loss are num, and the learning cannot proceed normally. Therefore, the embedding of wireless channel coding greatly improves the anti-interference ability of the wireless federated learning framework, enabling the learning to receive accurate model parameters in a complex environment for model update and training, ensuring the wireless federated learning performance.
2) GPU-CPU hybrid acceleration experiment: To reduce the impact of encoding embedding on learning efficiency, we designed a software-defined GPU-CPU hybrid acceleration architecture.
According to the characteristics of wireless channel coding, we divide the acceleration into two parts. The first part is the calculation of converting the model parameters of the floating point type into binary matrices, and the calculation of generating the parity matrix H through matrix transformation. The second part is the multiplication of binary matrix and generator matrix and the syndrome calculation of bit flip decoding. The first part is mainly characterized by small calculations but high computational complexity, which is suitable for CPU acceleration. The second part is a large amount of calculation, high throughput, repetitive and simple calculation content, suitable for GPU acceleration. Under the condition of the same bit error rate p, we accelerate the coding process with pure CPU acceleration and GPU-CPU hybrid acceleration respectively to obtain a comparison chart of the time spent by each acceleration method when the model reaches convergence.
Obviously, compared with the GPU-CPU hybrid acceleration method, CPU acceleration method spent more time to reach convergence. The convergence speed of the GPU-CPU hybrid acceleration method is 5 times that of the pure CPU acceleration method.

VI. CONCLUSION
In this article, we proposed a software-defined GPU-CPU empowered efficient wireless federated learning architecture with LDPC communication coding embedded, which integrates FL and wireless coding to enhance the performance of wireless federated learning while guaranteeing information privacy between the clients who participate in FL. In this proposed framework, we first discuss the influence of the framework itself on learning performance from two aspects of model accuracy and loss. Next, we designed a simulation experiment to simulate the accuracy and loss of our proposed framework and the common framework under different bit error rates p. In addition, we compared pure CPU acceleration and GPU-CPU hybrid acceleration to improve the encoding speed. Finally, the experimental results show that the anti-interference wireless federated learning framework we proposed can ensure the accurate transmission of model parameters in complex environments. Moreover, the GPU-CPU acceleration module defined at the software level improves the speed of encoding calculation in the framework, improves the learning efficiency of the framework, and solves the problem of learning efficiency decline caused by encoding embedding. In future work, we will study the impact of parity matrix on communication optimization when using LDPC codes to optimize WFL communication under different WFL tasks. In addition, we will compare different wireless channel codes to optimize the communication process of WFL based on the rules and characteristics of model parameters.