Abstract:
Federated learning, as a well-known framework for collaborative training among distributed local sensors and devices, has been widely used in practical learning applicati...Show MoreMetadata
Abstract:
Federated learning, as a well-known framework for collaborative training among distributed local sensors and devices, has been widely used in practical learning applications. To reduce communication resource consumption and training delay, acceleration training algorithms, especially momentum-based methods, are further developed for the training process. However, it is observed that under the influence of transmission noise, existing momentum methods exhibit poor training performance due to the noise accumulation along with the momentum term. This motivates us to propose a novel acceleration algorithm to achieve an efficient tradeoff between the training acceleration and noise smoothing. Specifically, to obtain clearer insights into the model update dynamics, we utilize a stochastic differential equation (SDE) model to mimic the discrete-time (DT) training trajectory. Through high-order drift approximation analysis on a general momentum-based SDE model, we propose a dynamic momentum weight and gradient stepsize design for the update rule, which is adaptive to both the training state and gradient quality. Such adaptation ensures that the training algorithm can seize good update opportunities and avoid noise explosion. The corresponding DT training algorithm is then derived via discretization of the proposed SDE model, which shows a superior training performance compared to state-of-the-art baselines.
Published in: IEEE Internet of Things Journal ( Volume: 11, Issue: 8, 15 April 2024)
Funding Agency:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Fading Channel ,
- Federated Learning ,
- Wireless Fading Channels ,
- Superior Performance ,
- Training Performance ,
- Training Algorithm ,
- Influence Of Noise ,
- Update Rule ,
- Local Devices ,
- Stochastic Differential Equations ,
- Dynamic Weight ,
- Momentum Term ,
- High-order Approximation ,
- Dynamic Gradient ,
- Training Trajectories ,
- Convergence Rate ,
- Dynamic Characteristics ,
- Stochastic Gradient Descent ,
- Stochastic Gradient ,
- Model Weights ,
- Constant Step Size ,
- Lyapunov Function ,
- Local Gradient ,
- Stochastic Gradient Descent Method ,
- Convergence Of Algorithm ,
- Deep Neural Network Model ,
- Central Server ,
- Stochastic Algorithm ,
- Randomization Schedule ,
- Training Iterations
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Fading Channel ,
- Federated Learning ,
- Wireless Fading Channels ,
- Superior Performance ,
- Training Performance ,
- Training Algorithm ,
- Influence Of Noise ,
- Update Rule ,
- Local Devices ,
- Stochastic Differential Equations ,
- Dynamic Weight ,
- Momentum Term ,
- High-order Approximation ,
- Dynamic Gradient ,
- Training Trajectories ,
- Convergence Rate ,
- Dynamic Characteristics ,
- Stochastic Gradient Descent ,
- Stochastic Gradient ,
- Model Weights ,
- Constant Step Size ,
- Lyapunov Function ,
- Local Gradient ,
- Stochastic Gradient Descent Method ,
- Convergence Of Algorithm ,
- Deep Neural Network Model ,
- Central Server ,
- Stochastic Algorithm ,
- Randomization Schedule ,
- Training Iterations
- Author Keywords