Abstract:
Optimization methods based on adaptive gradients, such as AdaGrad, RMSProp, and Adam, are widely used to solve large-scale machine learning problems. This work aims to im...Show MoreMetadata
Abstract:
Optimization methods based on adaptive gradients, such as AdaGrad, RMSProp, and Adam, are widely used to solve large-scale machine learning problems. This work aims to improve the recently proposed and rapidly promoted distributed adaptive gradient descent optimization algorithm(DADAM). There are two main components in Adam-a momentum component and an adaptive learning rate component. However, regular momentum can be proved conceptually and empirically inferior to a similar algorithm called Nestrov’s Accelerating Gradient (NAG). We therefore incorporate Nesterov’s Momentum into Distributed Adaptive Gradient Method (DADAM) for Online Optimization and obtain our NDADAM algorithm. Experiments show that the convergent speed of the proposed NDADAM algorithm has been greatly improved.
Published in: 2021 China Automation Congress (CAC)
Date of Conference: 22-24 October 2021
Date Added to IEEE Xplore: 14 March 2022
ISBN Information: