Loading [MathJax]/extensions/MathZoom.js
Matcha: A Matching-Based Link Scheduling Strategy to Speed up Distributed Optimization | IEEE Journals & Magazine | IEEE Xplore

Matcha: A Matching-Based Link Scheduling Strategy to Speed up Distributed Optimization


Abstract:

In this paper, we study the problem of distributed optimization using an arbitrary network of lightweight computing nodes, where each node can only send/receive informati...Show More

Abstract:

In this paper, we study the problem of distributed optimization using an arbitrary network of lightweight computing nodes, where each node can only send/receive information to/from its direct neighbors. Decentralized stochastic gradient descent (SGD) has been shown to be an effective method to train machine learning models in this setting. Although decentralized SGD has been extensively studied, most prior works focus on the error-versus-iterations convergence, without taking into account how the topology affects the communication delay per iteration. For example, a denser (sparser) network topology results in faster (slower) error convergence in terms of iterations, but it incurs more (less) communication time per iteration. We propose MATCHA, an algorithm that can achieve a win-win in this error-runtime trade-off for any arbitrary network topology. The main idea of MATCHA is to communicate more frequently over connectivity-critical links in order to ensure fast convergence, and at the same time minimize the communication delay per iteration by using other links less frequently. It strikes this balance by decomposing the topology into matchings and then optimizing the set of matchings that are activated in each iteration. Experiments on a suite of datasets and deep neural networks validate the theoretical analyses and demonstrate that MATCHA takes up to 5x less time than vanilla decentralized SGD to reach the same training loss. The idea of MATCHA can be applied to any decentralized algorithm that involves a communication step with neighbors in a graph.
Published in: IEEE Transactions on Signal Processing ( Volume: 70)
Page(s): 5208 - 5221
Date of Publication: 09 November 2022

ISSN Information:

Funding Agency:


References

References is not available for this document.