I. Introduction
The resurgence of deep learning (DL) intensifies the demand for training high-quality models from distributed data sources. However, the traditional approach of centralized model training may raise severe concerns of privacy leakage, since data, possibly with personal and sensitive information, need to be collected by a server prior to training. As a result, federated learning (FL) [1], [2], which is a privacy-preserving distributed model training paradigm, has recently received enormous attention [3], [4], [5], [6]. A typical FL system consists of a central server and many clients with private data, which collaborate to accomplish an iterative training process. In each training iteration, clients train local models based on their local data and upload the model updates to the server. Upon receiving the model updates, the server performs model aggregation and the aggregated global model is disseminated to clients for the next training iteration.