I. Introduction
As a promising distributed learning paradigm in the era of edge computing, the concept of Federated Learning (FL) was first proposed by Google in 2016 [1], [2], [3]. Intuitively, the main idea of FL is to build deep learning models based on datasets that are distributed over multiple end devices (i.e., clients) in a privacy-preserving way. Recently, FL has already found a wide range of applications in popular mobile apps, such as Google's GBoard [4], Apple's QuickType [5], and various data sensitive application domains including financial [6], medical [7] and security [8] scenarios, where distributed data cannot be directly collected to a central server for training deep learning models, due to various factors such as intellectual property rights, government regulations, and other physical constraints [9].