I. Introduction
The availability of massive amounts of data at mobile edge devices has led to a surge of interest in developing artificial intelligence (AI) services, such as image recognition [2] and natural language processing [3], at the edge of wireless networks. Conventional machine learning (ML) requires a data center to collect all data for centralized model training. In a wireless system, collecting data from distributed mobile devices incurs huge energy/bandwidth cost, high time delay, and potential privacy issues [4]. To address these challenges, a new paradigm called federated learning (FL) has emerged [5]. In a typical FL framework, each edge device computes its local model updates based on its own dataset and uploads the model updates to a parameter server (PS). The global model is computed at the PS and shared with the devices. By doing so, direct data transmission is replaced by model parameter uploading. This significantly relieves the communication burden and prevents revealing local data to the other devices and the PS.