I. Introduction
As The contradiction between the phenomenon of data silos and the requirement for data fusion becomes increasingly prominent, the advancement of Big Data-driven artificial intelligence applications is severely limited. Federated learning (FL), as a new distributed machine learning paradigm, ensures each participant's absolute control over its own sensitive data while allowing the collaborative training among participants [1], [2], [3]. Although FL avoids disclosing any local data, there is still a significant risk of privacy breaches. For example, attackers can utilize the captured communication gradients or parameters to reconstruct part of the sensitive data or even infer whether the mastered data originated from a particular participant [4], [5]. When implementing FL in realistic scenarios, some challenges cannot be ignored [6], [7], [8]. However, most existing researches focus on addressing one or two of these challenges. Therefore, here we aim to comprehensively explore four essential issues, i.e., data privacy, model utility, communication efficiency and data heterogeneity.