Loading [MathJax]/extensions/MathZoom.js
On Decentralizing Federated Reinforcement Learning in Multi-Robot Scenarios | IEEE Conference Publication | IEEE Xplore

On Decentralizing Federated Reinforcement Learning in Multi-Robot Scenarios


Abstract:

Federated Learning (FL) allows for collaboratively aggregating learned information across several computing devices and sharing the same amongst them, thereby tackling is...Show More

Abstract:

Federated Learning (FL) allows for collaboratively aggregating learned information across several computing devices and sharing the same amongst them, thereby tackling issues of privacy and the need of huge bandwidth. FL techniques generally use a central server or cloud for aggregating the models received from the devices. Such centralized FL techniques suffer from inherent problems such as failure of the central node and bottlenecks in channel bandwidth. When FL is used in conjunction with connected robots serving as devices, a failure of the central controlling entity can lead to a chaotic situation. This paper describes a mobile agent based paradigm to decentralize FL in multi-robot scenarios. Using Webots, a popular free open-source robot simulator, and Tartarus, a mobile agent platform, we present a methodology to decentralize federated learning in a set of connected robots. With Webots running on different connected computing systems, we show how mobile agents can perform the task of Decentralized Federated Reinforcement Learning (dFRL). Results obtained from experiments carried out using Q-learning and SARSA by aggregating their corresponding Q-tables, show the viability of using decentralized FL in the domain of robotics. Since the proposed work can be used in conjunction with other learning algorithms and also real robots, it can act as a vital tool for the study of decentralized FL using heterogeneous learning algorithms concurrently in multi-robot scenarios.
Date of Conference: 23-25 September 2022
Date Added to IEEE Xplore: 01 November 2022
ISBN Information:
Conference Location: Ioannina, Greece
Dept. of Computer Science and Engg, Federal Institute of Science and Technology, Angamaly, India
Dept. of Computer Science and Engg, Indian Institute of Technology Guwahati, Guwahati, India
Dept. of Computer Science and Engg, Indian Institute of Technology Guwahati, Guwahati, India
Dept. of Computer Science and Engg, Federal Institute of Science and Technology, Angamaly, India

I. Introduction

With an ever increasing trend in the use of handheld devices and a consequent enormous explosion in data generated, researchers have been trying hard to figure out varied techniques to learn from such data by aggregating the same in a centralized entity. The learning algorithm is then run on this central server and the knowledge gained is sent back to all connected participating devices. This process is not always viable considering the fact that a large amount of data needs to be uploaded to the server and then processed periodically [1]. One work-around is to use a technique termed Federated Learning (FL) [2] where, in lieu of data, the models generated in-situ or on-device are shared by all participating devices with the central server. The server, in turn, aggregates them using some pre-defined techniques [3] –[5] and sends the modified model back to the devices. Thus, every device has a local dataset using which, a model, most often an Artificial Neural Network (ANN), is trained and evolved. Each device periodically sends the trained weights of its respective ANN to the central server in the form of an update. At the server these weights are, for instance, averaged, and then sent back to all the devices. Most often these new set of (averaged) weights represent a part of the learning performed at each of the participating devices and hence contribute to the enhancement of learning within the network. Over several rounds of this process, the models at each of the devices saturate to fairly homogeneous ones. This centralized version of FL model suffers from inherent drawbacks such as a central point of failure, scalability, privacy issues, coupled with the requirement of large clients-to-server bandwidth [1]. In order to overcome these hurdles, decentralized versions of FL have been proposed [6] –[10]. FL has also made its niches in multi-robot scenarios [11], [12]. In such cases, each robot shares its learned model with others, thereby aiding in faster learning convergence. The robots could be either in the same environment or different ones. Robots, most often need to connect to a centralized server, a cloud or a controller, which in turn performs the task of aggregating the models received. Since robots could be mobile, their dynamically changing positions may tend to make or break connections with the central entity. For a group of mobile robots in the same environment or different environments, a decentralized approach or a hybrid of the centralized and decentralized approaches, could prove to be more beneficial. Research on FL in the area of robots, most often target specific or customised robotic scenarios, making it difficult for others to reuse the work.

Dept. of Computer Science and Engg, Federal Institute of Science and Technology, Angamaly, India
Dept. of Computer Science and Engg, Indian Institute of Technology Guwahati, Guwahati, India
Dept. of Computer Science and Engg, Indian Institute of Technology Guwahati, Guwahati, India
Dept. of Computer Science and Engg, Federal Institute of Science and Technology, Angamaly, India

Contact IEEE to Subscribe

References

References is not available for this document.