Multi-Agent Reinforcement Learning for Intelligent V2G Integration in Future Transportation Systems | IEEE Journals & Magazine | IEEE Xplore

Multi-Agent Reinforcement Learning for Intelligent V2G Integration in Future Transportation Systems


Abstract:

Electric vehicles (EVs) are the backbone of the future intelligent transportation system (ITS). They are environmentally friendly and can also be integrated as distribute...Show More

Abstract:

Electric vehicles (EVs) are the backbone of the future intelligent transportation system (ITS). They are environmentally friendly and can also be integrated as distributed energy resources (DERs) into the smart grid using vehicle-to-grid (V2G) scheme. Specifically, utility companies can push back EV batteries into the electric grid to reduce the peak load. However, integrating EVs into the power grid efficiently requires accurate artificial intelligence (AI) mechanisms to forecast, coordinate, and dispatch the EVs into the grid. This paper proposes a Multi-agent Reinforcement Learning (MARL) mechanism that schedules the day-ahead discharging process of EV batteries to optimize the peak shaving performance of the electric grid. The proposed MARL overcomes the inaccuracy of energy prediction by allowing the agents, i.e. EVs, to make autonomous decisions. These agents are trained in a centralized fashion but make decisions locally to maintain autonomy and privacy. In particular, the model does not require that the EVs communicate with a centralized entity during the execution stage, which assures the model’s integrity and protects the EVs’ private information. To evaluate the model, a comprehensive series of experiments were carried out to prove the effectiveness of the MARL coordination and scheduling mechanism and to show that the model can indeed flatten the peak load.
Published in: IEEE Transactions on Intelligent Transportation Systems ( Volume: 24, Issue: 12, December 2023)
Page(s): 15974 - 15983
Date of Publication: 28 June 2023

ISSN Information:

Funding Agency:


I. Introduction

Intelligent Transportation Systems (ITS) are expected to support the integration of EVs into the grid network through advanced information and communication technologies (ICT) powered by AI mechanisms and V2G schemes. Indeed, the prospect of the future ITS infrastructure including connected EVs and V2G is attracting significant attention from researchers. One aspect of such attention is devoted to the AI-based mechanisms for the integration of EVs with the smart grid to flatten the peak load [1], [2] [3], and [4]. AI mechanisms such as reinforcement learning (RL) are particularly considered promising where EVs are modeled as ITS agents with the capability to learn from the environment and perform actions to receive rewards [5]. The underlying concept of the RL is that agents can be autonomous, i.e., their behavior is learned independently from interacting with the environment [6]. Such a distinctive attribute makes RL particularly suitable for DR applications [7], [8], [9], specifically peak load shaving. The future integration of ITS and smart grids gives utility companies the opportunity to engage EVs of all kinds to effectively reduce the peak load. However, in practice, it is not feasible to assume that all brands of EVs can communicate with each other to make a coordinated decision, but rather take local actions based on partial observation of the entire system [8]. To this end, it is important that the employed mechanism does not need to fully observe the entire environment to reach a decision. Furthermore, centralized optimization mechanisms require full control of EV data to make an optimized decision which is not necessarily maximizing the reward of participating agents. As the number of EVs increases, the overhead communication and the complexity of computing resources make finding an optimal solution intractable. To this end, it is more logical to treat the problem using a distributed mechanism where each EV agent can learn and take a decision to maximize its reward. Although distributed solutions may not lead to the optimal social welfare of the participating EVs, the proposed MARL allows the agent to learn cooperatively to maximize the reward function. This paper capitalizes on this fundamental idea and proposes a novel multi-agent reinforcement learning system to efficiently and effectively schedule multiple EVs to reduce the peak load. The proposed MARL approach is based on the actor-critic framework for an optimized day-ahead discharging scheduling and coordination of EVs.

Contact IEEE to Subscribe

References

References is not available for this document.