Loading [a11y]/accessibility-menu.js
Vehicle-Level Fairness-Oriented Constrained Multi-Agent Reinforcement Learning for Adaptive Traffic Signal Control | IEEE Journals & Magazine | IEEE Xplore

Vehicle-Level Fairness-Oriented Constrained Multi-Agent Reinforcement Learning for Adaptive Traffic Signal Control


Abstract:

Multi-agent Reinforcement Learning (MARL) has shown considerable promise in enhancing the efficiency of adaptive traffic signal control (ATSC) systems. However, existing ...Show More

Abstract:

Multi-agent Reinforcement Learning (MARL) has shown considerable promise in enhancing the efficiency of adaptive traffic signal control (ATSC) systems. However, existing MARL approaches primarily focus on optimizing overall traffic flow, often overlooking the issue of fairness in vehicle waiting times. Considering that there is no need to strive for the ultimate fairness, this paper models the ATSC problem as a Constrained Partially Observable Markov Game (CPOMG), where fairness is modeled as a constraint on the maximum waiting time of vehicles on lanes of intersections instead of a reward term that pursues maximization. CPOMG aims to find a cooperative control policy with optimal traffic efficiency within the constrained solution space by multiple agents. On this basis, this paper proposes a new centralized training and decentralized execution cooperative MARL method, i.e., vehicle-level fairness multi-agent proximity policy optimization (VF-MAPPO). VF-MAPPO leverages a centralized trained global Critic Network to estimate the average vehicle traffic efficiency and vehicle maximum waiting time, and an Actor Network shared by all intersections for decentralized execution, which converts the optimization problem with constraints to an unconstrained optimization objective through the Lagrange multiplier method and adopts proximity policy optimization during training. Additionally, VF-MAPPO incorporates spatial-temporal graph attention in the Critic network to efficiently extract state representations in multi-intersection environments. We qualitatively analyzed the monotonic improvement guarantee of VF-MAPPO. Extensive experimental validation across two real-world and one synthetic scenarios substantiates that VF-MAPPO enhances vehicle-level fairness and maintains average traffic efficiency, surpassing state-of-the-art methods.
Published in: IEEE Transactions on Intelligent Transportation Systems ( Volume: 26, Issue: 4, April 2025)
Page(s): 4878 - 4890
Date of Publication: 27 February 2025

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.