Loading [MathJax]/extensions/MathMenu.js
Reinforcement Learning Agents Playing Ticket to Ride–A Complex Imperfect Information Board Game With Delayed Rewards | IEEE Journals & Magazine | IEEE Xplore

Reinforcement Learning Agents Playing Ticket to Ride–A Complex Imperfect Information Board Game With Delayed Rewards


To demonstrate the capability of a general-purpose Reinforcement Learning (RL) algorithm in playing a complex, imperfect information board game -- Ticket to Ride, eight R...

Abstract:

Board games are extensively studied in the AI community because of their ability to reflect/represent real-world problems with a high-level of abstraction, and their irre...Show More

Abstract:

Board games are extensively studied in the AI community because of their ability to reflect/represent real-world problems with a high-level of abstraction, and their irreplaceable role as testbeds of state-of-the-art AI algorithms. Modern board games are commonly featured with partially observable state spaces and imperfect information. Despite some recent successes in AI tackling perfect information board games like chess and Go, most imperfect information games are still challenging and have yet to be solved. This paper empirically explores the capabilities of a state-of-the-art Reinforcement Learning (RL) algorithm – Proximal Policy Optimization (PPO) in playing Ticket to Ride, a popular board game with features of imperfect information, large state-action space, and delayed rewards. This paper explores the feasibility of the proposed generalizable modelling and training schemes using a general-purpose RL algorithm with no domain knowledge-based heuristics beyond game rules, game states and scores to tackle this complex imperfect information game. The performance of the proposed methodology is demonstrated in a scaled-down version of Ticket to Ride with a range of RL agents obtained with different training schemes. All RL agents achieve clear advantages over a set of well-designed heuristic agents. The agent constructed through a self-play training scheme outperforms the other RL agents in a Round Robin tournament. The high performance and versality of this self-play agent provide a solid demonstration of the capabilities of this framework.
To demonstrate the capability of a general-purpose Reinforcement Learning (RL) algorithm in playing a complex, imperfect information board game -- Ticket to Ride, eight R...
Published in: IEEE Access ( Volume: 11)
Page(s): 60737 - 60757
Date of Publication: 16 June 2023
Electronic ISSN: 2169-3536

Funding Agency:


References

References is not available for this document.