Skip to Main Content
We formulate an automatic strategy acquisition problem for the multi-agent card game "hearts" as a reinforcement learning (RL) problem. Since there are often a lot of unobservable cards in this game, RL is approximately dealt with in the framework of a partially observable Markov decision process (POMDP). This article presents a POMDP-RL method based on estimation of unobservable state variables and prediction of actions of the opponent agents. Simulation results show our model-based POMDP-RL method is applicable to a realistic multi-agent problem.