Skip to Main Content
Solving multiagent reinforcement learning problems is a key issue. Indeed, the complexity of deriving multiagent plans, especially when one uses an explicit model of the problem, is dramatically increasing with the number of agents. This papers introduces a general iterative heuristic: at each step one chooses a sub-group of agents and update their policies to optimize the task given the rest of agents have fixed plans. We analyse this process in a general purpose and show how it can be applied to Markov decision processes, partially observable Markov decision processes and decentralized partially observable Markov decision processes.