Skip to Main Content
This paper contributes to solve effectively stochastic resource allocation problems known to be NP-complete. To address this complex resource management problem, the merging of two approaches is made: The Q-decomposition model, which coordinates reward separated agents through an arbitrator, and the Labeled Real-Time Dynamic Programming (LRTDP) approaches are adapted in an effective way. The Q-decomposition permits to reduce the set of states to consider, while LRTDP concentrates the planning on significant states only. As demonstrated by the experiments, combining these two distinct approaches permits to further reduce the planning time to obtain the optimal solution of a resource allocation problem.