Optimizing decisions on an ensemble of incomplete disturbance trees and aggregating their first stage decisions has been shown as a promising approach to (model-based) planning under uncertainty in large continuous action spaces and in small discrete ones. The present paper extends this approach and deals with large but highly structured action spaces, through a kernel-based aggregation scheme. The technique is applied to a test problem with a discrete action space of 6561 elements adapted from the NIPS 2005 SensorNetwork benchmark.
Published in:
Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL '09. IEEE Symposium on
Date of Conference: March 30 2009-April 2 2009