Efficient Model-Based Exploration

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$15 $15
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, books, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)

Model-Based Reinforcement Learning (MBRL) can greatly profit from using world models for estimating the consequences of selecting particular actions: an animat can construct such a model from its experiences and use it for computing rewarding behavior. We study the problem of collecting useful experiences through exploration in stochastic environments. Towards this end we use MBRL to maximize exploration rewards (in addition to environmental rewards) for visits of states that promise information gain. We also combine MBRL and the Interval Estimation algorithm (Kaelbling, 1993). Experimental results demonstrate the advantages of our approaches.