By Topic

Dynamic Exploration in Q(λ)-learning

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
J. van Ast ; Delft Center for Systems and Control, Delft University of Technology, Mekelweg 2, 2628 CD Delft, the Netherlands ; R. Babuska

Reinforcement learning has proved its value in solving complex optimization tasks. However, the learning time for even simple problems is typically very long. Efficient exploration of the state-action space is therefore crucial for effective learning. This paper introduces a new type of exploration, called dynamic exploration. It differs from the existing exploration methods (both directed and undirected) in that it makes exploration a function of the action selected in the previous time step. In our approach, states can either belong to long-path states, where the optimal action is the same as the optimal action in the previous state, or to switch states, where the action is different. In realistic learning problems, the number of long-path states exceeds the number of switch states. Given this information, the exploration method can explore the state-space more efficiently. Experiments on different gridworld optimization tasks demonstrate the reduction of learning time with dynamic exploration.

Published in:

The 2006 IEEE International Joint Conference on Neural Network Proceedings

Date of Conference:

16-21 July 2006