By Topic

Improving reinforcement learning algorithms by the use of data mining techniques for feature and action selection

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
de L Vieira, D.C. ; Centro de Inf., Univ. Fed. de Pernambuco, Recife, Brazil ; Adeodato, P.J.L. ; Gonçalves, P.M.

Data mining can be seen as an area of artificial intelligence that seeks to extract information or patterns from large amounts of data either stored in databases or flowing in streams. The main contribution of this work is to present how LVF data mining technique improves Sarsa(λ) algorithm combined with tile-coding technique by selecting the most relevant features and actions from reinforcement learning environments. The objective of this selection is to reduce the complexity of the problem and the amount of memory used by the agent thus leading to faster convergence. The motivation of this work was inspired by the rationale behind Occam's razor, which describes that a complex model tends to be less accurate than another with a lower complexity. The difficulty in using data mining techniques in reinforcement learning environments is due to the lack of data in a database, so this paper proposes a storage schema for states visited and actions performed by the agent. In this study, the selection of features and actions are applied to a specific problem of RoboCup soccer, the dribble. This problem is composed of 20 continuous variables and 113 actions available to the agent which results in a memory consumption of approximately 4.5mb when the traditional Sarsa(λ) algorithm is used combined with the tile-coding technique. The experiments' results show that the amount of variables in the environment were reduced by 35% and the amount of actions by 65%, which resulted in a reduction in memory consumption of 43% and an increase in performance of up to 23%, according to the relative frequency distribution of agent's success. The approach proposed here is both easy to use and efficient.

Published in:

Systems Man and Cybernetics (SMC), 2010 IEEE International Conference on

Date of Conference:

10-13 Oct. 2010