By Topic

Backpropagation Modification in Monte-Carlo Game Tree Search

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Fan Xie ; Jiu-Ding Comput. Go Res. Inst., Beijing Univ. of Posts & Telecommun., Beijing, China ; Zhiqing Liu

The Algorithm UCT, proposed by Kocsys et al, which apply multi-armed bandit problem into the tree-structured search space, achieves some remarkable success in some challenging fields. For UCT algorithm, Monte-Carlo simulations are performed with the guidance of UCB1 formula, which are averaged to evaluate a specified action. We observe that, as more simulations are performed, later ones usually lead to more accurate results, partly because the level of the search used in the later simulation is deeper and partly because more results are available to direct subsequent simulations. This paper presents a new method to improve the performance of UCT algorithm by increasing the feedback value of the later simulations. And the experimental results in the classical game Go show that our approach increases the performance of Monte-Carlo simulations significantly when exponential models are used.

Published in:

Intelligent Information Technology Application, 2009. IITA 2009. Third International Symposium on  (Volume:2 )

Date of Conference:

21-22 Nov. 2009