Cart (Loading....) | Create Account
Close category search window
 

The Ant(λ) ant colony optimization algorithm based on eligibility trace

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Xiao-Rong Wang ; Nat. Lab. of Ind. Control Technol., Zhejiang Univ., Hangzhou, China ; Tie-Jun Wu

The pheromone-based parameterized probabilistic model for the ACO algorithm is presented as the construction graph that the combinatorial optimization problem can be mapped on. Based on the construction graph, the solution construction procedure and update rule of pheromone model in the ACO algorithm is illustrated. The finite deterministic Markov decision process corresponding to the solution construction procedure is illustrated in the terminology of reinforcement learning (RL) theory. The ACO algorithms are fitted into the framework of generalized policy iteration (GPl) in RL based on incomplete information of the Markov state. Furthermore, we show that the pheromone update in the ACS and Ant-Q algorithm is based on the MC methods or some formalistic combination of MC methods and TD methods. TD methods have usually been found to converge faster than MC methods in many applications, but works worse than the MC method in the non-Markov environment. We propose a novel ACO algorithm, Ant(λ) algorithm, which introduces the eligibility trace mechanism into the local update procedure of pheromone, the algorithm unifies the TD method and MC method mathematically, and in the algorithm, the delayed reinforcement can be back propagated in time.

Published in:

Systems, Man and Cybernetics, 2003. IEEE International Conference on  (Volume:5 )

Date of Conference:

5-8 Oct. 2003

Need Help?


IEEE Advancing Technology for Humanity About IEEE Xplore | Contact | Help | Terms of Use | Nondiscrimination Policy | Site Map | Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology.
© Copyright 2014 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.