Skip to Main Content
A novel exploration-exploitation strategy for reinforcement learning (RL) based an adaptive ant colony system is proposed in this paper, which called AACO-RL. The elitist strategy ant system (ASelitist), developing from ant system, presented by M. Dorigo, improved efficiency through imposing additional pheromone on the paths of the global optimal solution. But as the amount of elitist ant is produced by experience, it may converge to the partial optimal solution quickly if the amount is not appropriate. The novel AACO-RL strategy generates an adaptive set of elitist ants (EA) and straggled ants (SA) by the learning agent, exploring the unknown would. In addition, it shows that the AACO-RL strategy proposed converges faster to optimal solution.