Skip to Main Content
In the background of Agent Alliance combat deduction, here we present a Two Layer Reinforcement learning algorithm, referred to a TLRL algorithm, for the special requirements of battlefield simulation environment Agents offensive and defensive decision-making study. The algorithm model is classified into two layers: one is the global decision-making Agent, called Commandant Agent, learning from the environment as well as both enemies' and friends' actions, the other is the Servant Agents optimizing the action by receiving local environment feedback. Finally the war situation deduction which is carried out on the simulation platform TBS we set up, has showed the fast convergence and effectiveness of this algorithm.
Computer Application and System Modeling (ICCASM), 2010 International Conference on (Volume:1 )
Date of Conference: 22-24 Oct. 2010