Skip to Main Content
This paper deals with reinforcement learning for process modeling and control using a model-free, action- dependent adaptive critic (ADAC). A new modified recursive Levenberg Marquardt (RLM) training algorithm, called temporal difference RLM, is developed to improve the ADAC performance. Novel application results for a simulated continuously-stirred-tank-reactor process are included to show the superiority of the new algorithm to conventional temporal-difference stochastic backpropagation.
Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on (Volume:35 , Issue: 2 )
Date of Publication: April 2005