Skip to Main Content
Reinforcement learning is learning what to do - how to map situations to actions - so as to maximize a numerical reward signal. In allusion to the problem that Q-Learning, which uses discount reward as the evaluation criterion, cannot show the affect of the action to the next situation, the paper puts forward AR-Q-Learning based on the average reward and Q-Learning. In allusion to the Curse Of Dimensionality, which means that the computational requirements grow exponentially with the number of state variables, the paper puts forward Minimum State Method. AR-Q-Learning and Minimum State Method are used in the reinforcement learning of Blocks World, and the result of the experiment shows that the method has the characteristic of after effect and converge more faster than Q-Learning, and at the same time, solve the Curse Of Dimensionality in Blocks World in a certain extent.