Skip to Main Content
This study presents an improved hierarchical reinforcement learning (HRL) approach to deal with the curse of dimensionality in the dynamic optimisation of generation command dispatch (GCD) for automatic generation control (AGC) under control performance standards. The AGC committed units are firstly classified into several different groups according to their time delay of frequency control, and the core problem of GCD is decomposed into a set of subtasks for search of the optimal regulation participation factors with the solution algorithm. The time-varying coordination factor is introduced in the control layer to improve the learning efficiency of HRL, and the generating error, hydro capacity margin and AGC regulating costs are formulated into Markov decision process reward function. The application of the improved hierarchical Q-learning (HQL) algorithm in the China southern power grid model shows that the proposed method can reduce the convergence time in the pre-learning process, decrease the AGC regulating cost and improve the control performance of AGC systems compared with the conventional HQL, genetic algorithm and a engineering method.