Skip to Main Content
This paper presents a Reinforcement Learning (RL) method for network constrained setting of control variables. The RL method formulates the constrained load flow problem as a multistage decision problem. More specifically, the model-free learning algorithm (Q-learning) learns by experience how to adjust a closed-loop control rule mapping states (load flow solutions) to control actions (offline control settings) by means of reward values. Rewards are chosen to express how well control actions cause satisfaction of operating constraints. The Q-learning algorithm is applied to the IEEE 14 busbar and to the IEEE 136 busbar system for constrained reactive power control. The results are compared with those given by the probabilistic constrained load flow based on sensitivity analysis demonstrating the advantages and flexibility of the Q-learning algorithm. Computing times with another heuristic method is also compared.