Abstract:
This paper proposes a hybrid mixed- integer quadratic programming-constrained deep reinforcement learning (MIQP-CDRL) framework for energy management of multi-energy comm...Show MoreMetadata
Abstract:
This paper proposes a hybrid mixed- integer quadratic programming-constrained deep reinforcement learning (MIQP-CDRL) framework for energy management of multi-energy communities. The framework employs a hierarchical two-layer structure: the MIQP layer handles day-ahead scheduling, minimizing operational costs while ensuring system constraint satisfaction, while the CDRL agent makes real-time adjustments. The goal of this framework is to combine the strengths of CDRL in addressing sequential decision-making problems in stochastic systems with the advantages of a mathematical programming model to guide the agent's exploration during the training and reduce the dependency on black-box policies during real-time operation. The system dynamics are modeled as a constrained Markov decision process (CMDP), which is solved by a model-free CDRL agent built upon the constrained policy optimization (CPO) algorithm. Practical test results demonstrate the effectiveness of this framework in improving the optimality and feasibility of the real-time solutions compared to existing stand-alone DRL approaches.
Published in: IEEE Transactions on Sustainable Energy ( Early Access )