I. Introduction
Many control applications require to compute control actions by solving partially-defined optimisation problems that depend on unknown parameters that can be estimated from data. A microgrid Energy Management System (EMS) is a representative example of this problem, where the controller has to plan the amount of energy to store in the storage systems at each time step based on predictions of unknown variables such as electricity price. The standard procedure to solve these control problems is to first estimate the unknown parameters and then provide the solution of the estimation to the subsequent optimisation problem that computes the control actions, according to the so-called Predict then Optimise framework [12]. However, in recent years, performance-based network training has proven to outperform the Predict then Optimise approach [9, 24]. In this setting the parameters of the network are adjusted aiming at optimising the ultimate criteria on which the model is evaluated instead of targeting the estimation performance [21]. In this paper we propose a performance-based self-tuning MPC algorithm that relies on differentiable optimisation layers. Recent works have shown that it is possible to include a parametrised convex optimisation problem as a layer of a NN by implicitly differentiating its optimality conditions [2]. The network is then trained E2E updating its parameters based on the optimality of the decisions. The novel idea in this paper is to include the MPC optimisation problem in the learning framework to allow the real-time operation. The resulting algorithm combines the advantages of MPC and NNs. On the one hand, MPC is a very successful control technique thanks to its ability to compensate for uncertainty and to handle constraints. On the other hand, using deep NNs for estimation of unknown parameters, allows to choose a model structure that has the capability of approximating functions with arbitrary accuracy [23].