Physics-Informed Neural Network Modeling and Predictive Control of District Heating Systems

This article addresses the data-based modeling and optimal control of district heating systems (DHSs). Physical models of such large-scale networked systems are governed by complex nonlinear equations that require a large amount of parameters, leading to potential computational issues in optimizing their operation. A novel methodology is hence proposed, exploiting operational data and available physical knowledge to attain accurate and computationally efficient DHSs dynamic models. The proposed idea consists in leveraging multiple recurrent neural networks (RNNs) and in embedding the physical topology of the DHS network in their interconnections. With respect to standard RNN approaches, the resulting modeling methodology, denoted as physics-informed RNN (PI-RNN), enables to achieve faster training procedures and higher modeling accuracy, even when reduced-dimension models are exploited. The developed PI-RNN modeling technique paves the way for the design of a nonlinear model predictive control (NMPC) regulation strategy, enabling, with limited computational time, to minimize production costs, to increase system efficiency and to respect operational constraints over the whole DHS network. The proposed methods are tested in simulation on a DHS benchmark referenced in the literature, showing promising results from the modeling and control perspective.


I. INTRODUCTION
T HE growing issue of climate change calls for cutting-edge solutions to substantially reduce carbon emissions.In this context, district heating systems (DHSs), given their high efficiency, are recognized as crucial to reach the energy transition objectives.In fact, the European Commission considers this technology necessary to meet the 2050 decarbonization targets [1], with the aim of covering at least 50% of the heating demand in most European countries [2].A DHS is generally composed of a heating station, comprising different thermal generators, and of an insulated water pipeline network, transferring the generated heat to thermal loads (e.g., residential and The authors are with the Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133 Milan, Italy (e-mail: laura.bocadegiuli@polimi.it; alessio.labella@polimi.it;riccardo.scattolini@polimi.it).
Digital Object Identifier 10.1109/TCST.2024.3355476commercial users), which, exploiting local heat exchangers, absorb the delivered heat and use it for indoor heating and domestic hot water.DHSs are typically operated by heuristic rule-based control strategies, which, however, do not exploit their full efficiency potential, implying the necessity to design advanced optimization-based control strategies [3].This is not a trivial task though, as DHSs are large-scale systems governed by complex nonlinear dynamical equations (e.g., describing transport phenomena in thermo-hydraulic networks), entailing a significant effort to compute their optimal operation and to develop accurate physical models, also due to the considerable number of necessary parameters (e.g., pipes lengths, diameters, friction coefficients, etc.) [4], [5].
To overcome these issues, it is here proposed to rely on identification methods with the purpose of obtaining computationally efficient and accurate models from operational data, which are typically widely available in DHSs.More specifically, recurrent neural networks (RNNs) are employed, being particularly suited to model nonlinear dynamical systems [6], [7].It is worth remarking that RNNs generally do not exploit any physical insight on the identified system: this may lead to the need of large datasets, time-consuming training procedures, or even unreliable data-based models.On the other hand, besides operational data, in engineering systems, there is usually the availability of some physical knowledge, which is worth being used to develop physically consistent data-based models.This has motivated the design of a novel physics-informed RNN (PI-RNN) modeling methodology for DHSs, enabling to achieve enhanced identification performances and efficient training procedures.In particular, the commonly known information about the physical topology of the DHS network (i.e., how thermal loads and generators are interconnected) is exploited to develop a PI-RNN model with an analogous topological structure.It is also shown that the developed PI-RNN model can be effectively employed to design a nonlinear model predictive control (NMPC) regulator, enabling to minimize production costs and to increase the system efficiency while respecting the desired operational constraints (e.g., temperature limits over the network).

A. Related Work
The detailed modeling of DHSs is addressed in [8], focusing on the stability of their nonlinear dynamics.The physical modeling and optimal operation of DHSs are discussed in [4], yet leading to the formulation of a large-scale problem solved using a one-step prediction horizon.In fact, DHS physical models typically include many state variables governed by nonlinear thermo-hydraulic equations, e.g., describing fluid and heat transport in water pipelines, resulting in a modeling complexity hardly tractable by standard optimization-based controllers.To overcome this issue, predictive controllers exploiting simplified models have been proposed in the literature, such as [9], [10], and [11], where the thermal dynamics of the DHS network are not modeled.Nevertheless, the accurate modeling of the network thermal dynamics is crucial for the optimal operation of DHS plants for several reasons: 1) network temperatures must respect operational constraints due to technical limits of thermal generators and to the proper heat supply to thermal loads (e.g., the water temperature supplied to each load must exceed a minimum lower bound [4], [12]) and 2) network thermal dynamics, if modeled, can be optimized to minimize heat losses and to increase the overall DHS efficiency.Other optimization-based control approaches include a dynamical modeling of DHS networks using simplifying assumptions, such as [13], where constant transport delays are considered, or [14], where all thermal loads are assumed to be supplied with the same water temperature (i.e., neglecting heat losses over the DHS network).Given the significant complexity of detailed physical models for DHS networks and the poor accuracy of simplified ones, data-based methods have been proposed to identify control-oriented and accurate DHS models directly from operational data [15], [16].In this context, neural networks (NNs) have been exploited for modeling and optimally controlling heating and cooling networks, thanks to their enhanced capability of representing nonlinear dynamical systems [17], [18].Nevertheless, the mentioned data-based models disregard any available physical insight on the system to be identified, possibly leading to poorly physically consistent and unreliable models.
Currently, in the scientific community, a growing interest is arising to embed available physical knowledge in NN models, enhancing their physical consistency, accuracy, and training procedure [19].To do that, different approaches have been presented in the literature.For instance, in [20], [21], [22], and [23], the loss function used for the NNs training is modified such that, besides minimizing the prediction error, known physical equations or relationships among variables are induced to be respected.Other methods suggest to incorporate the available physical knowledge directly in the NN architecture [24], [25].In this context, in [26], a physics-guided layer, embedding known system dynamics, is placed in parallel to NN hidden layers, improving the modeling performances.Considering the problem of deriving data-based models of interconnected systems, a further method consists in exploiting their physical topology, which is generally known, and interconnecting different NN models accordingly.This idea has been applied to chemical processes in [7] and [27], leveraging the known sequence of operations, and to thermal buildings in [28], exploiting the known connections among different thermal zones.This approach is conceptually similar to graph NNs (GNNs), where different neurons are interconnected by resembling graph-structure data dependencies [29].Nevertheless, none of the mentioned physics-informed identification approaches is applied to energy networks and in particular to DHSs, which are commonly characterized by a well-defined topology, and none of them exploits the developed models for the design of computationally efficient and cost-effective NMPC regulators.

B. Main Contribution
In view of the above discussion, a novel PI-RNN modeling methodology for DHSs is proposed, particularly suited for the design of NMPC regulators.The main contributions of the work are hereafter synthesized.
1) Physics-Informed Neural Network Modeling of DHSs: Given that the DHS network topology is commonly known, this information is leveraged to develop a novel PI-RNN architecture, capable of accurately modeling the main thermal dynamics.More specifically, a different RNN is first paired with each section of the DHS network (e.g., with a thermal load and the corresponding supplying pipes), and, subsequently, all RNNs are interconnected resembling the network physical topology.
Then, the overall PI-RNN, comprising all the interconnected RNNs, is trained as a unique data-based model, embedding in its architecture the physical dependence among the different DHS network sections.This enables to achieve a faster training procedure and higher modeling accuracy with respect to standard RNN models, even when employing reduced-order PI-RNN models, as witnessed by the numerical results.2) NMPC Design for Optimal Operation of DHSs: The developed PI-RNN model is exploited for the design of an NMPC regulator, which optimizes the DHS with a prediction horizon of several hours, enabling to minimize production costs, increase system efficiency, and comply with operational constraints over the whole DHS, e.g., by providing proper heat delivery to all thermal loads.Moreover, as witnessed by the numerical results, the employed PI-RNN model enables to reduce the NMPC computational complexity not only with respect to physical models but also with respect to standard RNN-based ones.The proposed approach is tested in simulation on a DHS benchmark, i.e., the AROMA DHS [4], showing promising results from the modeling and control perspective.

C. Article Outline
This article is organized as follows.A general overview on the DHS physical modeling is presented in Section II, together with the description of the benchmark case study analyzed in this work.Two data-based modeling approaches, i.e., standard RNN and PI-RNN methods, are presented in Section III, with a special focus on the proposed physics-informed databased methodology and its application to the considered DHS benchmark.The formulation of the NMPC regulator exploiting the developed data-based models is described in Section IV.The numerical results regarding the proposed modeling and control methods are reported in Section V. Final conclusions are given in Section VI.

D. Notation
Let R denote the set of real numbers and N the one of natural numbers.Given two vectors of variables x, y ∈ R n , the inequalities between the two, e.g., x > y, are intended elementwise, whereas their Kronecker product is indicated with x ⊗ y.For a vector x ∈ R n , its two-norm is indicated as ∥x∥ 2 , whereas the vectors of corresponding upper and lower bounds are x ∈ R n and x ∈ R n , respectively, with x > x.Considering a real variable a ∈ R, with a > 0, b = ⌊a⌋ is the largest integer less than or equal to a, i.e., b ∈ N ∪ {0}.Given a sequence of variables a 1 , . . ., a n , and the set of their indices N = {1, . . ., n}, the vector a = [a 1 , . . ., a n ] ′ is compactly written as a = {a i } ∀i∈N .Given a set N , its cardinality is denoted as n = |N |.

A. System and Main Modeling Assumptions
The main physical variables and parameters used in the following are reported in Table I.A DHS typically consists of four main elements, as depicted in Fig. 1: 1) the supply network, where water at high temperature flows from the heating station to thermal loads; 2) the return network, where water at cold temperature flows from thermal loads to the heating station; 3) the heating station, which absorbs water from the return network and injects it at higher temperature into the supply network; and 4) the thermal loads (e.g., households or buildings), which absorb water from the supply network, exploiting the delivered heat for internal heating, and inject it into the return network.
For the sake of clarity, two standard assumptions for DHSs are considered in the following.
Assumption 2: The supply and return networks have the same physical topology [8], [31].
Note that the introduced assumptions could be removed in the following at the price of reduced clarity of presentation.
Given Assumption 2, the DHS can be represented by a structured graph G = (N , E), where N identifies the set of nodes, whereas E ⊆ N × N is the set of edges.Each node, denoted as α i , with i ∈ N , represents an element of the DHS, e.g., a thermal load, the heating station, or a junction among multiple pipes, and it includes a connection both with the supply and return network, as depicted in Fig. 1.Given Assumption 1, without any loss of generality, the node where the heating station is connected is denoted as α 0 .Moreover, for the sake of clarity, two specific subsets of nodes are introduced.The first one, defined as N net = N \{0}, contains all the nodes of the DHS network excluding the heating station, whereas the second one, denoted as N c ⊆ N net , with n c = |N c |, includes all the nodes containing thermal loads.
Each edge directed from α i to α j is denoted as e i j = (α i , α j ), with (i, j) ∈ E. As a convention, each edge is oriented according to the water flow direction in the supply network, assumed to be known, e.g., from available operational data or preprocessing techniques [4].Nevertheless, in case  the flow direction in the supply network between α i and α j is not fixed, two opposite edges e i j and e ji are defined to interconnect the corresponding nodes.
Since the development of a DHS mathematical model based on physical laws is well known and out of the scope of this article (the interested reader is referred to [4]), the fundamental relations among the main system variables are here briefly presented, as these will be necessary to better describe the proposed identification method.
First of all, as evident from Fig. 1, each load absorbs a water flow q c i from the supply network at temperature T s i .This water flow goes through an internal heat exchanger absorbing a thermal power P c i , then it is injected into the return network at temperature T c i , which can be modeled as where c w is the water specific heat.Note that the thermal load model is static as its dynamical transients are negligible with respect to the DHS network ones.As discussed in [12], the load water flow q c i is supposed to be regulated by a local controller, tracking a constant reference for the load output temperature, indicated as T c⋆ i .This implies that the load water flow can be generally modeled as a function of the load supply and output temperature, the thermal power and the output temperature reference, i.e., where ζ c i denotes a generic nonlinear function, whereas P c i and T c⋆ i act as external disturbances.Before modeling the supply and return network dynamics, the sets of inlet nodes of α i are defined with respect to the supply and return networks.These are denoted as I s i and I r i , respectively.More specifically, given that E is oriented according to the water flow direction of the supply network, it follows that I s i = { j ∈ N |∃( j, i) ∈ E}.On the other hand, in case the water flow directions of the return network are opposite with respect to the supply one, as typical in DHSs [32], it follows that I r i = { j ∈ N |∃(i, j) ∈ E}.If this does not hold, I r i can be defined according to the actual water flow directions of the return network.
As shown in Fig. 1, each node of the DHS network is characterized by a net water flow at the supply network, i.e., q s i , and by one at the return network, i.e., q r i .These depend on the water flows at the corresponding inlet network nodes and on the one absorbed, or injected, by the thermal load, if present.Thus, ∀i ∈ N net , it holds that Thermo-hydraulic pipes introduce transport delay effects on temperature profiles over the DHS network.It is possible to describe the physics of water flow in pipelines through 1-D Euler equations: in order to write the system in a state-space (SS) form, however, the model of each pipe must be discretized in space (finite volume method), as discussed in detail in [4].Hence, each node of the DHS network is characterized by the following temperature dynamics at the supply network, which, ∀i ∈ N net , can be compacted as where f s i and g s i are nonlinear functions and z s i is a generic vector employed to represent the internal states of the supply temperature dynamic model.Specifically, the finite volume method models each pipeline as a sequence of adjacent subsections, each one characterized by a lumped temperature dynamics [4].Hence, the state variable z s i in (5a) compactly comprises the temperatures of all subsections, considering the supply network pipes interconnecting node α i , with i ∈ N net , and its inlet nodes α j , with j ∈ I s i .Moreover, as evident from (5a), the supply temperature dynamics at each node α i is influenced by the supply temperatures and water flows of all α i 's inlet nodes, i.e., α j , with j ∈ I s i .Additionally, T ext is the external temperature, which corresponds to the ground temperature being DHS pipes typically buried.Finally, the supply temperature T s i can be expressed as a function of the subsection temperatures and of the incoming water flows, as evident from (5b).
Similar to (5), each node is characterized by a temperature dynamics at the return network as well, which, ∀i ∈ N net , is expressed as where f r i , g r i , and g r i are generic nonlinear functions, whereas the vector z r i comprises the internal states of the return temperature dynamic model.In particular, similar to (5a), the internal states of (6a) correspond to the lumped subsection temperatures of the pipes interconnecting node α i , with i ∈ N net , and its inlet nodes α j , with j ∈ I r i , according to the finite volume modeling method [4].Considering the modeling of the return temperature T r i in (6b), note that, differently from (5b), if node α i is connected to a thermal load, the corresponding dynamics is influenced also by the load output temperature, i.e., T c i , and by the load water flow, i.e., q c i (see Fig. 1).Finally, regarding the heating station, even though it is typically composed of different thermal generators (boilers, heat pumps, cogenerators, etc.) and storages, here, similar to [4], its internal configuration is neglected, whereas just its overall power consumption, water flow, return and supply temperatures are taken into account.The latter, denoted as T s 0 , does not depend on the DHS network supply nodes, but it is a control variable imposed by the heating station itself [4], [16].Moreover, T s 0 is not here described by a dynamical equation, as in (5), given that thermal generation is usually characterized by negligible dynamical transients with respect to the DHS network ones [30].On the other hand, the return temperature at the heating station, i.e., T r 0 , is characterized by a dynamical behavior, which can be compacted as where f r 0 and g r 0 are nonlinear functions and z r 0 is the vector that represents the internal states of the dynamic model of the return temperature at the heating station.Furthermore, given Assumption 1, and since additional bypasses between the supply and the return network are not considered, the heating station water flow, indicated as q 0 , is equal to the sum of all load flows, i.e., The heating station power is denoted as P 0 and it depends on the overall water flow and on the difference between the supply and return temperature, i.e., To sum up, it is possible to collect the above-described input and output variables into vectors so as to get an overall SS model of the DHS network.Thus, (1)-( 8) are compacted as where f and g are nonlinear functions resulting from ( 1)-( 2) and ( 5)-( 7), z is the overall state vector, v = T s 0 is the controllable input, whereas d = {P c i } ∀i∈N c , i.e., the thermal load demands, are the disturbances.Note that, being T ext and T c⋆ i constant over time, they are not included as disturbances in the system model.Concerning the outputs, the following ones, being typically measurable, are selected: In detail, T r 0 and q 0 are needed to compute the heating station thermal power, as evident from ( 9), whereas the loads supply temperatures, i.e., T s i , must be monitored to ensure that they respect prescribed operational limits.Finally, as it will be later clarified, the output temperature and water flow of each thermal load, i.e., T c i and q c i , are also convenient to be measured.At the end, considering an appropriate sampling time τ s , the DHS model ( 10) can be discretized using a suitable integration method.Hence, the discretized system model reads as where k = ⌊t/τ s ⌋ is the adopted discrete-time index, and for simplicity the same notation as continuous time-dependent variables is maintained.

B. Case Study
To better comprehend the proposed modeling method, the considered system benchmark, i.e., the AROMA DHS described in [4], is here briefly introduced.A schematic representation of the AROMA DHS is reported in Fig. 2(a).As visible from this scheme, the system is composed of a heating station and a DHS network of nine nodes, including five thermal loads.In particular, the total pipeline length at the supply and return networks is 7262.4m, whereas other details are available in [4].
As previously discussed, it is possible to define a graph describing the considered DHS, as shown in Fig. 2(b).Note that the water flow direction between nodes α 7 and α 8 may be not determined a priori.Consequently, the corresponding edge is doubled (e 78 and e 87 ), as evident from Fig. 2(b).For the sake of clarity, as a convention, loads are numbered in increasing order according to their distance along pipelines with respect to the heating station (node α 0 ).Moreover, the nodes which do not represent thermal loads but pure junctions are numbered with indices greater than the loads ones, again according to their distance from the heating station.Finally, the AROMA DHS physical model, described in [4], has been leveraged to develop a dynamic simulator in the Modelica environment [33], exploiting the library [34].The developed AROMA DHS simulator will be used in the following both for data collection and for control testing.

III. DATA-BASED MODELING
As anticipated, for complex large-scale systems such as DHS networks, developing physical models as (10) may demand a lot of modeling effort due to the required knowledge of a huge amount of parameters (e.g., pipe lengths, diameters, thickness, heat transfer coefficients, etc.).Therefore, the first objective of the work is to identify a computationally efficient DHS network model through data-based approaches.The latter deal with the problem of building mathematical models of dynamical systems based on observed data from the plant itself.The procedure to follow is straightforward: input and output signals from the system are collected and processed by a data analysis technique so as to infer a dynamic model [35].

A. Recurrent Neural Networks
Various identification techniques can be exploited.Since the system under control is characterized by a nonlinear behavior, linear models such as autoregressive with exogenous input (ARX) or output-error (OE) are not appropriate, as discussed in Section V.By contrast, NNs [36], thanks to their enhanced ability to learn nonlinear relationships, are suited to identify complex systems like DHSs.In particular, each NN is characterized by the so-called hyperparameters, which include hidden layers and neurons.A hidden layer is an intermediate layer between the NN input and output layers, and it is the collection of neurons which transfer data to layers [37], [38].Within the NNs framework, RNNs are particularly suited to represent nonlinear dynamical systems and to process time series data [39], being inherently characterized by the presence of state variables [7].Therefore, RNNs will be exploited in the remaining of the work to identify the DHS network model under investigation.
In general, RNNs can be described as a dynamical SS model, i.e., where x ∈ R n x , u ∈ R n u , y ∈ R n y are the state, input, and output vectors, respectively.Besides, is the set of parameters (weights and biases) of the RNN, which must be tuned during the training procedure [7].Specifically, an RNN is constituted by n l hidden layers, each one comprising n [i]  x state variables, with i = 1, . . ., n l , thus implying that the total number of states in (12) is n x = n l i=1 n [i]  x .Note that the number of states of each RNN layer is defined by the selected number of neurons [40].For the purpose of identifying the DHS network modeled in (11), the inputs of the RNN model (12) are ] ′ .Despite their potential, RNNs, in general, do not embed any physical knowledge but they just rely on the available input-output data.Nevertheless, available physical information, such as the network topology in DHSs, is worth to be exploited to enhance their modeling performances.

B. Physics-Informed Recurrent Neural Networks
The proposed PI-RNNs modeling methodology involves the interconnection of different RNNs according to the physical system structure, so that the so-obtained overall PI-RNN architecture resembles the DHS network topology.
First of all, as later clarified, just a subset of the DHS nodes are of interest from the control perspective, i.e., the one comprising the heating station (α 0 ) and the thermal loads (α i , ∀i ∈ N c ).Thus, a reduced graph G = ( N , E) is introduced, where N denotes the set of these significant nodes, i.e., N = {0} ∪ N c .Then, the nodes in N are interconnected according to their physical dependence with respect to the supply network.In fact, the supply temperature at each load node is influenced by the ones at the inlet load nodes, defined according to the water flow direction.Hence, the set of the edges of this reduced graph, i.e., E ⊆ N × N , is defined as and The definition of E in (13) expresses the fact that two nodes in N are connected by an edge if there exists a path in the original DHS graph G = (N , E) which interconnects them and does not contain any other node in N .The definition of the reduced graph G = ( N , E) derives from the fact that, as visible from Fig. 3(a), each load supply temperature is influenced by the ones at the inlet load nodes, defined according to the water flow direction.In particular, let us consider a section of the DHS network comprising the ith thermal load node and the supply pipes connecting it with each jth inlet node, ∀ j : ( j, i) ∈ E [dotted shadow area in Fig. 3(a)].It is evident that the supply temperatures of the inlet load nodes, i.e., {T s j } ∀ j:( j,i)∈ E , have a direct impact on the considered ith DHS section, and thus they can be modeled as  local inputs of this subsystem.The same holds for the thermal demand P c i , which acts as an external disturbance significantly influencing the local load water flow and output temperature.On the other hand, the resulting supply temperature T s i can be modeled as an output for the considered ith DHS section.This must comply with the load operational limits and it constitutes an input for the subsequent DHS section models, defined based on E. Additionally, the load water flow q c i and the output temperature T c i are modeled as outputs for the ith DHS section model as well, since these will be needed to identify the return network dynamics, as explained below.
Consequently, the approach proposed in this work consists in defining a load-associated RNN for each ith DHS section, with i ∈ N c , comprising the ith load and the corresponding supply pipe(s) entering the node, as depicted in Fig. 3(b).Then, each RNN is interconnected to the others according to the topology of the reduced graph, expressing the dependence among inputs and outputs of the different DHS sections.
On the other hand, a different approach applies for the return network dynamics.Indeed, a single return-associated RNN is employed, since nodal variables at the return network (T r i , q r i , ∀i ∈ N c ) are not of interest from the control perspective, as will be evident in Section IV, but only the return temperature T r 0 and water flow q 0 are necessary to compute the heating station produced power P 0 in (9).The return-associated RNN receives as inputs the output temperature and water flow of each ith load, which are outputs of the ith load-associated RNN, ∀i ∈ N c , and it outputs the return temperature and the water flow at the heating station, i.e., T r 0 and q 0 , as evident from Fig. 4.
Thus, the total number of employed RNNs in the proposed PI-RNN approach is equal to the number of loads plus one, i.e., n RNN = n c + 1, and each ith RNN is modeled as with i ∈ {1, . . ., n RNN }.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
The number of states of each ith RNN is indicated with n [i]  x , i.e., x , implying that the total number of states of the whole PI-RNN is n x = n RNN i=1 n [i]  x .In particular, the inputs and outputs of each ith RNN are defined as follows.
1) For each ith load-associated RNN, with i ∈ {1, . . ., n c } 2) For the return-associated RNN, with i = n RNN = n c + 1 In other words, each ith load-associated RNN, i.e., paired with a DHS section comprising the ith load and the supply pipe(s) entering node α i , identifies the corresponding supply temperature among its outputs, see (15d), which consequently constitutes an input for the load-associated RNNs influenced by the ith one, as evident from (15a), according to the reduced graph interconnections described by E. As shown in Fig. 3(b), each ith load-associated RNN returns also as outputs the corresponding load output temperature and water flow, see (15e), which represent an input for the returnassociated RNN (16a), as shown in Fig. 4(b).Moreover, being the local thermal demand a disturbance, each load-associated RNN is also fed with it, as evident from (15b).Finally, the return-associated RNN identifies as outputs the overall water flow and return temperature (16b), being the latter necessary to compute P 0 in (9).Ultimately, by collecting the variables of all RNNs into vectors, the overall PI-RNN model can be written as in (12) by setting x = [x [1] ′ , . . ., } ∀i:(0,i)∈ E , as the supply temperature at the heating station node α 0 is the effective external input of the system, d = [d [1] , and y = [y [1] ′ , . . ., y [n RNN ] ′ ] ′ .PI-RNN Modeling Applied to the AROMA DHS: As discussed, starting from the physical topology of the AROMA DHS, reported in Fig. 2(a), one can extract the structured graph depicted in Fig. 2(b).Therefore, following the procedure described in Section III-B, the reduced graph shown in Fig. 5(a) can be defined, which represents how load supply temperatures influence each other.Finally, the proposed PI-RNN architecture, which reflects the physical system topology, is encoded according to the information contained in the reduced graph, as shown in Fig. 5(b).Please note that v = v [1]  = v [2]  = T s 0 , being the supply temperatures at nodes α 1 and α 2 directly affected by the one at the heating station node α 0 , which is the overall system input.Moreover, v [3]  = y [1]  s , being α 1 the only preceding significant node for α 3 , v [4]  = v [5]  = [y [2] ′ s , y [3] ′ s ] ′ , being α 2 and α 3 the preceding significant nodes for α 4 and α 5 , whereas u [6]  = [y [1] ′ r , y [2] ′ r , y [3] ′ r , y [4] ′ r , y [5] ′ r ] ′ , since the return-associated RNN is fed with the load output temperatures and water flows which are outputs for the five load-associated RNNs.
Remark 1: Considering the AROMA DHS case study, a different RNN is paired with each load.However, in case DHSs are characterized by numerous thermal loads, these can be grouped as shown in [41].In this way, a single RNN can be used to model each loads cluster, thus limiting the size of the overall PI-RNN model.
Remark 2: The topological correspondence between the proposed PI-RNN architecture and the physical system enables to properly tune the number of neurons of each RNN.In particular, if certain sections of the physical system require a higher modeling capability, given an inherent higher dynamical complexity, the number of neurons of the associated RNNs can be suitably increased.Considering the AROMA DHS case study, the farther a thermal load from the heating station, the more complex its thermal dynamics, being influenced by longer pipelines (and consequently by higher head and thermal losses), as well as by the operations of all preceding thermal loads.As a consequence, a higher identification error may occur for the output variables related to farther thermal loads.This issue can be effectively solved by the proposed approach increasing the modeling complexity (e.g., the number of neurons) of each load-associated RNN with the distance of the corresponding thermal load from the heating station, so as to maintain a small identification error for the output variables of interest.
Remark 3: In the PI-RNN approach, each RNN can be fed with some additional physical knowledge to improve its modeling performance.For instance, in order to give the information regarding the total DHS thermal demand to each ith Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
load-associated RNN, the sum of the other thermal demands can be also provided, i.e., d [i]

IV. NONLINEAR MODEL PREDICTIVE CONTROL
Before showing the performances achieved by the proposed PI-RNN modeling approach, an NMPC regulation strategy is formulated, which periodically optimizes the DHS operation exploiting the derived RNN-based dynamical models.
Let us consider a sampling period τ s and a prediction horizon of N steps.Thus, leveraging the receding horizon strategy [42], the following NMPC problem is solved at each time instant t = k s τ s , with k s ∈ N: where the cost function, the constraints, and the adopted symbols are described in the following.In detail, the cost function (17a) minimizes the production cost of the heating station: the power P 0 is multiplied by the time-varying price c el and divided by the heating station thermal efficiency η.Moreover, a terminal cost is added in the cost function, weighted via c t , to discourage significant variations of the load supply temperatures from a nominal reference value T ⋆ .The dynamical model of the DHS network is embedded in the NMPC formulation in (17b) and (17c), reported in the generic form of (12), so as to include either the standard RNN or the proposed PI-RNN model.It is worth noting that, in principle, (17b) and (17c) could be replaced by the DHS network physical model, i.e., (11), which, as discussed, leads to a large-scale optimization problem hard to be solved [4].Independent of the selected model, as evident from (17d), the system state x must be initialized at each NMPC iteration with x0 , supposed to be measured or estimated.In fact, being the state typically not accessible in RNN models, state observers could be necessary, e.g., the ones proposed in [7] and [43].Constraints (17e) and (17f) are included to comply with temperature limits at the heating station, whose produced thermal power is modeled in (17g) and bounded in (17h).
Moreover, constraint (17i) is imposed to guarantee that the supply temperature at each thermal load respects prescribed limits, enabling the proper heat delivery and functioning of local load exchangers.In particular, note that these temperature limits may change over time, e.g., being higher by day and lower by night, consistently with the thermal demand daily trend [30].Moreover, being (17f), (17h), and (17i) imposed on output variables, in order to ensure problem feasibility [42], these should be stated as soft constraints by means of slack variables, which, for the sake of clarity, are not here explicitly reported.
Finally, to reduce the computational complexity induced by the nonlinear model ( 17b) and (17c), constraints (17j) and (17k) are added.The former implies that the variation of the heating station supply temperature between two consecutive time instants is limited by T s 0 > 0. Constraint (17k) is commonly referred to as input blocking strategy [44], since it limits control variables to vary every N b steps over the prediction horizon, where N b is a positive integer with N b ≪ N , reducing the problem degrees of freedom.This strategy enables to lighten the optimization problem, but in practice, being the NMPC regulator executed with a period τ s , the actual manipulated input T s 0 will still vary at each t = k s τ s .Please note that constraints (17j) and (17k) are not necessary from a conceptual point of view, but they enable to adopt larger prediction horizons, which can be necessary to effectively optimize DHSs, given their slow dynamical transients.
Overall, the just described optimization problem constitutes a basic example, since more advanced NMPC strategies for DHSs are out of the scope of this article.For instance, a thermal energy storage (TES) could be considered in the heating station modeling as a further degree of freedom.

V. NUMERICAL RESULTS
In this section, the performances of the proposed modeling and control approaches applied to the AROMA DHS benchmark are presented.

A. Identification Results
First, to properly identify the system dynamics, a significant dataset of input-output samples is collected.Therefore, the system inputs, i.e., the supply temperature T s 0 and the thermal demands P c i , ∀i ∈ N c , are varied using multilevel pseudorandom binary sequences (MPRBS), composed of steps of random amplitudes and interval time sizes.Specifically, the amplitude of T s 0 is varied between 62 • C and 96 • C, whereas the one of P c i , ∀i ∈ N c , between 30 kW and 420 kW.Considering the system transients, which are typically slow in DHSs, it is reasonable to collect data with a sampling time τ s = 5 min.The system is thus simulated to gather a dataset D tot of 15 690 samples (properly split in training, validation, and testing sets, denoted as D train , D val , D test , respectively).Let us recall that the outputs of interest of the AROMA DHS are the five loads supply and output temperatures, their absorbed water flows (respectively, T s i , T c i and q c i , ∀i ∈ N c ), as well as the overall return temperature T r 0 and water flow q 0 .
Second, in order to quantitatively evaluate the identification performances, specific performance indices are defined.In particular, the fitting (FIT) index and the coefficient of determination R 2 are employed, reported in ( 18) and ( 19), respectively.The FIT index assesses the model overall accuracy on the test set, and it is defined as where is the sequence of measured ones and y avg test is its average, i.e., defined as y ∀i∈D test ⃗ y test (i), as discussed in [43].The R 2 index is leveraged to assess the modeling accuracy of each identified output with respect to the test set [45].The R 2 j related to the jth output is defined as where y for each jth output.In particular, the minimum coefficient of determination, i.e., R 2 = min j=1,...,n y R 2 j , related to the output identified with the worst accuracy and the maximum one, i.e., R 2 = max j=1,...,n y R 2 j , related to the output identified with the best accuracy are evaluated to assess the modeling performances of the developed data-based models.
As anticipated, linear models such as SS, ARX, and OE are not able to properly model the system dynamics, yielding very low FIT values, as shown in [46], and thus they are not considered further.Consequently, standard RNNs are tested in the first place, and then PI-RNNs are developed and compared to the former.The implementation of both types of NNs is performed with the Python programming language (version 3.10), using the library developed in [47] for standard RNNs and customizing it to build up PI-RNNs.The training procedure employed for the different RNNs exploits the so-called truncated backpropagation through time (TBPTT) method, thoroughly described in [40], where the employed loss function is the mean-square error (MSE), measuring the FIT quality of predicted output with respect to the measured one, whereas the RNN states are randomly initialized.Ultimately, all computations are carried out on a laptop with an Intel Core i7-11850H processor.
In detail, two families of RNN architectures are first tested, i.e., long short-term memory (LSTM) [48] and gated recurrent unit (GRU) [43], as well as different combinations of hyperparameters, i.e., amount of hidden layers, of neurons and optimizers, e.g., adaptive moment estimation (ADAM) and root mean squared propagation (RMSProp) [49]).By comparing the performance indices of the different combinations of NNs and hyperparameters (FIT, R 2 , R 2 , best epoch and training time to reach the best epoch), GRU NNs are chosen to identify the AROMA DHS model [46], also thanks to their simpler structure [40].Therefore, only GRU NNs will be considered in the following, even though the proposed identification method applies to any type of RNN.Thus, the performances of a standard GRU and of a physics-informed GRU (PI-GRU) model are now compared.These are trained with the 15 690-sample dataset over 1500 epochs, with the ADAM optimizer and a learning rate of 0.003, so as to get a good trade-off between convergence speed and excessive oscillations avoidance [6].
Since the AROMA DHS is composed of five loads, the PI-GRU model is composed of six GRUs [n RNN = n c + 1, see Fig. 5(b)], each one implemented with a single hidden layer.To make a fair comparison, the standard GRU is composed of six hidden layers accordingly (n l = n RNN ).
Moreover, as highlighted in Remark 2, the number of neurons of the load-associated GRUs in the PI-GRU model increases with the distance of the corresponding thermal load from the heating station.On the other hand, the returnassociated GRU, being paired with a DHS section comprising the overall return network, is assigned a large number of neurons as well.Therefore, given that in GRU networks the state dimension matches the number of neurons of each layer [40], n [i]  x increases with i, as load nodes are numbered according to their distance from the heating station, as previously discussed.By contrast, since hidden layers do not have a physical matching in the standard GRU model, the same amount of neurons is set for each hidden layer, so that its total number of states coincides with the PI-GRU one.
Then, as highlighted in Remark 3, each load-associated PI-GRU is fed, among other inputs described in Section III-B, with the cumulative power consumption.
Fig. 6 reports the comparison between the FIT trend of a 54-state standard GRU and of a PI-GRU one, for the AROMA DHS benchmark.Specifically, n x = 54 is chosen as it is the minimum amount of states that enables the standard GRU to reach approximately an FIT of 50%.The PI-GRU takes 557 epochs and a training time of 91 min to reach its best FIT value of 83.6%, whereas the standard GRU takes 1479 epochs and a training time of 145 min to reach its best FIT value of 54.5%, as evident from Fig. 6.For the sake of completeness, the identification procedure is repeated multiple times due to the random initialization of RNNs weights and biases and the random subsequences extraction of the TBPTT method.The obtained results are reported in Table II, where the average FIT, R 2 and R 2 values are shown, together with their standard deviation, in case the RNN models are trained three times.These results show how the PI-GRU outperforms the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE II COMPARISON BETWEEN THE PERFORMANCE OF A STANDARD GRU AND
A PI-GRU, BOTH HAVING 54 STATES AND TRAINED WITH 15 690 SAMPLES Fig. 7. GRU and PI-GRU identification results.The identified variable is depicted in orange, the measured one in blue.(a) T s 5 identified by the standard GRU.(b) T s 5 identified by the PI-GRU.(c) q c 5 identified by the standard GRU.(d) q c 5 identified by the PI-GRU.
standard GRU, even though the two networks are characterized by the same hyperparameters.In particular, the enhancement in the FIT and in R 2 reported in Table II is promising.Moreover, the PI-GRU takes less than 100 epochs to exceed a FIT of 80%, which is a value that the standard GRU does not even reach within 1500 epochs (see Table II).Moreover, in Fig. 7, the measured supply temperature and water flow trend of the load placed in node α 5 are compared with the predictions both of the 54-state standard GRU and of the PI-GRU one.Being α 5 the farthest node from the heating station, only the PI-GRU is able to properly identify both T s 5 and q c 5 , whereas the standard GRU commits a considerable modeling error (see Table II).In fact, the variables paired with the farthest loads are the most challenging to be identified, achieving low R 2 values.This identification local issue, however, can be tackled by PI-GRUs through a suitable choice of the number of neurons, as discussed in Remark 2, but it cannot be tackled by standard GRUs, and RNNs in general, as their architecture does not have a physical interpretation.
Sensitivity Analysis: A short sensitivity analysis is here reported to assess the robustness of the proposed PI-RNN method.
First, the performances of the standard GRU and PI-GRU networks are evaluated using different amount of states, i.e., n x = 30 and n x = 90.As visible from Fig. 8(a) and (b), the PI-GRU still achieves superior performances, yielding always a FIT above 80%, which slightly decreases as the state dimension drops.In fact, the PI-GRU states dimension has an impact solely on the number of epochs required to exceed the 80% FIT.After several tests, the average FIT, R 2 and R 2 values, together with their standard deviation, are computed and reported in Table III.Ultimately, PI-GRUs, unlike standard   GRUs, are shown in practice to be robust with respect to the number of neurons.In a second test, a dataset of 7845 samples (i.e., half of the original dataset) is used to train both the 54-state standard GRU and the PI-GRU one.Once again, the latter outperforms the standard GRU, as evident from Fig. 8(c).Multiple tests are carried out and the obtained average FIT, R 2 and R 2 values, together with their standard deviation, are reported in Table IV.To conclude, another advantage of such physics-informed method lies in the fact that, when only a limited amount of data is available, the PI-GRU, contrarily to the standard GRU, is still capable of modeling the system dynamics.Additionally, the 54-state PI-GRU trained with a reduced dataset (see Table IV) performs better even than the 90-state standard GRU trained with the complete dataset (see Table III).Remark 4: For a complete analysis, the comparison of a standard GRU and another one characterized by a physics-informed loss function [50] is carried out.In DHSs, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
for instance, thermal loads are fed with water at high temperature from the supply network and they deliver it at cold temperature to the return network.Therefore, it is reasonable to enforce in the GRU loss function the physical constraint according to which each load supply temperature is greater than the output one, using techniques similar to [20], [21], [22], and [23].In the AROMA DHS case, when the GRU is trained by minimizing this physics-informed loss function, a performance improvement is not evident with respect to the standard loss function case, as shown in [46].Ultimately, the physics-informed loss function approach has been discarded.

B. Control Results
The formulated NMPC regulator is implemented in MAT-LAB R2023a using the CasADi environment and the Ipopt solver.Moreover, the control tests are carried out on the developed AROMA DHS simulator, implemented in the Modelica environment using [34].
The NMPC regulator is executed with a sampling time τ s = 5 min and it considers a prediction horizon N = 6 h/τ s = 72 steps.The main control design parameters are reported in Table V.The heating station is assumed to be modeled as an equivalent heat pump, normally characterized by an electrical-to-thermal efficiency η larger than one (i.e., the coefficient of performance [51]).The lower bound of the thermal load supply temperatures is time varying, as previously discussed and reported in Table V.For the sake of simplicity, an open-loop observer consisting of the equations of the employed RNN-based model is implemented to provide the state estimate to the NMPC regulator.Actually, the design of closed-loop RNN-based observers is outside the scope of this work, but the interested reader is referred to [7] and [43] for details about their design.Moreover, considering a typical DHSs operation, the daily trend of the considered thermal demands P c i is reported in Fig. 9(a), whereas the electrical price profile c el is depicted in Fig. 9(b).It is worth noting that the thermal demand profiles are assumed to be known to the NMPC regulator, given that the development of suitable forecasting algorithms is beyond the scope of this work and different related techniques are available in the literature (see [30], [52]).
Regarding the model choice, the NMPC performances are tested considering the 54-state standard GRU and the 30state PI-GRU model.On the one hand, the 54-state standard GRU is selected in order to have at least an average FIT of 50% (see Table II) and a computationally tractable optimization problem.Indeed, the 90-state standard GRU yields slightly higher FIT values (see Table III) but yet computationally heavy optimization problems in case of multistep-ahead prediction horizons, because of the model large dimension: the solver takes almost 3 min per iteration, which would introduce unacceptable delays considering τ s = 5 min.contrast, the 30-state standard GRU is computationally lighter but it leads to very poor identification accuracy (see Table III) and therefore to unreliable models.On the other hand, the 30-state PI-GRU is the physics-informed model characterized by the minimum amount of states that reaches a FIT of 80% (see Table III), thus producing accurate predictions and also computationally The NMPC exploiting the standard GRU model and the one exploiting the PI-GRU one are also compared with a rulebased strategy, where the heating station is operated at constant supply temperature, as typical in DHSs [30], [53], by setting T s 0 = 75 • C. The control strategies are tested over a daily simulation and evaluated based on the following performance indices: the daily production cost C p = T t=1 c el (t)P * 0 (t)/η, where P * 0 (t) is the effective power produced by the heating station and T = 24 h/τ s = 288, the average computational time t avg , and the total thermal losses P loss = T t=1 (P * 0 (t) − n c i=1 P c i (t)).The computed indices are reported in Table VI for the three analyzed control strategies.In particular, when the NMPC regulator exploits the PI-GRU model, the computational time is significantly reduced, given that the 54-state standard GRU is characterized by a larger dimension.Moreover, thanks to the greater reliability of PI-GRU predictions, the production cost is lowered as well.By contrast, when the rule-based strategy is adopted, the production cost clearly grows with respect to NMPC strategies.These economic savings are particularly encouraging in terms of efficiency improvement, as thermal losses are significantly reduced when using NMPC strategies, and in particular when exploiting the PI-GRU-based NMPC, as evident from Table VI.It is worth noting that the economic  savings achieved by the NMPC regulator with respect to the rule-based strategy are related to thermal losses reduction and to a better exploitation of the network pipelines storage capability, since the only control variable is the supply temperature at the heating station.If additional degrees of flexibility were present, such as cogeneration units or thermal storage tanks, higher savings could be achieved by an NMPC regulator, see [16] and [30] for further details.Ultimately, by using computationally efficient data-based models, the optimization problem is tractable even with multistep-ahead prediction horizons: the computational complexity issues related to the physical model are overcome.
The trends of the control variable T s 0 and of the five load supply temperatures obtained by executing the PI-GRU-based NMPC for a whole day are reported in Fig. 10.As visible from Fig. 10(a), the heating station supply temperature never exceeds its bounds.Note that when the electrical price reaches its peak, as shown in Fig. 9(b), the NMPC decreases the heating station supply temperature.On the other hand, T s 0 is raised when the electrical price reaches its minimum.This predictive ability enables to reduce the production costs, as the DHS network is charged by raising the supply temperature when convenient.Moreover, given that the PI-GRU embeds the network thermal dynamics, the NMPC is able to optimize the DHS operations while also ensuring that thermal loads are always supplied with water at temperature within prescribed limits, despite their distance from the heating station, as evident from Fig. 10(b).

VI. CONCLUSION
A novel data-based modeling methodology and optimal control strategy are proposed for DHSs.In fact, they typically involve large-scale nonlinear dynamical models, which are not suited for online optimization-based strategies.On the other hand, DHSs are characterized by significant amounts of historical data, which can be leveraged to identify computationally efficient data-based models, e.g., through the use of RNNs.This work first proposes a novel modeling approach where the potential of RNNs is combined with a commonly known physical information in DHSs, i.e., the DHS network topology, leading to the design of PI-RNN models.It is shown that interconnecting multiple RNNs by resembling the DHS network topology leads to significant improvements in terms of faster training procedures, higher identification accuracy, and reduced modeling complexity, with respect to pure black-box RNN methods.The developed PI-RNN model is leveraged for the design of an NMPC regulator, able to minimize production costs, increase system efficiency, and respect operational constraints over the whole DHS network.The proposed PI-RNN-based NMPC strategy enables to optimize the DHS with a prediction horizon of few hours and reduced computational times, obtaining enhanced performances with respect to the standard RNN-based NMPC and a rule-based control strategy.The proposed methods are tested on a DHS benchmark referenced in the literature, i.e., the AROMA DHS, implemented in simulation using the Modelica environment, achieving promising results both from the modeling and control perspective.
The proposed PI-RNNs approach can be actually applied to other types of networked systems where the topology of physical interactions among subsystems is known, such as industrial plants, electrical and gas grids, or biological systems.Thus, future-related works regard the development of a methodological approach to design physics-informed databased models of networked systems, which can be easily extended to other applications.Additionally, the development of a physics-informed closed-loop observer could be carried out, given its key role when dealing with RNN-based regulators.Moreover, it is worth investigating lifelong learning algorithms for physics-informed models, able to adapt, through a continuous information acquisition, existing models to exogenous changes, e.g., given by a variation in the system interconnections topology.

Fig. 1 .
Fig. 1.Schematic representation of a DHS, interconnecting the heating station and the thermal load at node α i (encircled with a dotted line).

Fig. 2 .
Fig. 2. (a) Schematic representation of the AROMA DHS [4].(b) Graph representation of the AROMA DHS, with load nodes highlighted in yellow.

Fig. 3 .
Fig. 3. (a) Schematic representation of the ith DHS section comprising the ith thermal load and the supply pipe(s) entering node α i (dotted shadow area).(b) ith load-associated RNN having inputs and outputs paired with the ones of the corresponding ith DHS section.

Fig. 4 .
Fig. 4. (a) Schematic representation of the return network section (dotted shadow area).(b) Return-associated RNN having inputs and outputs paired with the ones of the corresponding DHS section.

Fig. 5 .
Fig. 5. (a) AROMA DHS reduced graph: nodes containing loads are highlighted in yellow.(b) AROMA DHS PI-RNN architecture: inputs are depicted in purple, disturbances in green, and outputs in orange.

Fig. 6 .
Fig.6.Comparison between the FIT trend of a standard GRU (orange) and of a PI-GRU (yellow) over the training procedure, both having 54 states and trained with a 15 690-sample dataset.The 80% FIT is depicted in dotted black.

Fig. 10 .
Fig. 10.NMPC results over a daily optimization, for the AROMA DHS.Constraints are depicted in black solid lines.(a) Optimized supply temperature at the heating station.(b) Loads supply temperatures: the first load's supply temperature is represented in blue, in purple the second's, in orange the third's, in green the fourth's, and in yellow the fifth's.

TABLE III COMPARISON
BETWEEN THE PERFORMANCE OF A STANDARD GRU ANDA PI-GRU, HAVING 90 AND 30 STATES, TRAINED WITH 15 690 SAM-

TABLE V NMPC
CONTROL PARAMETERS Fig. 9. NMPC inputs.(a)Thermaldemands: the first load's thermal demand is depicted in blue, in purple the second's, in orange the third's, in green the and in yellow the fifth's.(b)Dailyelectricalprice c el profile.tractableproblems.In fact, note that the 30-state PI-GRU model is characterized, by considering input, output and state variables, by n v = 53 variables, whereas the 54-state standard GRU is characterized by n v = 77 variables.By contrast, the AROMA DHS physical model is characterized, overall, by n v = 882 variables.1Therefore,the amount of optimization variables that an NMPC regulator has to manage is n v • N , where, in case of data-based models is a tractable number, whereas in case of physical models is clearly intractable.