A Graph Theoretical Approach to Modeling of District Energy Networks

Simulation of thermal dynamics in city-scale district energy grids often becomes computationally prohibitive for long simulation runs. Most current model order reduction methods offer limited interpretability with regards to the nonreduced system and are not in general applicable for varying flow rates, multiple producers, or changing flow directions. This article presents a graph perspective on modeling of district energy networks. Based on spectral graph theory, a novel method for model order reduction of district energy systems is proposed. The method approximates the solution of an optimization problem, minimizing the coefficients of the local truncation error for the advection equation. Furthermore, a method for calculating the intracluster temperature distribution is presented. It is shown that the method can be used to reduce the thermal dynamic model of a city-scale energy grid, resulting in a coarser temporal and spatial resolution, with a significant decrease in simulation time. The relative root-mean-square error (rRMSE) was 2.5% for the temperature in the evaluation scenario, comparing the reduced-order system with the nonreduced system at the instances of the coarser time step.

due to the computational demands of dynamic simulation [3].The computational performance is highly dependent on the scale of the problem and the spatial and temporal resolution.The scale of the problem is usually given beforehand from the size of the network, but the resolution of interest depends on the use cases.The resolution of interest can also vary within a specific use case.It would be ideal to find methods that can adapt the spatial and time resolution of the simulation between simulation runs or even dynamically during simulation.
Existing methods to reduce the model can be roughly split into four categories.The first category consists of methods to simplify the model by merging nodes and pipes in a structured manner based on the physics of district energy grids.The most popular methods are the Danish and German methods [4].Both methods are limited in that they do not preserve the original structure of the grid.Thus, the states are not directly interpretable with regard to the original system.
The methods cannot handle changing flow directions or multiple production units.While these methods might seem overly simplistic, they are robust and easy to understand.Expanding the models, e.g., using different types of models, is straightforward for this kind of model reduction, and the models work well for a variety of use cases.
A second category consists of data-driven model order reduction techniques for dynamic systems in state-space form.Proper orthogonal decomposition (POD) is a popular method that is relatively easy to implement and has straightforward nonlinear extensions [5].Since the methods are data driven, they depend on measurements or synthetic data of the process.For the district energy process, measured data are scarce, and it is, in many cases, not possible to generate appropriate synthetic data for any possible scenario.
A third category consists of methods for computational fluid dynamics (CFD), e.g., various multigrid methods [6].These methods are mainly focused on higher order models and higher accuracy than in this article and are not generally applicable.
Within the control community, a fourth category of methods based on Hankel model order reduction and balanced truncation [7] are popular choices and, indeed, do provide good, and under some circumstances optimal, approximation properties.Most of these methods are based on linear-and time-invariant models, which are not generally valid for the thermal dynamics of pipe flow with varying flow rates.The methods are projection-based and hard to interpret with regard to the original coordinates.This article is structured as follows.First, the mathematical background for the differential equations of heat transport and heat losses in a district energy pipe is presented, followed by an introduction to the graph theory used in this article.Next, modeling the pressure and flow in a district energy grid is presented.It is then shown how advection on a graph can be modeled as a linear time-varying state-space system and how reduced order models of these state-space systems can be performed.
In Section III, this article proposes a novel method based on graph theory and the partial differential equations (PDEs) of advection with corresponding discretization methods, posing this as an optimization problem, where the spatial resolution of the grid is optimized with regard to the time resolution of interest, minimizing the coefficients of the local truncation error.To avoid falling into the NP-hardness trap common for algorithms on graphs, the solution to the optimization problem is approximated by solving a generalized eigenvalue problem using the graph Laplacian(s).While there are generic algorithms for clustering (sometimes called sparsening or coarsening of graphs) [8], these cannot be directly applied to a district energy grid without considering the underlying physics.The pressure and flow calculations on a district energy graph are included with appropriate references.Heat losses are treated as static within clusters, and a novel method of treating the distribution of intracluster temperatures is presented.
In the results section, the performance of the clustering is evaluated and compared with a model of a city-scale grid, including the accuracy and improvement in computational performance for different numbers of clusters.
The scientific contributions of this article can be summarized as follows.
1) Showing how to find the reduced-order model can be formulated as an optimization problem that minimizes the local truncation error of the discretization for the chosen time resolution.2) Introducing a method to calculate the temperature distribution within the clusters.3) Showing how k-means clustering corresponds to minimizing the average commute time within the clusters for this application.4) Providing heuristics for optimization of the number of clusters, with a lower bound on the accuracy, that gives a higher bound of the number of eigenvalues needed to calculate using iterative solvers.5) Showing how the eigenvalues of the specified graph Laplacians are connected to eigenvalues of the system matrix of the state-space system describing advection on a graph.

A. Assumptions and Delimitations
For the scope of this article, the problem is limited to a temporal resolution ranging from seconds to hours.The pressure wave propagation velocity in a district heating pipe is ∼1500 m/s, while under normal operating conditions, the fluid velocity is between 1 and 5 m/s.Under these conditions, the pressure dynamics can be modeled as static, and the water can be assumed incompressible.
The piping is assumed to be insulated and installed as single pipes in the ground rather than the supply and return carrier pipes within the same insulated jacket pipe.While this allows for some simplification, it is mainly chosen to narrow the scope of this article somewhat, and most results are valid regardless of the choice of piping [9].The surrounding temperature is assumed to be uniform for the whole grid.
For the scope of this article, the production and consumption of all producers and consumers are assumed to be known and are seen as inputs to the model.In order to simulate the return piping, the return temperature needs to be known as well.
The clustering results in an approximately optimal discretization for a set of flow rates, not necessarily corresponding to an actual operating point.For operating points outside of the optimized set of flow rates, the truncation error will be larger, or the step size will need to be reduced.

B. Thermal Dynamics of Pipe Flow
The thermal dynamics of a district heating pipe can be described by a 1-D PDE, where axial diffusion, pressure losses, dissipation, and wall friction have been shown to have a negligible impact on the temperature for the operational ranges of district heating and are thus neglected [10].The remaining PDE is where x is the temperature, t is the time, and z is the axial direction.ρ is the density, c p is the heat capacity, and s is the pipe's cross-sectional area, where all three are modeled as constant.ϕ(z, t) is the heat loss to the surroundings, and v(t) is the flow velocity in the axial direction.Heat losses ϕ(z, t) in district heating pipes is another wellresearched problem [10].For the heat transfer coefficient, most papers refer to the results of [11].For a single insulated pipe in the ground, the heat losses per unit length can be approximated by where λ g is the conductivity in the ground, x o is the temperature of the surroundings, and h 1 is a function of the depth underground where the pipe is located H , the radius of the pipe jacket (including insulation) r 0 and the dimensionless parameter Here, λ i is the conductivity of the insulation, and r i is the radius of the carrier pipe.Suggested approximate formulas for calculating h 1 can be found in [11].
To solve the PDE in (1) numerically, the pipe is discretized along the pipe length.In this article, a finite volume method and an upwind discretization scheme are used [12], a common choice for modeling heat transport in district energy systems.
Using subscripts for spatial steps, the balance equation for each finite volume of the pipe can be written as follows: where k ϕi is the heat transfer coefficient for each volume.
A well-known necessary condition for the stability of explicit time integration schemes, when the model is discretized in both space and time, is that the volume of flow passing through a volume during a time step of the solver does not exceed the size of the volume.This is known as the Courant-Friedrichs-Lewy (CFL) condition and can be written as follows: where t is the time step and the left-hand side is called the Courant number C.Even if the ODE is not explicitly discretized in the time domain, it is helpful to perform the analysis on the discretized system since the CFL condition is, in most cases, the greatest limiter for the step size.
Since the flow velocity in a circular pipe is v(t) = ṁ(t)/(ρ A) for incompressible flow and z = m/(ρ A) we can rewrite (5) as follows: where τ is the transport delay through the pipe segment.The Courant number also relates to the numerical accuracy of the discretization.A truncation error analysis [13] using a Taylor expansion gives that the local truncation error for a finite volume upwind discretization is approximately For C = 1, the truncation error due to discretization vanishes.Notably, the expression for the truncation error is similar to the diffusion equation, and the phenomenon is hence called numerical diffusion.An illustration of the effect of numerical diffusion can be seen from Fig. 1, where a step-shaped temperature drop travels through a pipe.The smoothing effect of the numerical diffusion can be seen at the outlet of the pipe.While many methods exist to minimize numerical diffusion and choose an optimal time step to achieve better accuracy, for practical applications in district energy simulation, it is usually sufficient to find a discretization with a reasonable trade-off between accuracy and stability for the given time resolution.The step size of the solver is not always available or desirable to control, instead one of the two approaches.
1) Discretize so that the model is stable for all flows with the desired step size, and accept the higher truncation error for the lower flow rates.2) Discretize so that the truncation error is minimized for the lowest flow rate and use shorter step sizes for higher flow rates is usually chosen, where further analysis is not dependent on the particular choice.

C. District Energy Grid as a Graph
The district energy grid consists of a network of parallel connected pipes and to express these interconnections, using graph theory is a natural choice.In Fig. 2, a schematic of a minimal district energy grid with three consumers, and one producer is shown.The shaded circles represent nodesconnection points for junctions, consumers, or producers.The lines between the nodes are called edges and represent pipes.This grid and other small grids will be used as an example in the theoretical part, whereas a city-scale grid will be used in the results section to show the method's applicability.
The flows are assumed as balanced so that the supply flow rate equals the return flow rate for each node, with opposite flow directions, i.e., the most common setup for district energy networks.A simplified schematic of the same grid can be seen from Fig. 3, where the labels n i and e i are added for nodes and edges, and a flow direction is included, denoted by arrows.Specifying the flow direction is needed when simulating advection, but the flow direction can change during simulation.The grid can be represented by an incidence matrix of dimension n × m, where the columns represent the edges and rows represent the nodes if node n j is the target of e i − √ w i j , if node n j is the source of e i 0, elsewhere where w i, j is some weight that is specific to the application.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Consumers and producers are not included in the incidence matrix in this article.When the weights are chosen as w i j = 1, the incidence matrix is unweighted.For the example grid, the unweighted incidence matrix is The matrix L w = M w M ⊤ w is called the weighted graph Laplacian, which is undirected, symmetric, and positive definite.The diagonal matrix with the diagonal elements from the weighted graph Laplacian is the degree matrix D w and W = L w − D w is the weighted adjacency matrix.For unweighted matrices and vectors, the subscript is dropped in the notation.
A strongly connected component (SCC) is a subgraph where any node can be reached from every other node, corresponding to loops in the grid.An example of a grid with one loop can be seen from Fig. 4, where all nodes belonging to the loop are one SCC.Any node that is not belonging to a larger SCC is considered an SCC on its own, corresponding to the single node outside the loop in Fig. 4.
The corresponding incidence matrix for the grid with one loop is and the Laplacian matrix is The graph Laplacian has positive values on the diagonal and negative on the antidiagonal.The example Laplacian matrices seem to have a band-like structure, which is in general the case for district energy systems.Any graph, including graphs with loops, can be sorted in a block structure of SCCs, as in (12), where the first and last rows of each block overlaps It is from here assumed that the graph is sorted in a block-diagonal order as in (12).The graph Laplacian is positive-semidefinite, and its eigenvalues are real and nonnegative with the number of zero eigenvalues corresponding to the connected components of the graph-in the context of district energy grids, this corresponds to the first eigenvalue being 0.
In this article, several different weighted and unweighted graphs Laplacians and modified Laplacians will be used.A symmetric weighted Laplacian L w has the property that for any vector f, it holds that When the graph Laplacian is scaled with a diagonal matrix D d , so that L s = D −1 d L w it is no longer symmetric due to the diagonal matrix scaling the rows.However, L s has the same eigenvalues as a symmetric Laplacian For the vector f, the equivalent to (13) becomes The eigenvectors of L s and L sym are similar, only scaled so that for any eigenvector v s of L w , the corresponding eigenvector v sym of L sym is To calculate the eigenvalues of L s = D −1 d L w , the generalized eigenvalue problem is solved.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

D. Pressure and Flow on a Graph
The pressure-flow problem is well researched, mainly in the context of water distribution networks [14].A district heating network is primarily operating in the turbulent flow regime except for low consumption scenarios, where the corresponding pressure drops of the pipes are also low [10].It is, thus, a reasonable approach to model the pressure-flow equations as if the flow was turbulent.
The pressure drop for a pipe can be modeled as a quadratic function of the volumetric flow rate through the pipe u j = ṁ j /ρ, where the head loss h j in meters is and the friction factor r and the empirical exponent γ depend on the friction model used.Index j is from the enumeration of the pipe section.
There are many ways to calculate the friction factor r j using different empiric models [15].For a district energy network, the available measurements are few.Hence, it is generally preferred to have fewer independent parameters.For this article, the Hazen-Williams equation r j = 10.67 l j ξ γ j d 4.8704 (18) is used, with the exponent γ = 1.852,where l j is the length of the pipe section in meters, d is the diameter of the pipe section in meters, and ξ j is a pipe roughness coefficient that can be estimated from tabular values.While, e.g., the Darcy-Weisbach equation is known to be more accurate, it is significantly harder to parameterize.
The pressure heads and volumetric flow rates of the network in vector form are h = [h 1 , . . ., h l ] and q = [q 1 , . . ., q l ], respectively.The mass balance of the network is directly given by the incidence matrix and the vectors of inflow and outflow to and from each node q in , q out , that are assumed to be known Equation ( 17) can be rewritten using the diagonal matrix G, where G ii = r i |q i | (γ −1) , using the fact that all head losses when traversing the graph sum to 0 and, once again, using the incidence matrix The problem is underdetermined; to solve it, the pressure must be known for at least one of the nodes.For most district energy scenarios, the pressure heads can be assumed to be known at nodes corresponding to pumps or reservoirs, where the vector of known heads is denoted by h 0 .
Partitioning the incidence matrix M so that the rows corresponding to unknown pressures are in M 1 and the rows corresponding to known pressures in M 2 , ( 19) and ( 20) can be combined and written as the nonlinear equation system that can be solved iteratively using gradient descent methods as shown in [14].
Generally, the method converges quickly, and the initial guess does not need to obey the rules of mass conservation.The calculations can be efficiently calculated using well-developed linear algebra methods and are calculated at every time step.Generally, there is no need to perform model reduction for the pressure-flow calculations.More complex pressure and flow calculations on matrix form are available, including, e.g., pump characteristics and consumers not meeting their flow demand.When modeling by hand, these models quickly become error prone, and it is the author's advice to interface well-validated models instead.

E. Advection on a Graph
For the model reduction performed for the advection model, two different weighted Laplacian matrices are used-the flow Laplacian L f = M f M ⊤ f , where the weights correspond to the square root of the mass flow rate through each pipe, m where the weights correspond to the square root of the mass of water within each pipe, w i j = √ m i j .The flow rates used as weights w i j are chosen according to the principles outlined in Section II-B.The mass matrix M d is the diagonal matrix The graph itself can be interpreted as a finite volume discretization, where the edges represent the mass flow rate, and the nodes represent the masses of the volumes.For most use cases, this is a very coarse discretization, given that there can be long stretches of piping without any consumers.The problem can be overcome by introducing intermediate nodes along the pipes, which can be interpreted as an oversampling of the graph.The oversampling consists of adding partitions to the incidence matrix so that the oversampled incidence matrix becomes Advection on graphs for a more general case is described in [16].The thermal dynamics of the supply and return pipes are modeled separately.To model advection, the modified incidence matrix and the flow matrix Q d = diag([q 1 , . . ., q m ]), where q i is the flow on edge e i , are introduced.Q do is the diagonal matrix of flows exiting a node from the supply to the return pipe or vice versa.Q di is the diagonal matrix of flows entering each Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
node and x in the temperature of the flows entering each node.Equation ( 4) can then be written in matrix form as follows: Equation ( 25) can be recognized as a time-varying state space model that can be written in the form with where • denotes the Hadamard product.The matrix M −1 d A(t) is hereafter called the advection Laplacian L adv (t) and is nonsymmetric.For the discretized version, the corresponding advection equation in state-space form is where with the corresponding discrete advection Laplacian where the Courant numbers are the diagonal elements of L adv (k).For the spectral clustering method, the system Laplacian is introduced, that is a symmetric version of the discrete advection Laplacian specified at a set of flow rates.It can be expected that the system Laplacian L s and the discrete advection Laplacian L adv (k) share some common traits since the advection Laplacian is a directed version of the same graph, and the diagonal elements are equal (L s ) ii = (L adv (k)) ii for the same set of flow rates.The system Laplacian has real eigenvalues, whereas the advection Laplacian can have complex eigenvalues.By the Gershgorin circle theorem, the eigenvalues of both L s and L adv (k) are located within disks centered at each diagonal element, that is C i with the radius |C i |, that is the absolute sum of the row elements.This is given by the fact that both matrices are row stochastic.Hence, while not similar, the eigenvalues are located within equal Gershgorin disks, centered at C i with the radius |C i |.
For a sufficiently slowly varying discrete system, a wellknown necessary stability condition [17] is that all eigenvalues of the system matrix are located inside the unit circle.For the advection equation, this means that For a straight pipe or a tree-structured grid without any loops, L adv (k) can be sorted so that it is lower triangular.The eigenvalues of A(k) are hence precisely the diagonal elements (A(k)) ii = 1 − C i .For A(k) to have all eigenvalues within the unit circle, it is thus sufficient that C i < 2. For a more complex grid containing loops, the eigenvalues of the discrete advection Laplacian are bounded to lie within the Gershgorin disks centered at C i with radius |C i |.To guarantee that the real part of the eigenvalues of A(k) are < 1 requires that all C i < 1, that is exactly the CFL condition.Another way to interpret the apparent discrepancy between the CFL condition and stability of the state-space system is by using pseudospectral methods, such as in [13].

F. Low-Rank Approximations by Clustering
Low-rank approximations can be used to improve the computational performance of the previously introduced model.A low-rank approximation of the state space system means that the calculations can be performed in a lower dimensional space R k with k < n, where n is the number of states (nodes in the graph).The resulting number of edges from the reduction is l.This corresponds to contracting the nodes and removing the intracluster edges, as shown in Fig. 5.
A desirable trait of aggregating rather than using projection-based methods is that including and interfacing with other models is straightforward.As shown in Fig. 5, consumers and producers are reconnected to the aggregated node.The flows to and from each node that is aggregated are summed.The total energy in the grid is conserved, and the cluster's temperature corresponds to the mean temperature of the total water volume of the cluster.These are likely some of the reasons for the popularity of aggregation-based model reduction for district energy grids [4].
The reduction matrices P n ∈ R k×n , P e ∈ R m×l calculated from a given clustering serve two separate purposes, namely: 1) merging the nodes of each cluster and 2) removing the edges within each cluster, where and The reduced-order system matrices in (36) can then be computed.Since L = MM ⊤ , P e P ⊤ e = I l , the reduced Laplacian L r can be calculated as follows: For the reduced order state space model, using the notation from (25) and dropping the time index for more compact notation, the reduced versions of the respective vectors and matrices becomes For energy conservation, the reduced-order nodes must contain the same energy as the nodes of the full model.Hence, the reduction is scaled by the mass of the nodes xi = j∈C i m j x j j∈C i m j or in matrix form With the introduced notation, the reduced-order state-space system can be written as follows: where Since y(t) is the lifted state vector in R n , there is a direct interpretation with regards to the states of the original graph, where the lifted states are the averages corresponding to the reduced states.
The method is a singular perturbation model order reduction method for the advection equation, as described in [18].Once the clusters are chosen, one of the nodes of each cluster is chosen as the contraction node n c .The advection dynamics for each node in the cluster is For the contraction, it is assumed that there are no intracluster dynamics.This corresponds to the setting for all i ̸ = c in the cluster.The result is a system of equations That is fulfilled only when The reduced-order model shares the properties of the singular perturbation method; most importantly that the estimation error for advection is 0 at steady state.

G. Heat Losses on a Clustered Graph
The heat loss in (4) and the corresponding low-rank approximation consider the average energy in the cluster.To reach a lower approximation error for larger clusters, the intracluster heat distribution can be taken into account, where the heat distribution is calculated for each point of evaluation.
Adding the heat loss term, discretized in the axial direction, to the discrete advection equation, A(k) and B(k) from (29) are modified so that and the input vector becomes where x o (k) is the temperature of the surroundings, k ϕ is the vector of heat transfer coefficients k ϕi , and K ϕd is a diagonal matrix with k ϕ as diagonal entries.The clustering process and corresponding reduction matrices are unchanged when the heat transfer is included.The intracluster static heat distribution does not interact with the advection dynamics, and is included as a correction term ỹ(k) on the output equation Rewriting (2) as follows: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
and integrating along the pipe length gives the solution where l is the pipe length.Notably, there is no flow rate dependency on the static heat loss, which is shown in [10] to be valid for the operating conditions of a district energy grid considered in this article.
For each cluster, this can be put in matrix form using a heat loss incidence matrix M ϕ , where the index for indicating the cluster is dropped in this section to lighten the notation somewhat.First, all loops must be removed by cutting each loop so that all nodes are reachable starting from a root node since the temperature drops do not sum to 0 in a loop.The remaining graph is called an arborescence of n nodes and n − 1 edges and can be calculated, e.g., with the algorithm given in [19].
For each arborescence cluster, inserting (48) gives where Equation (48) can then be written in matrix form for one cluster as follows: The equation system is underdetermined, which is solved by adding the energy balance criteria given that the mean temperature of the cluster is known from the advection dynamics so that The method calculates the temperature distribution within the cluster, and the mean temperature is unchanged.The procedure can be illustrated by performing the operations on the graph shown in Fig. 6 with six nodes and two clusters.The arborescence incidence matrix has the dashed lines removed and becomes Fig. 6.Arborescence clusters for calculation of the heat loss.Fig. 7. Heat losses δx i with a local heat addition x 2 at the node of the producer.
The heat loss incidence matrices are Summing up, the equation system to solve at every point of evaluation for the example grid is and the right-hand side Incoming flows to nodes in the cluster, such as producers on the supply side, will cause a local change in temperature that changes the temperature balance, as shown in Fig. 7.The temperature rise on the incoming nodes can be approximated by assuming that perfect mixing occurs at the node so that Let the temperature deviation for nodes with incoming flow be x i , and the indicator matrix Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Equation (52) can then be written as follows: where the indices denote cluster.Since the intracluster heat losses are independent of adjacent clusters, the equations can be solved separately for each cluster or, alternatively, solved as one large linear system.The system scales well as it is only calculated at the point of evaluation.Finally, for each step, the correction term is

III. SPECTRAL GRAPH CLUSTERING
Coarsening the spatial discretization of the grid corresponds to contracting nodes by adding the masses of the contracted nodes and removing the edges connecting the contracted nodes.The coarsening can be treated as a graph clustering problem-the original nodes of the graph should be assigned to the clusters N i , i ∈ [1, . . ., k], where the temperature is averaged within each cluster, and the intracluster edges are removed.
The clustering of the reduced grid should minimize the truncation error for the advection equation with some stability margin with regard to the CFL condition.Without any assumptions on the temperatures in the grid, this corresponds to minimizing the coefficients, i.e., minimizing ∥1 − C∥.There are two optimization variables, the number of clusters k and the clustering indicator matrix where Finding H k , k that minimizes ∥1 − C∥ can then be formulated as the constrained optimization problem min where t is dropped since it does not affect the optimization problem.For each indicator vector h i , the structure of the Laplacian and the mass matrix gives that since the expression is 0 within the cluster i, j ∈ N i and w i j = ṁi j if i, j are in separate clusters.The expression in ( 64) is called the cut, denoted by cut(N i , N j ), where the cuts coincide with the diagonal elements ii and for the mass matrix 62) in a more compact form and including the analysis above gives that min The minima of the problem is given by and for a given k, this is also the minima of ∥1 − C∥.Hence, the optimization problem can be structured as two separate optimization problems that can be solved sequentially.One interpretation of the minimization is the graph-cut analogy used in [20].For a cut in a graph G = (N , E) consisting of nodes N and edges E, the weight of the cut is defined as the sum of the weights of the edges separating Nodes are weighted with m i , and thus, the total mass of a cluster is As an illustration of the clustering problem for advection on a graph, it is clarifying to look at the clustering of the graph to two clusters N 1 , N 2 , i.e., the cut is a bisection.Starting from the previously introduced graph of a district energy grid, edge weights corresponding to the maximum flow rate through each pipe and node weights corresponding to the mass of each node are added, as can be seen from Fig. 8.
Finding the smallest weight that bisects the graph is called the minimum cut problem.The minimum cut is trivial in this case, as shown in Fig. 8 where edge and node weights are added to the graph, corresponding to the maximum flow rate through each pipe and the mass of each node.The minimum cut is not optimal-separating the graph as one small cluster containing a single node and one large cluster of nodes would end up with two largely different Courant numbers so that the simulation either ends up with a large truncation error or becomes unstable, depending on the step size.In the example, the sum Courant number is C 1 + C 2 = t (0.1/15) + t (0.1/1) ≈ 0.11 t.
In the graph, the optimal cut is along the weight, i.e., 0.3, giving a sum of Courant numbers C 1 + C 2 ≈ 0.08 t, and is Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.shown in Fig. 9.With the introduced notation, this corresponds to a balanced cut problem minimizing the expression The problem in (66) is, unfortunately, a nonconvex combinatorial optimization problem that even for each k is known to be NP-hard [21].However, the optimal solution can be approximated with proper relaxation of the problem.
Assume that t can be chosen arbitrarily so that the system can always be stabilized for all C i > 0. The n × k projection matrices V k = [v 1 , . . ., v k ] are introduced, where the vectors v i are mutually orthogonal and can take any real value.The relaxed optimization problem can then be written in a more compact form as follows: min Equation ( 70) is known as the generalized Rayleigh quotient.For the bisection of the graph into two separate clusters so that the H k would be reduced to only two indicator vectors, there is an optimal solution for the relaxed problem.The solution is given by the Rayleigh-Ritz theorem, stating that the minimum is given by choosing v 2 as the second eigenvector of the generalized eigenvalue problem L f v = λM d v, where the eigenvalues are sorted in increasing order.The first generalized eigenvector is trivially v 1 = 1/ √ n with the corresponding eigenvalue 0. The second eigenvector v 2 called the Fiedler vector, which is orthogonal to v 1 , solves the minimization problem.The eigenvalues are related to the Rayleigh quotient as follows: The second eigenvector is not an indicator vector unless the two clusters are disconnected, a case not considered in this article.The most basic solution for bisection is to assign all positive values of v 2 to one cluster and all negative to the other, and then assign the clusters as The bisection can be performed recursively until some stop criterium is met.In that case, the number of clusters k is simultaneously optimized, and the stopping criterium can, e.g., be based on the second eigenvalue with a well-defined physical explanation as the sum of Courant numbers between the two clusters.However, recursive bisection approximates an optimal solution to a subproblem recursively, rather than approximating the optimal solution to the global problem.Empirically recursive bisection performs worse than optimizing all clusters at once for the use cases employed in this article, compared with an optimal solution found by an exhaustive search.
For k clusters, the optimal solution of (70), but not necessarily to the nonrelaxed problem, is given by k first eigenvectors in matrix form as before V k [21].The clusters can no longer be assigned based on the sign of the eigenvectors.Instead, k-means clustering is usually performed on the rows of V k .The nodes are treated as points y i in the k-dimensional space spanned by the rows of V k .The k-means algorithm solves the minimization problem min where µ i is the mean of the nodes in the new coordinate system, spanned by the eigenvectors.

A. Commute Time Perspective on k-Means
The k-means part of the clustering is often presented without any further analysis, but the commute distance can describe how this makes sense for advection.The commute distance c i j on a graph is the distance for a random walk from node i to node j and back.It is closely related to the shortest path but includes all possible paths.The metric relates to the Courant number for this specific application as follows: The computation of the commute time uses the Moore-Penrose pseudoinverse of the Laplacian L † s .Since the Laplacian can be decomposed as L s = V V ⊤ using its eigenvectors the pseudoinverse is The eigenvectors are the same, but the eigenvalues are inversed and, therefore, reversed in order so that λ † i = (1/λ i ) except for the zero eigenvalue that is disregarded for the pseudoinverse.The commute time between vertices is given by where y i is the ith row of the matrix of eigenvectors V and n is the number of vertices.Thus, the expression that is minimized in (72) would be exactly the mean commute distance from the mean µ i if all eigenvectors V would have been used.Since L s is a normal matrix, the eigenvectors are also singular vectors, and the eigenvalues are λ i = √ σ i .For any stable system, all eigenvalues of L s are strictly positive.In this sense, the relaxed spectral clustering can be seen as an SVD low-rank approximation of the commute time Laplacian.

B. Choosing the Number of Clusters
Finding the optimal k presents a dilemma-searching and evaluating a high number of clusters implies calculating many eigenvalues, which defeats one of the purposes of spectral methods and computational efficiency.Spectral clustering is mainly used to find natural clusters in data, and heuristics used for this purpose, such as the spectral eigengap [21] does not apply in general to the use case in this article.
With the results presented in this section, some useful bounds for a sufficient number of eigenvalues are derived to limit the search space.First, we note that the eigenvalues of the generalized eigenvalue problem Lv = λM d v are the same as the eigenvalues of the matrix M −1 d L. We introduce the notation: λ k is eigenvalue k of the unreduced system Laplacian L s .λk is the largest eigenvalue at index k of the relaxed reduced system Laplacian where V k is the matrix with the k first eigenvectors as columns.λk is the largest eigenvalue at index k of the clustered reduced Laplacian and Ck is the maximal Courant number of the clustered reduced system.The Courant numbers of the clustered reduced system are the diagonal entries of the clustered reduced Laplacian We then have the main result that the eigenvalues and the Courant numbers are connected in the sense that 2 Ck ≥ λk ≥ λk > λ k . (79) The first inequality follows from combining the Schur-Horn theorem with the Gershgorin circle theorem.The Schur-Horn theorem states that for a Hermitian matrix where the eigenvalues λ i and diagonal entries d i are sorted in nonincreasing order.It follows that the diagonal entry of the largest eigenvalue is larger or equal to the largest diagonal entry.The Gershgorin circle theorem reads that for a square matrix with the entries a i j i.e., the eigenvalues λ i are located within a disk centered at the diagonal entry with radius Since the Laplacian is row stochastic, the radius is always Algorithm 1 Spectral Graph Clustering of District Energy Network The second inequality follows directly from the Rayleigh-Ritz theorem that V k minimizes the Rayleigh quotient for k eigenvectors and thus provides a lower bound on the eigenvalue.The last inequality follows from the generalized version of the Poincaré separation theorem [22] that states that for a real symmetric matrix L sym and a (semi-)orthogonal matrix V k , for the eigenvalues λi of ( The results imply that if k eigenvalues are used for clustering and the generalized eigenvalue λ k > 2, and then, the reduced order system violates the CFL condition.It is, thus, sufficient for an iterative eigenvalue solver to calculate all eigenvalues λ i < 2C max to find the best stable solution.For an iterative solver, this can serve as a stopping criterium.
The clustering problem is nonconvex, and optimizing k might end up in a local minimum.For large networks and many clusters, the global optimum can be found by restarting the optimization with different starting values.For smaller networks or a small number of clusters, the optimization can be replaced by a sweep evaluating all possible k.Since no derivative is available for the combinatorial problem, gradientfree optimization methods such as Nelder-Mead can be used.
The clustering process is summarized in Algorithm 1.

C. Note on Numerical Methods
The computational bottlenecks for large grids in Algorithm 1 are the calculation of the eigenvalues and the memory requirements of the matrices and corresponding algorithms.However, only the first k eigenvalues and corresponding eigenvectors are needed, and the graph Laplacian matrix is sparse.Storing the graph Laplacian matrices in a sparse format and using iterative algorithms, such as the Lanczos algorithm, allows for efficient calculations of the first k eigenvalues and corresponding eigenvectors when the number of clusters is sufficiently small.For reference, on a standard laptop with an eighth generation i7 Intel processor and 16 GB of RAM, some elapsed CPU times for the clustering algorithm are presented in Table I.

IV. RESULTS
Evaluating a reduced order model has its pitfalls.For a cityscale grid, there is no ground truth available due to a lack Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply. of measurements-this is often the reason that a simulator is needed in the first place.Unknown and uncertain parameters affect the results, and to some extent, both production and consumption are driven by human behavioral patterns.

TABLE I ELAPSED CPU TIME FOR THE CLUSTERING ALGORITHM
A common way to evaluate a model order reduction is by comparing the reduced order model with a fine-grained model [23].Even then, it is not apparent what to compare.For example, an aggregation-based model averaging over several states can be compared with the average of the states of the fine-grained model or all of the states of the fine-grained model.Furthermore, all model order reduction methods can not interpret the reduced states with the original states in a sound manner.
In the end, the type of comparison is a user choice and should reflect the use case appropriately.The comparison in this article relies on the following assumptions.
1) There exists a fine-grained simulator that is verified with either measurements or some other well-verified simulator.2) All states are of equal interest.
3) The compared scenario should reflect realistic consumption patterns.4) The goal is to find a model suitable for running with a larger time step that, for the new sampling rate, emulates the fine-grained model.Based on these assumptions, all the states of the fine-grained model were compared with all the lifted states of the reduced-order model at the exact time instances of the lower sampling rate.The inputs used were smooth and slowly varying without any frequency content above 1/(2τ s ), where τ s is the sampling rate of the reduced-order model.The comparison setup is represented graphically in Fig. 10.
To achieve this, a fine-grained model of a subgrid of the district energy grid of Luleå, Sweden, where the topology is shown in Fig. 11, was generated and validated against models previously validated and developed in the Modelica language.The reference simulator was previously developed within the OPTi project [24].The project started in 2015 as a European Union Horizon 2020 project and was coordinated by Luleå University of Technology, Luleå.
Only the supply piping was included in the simulation, but the models are also valid for the return piping.While the advection equations, pressure drops, flow rates, and heat losses are similar on the return side, with an opposite flow direction, the return temperatures of the consumers need to be modeled to get realistic results for the return piping.
The models were automatically generated from geographical grid data and piping information, including the coordinates of each pipe section and the pipe diameters.The spatial discretization used in the reference model corresponded to each pipe being discretized with a uniform segment length z when the pipe length exceeds z.Elsewhere, the pipe length provided by data was used.The original model consisted of about 3700 pipe sections and 1500 consumers.After some initial preprocessing, where nodes connected by pipes shorter than 10 m were merged, the resulting model included 2715 pipes.
The reference model and reduced order models were generated as Julia code and ran using BDF solvers provided by the package [25].
The simulation scenario used was an increased outdoor temperature, with an increased demand of the consumers and a linearly increasing supply temperature.The demands of the consumers included morning and afternoon peaks, with random perturbations so that the demand pattern resembled what could be expected from real-life data.An example of the demand for 100 different consumers is shown in Fig. 12.The simulation time was set to 24 h, where the reference model used a time step of 5 s, and the reduced model used a time step of 600 s.While the model was optimized with for the time step, this was not necessarily the internal time step of the solver.The reduced model's speedup factor was  around 90, resulting in the reduced-order model running at about 4000 times the real-time speed using a standard laptop computer.To evaluate how close the reduced model tracked the reference model, the states (temperatures) x of the original model are compared with the lifted states P ⊤ n x, at each time step of the larger step size, using the rRMSE rRMSE = 100 1 x The reduced-order model was generated using the algorithm provided in Algorithm 1 and the procedure described in this article.An example of the clustering found by the algorithm for k = 150 clusters is seen in Fig. 13.The number of clusters was chosen so that C max ≈ 0.9 for the time step size and maximum flow rate, corresponding to about 150 clusters for the reduced model.
The first comparison used idealized pipe models only including the advection equation for thermal dynamics, so that (nonnumerical) diffusion and thermal losses were neglected.The simulation comparison resulted in an rRMSE of ≈2.0%.A heat map of the absolute error can be seen in Fig. 14, with more significant deviations following the peaks in heat load.From the truncation error analysis, it can be deduced that the reduced-order model does not necessarily have a larger truncation error, and the reduced model might be better in this regard.However, the dynamics of the more fine-grained model between the 600-s time steps are lost in the comparison, which might be of interest for specific use cases.
In the second comparison shown in Fig. 15, a more realistic model with heat loss to the outside was used, otherwise, using the same setup as in the preceding comparison.The resulting rRMSE was 2.7% for this case, and as expected, some individual nodes-typically for low-consumption consumers-were relatively far off due to averaging the temperature over the cluster.
In the third comparison shown in Fig. 16, the more realistic model with heat loss to the outside was compared with the model with the static intracluster heat losses implemented.The resulting rRMSE was 2.5% for this case, with the maximal error occurring during faster changes in the flow rate.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE II ELAPSED CPU TIME AND RELATIVE RMSE FOR 24 h OF SIMULATION
In Table II, a comparison between the elapsed CPU time and the rRMSE for different choices for the number of clusters for 24 h of simulation time, compared at the instances of 600-s time steps, is shown.Notably, more clusters did not necessarily result in a better rRMSE for a given time step, consistent with the theoretical truncation error.

V. CONCLUSION
This article presents a graph theoretical approach for reduced-order modeling of district energy grids.The method preserves the total energy of the water in the system, and the states can be directly interpreted with the states of the original system.
As is, the method can be applied for scenario-based simulation within normal operational limits.For a wider range of flow rates, different clustering setups can be needed to achieve the desired accuracy, and for very low flow rates, the heat loss equations need to be modeled in more detail.For automatic control purposes, a more comprehensive treatment with regards to, e.g., observability and controllability might be needed.
Future work includes researching frequency-weighted error bounds since this is hypothesized to be one of the reasons for the relatively high accuracy, although it is a challenging problem for combinatorial optimization such as clustering.The aim is to implement and validate the model in a scenario such as a smaller scale district energy grid where there are more measurements that can be used for validation.Finally, using the model for model-based control is one of the main objectives for the future.He is currently a Professor of automatic control with the Luleå University of Technology, and a CTO of Predge AB, Luleå.He has a background in the development of active safety systems and process control applications for resource-efficient processes.In the active safety area of passenger cars, he has been working with systems for collision avoidance and mitigation, and driver state estimation and warning.In the area of process control, his research work has led to methods, tools, and control solutions which increase the resource efficiency for energy systems, iron and steel-making processes, and processes in the pulp and paper industry.He has coordinated several large international research projects and is a co-founder of several spin-off companies from research projects, like, for example, Sentiente+ and Predge AB, which provides cloud-based decision support solutions for predictive maintenance to the railway sector and process industry.His main research interests are control configuration selection methods under model uncertainties and the application of modeling, simulation, and control in cyber-physical and large-scale complex systems.
Dr. Birk has received the Henry Ford Technology Award, in 2007, for his work relating to the development of Volvo Cars Driver Alert.

Fig. 1 .
Fig. 1.Temperature at the inlet and outlet of a discretized pipe without heat losses showing the effects of numerical diffusion.

Fig. 2 .
Fig. 2. Graphical representation of junctions, pipes, consumers, and one producer, where the shaded areas are referred to as nodes in the graph.

Fig. 3 .Fig. 4 .
Fig. 3. Graph representation of a district energy grid with nodes weighted by n i and edges by e i .

Fig. 5 .
Fig. 5. Three nodes with masses m 1 , m 2 , and m 3 are contracted to one node with mass m1 = m 1 + m 2 + m 3 , and the intracluster flows are removed.

Fig. 8 .
Fig. 8. Graph with edge and node weights and a corresponding minimum cut.

Fig. 9 .
Fig.9.Balanced cut, corresponding to minimization of the sum of Courant numbers, where the node masses are summed to a cluster mass.

Fig. 11 .
Fig. 11.Topology of the grid used as a validation model, showing how the piping is geographically distributed.

Fig. 12 .
Fig. 12.Flow demand of 100 of the total of 1513 consumers for the simulation time of 24 h.

Fig. 13 .
Fig. 13.Example of clusters found by the spectral clustering algorithm.

Fig. 14 .Fig. 15 .
Fig. 14.Heat map of the absolute error for each node, with node number on the y-axis and time on the x-axis, for a simulation of the advection equation only.

Fig. 16 .
Fig.16.Heat map of the absolute error for each node, with node number on the y-axis, and time on the x-axis, for a simulation including the static heat distribution.

Johan
Simonsson received the M.Sc.degree in electrical engineering from the Uppsala University, Uppsala, Sweden, in 2007.He is currently pursuing the Ph.D. degree with Luleå University of Technology, Luleå, Sweden.He has a background as a technical consultant, specializing in modeling and control of industrial processes, at Optimation AB, Uppsala, where he has been working since 2007.His main research interests are modeling, simulation and control of large scale industrial systems, distributed and decentralized control, and graph theoretical methods.Khalid Tourkey Atta received the B.Sc. degree in electrical engineering and the master's degree in control and computer engineering from Baghdad University, Baghdad, Iraq, in 1997 and 2000, respectively, and the Ph.D. degree in automatic control from the Luleå University of Technology, Luleå, Sweden, in 2015.He is currently an Associate Professor in automatic control with the Luleå University of Technology.His research focused on optimizing and controlling complex and distributed processes like district heating, mineral processing, and hydropower plants using model-based and model-free control techniques.Wolfgang Birk received the M.Sc.degree in electrical engineering from the University of Saarland, Saarbrücken, Germany, in 1997, and the Ph.D. degree in automatic control from the Luleå University of Technology, Luleå, Sweden, in 2002.