Estimation of Operation Cost of Residential Multiple Energy System Considering Uncertainty of Loads and Renewable Energies

This paper introduces a new method of estimating the variation interval and the expected value of optimal economic operation cost for the energy hub (EH) model considering load and renewable energy uncertainties in the residential district. Firstly, renewable energy and load uncertainties are defined as uncertain sets of historical data. Due to dispatch factors, the traditional EH model is non-linear and difficult to calculate and analyze. In this paper, a multi-step modeling method is developed for the breakdown of a complex EH model (CEH) into several simple EH (SEH) models and variable substitutions in each SEH, to eliminate the non-linearity of CEH. The upper and lower bound of the operating costs are then obtained based on the duality theorem and the expected value is calculated by expected value model. Finally, the case study in Beijing residential district is carried out to confirm the efficiency of the proposed method.


I. INTRODUCTION
The worldwide energy use today depends mainly on fossil fuels, which causes considerable emissions and are also on the verge of exhaustion [1]- [3]. Countries around the world, therefore, try to promote a revolution in the energy sector by exploiting renewable energy and changing the way they consume traditional energy. Many researchers from different countries have concentrated on multi-energy planning in recent years and slowly reached a consensus on multiple energy system (MES) innovations [4]- [6]. The MES includes a number of energy systems, for example, electricity and natural gas, which greatly enhance renewable energy consumption. Nonetheless, a unified optimization measurement is difficult to perform due to the dynamic nature of the MES with high uncertainty [1], [7]. So, the optimal operating costs The associate editor coordinating the review of this manuscript and approving it for publication was Canbing Li. of MES must be calculated quickly and accurately in light of the uncertainty of loads and renewable energy sources.
A broad variety of MES models have been proposed for the analysis and modeling of MES optimization and can be divided into three main types: EH model [8], interconnected EHs model [9] and detailed multi-energy carrier model [10], which consider the transmission constraints between each component in the system. Different models are chosen to address the problems according to different research objects. The EH model, which the ETH Zurich community first proposed [11], is commonly chosen to optimize residential and district energy management. For example, the optimal electrical and thermal energy management of a residential EH is discussed in [12]. [13] presents the development of a generic optimized industrial load management model based on EH management systems, which demonstrates the effectiveness of both industrial customers (a flour mill and a water pumping facility).
The investment decision and optimum size of residential consumers are discussed in [14]- [18]. Multi-objective EH MES operational management is evaluated with the use of safety, economic and gas emission indicators in [19], [20].
The key challenge in the EH operational analysis is deciding how much energy should be distributed, converted and consumed. The dispatch factor is proposed [8], [9] to clarify this problem. Yet the resolution of the coupling matrix of the EH model is a non-linear problem because of the dispatch factor, which is difficult to quantify with global optimal results. Nowadays, many new electrical appliances, heating and cooling appliances, such as plug-in hybrid electric vehicles (PHEV) and inverter air conditioners, make it difficult to forecast the load variation in smart buildings [21]. Renewable energy, such as rooftop solar panels and building integrated wind turbines, also bring uncertainty to optimal calculation. An additional relevant subject is therefore the optimal calculation of the residential area, considering the volatility of load and renewable energy. The instability of the EH model is discussed in several papers. The main method can be classified into two types: the two-point estimation method [21]- [26] and the Monte Carlo simulation method [27]- [32]. The twopoint estimation method is an approximate method used to estimate the relative accuracy of the result. Yet this method does not take the extreme situation into account. The range of the uncertainty set of operation cost cannot be obtained, which cannot be applied for risk management or multiple energy system design. Through using several simulations, the Monte Carlo method can roughly estimate the region of operation cost. However, in reality, this approach could have two major obstacles: 1) accurate region estimation costs nearly a thousand times simulations, which require a considerable amount of time and energy. 2) There are no systematic methods in place to determine an appropriate function to describe renewable energy output uncertainty and descriptions of these variables are largely subjective.
To overcome these shortcomings, this paper suggests a new way of dealing with the uncertainty of loads and renewable energy and calculate the expected value. In Section II, this paper uses a data cleaning method to remove unreliable data and then to calculate the boundary and the expected value of uncertainty variables in the research time interval. In Section III, this paper introduces a new multi-step modeling approach for a linear transformation of the non-linear EH model. In Section IV, based on the linear model in Section III, the uncertainty set and expected value of optimal operation cost can be calculated on the basis of strong duality theorem and expected value model. In Section V, real measured data of the Beijing residential district is conducted to show the efficiency of the proposed method. All discussions in this paper are concluded in Section VI.

II. DATA SET ANALYSIS OF UNCERTAINTY VARIABLES
For the optimum energy management of MES, historical operational data on renewable energies and loads obtained from real residential areas are important. Nonetheless,

Algorithm 1 Quartile division algorithm
Step 1: Arrange the sample data in ascending order X = [x 1 , x 2 , . . . , x f ] Step 2: According to different remainders, the first quartile Q 1 are calculated as: Step 3: Determine the internal limit of outliers in data sample Step 5: Check each value in the sample data Update h = h+1 and go back to step 4.
the data must be pre-processed before utilization. Data analysis is a process of inspection, purification, transformation, and modeling of data in order to discover useful information, inform conclusions, and support decision-making [33].

A. ELIMINATING ABNORMAL DATA BASED ON QUARTILE DIVISION METHOD
There should be concern about the abnormal data in the data group. Including outliers in the calculation and analysis process of data can have detrimental effects to the result. The abnormal data come primarily from two aspects: 1) Supervisory Control and Data Acquisition (SCADA) system measurement errors. 2) Some unexpected accidents [34]. Interquartile range (IQR) is a reliable statistic, that is, the IQR value does not differ significantly with individual abnormal data. According to IQR, identifying outliers is therefore stable and reliable. In this paper, according to [35], [36], Algorithm 1 for quartile division is adopted to eliminate abnormal data. In the algorithm, f , p, h are subscripts; f represents the total number of data; p represents the quotient of f /4; Q 1 represents the first quartile and Q 3 represents the third quartile; I QR is equal to Q 3 − Q 1 ; The detailed procedure is shown as follows:

B. REGION DEFINITION AND EXPECTED VALUE CALCULATION
The collection of data is based on the research target. If the model is used to estimate the operation of a month, the historical data from that month should be obtained in historical years. The data is then separated by different research time VOLUME 9, 2021 Algorithm 2 Region definition algorithm Step 1: Separate the data into different groups according to the research time scale.
Step 2: Eliminate abnormal data based on Algorithm 1.
Step 3: Calculate the boundary [x min , x max ] of data: . , x f Step 4: Calculate the expected value of the data in each. group: Step 5: Define the uncertainty set of the parameters: scales into different groups. For this paper, we choose one hour as research time interval and divide the measurement data into 24 groups, which corresponds to their own time period. Algorithm 2 is used in each group to calculate the boundary and expected value of group data that will make full use of reliable data.

III. EH MODEL AND ITS SIMPLIFICATION METHOD A. ESTABLISHMENT OF DIRECTED GRAPH OF EH MODEL
Graph theory is adopted to describe the connection of each component to simplify the EH structure. The relationships between physical components and graph theory concepts are a) A node is the abstraction of aggregation points of energy flows and energy converters. The terminals of input and output are known as special nodes. b) Each energy flow to or from the nodes is represented by a branch. Branches carry flows of energy into any vector, e.g. gas, electricity, heat, cooling.
c) The number of input and output ports in each node is fixed.
d) The EH model is a directed graph. Each branch has a specified direction, i.e. from the source to the sink.

B. REGION DEFINITION AND EXPECTED VALUE CALCULATION 1) AGGREGATION POINTS OF ENERGY FLOWS
The aggregation point can be considered as the node with one or more ports (r input ports and s output ports), as shown in Fig. 1. The relationship between input and output ports can be expressed as: where Lout = [L 1 , L 2 , . . . , Ls]; T is the vector of output energy of the node, Pin = [P1, P2, . . . , Pr]T is the vector of input energy of the node, W is s × r matrix of ones.

2) ENERGY CONVERTERS
The energy converters can be considered as the nodes with one or more ports (r input ports and s output ports), as shown in Fig. 1. The relationship between input and output ports is And all energy devices can be regarded as a combination of aggregation points and energy converters In this paper, SEH implies that the input energy reaches the output energy through only one node (excluding the special node). As mentioned in part B), the coupling formulas between the inputs and outputs of the SEH are shown in (4) - (6).
Incolumnj : where i, j is the index of input/output port of the EH. j → i denotes input energy j is connected with load i, j →i denotes input energy j is not connected with load i, m, n is the number of input and output ports of SEH. K is a set of elements in the coupling matrix, η ij of which is 1.

2) MODELLING CEH
In this paper, the CEH refers to fact that the input energy reaches output energy by more than one node. In a practical application, most of the EH models are CEHs. To solve CEH, CEH should first be broken into several SEHs based on CEH separation algorithm called Algorithm 3. The detailed process is demonstrated in Algorithm 3. And each CEH can be broken into several SEHs based on Algorithm 3, as shown in Fig. 2. The output energy of the front SEH is the input energy of the latter SHE in the separated CEH and can be expressed as [37] P step,k = C step,k P step,k−1 (7) 4876 VOLUME 9, 2021 FIGURE 2. Conversion of CEH to multiple SEHs.

Algorithm 3 Multi-step modeling algorithm for CEH
Step 1: Set the input terminals of EH in the initial step (step 0). Step 2: Find the maximum route (from input terminals to the node) of each node.
Step 3: Each node can be divided into different steps sequences (step k), according to its branch number of the maximum route.
Step 4: Output terminals are arranged to the last sequence of step.
Step 5: Search among each branch and find the orders of starting point (step M ) and ending points (step N ) of the branch.
be inserted in the branch.
Step 7: Return to step 5 until all the branches are checked.
Step 8: Return to step 1 ∼ 4 to reorganize the nodes (original nodes and virtual nodes).
where P step,k is output energy of the k th order of SEH, P step,k−1 is output energy of the (k −1) th order of SEH, C step,k is the coupling matrix of the k th order of SEH. Therefore, the coupling matrix C of the CEH is the product of the coupling matrix of each SEH, which can be expressed as where N k is the number of SEHs In order to eliminate the nonlinear coupling property in formula (4), the variables of (4) is replaced by using (9).
where P vs = P ij m×n is variable substitution and o is a Hadamard product. In the P vs , After variable substitutions, (4) can be transformed to (11), where η = η ij m×n is an efficient matrix; η(i) represents the i th row of the matrix; P vs (i) represents the i th row of the matrix.

2) LINEARIZATION OF CEH
From the discussion in part E), the CEH can be expressed as N k SEHs and C step,k is a m k × n k matrix.
After variable substitution, in each SEH, the EH model can be expressed as: where k = 1, 2, . . . N k ; η step,k is efficient matrix in step k; P vs,step,k−1 is variable substitution matrix in step k.

IV. SOLUTION METHODOLOGY OF ECONOMIC OPERATION ESTIMATION CALCULATION A. GENERAL ECONOMIC OPERATION MODEL OF EH
In this subsection, we formulate the residential EH mathematically as an optimization problem considering renewable energy and load uncertainty.

1) THE OBJECTIVE FUNCTION
In this paper, the objective is to minimize total energy purchase cost over the time horizon, which is where γ t G , γ t E , γ t RE are grid electricity, natural gas and renewable energy tariffs at time t, respectively. P t G , P t E , P t RE are purchased energy from power grid, gas network and renewable energy plant.
Obeying Chinese policy and environment protection, the energy system will have as much renewable energy as possible.
2) OPTIMIZATION CONSTRAINTS a) Energy flow equations is (13) b) Capacity constraints: P eq ≤ P eq ≤ P eq (15) where P eq , P eq are lower and upper bound of the capacity of energy converters c) Constraints of variable substitution ij,k∈K

B. SOLUTION METHOD OF ECONOMIC OPERATION ESTIMATION FOR EH
Renewable energy can be regarded as negative loads, which is set among the right-hand side. Inequality constraints can be transformed into equality constraints and unconstraint variables can be transformed to positive variables by adding dummy variables. For ease of analysis, the EH can be written in a compact form: where x is substitution variables and dummy variables; b is uncertainty renewable energy and loads; c 1 , c 2 and A are constant matrices, which can be derived from (13) - (16). The expected value model [38] used to calculate the expected value of operation cost can be expressed as: According to addition theorem of expected value [39], (18) can be transformed as: where The upper and lower boundary problem has been changed so that the best and worst optimal solution of (17) can be determined. In this case, the best optimal solution of the problem is the lower bound of optimal operation cost, which can be expressed as: The worst optimal solution is the upper bound of optimal operation cost, which can be expressed as: Because (21) cannot be calculated directly. So, the following transformation is made: Theorem 1: Weak Slater's condition [40]: If the primal problem is convex, when some of the inequality constraint functions are affiliate (assume f 1 , . . . , f k l are affiliate), then strong duality holds provided the following weaker condition holds: There exists a x ∈ relint D with f s l (x) ≤ 0, s l = 1, . . . , k l , f s l (x) < 0, s l = k l + 1, . . . , m l Remark 1: By the weaker form of the Slater's condition, it can be found that strong duality holds for any linear programming (LP) provided the primal problem is feasible. Applying this result to the duals, the strong duality holds for LPs if the dual is feasible.
For a given b, the dual program of (17) can be expressed as: According to Theorem1, (17) can be replaced by the following quadratic linear program:

V. CASE STUDY
In this section, the proposed method is applied over a 24-hour span in the economic optimization operation estimation of a residential district of Beijing. Fig. 3 illustrates the configuration of the residential district complex MES comprises of a transformer (Tr), a CHP unit, a furnace (F), an electric heater (EHe), a compression chiller (CC), an absorption chiller (AC), micro wind plant (WP), micro photovoltaic (PV) station and solar heat exchange (SHE) station. The respective electricity (E), heat (H) and cooling (C) load demand patterns are the two usual months in summer and winter. The day ahead electricity price is referred to [41]. The gas price is expected to be constant. Renewable energy supply prices are determined by a bilateral contract which includes wind power ( 0.6/kWh in winter, 0.8/kWh in summer), photovoltaic (PV) power ( 0.55/kWh in summer, 0.75/kWh in winter), and heat power ( 0.4/kWh in summer, 0.8/kWh in winter), and preferential access to renewable energy sources. The energy converter parameters are shown in Table 1.

A. UNCERTAINTY SET DEFINE
This paper picks two typical months in February (winter) and August (summer). The data in February and August have more presentative characteristics, which have large boundary compared with December and January (winter) and June and July (Summer). The historical measurement data from the residential district in Beijing, China for these two months from 2014 to 2016. The measurement accuracy is 1s. The total time horizon is divided into 24-time periods and research time interval is 1 hour. Based on the proposed method, abnormal data are omitted and interval distribution of energy demand and renewable energy are obtained, as shown in Fig. 4 and Fig. 5. Energy demand and renewable energy outputs in different time periods are limited in the interval bounds, i.e., the possible value in the uncertain set at each time period always appear within the bounds. Note the output of SHE is identical to PV that is not discussed in this paper.

B. EH MODEL MODELING AND LINEARIZED METHOD
In order to arrange the orders of different nodes in this MES, Fig. 3 has been presented in the form of a directed graph with nodes arranged, as shown in Fig. 6. In Fig. 6, V1-V17 are  virtual nodes, whose efficiencies are 1. In this case, the longest route of energy flow is (D1→ CHP→ D2→ D4→ EHe→ D5→ D7→ AbC→D10) and highlighted mark in Fig. 6. Moreover, the total sequence number of the complex VOLUME 9, 2021 MES is 9. After arranging the orders of real and virtual nodes, the CEH model is decomposed into 9 SEHs. In each SEH, the nodes are decoupled. According to Eq. (4) ∼ (6), the coupling matrix of SEH can be easily formulated. Based on Eq. (9) ∼ (11), each SEH can be changed to linear. The input and output equations of the first SEH is: where v1 and v2 are the dispatch factors of D1.
The variable submission of the first SEH is: The linear input and output equations of the first SEH is: The input and output equations of the second SEH is: The input and output equations of the third SEH is: The input and output equations of the 4th SEH is: where v3 and v4 are the dispatch factors of D4.
The variable substitution of the 4th SEH is: The linear input and output equations of the 4th SEH is: The input and output equations of the 5th SEH is: The input and output equations of the 6th SEH is: The input and output equations of the 7th SEH is: where v5 and v6 are the dispatch factors of D6; v7 and v8 are the dispatch factors of D7. The input and output equations of the 7th SEH is: (40) The linear input and output equations of the 7th seh is The input and output equations of the 8th SEH is: The input and output equations of the 9th SEH is:

1) EXPECTED AND BOUNDARY OPERATION COST VALUE
The price of gas and electricity is shown in Fig. 7. The projected gas and electricity supplies are shown in Fig. 8. As shown in Fig. 8, these two energy supplies are complementary. At low electricity tariff time horizon: 0:00∼6:00 in Fig. 7 (a) and 0:00∼8:00 in Fig. 7 (b), the electricity supply is the priority adoption. At high electricity tariff time horizon (such as 8:00-12:00 and 14:00∼18:00), gas is being used to meet the energy demand. For other extreme situations, such as 12:00 in Fig. 7 (a) and 12:00, 14:00, 18:00 in Fig. 7 (b), the electricity supply is negative, that is, because of high electricity price, the micro energy system also sold electricity back to the grid. We assume that there is no correlation between renewable energy uncertainty and energy loads.
In this paper, we consider four scenarios: a) typical winter month: February b) typical summer month: August c) whole VOLUME 9, 2021 winter including December, January and February d) whole summer from June to August. In Table 2 the limit and the expected value of optimal operating costs are shown as the research time interval with 1 hour, based on the solution method described in Section IV.

2) COMPARISON OF PROPOSED AND TRADITIONAL METHODS a: COMPARISON OF LINEAR AND NONLINEAR CALCULATION METHODS
Taking the calculation of the expected value of operation cost as an example, the comparison of the traditional nonlinear calculation method and the proposed linear calculation method is shown in Table 3. The linear programming (LP) problem is solved by using CPLEX under GAMS. The nonlinear programming (NLP) problem is solved by using SBB/CONOPT under GAMS. The maximum number of iterations is set at 1000. In Scenario 1 (February), the solver cannot solve the problem. In Scenario 2 (August), the classical nonlinear calculation method takes nearly 50 times than that of the linear method without obtaining the global optimum. Because existing software with high computational efficiency can solve the large-scale LP problem, the proposed linearization method is also applicable for a large-scale system with global optimal operation decision obtained.

b: COMPARATION OF THE PROPOSED BOUNDARY ESTIMATION METHOD AND MONTE CARLO SIMULATION METHOD
Monte Carlo simulation method is a universal method, which is commonly used in recent years [27]- [32]. MCS can generate many scenarios based on uncertainty parameters to estimate the value distribution. In this paper, the model has 5 uncertainty parameters PV (SHE), WT, E, C and H. It is assumed that the parameters are uniformly distributed in the uncertainty set and generate 200 Monte Carlo simulations to estimate the interval of optimal operation cost. The comparison of MCS based on the LP method, MCS based on the NLP method and the proposed robust method is shown in Table 4. From Table 4, MCS based on LP and proposed method has similar accuracy, while MCS based on NLP is time-consuming with lower accuracy. This is mainly because in many scenarios, CONOPT solver cannot find the solutions. The proposed method has advantages in both accuracy and computation time.

D. IMPACT OF THE RESEARCH TIME INTERVAL
According to different research time intervals, the time horizon is divided into a different number of groups. The research time intervals in this paper are 2 hours, 1 hour, 30 minutes, and 15 minutes. The time horizon is then divided into 12, 24, 48, and 96 groups accordingly. Taking the winter scenario as an example, the boundary and expected operation cost and computation time are shown in Fig. 9. The computation time primarily consists of uncertainty set definition of variables, while the expected value and boundary calculation is ignored. From Fig. 9, it is clear that the boundary of optimal operating cost becomes narrow, and the results are more precise with a reduction in the research time interval while the computation time is substantially increased. However, only small changes with different time intervals are expected to occur. This is because the expected value uses any data in its entirety and the results are robust. It should be noted that because micro-CHP starts and ends in a short period of time in this paper, because micro CHP start and stop in a short time and the research time interval is sufficiently long, the minimum up and down time constraints of CHP can be ignored. If the time interval is sufficiently short, these constraints should be reconsidered.

VI. CONCLUSIONS AND FUTURE WORKS
This paper analyses and addresses the variation interval and the expected value of the economic operation cost of an EH model-based residential district based on uncertainty load and renewable energy. The proposed estimation method can accurately calculate the boundary with a relatively low computation time, unlike the traditional estimation method. The main conclusions are as follows: 1) Each CEH can be decomposed several SEHs and converted to linear based on the variable substitution method. The computational burden of the linear EH model is decreased by approximately 98% compared to the nonlinear EH model by using the proposed linear form. The proposed model can be implemented in any scenario while the nonlinear model cannot be solved by using the GAMS in certain scenarios.
2) Based on the linear model, the expected value and interval estimation of EH model can be resolved using the expected value model and strong duality theorem. The energy management process, gas and electricity are complementary and even the residential district has sold electricity back to the grid in some high electricity tariff time horizon.
3) The boundary of optimal operation costs is narrowed, with research time interval decreasing, while the expected value is stable.
In the future, we will consider applying the proposed method for a real multiple energy system, which consider the transmission constraints and we will also consider the adjustable loads, such as electric vehicles, air conditioners, and so on. Her research interests include the artificial intelligence and its application on the modeling and optimization of multiple energy systems.