Modeling Daily Load Profiles of Distribution Network for Scenario Generation Using Flow-Based Generative Network

The daily load profiles modeling is of great significance for the economic operation and stability analysis of the distribution network. In this paper, a flow-based generative network is proposed to model daily load profiles of the distribution network. Firstly, the real samples are used to train a series of reversible functions that map the probability distribution of real samples to the prior distribution. Then, the new daily load profiles are generated by taking the random number obeying the Gaussian distribution as the input data of these reversible functions. Compared with existing methods such as explicit density models, the proposed approach does not need to assume the probability distribution of real samples, and can be used to model different loads only by adjusting the structure and parameters. The simulation results show that the proposed approach not only fits the probability distribution of real samples well, but also accurately captures the spatial-temporal correlation of daily load profiles. The daily load profiles with specific characteristics can be obtained by simply classification.


I. INTRODUCTION A. BACKGROUND
With the increasing penetration of renewable energy and electric vehicles, uncertainties in their output powers pose challenges to all aspects of the power grid, such as planning and operation. Therefore, it is necessary to propose some methods for modeling uncertainty. The scenario generation is a popular method to capture the uncertainty of power load and renewable energy by generating a series of possible daily power load profiles [1]. Taking a large number of daily load profiles as the input of the Newton-Raphson method, the probability distribution of voltage and power loss is obtained, which is of great significance for the economic operation and stability analysis of the distribution network [2]- [4].
The associate editor coordinating the review of this manuscript and approving it for publication was Petros Nicopolitidis .
The core idea of the scenario generation method is to generate new daily load profiles similar to real samples after learning some given real samples. The scenario generation methods can be divided into explicit density models and implicit density models with respect to whether it is necessary to assume the probability distribution of real samples [5]. The explicit density models need to know the probability distribution of the real daily load profiles which are used to fit the parameters in the probability distribution. Then, the new daily load profiles can be generated by using the trained probability distribution [6]- [8]. The explicit density models need to artificially make assumptions about the probability distribution of real samples, which limit the scope of application, since the probability distribution of most real samples is difficult to be accurately described by mathematical formulas. Moreover, it is not universal, because the probability distributions of daily load profiles vary from region to region [9], [10]. VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ The implicit density model does not require explicitly estimate or fit the probability distribution of daily load profiles. The new data can be generated by sampling and training [11]. To this end, the implicit density model has been widely used in various fields, such as pattern recognition, image generation, text translation, and data imputation [12]- [14]. As popular frameworks of implicit density model, the variational automatic encoder (VAE) and generative adversarial network (GAN) are considered to be two distinct paradigms. They have been extensively and independently studied. However, the training process of the GAN is unstable, which makes it difficult to achieve the Nash equilibrium. Besides, it is prone to the disappearance of the gradient [15]. Unlike GAN that learns samples by constructing a generator and a discriminator in the adversarial state, VAE directly calculates the mean square error between generated samples and real samples, which results in poor performance of VAE than GAN [16]. Therefore, VAE and GAN are not suitable for modeling daily load profiles before these shortcomings are overcome. How to accurately model the daily load profiles through the implicit density method is still a challenge. The non-linear independent component estimation (NICE) model is one of the flow-based generative models, which is a well-known generative framework after GAN and VAE. So far, the NICE model has been widely applied to the tasks of image generation and classification, and has shown good performance [17]. There is no report on the use of the NICE model to generate daily load profiles. How to design a NICE model structure suitable for generating daily load profiles needs further study.

B. LITERATURE REVIEW
Most of the existing methods for modeling daily load profiles belong to explicit density models that first utilize the real samples to fit the assumed probability distribution then sample the daily load profiles from the probability distribution based on the Monte Carlo method. For example, in order to simulate the daily load profiles of electric vehicles (EVs) charging stations, the traffic topology data, road network information, and detailed travel data are used to estimate the probability distribution of EVs in [18]- [20]. The charging curve of each EV is obtained to form the daily load profiles of EVs charging station. Nevertheless, the travel data of EVs and traffic flow information aren't available at most charging stations, which results in these methods being generally used for theoretical analysis and difficult to apply in practice. In [21], a stochastic bottom-up method is proposed to model household load profiles considering the impact of user behavior and financial incentive. While in [22], the domestic load profiles are generated by taking into account the time-related features of the activities such as duration and starting time. In general, these methods mainly have the following two shortcomings: 1) They are not universal. The probability distribution is different for various loads such as EVs, household loads and power station loads. Moreover, the probability distribution is also affected by space, time and electricity consumption habits. In this case, it needs to establish mathematical models for different loads separately, which is very troublesome.
2) The probability distribution of the real load curve is difficult to be described by a mathematical formula. They can only be approximated by the usual probability distribution.
In order to overcome the shortcomings of the explicit density models, some scholars use the autoregressive integrated moving average model (ARIMA) and autoregressive moving average (ARMA) model to estimate the peak of daily load profiles. However, they are linear models that do not take into account the nonlinear relationship between power loads and other factors. Besides, these models can only be used for the simulation of a single daily load profile, and the correlation among multiple daily load profiles cannot be considered.
Recently, some deep neural networks have been applied to generate daily load profiles. The core idea of VAE is variational inference. The posterior probability is replaced by variational function, and the similarity between the expected output and the decoder output is measured by Kullback-Leibler divergence. The advantage of the VAE is that it can automatically extract the natural features of real samples to model load profiles. In [23], a variational automatic encoder composed of deep convolution networks is proposed to generate load profiles of EVs. In [24], the variational automatic encoder is designed to simulate a large number of times series for wind and photovoltaic powers. The GAN generates data by constructing an antagonistic generator and discriminator. If the structure and parameters of generator and discriminator are set properly and the training process approximates to the so-called Nash equilibrium, the new samples generated by the generator are almost identical with the real samples. In [25] and [26], a deep convolution generative adversarial network is used to capture the spatial-temporal complementary and fluctuant characteristics of wind and photovoltaic power. In [27], a conditional generative adversarial network is trained to output power consumption data considering demand response programs. Compared to the explicit density models and time series methods, the GAN and VAE can capture nonlinear time characteristics without assuming the probability distribution of daily load profiles manually, but they also have some limitations. For example, the GAN has the problem of gradient disappearance or gradient explosion. The VAE can only infer the approximation of the real probability distribution, which results in poor performance for generating data. The advantage of the NICE model is that it can accurately capture the probability distribution of real samples with latent variables due to the reversibility of the transformation function. At present, the NICE model is mainly used for image processing, and its application in other fields is relatively limited. In [28], a generative framework based on the NICE model is proposed for the synthesis and interpolation of images. Similarly, a flow-based generative network is used to model image through sampling and latent variables manipulations in [29]. To accurately predict power load, a conditional flow-based model is proposed to capture the complex temporal dependency of residential load in [30].
In [31], the improved NICE model is proposed to generate high-quality audio synthesis from Mel-spectrograms. Overall, the NICE model shows good performance for generating 2-dimension and 3-dimension data, such as images and videos. However, the NICE model cannot be used directly to generate 1-dimensional daily load profiles. It is necessary to redesign the structure of the NICE model to make it suitable for processing daily load profiles.

C. PROPOSED METHOD AND KEY CONTRIBUTIONS
In this paper, it is aimed to design a NICE model to generate daily load profiles. The performance of the proposed method will be tested by the actual data set from London. The key contributions of this paper are as follows: 1) New technologies: To the best of our knowledge, it is the first time to design the NICE model for generating 1-dimensional daily load profiles. After training, the NICE model can generate daily load profiles that accurately capture the volatility characteristics and spatial-temporal correlation of the real samples. 2) Data-driven load profiles generation: The proposed method utilizes historical data to model daily load profiles without artificially assuming the probability distribution of real samples. Besides, it can be applied to model power loads in different regions only by adjusting the structure and parameters. 3) Accurate model probability distribution: The NICE model can accurately capture the probability distribution of real samples with latent variables, which is good for capturing nonlinear time characteristics of real samples. The simulation results on actual datasets show that the NICE model has better performance than VAE and GAN. 4) Conditional load profiles generation: The conditional daily load profiles with specific characteristics (e.g. heavy load, light load) can be obtained by simply classifying the generated load profiles. The rest of the paper is organized as follows. Section II designs the NICE model to generate daily load profiles. Section III presents some indicators to evaluate the performance of the algorithms. Section IV discusses the simulations and results. Section V presents conclusions and future work.

II. METHODOLOGY A. THE PRINCIPLE OF NICE
Normally, the generative networks are designed to learn the probability distribution X ∼ P data(x) of real daily load profiles x. The VAE and GAN don't explicitly learn the probability density function (PDF) of real samples, since it is very hard to calculate latent variables. By contrast, the NICE model solves this hard problem using normalizing flows, which make it possible to efficiently complete many tasks such as density estimation and data generation. As shown in Fig.1, NICE model maps the probability distribution of real samples to a priori distribution through a series of reversible functions. Each reversible function is an additive coupling layer, just like flowing water, so the NICE model is also called a flow-based generative model.
Specifically, a series of reversible functions map the real samples x to the latent variables z whose dimensions are the same as that of x. In order to fit the probability distribution Z ∼ P z(z) , the variable substitution method is used to get the following formula: In formula (1), ∂f (x) ∂x is the Jacobian matrix of function f at x. The function f must be reversible and its determinant of the Jacobian matrix must be easily computed. By taking the logarithm of formula (1), we can get the optimization objective: If the prior distribution can be factorized, then the optimization objective can be rewritten as follows: When the training process is over, the random numbers are obtained by sampling probability distribution Z ∼ P z(z) , and the new daily load profiles are generated by the inverse function f −1 (z).
In order to make the determinant of the Jacobian matrix easy to be calculated, the NICE model divides x into two parts which include x 1 and x 2 . It is regarded as a building block to transform from (x 1 , x 2 ) to (y 1 , y 2 ): In formula (4), y 1 and y 2 are new variables called additive coupling layers. The m can be any complex function (e.g. multiple fully connected layers in our model). This building block has the unit Jacobian determinant for any function m, and its inverse can be expressed as follows: Let I 1 be the first d elements of x and I 2 be the element of x from d + 1 to D. D is the dimension of x. The Jacobian VOLUME 8, 2020 matrix can be rewritten as follows: In formula (6), this Jacobian matrix is a triangular matrix where all the diagonal elements are equal to 1. It can be seen from formula (4) that although this transformation is reversible, it is relatively simple and difficult to fit very complex non-linear relationships. To this end, the normalizing flow is proposed to estimate the probability distribution of real samples. As shown in Fig.2, the normalizing flow transforms the prior probability distribution Z ∼ P z(z) (e.g. Gaussian distribution) into a complex probability distribution X ∼ P data(x) of daily load profiles by using a series of invertible transformation functions.
From the chain rule and the change of variables theorem, we can get: Further, the determinant of formula (7) can be obtained: So far, the problem of calculating determinants has been solved.
The random variable z has the same dimension as the real sample x, since the NICE model uses a series of reversible transformations to obtain the probability distribution of daily load profiles. Although the dimension of the real sample x is D, it may have a problem of wasting dimensions. For example, the real sample x has a large number of 0 elements that have no effect. To improve computational efficiency, the NICE model proposes a concept of rescaling. It takes a diagonal scaling matrix S as the top layer that multiplies the i-th output data by S ii . Thus, the formula (3) can be rewritten as follows: In formula (9), the vector S can relate these scaling factors to the Eigen spectrum of a PCA by calculating how much variation is present in each of the latent dimensions (the smaller S, the more important the dimension i is). More parameters and principles of the NICE model can be found in [32].

B. THE TRAINING PROCESS OF THE NICE
The training process of the NICE model can be accomplished by using the back propagation algorithm which mainly includes two stages: incentive propagation and weight update [33]. As shown in Fig.3, in the stage of incentive propagation, input information is transmitted to the output layer through layer-by-layer processing of the middle layer. The result of the output layer is used to calculate the loss function for the stage of the weight update. Firstly, the real samples are used to train the parameters W 1 and b 1 in the first middle layer of the NICE model. Then, the output of the first middle layer is used as the input of the next layer, and the parameters W 2 and b 2 in the second middle layer are obtained. By analogy, the network parameters of all layers can be obtained. The result of the output layer is used to calculate the loss function, which is utilized to update the weights of the network. In the stage of weight update, its core idea is gradient descent and chain rule. The errors generated by the output layer will gradually propagate to the previous layer, and the weights of each layer will be continuously optimized according to the gradient descent algorithm. Firstly, the loss function is multiplied by the input data to obtain the weights of the gradient. Then, the product of gradient and training factor is inversed to obtain the change of weights that are used to update the old weights.

C. FORMAT TRANSFORM OF REAL SAMPLES
It is well known that the original intent of designing these deep generative networks is to process images with the same number of rows and columns. However, the daily load profile of the distribution network is a 1×n vector, which cannot be used as the input data of these deep generative networks. Therefore, the daily load profiles should be transformed into square matrixes before they are used as input data. The transformation process is shown in Fig.4.
For a single daily load profile generation, an explanation is given by taking 24 sampling points a day as an example. Firstly, the original load profile is transformed into a vector of 1 × 25 scale by adding a 0 element to its end. Then, the vector is transformed into a matrix of 5 × 5 scales by format transformation, which is used as the input data of the NICE model. The data generated by the NICE model is also In optimization models of distribution network, the daily load profiles of multiple nodes are often required. For example, the dynamic reactive power optimization model in the IEEE 33 bus system requires 32 daily load profiles of 32 buses. The nodes of the distribution network have similar power consumption in one area, and are prone to simultaneous peaks or valleys. Therefore, the simultaneous generation of multiple daily load profiles considering their spatial correlation is of great significance for distribution network planning. Here, the generation of three daily load profiles is taken as an example for explanation. Firstly, three original load profiles of 1 × 24 scale are merged into a vector of 1×72 scale. Then, the vector of 1×72 scale is transformed into a vector of 1 × 81 scale by adding 9 zero elements to its end. The vector of 1 × 81 can be transformed into a matrix of 9 × 9 scales by format transformation, which is used as the input data of the NICE model. After training the NICE model, the NICE model can generate a matrix of 9 × 9 scales. We only need to separate 3 daily load profiles from it.

D. CONDITIONAL LOAD PROFILES GENERATION
In addition to generating a variety of different daily load profiles, it is sometimes desirable to generate load profiles with specific properties (e.g. heavy load and light load). For example, relay protection devices may need to be tested under heavy and light loads to ensure that signals are properly signaled in the event of a fault. After training the NICE model, the random noises obeying the Gaussian distribution is used as the input data of the NICE model to generate a large number of daily load profiles. The daily load profiles with specific properties can be obtained by simply classifying the generated load profiles.
In this paper, an example is given to illustrate the principle of generating conditional daily load profiles. Firstly, three groups which include light load, medium load, and heavy load are defined. Then, the mean value of each daily load profile is calculated, and every generated load profile is distributed to one group according to the average value.

E. THE PROCESS OF GENERATING DAILY LOAD PROFILES BASED ON NICE
To summarize the previous description, the steps to generate daily load profiles using the NICE model are as follows: 1) Data normalization and transformation: Before inputting data into the NICE model, the real samples need to be normalized, and otherwise the loss function may not converge. In this paper, the minimum-maximum normalization method is used to transform the input data to the range of 0 to 1. Then, the time series is transformed into a matrix with the same number of rows and columns. 2) Sampling and Mapping: Firstly, a series of reversible functions (e.g. a multi-layer perception in this paper) are used to map the probability distribution of real samples to the prior distribution (e.g. Gaussian distribution in this paper). Then, the daily load profiles are generated by taking the random number obeying the Gaussian distribution as the input data of the NICE model.

3) Update parameters: The loss function is calculated
using generated data and real data. Then, the weights of the NICE model are adjusted by using the back propagation algorithm. 4) Daily load profiles generation: After training, the prior distribution Z ∼ P z(z) is sampled to obtain random numbers, and a large number of daily load profiles are generated by the inverse function(e.g. a multi-layer perception in this paper). The daily load profiles with specific properties can be obtained by simple classification.

III. INDICINDICATORS FOR EVALUATING RESULT
In order to evaluate the similarity between the generated daily load profiles and the real samples, 5 indicators are proposed to calculate the probability distribution, spatial-temporal correlation, and volatility of the load profiles. Probability density function: Firstly, several intervals are defined, and the probability that power load falls in some interval is calculated. Then, scatter plots or frequency histograms are drawn to approximate the probability density function of the power loads [34].
Autocorrelation function: The autocorrelation function can represent the temporal correlation of daily load profile, which is the correlation of a daily load profile with a delayed copy of itself [35]. The autocorrelation function between time t and t + τ can be expressed as follows: In formula (10), X t is the daily load profile and t is any point in time. The daily load profile has a mean µ and VOLUME 8, 2020 variance σ 2 . E is the expected value operator. τ is the lag time.
Pearson correlation coefficient: The Pearson correlation coefficient is used to measure the spatial correlation among multiple daily load profiles [36]. Its mathematical formula is as follows: (y i − µ y ) 2 (11) In formula (11), r xy is the Pearson coefficient between two daily load profiles. µ x and µ y are the mean of daily load profile x and y, respectively.
Load duration curve: The load duration curve describes the variation of a certain power load in a downward form that the smallest load is plotted on the right and the greatest load on the left. The area under the load duration curve represents the power consumption per day [15]. The mathematical formula of power load duration is as follows: In formula (12), t j is the time when the power loads are greater than P j . n is the length of the daily load profile. P i is the i-th element of the daily load profile. m is the number of intervals for power load.
The volatility of daily load profiles: In addition to mean and standard deviation, the wave rate is defined to evaluate the volatility of the daily load profile. Its mathematical formula is as follows: wave_rate = quantile(x, 0.9) − quantile(x, 0.1) (13) In formula (13), the wave rate of a daily load profile is the difference between 90-quantiles and 10-quantiles.

IV. CASE STUDY A. DATA DESCRIPTION AND DETAILS OF MODEL
In order to fully verify the effectiveness of the proposed approach, simulations are carried out using the London smart meter data set and Spanish transmission service operator data set from Kaggle competition. The proposed approaches run under Keras that a deep learning library. The configuration parameters of the laptop are: Intel Core i3-3110M, The processor is dual-core 2.4 GHz, 6 GB of memory.
For the London smart meter data set, it counts the hourly electricity consumption of each household in 112 blocks from December 2011 to February 2014. Since the statistical time of each block is different, after screening, the data from October 15, 2012 to February 9, 2014 is selected for simulation. The load data from October 15, 2012 to January 9, 2014 are used to train the structure and parameters of the NICE model and the remaining data are used as test sets. The load of three adjacent blocks is combined to analog a node in the distribution network. The first 96 blocks form 32 nodes, and each of which has 483 historical daily load profiles. In terms of format transformation, 32 daily load profiles are merged into a vector of 1 × 768 scale, and then 256 zero elements are added to the end of this vector to form a new vector of 1 × 1024 scale. The new vector is transformed into a matrix of 32 × 32 scales by format transformation, which is used as the input data of the NICE model.
For the Spanish transmission service operator data set, it includes the hourly electricity consumption of one node in the distribution network from January 2015 to December 2018. The load data from January 1, 2015 to March 14, 2018 are used to train the structure and parameters of the NICE model and the remaining data are used as test sets. In terms of format transformation, 1 zero elements are added to the end of the daily load profile to form a new vector of 1 × 25 scales. Then, the new vector is transformed into a matrix of 5×5 scales by format transformation, which is used as the input data of the NICE model.
After many experiments on the structure and parameters of the NICE model, the optimum parameters are as follows: The structures of the NICE model for these two datasets are similar. As shown in Fig.5, the NICE model for London smart meter data set is used as an example to illustrate the structure and parameters. The NICE model consists of 4 additive coupling layers where the reversible function is a multi-layer perceptron (MLP). Each MLP consists of four fully connected layers with 1000 neurons. The 4 additive coupling layers map the daily load profiles into the latent variables obeying Gaussian distribution. After training, the random noises obeying Gaussian distribution are taken as the input data of the NICE model. The inverse functions of four additive coupling layers map random noises into new daily load profiles.
In order to test the performance of the NICE model, the VAE and GAN are used as baselines. For the encoder, the number of neurons in the input layer and the middle layer is 1024 and 256 respectively. The output layer consists of two fully connected layers with 2 neurons to obtain the mean and variance. As far as the decoder is concerned, the number of neurons in the input layer, middle layer, and output layer is 2, 256 and 1024 respectively. More parameters and training VOLUME 8, 2020 methods of the VAE model can be found in [23]. For GAN, the discriminator consists of 3 fully connected layers, and their numbers of neurons are 512, 256, and 1 respectively. The generator consists of 3 fully connected layers, and their numbers of neurons are 256, 512, and 1024 respectively. More parameters and training methods of the GAN model can be found in [11]. Fig.6(a) shows the training process of the NICE model for the London smart meter data set. As the number of iterations increases, the loss function of the training set and the validation set decrease gradually. The loss function of the validation set is larger than that of the training set, which indicates that the NICE model occurred over-fitting. In addition, the loss function of the validation set is the smallest at the 272nd iterations. When the number of iterations is more than 272, the loss function of the training set is slowly decreasing and the loss function of the validation set is gradually increasing, which indicates that the applicability of the NICE model is deteriorating. Therefore, the NICE model for London smart meter data set takes the parameters of 272nd iterations. Similarly, Fig.6(b) shows the training process of the NICE model for the Spanish transmission service operator data set. The NICE model for this data set takes the parameters of 20th iterations, since the loss function of the validation set is the smallest at the 20th iterations.

B. SCENARIO GENERATION
After training the deep generative network, the 8000 random numbers obeying the Gaussian distribution are used as the input data of VAE, GAN and NICE model. All generative networks will obtain 8000 new daily load profiles. Then, some random moments of the daily load profile are selected and the probability distributions at these moments are counted as shown in Fig.7.
Obviously, the overall trends of the probability density functions of the power loads generated by VAE and GAN are consistent with that of the real samples, but the specific details are quite different. For example, the GAN and VAE can't fit the peak and valley of probability density function well. In contrast, the probability density function of the power load generated by the NICE model is very similar to that of real samples, which indicates that the NICE model has a better ability to fit probability distribution than VAE and GAN.
As shown in Fig.8, the mean, standard deviation and wave rate are calculated to compare the volatility of the daily load profiles generated by the NICE model with that of the real load profiles.
The probability distributions of the volatility characteristics (e.g. mean, standard deviation and wave rate) of the generated daily load profiles by different deep generative  networks are almost the same as the real situation, which shows that these models can well simulate the volatility of real daily load profiles. Further, the performance of the NICE model is better than that of GAN and VAE, because the probability density function of the NICE model is closest to that of real samples.

C. TEMPORAL CORRELATION
Some daily load profiles generated by the NICE model are randomly selected in Fig.9 with comparison to some real samples from the test set. It is obvious that the daily load profiles generated by the NICE model closely resemble the real daily load profiles from the test set, which did not participate in the training process of the NICE model.
The trend of autocorrelation function between the generated daily load profile and the real daily load profile is the same, which shows that the NICE model can capture the temporal correlation of the real daily load profile well.
On the time axis of the load duration curve, the duration for which a certain load runs during the day is given. From the third row of Fig.9, it can be found that the load duration of the generated daily load profile and the real daily load profile almost coincide, which shows that they consume roughly the same power consumption in a day.

D. SPATIAL CORRELATION
The nodes of the distribution network have similar power consumption in one area, and are prone to simultaneous peaks or valleys. Therefore, it is necessary to consider the spatial correlation when modeling daily load profiles. The London smart meter data set composed of 32 nodes is used to test the performance of each generative network. Fig. 10 shows the real daily load profiles and the daily load profiles generated by the NICE model. It is obvious that the real daily load profiles of these 32 nodes have the same trend, which indicates that their power consumption is similar. The NICE model takes into account the spatial correlation among multiple nodes while generating the daily load profiles, which is in line with the actual situation.
Further, the spatial correlation among multiple daily load profiles is quantitatively analyzed using the Pearson coefficient. Firstly, a Pearson coefficient matrix among 32 nodes is calculated. Then, the Pearson coefficient matrixes of the generative networks are respectively subtracted from the Pearson coefficient matrix of the real samples to obtain error matrixes. Finally, the absolute value of the error matrix is obtained, and the visualized as shown in Fig. 11.
As can be seen from Fig. 11, most of the errors of the NICE model are between 0.05 and 0.1. The errors of VAE and GAN mostly fall between 0.1 and 0.2. The maximum error of VAE, GAN and NICE model does not exceed 0.194, 0.234 and 0.163 respectively. In general, the Pearson coefficient matrixes generated by these generative networks have a small difference from the Pearson coefficient matrix of the real sample, which indicates that they can well capture the spatial correlation of power load. Moreover, the NICE model performs better than VAE and GAN in capturing correlation. VOLUME 8, 2020

E. CONDITIONAL LOAD PROFILES GENERATION
In order to analyze the performance of the NICE model generating conditional daily load profiles, we calculate the mean of real samples. Then, the daily load profiles are divided into three groups based on the mean u(x)(kW): u(x) < 35, 35 ≤ u(x) < 45 and 45 ≤ u(x). Further, every generated daily load profile is distributed to one group according to the mean value. The probability density function of real samples and generated daily load profiles in each group is shown in Fig. 12.
Obviously, the generated samples in each group and the real samples follow the same probability distribution. For example, in group 1, the mean values of the generated samples are relatively small, which belongs to the light load. While in group 3, the mean values of the generated load profiles are greater than 4.5, which can be utilized to simulate targeted heavy load days. These conditional load profiles play an important role in power system planning and operation. For example, relay protection devices may need to be tested under heavy and light loads to ensure that signals are properly signaled in the event of a fault. In short, the proposed method can generate daily load curves with special characteristics, which is of great help to distribution network planning.

V. CONCLUSION AND FUTURE WORK
Modeling daily load profiles can help capture the uncertainty of power load, which is of great significance to the optimization and operation of the power system. In this paper, a flow-based generative network, the non-linear independent component estimation (NICE) is proposed to model daily load profiles. The proposed approach does not require explicitly estimate or fit the probability distribution of daily load profiles. The new data can be generated by sampling and training. Besides, it can be applied to model different loads in different regions only by adjusting the structure and parameters. After the simulation, the conclusions are as follows: 1) The proposed method utilizes historical data to model daily load profiles without artificially assuming the probability distribution of real samples. The simulation shows that the probability distribution of the daily load profiles generated by the NICE model is very similar to the probability distribution of the real samples, and the daily load profile generated by the NICE model has the same volatility characteristics as the real sample. 2) By comparing the real samples with the daily load profiles generated by the NICE model, it is found that the NICE model can accurately capture the spatial-temporal correlation of the real samples. In addition, the power consumption of generated daily load profiles closely resemble that of real samples. Simulation results show that the NICE model is more suitable for generating daily load profiles than VAE and GAN. 3) By simply classification, we can obtain the conditional daily load profiles with specific characteristics (e.g. light load, medium load, and heavy load), which can be utilized to simulate targeted load days. For future work, we can try to use the NICE model to model power profiles for renewable energy (e.g. wind and photovoltaic power). In this case, these power profiles can be used for power system optimization such as robust reactive power optimization. In addition, we can try to combine the NICE model with other generative networks (e.g. VAE and GAN) to improve performance.