A Method of Complementing Missing Power Data in Low-Voltage Stations Based on Improved Deep Convolutional Self-encoding Network

The irregularities in the collection and transmission of user power data in the low-voltage power distribution station area have led to errors in the subsequent application analysis of the station area. In order to ensure the integrity of power data in low-voltage stations, a multi-user power missing data complement method based on improved deep convolutional self-encoding is proposed. First, according to the characteristics of the lack of multi-user power data in the low-voltage station area, the power data is formed into a spatio-temporal tensor data format that can be used for one-dimensional convolution operations. Then use the encoding and decoding capabilities of the improved deep convolutional self-encoding network to realize the reconstruction of missing data, and optimize the network structure by introducing residual learning and batch normalization (BN). Finally, based on the proposed method, two cases of random and continuous loss of user power data in a certain area are complemented. The results show that the method can accurately complete 40% of randomly missing data and 2 consecutive days of missing data. The proposed method has improved completion accuracy compared with traditional methods to varying degrees.


I. INTRODUCTION
With the widespread use of electricity information collection systems and advanced metering infrastructure (AMI), the intelligent collection of power data for users in the station area has achieved full coverage [1], and a large amount of data has been accumulated in the measurement system [2]. Therefore, the use of big data technology for station area application analysis (line loss calculation, user load forecasting, energy efficiency analysis, etc.) has become a trend [3][4][5][6][7].
The research conclusions obtained based on complete and reliable power data have practical application value and can correctly reflect the operating characteristics and objective laws of the station area [2]. However, in the process of collecting and transmitting power data in the station area, the problem of irregular data loss due to local channel instability and smart meter failure is significant, and the amount of data does not meet the requirements of big data analysis. As a result, the accuracy of subsequent application analysis decreases [1]. Therefore, in order to ensure the integrity of the power data in the low-voltage station area, it is very necessary to complete the lack of power data in the lowvoltage station area.
Traditional missing data completion methods include mean filling method, hot and cold card filling method, regression filling method, etc. [8][9]. Although these methods are simple and fast, they ignore the spatio-temporal nature of the power data of users in low-voltage stations, and the filling quality is not high. For this reason, some scholars have proposed a method based on matrix completion [10]. This 2 VOLUME XX, 2017 type of method uses the low rank (temporal and spatial) of the original data matrix and the sparsity of the error matrix to fill in randomly missing data. However, the optimization method used for non-convex problems is too complicated, and it is not effective in filling long-term continuous missing data. For this reason, some researchers have proposed a filling method based on shallow self-encoding [11]. Although this type of method realizes the filling of various missing patterns, the filling method based on the shallow learning network structure is difficult to accurately describe the deep nonlinear characteristics of the power data in the station area. The development of deep learning technology can solve this problem. Deep learning uses an unsupervised method to extract deep features from input large-scale highdimensional data through multiple levels of abstraction [12][13][14][15]. The deep convolutional auto-encoding method proposed in recent years has made breakthrough progress in missing image restoration and high-resolution image reconstruction. Literature [16] proposed a low-light-level image restoration method based on deep convolutional self-encoding network to extract features of ultra-low illumination images, which can effectively improve the signal-to-noise ratio and contrast level of low-light-level images; Literature [17] shows that the application of deep autoencoder network to the field of image deblurring, combined with multi-scale convolutional neural network to build a generator network has strong image restoration ability and certain generalization ability; Since missing image filling and segmentation map restoration and low-voltage station area power missing data filling problems have obvious commonality in nature: both are based on nonmissing parts to generate missing components that conform to objective laws [18]. Both of them also face the problems of high data distribution dimensions and complex modeling [19]. Therefore, this deep learning method is very suitable for solving the problem of filling missing data in the low-voltage station area. The missing data in the low-pressure station area includes random missing and continuous missing data. As long as there is a missing data, the accuracy of subsequent application analysis will decrease. Therefore, this paper proposes a method for missing completion of multi-user power data based on improved deep convolutional autoencoding. First, according to the lack of multi-user power data in the low-voltage station area, the power data is formed into a spatio-temporal tensor data format. Then, use the encoding and decoding capabilities of the improved deep convolutional self-encoding to realize the reconstruction of power missing data, and optimize the network structure through residual learning and the introduction of BN. Finally, in the example verification, the effectiveness of the proposed method is verified by using the real user power data of a certain station. Experiments show that this method can effectively achieve a 40% random missing rate and the completion of user power data for 2 consecutive days of missing all days. Under the three error evaluation indicators, when the random missing rate is 40%, the completion accuracy is increased by 1.17 times, 1.23 times, and 0.61 times respectively compared with the self-encoding. When the whole day is missing for 2 consecutive days, the completion accuracy is increased by 1.67 times, 1.15 times, and 2.30 times compared with auto-encoding.

A. DEEP CONVOLUTIONAL AUTOENCODER
The problem of missing power data (voltage, current, or power) in the low-voltage station area can be essentially transformed into a problem of reconstruction of missing data. Deep convolutional autoencoder is an unsupervised deep learning network, which can realize the reconstruction of missing data through end-to-end learning [20][21].
In the coding stage, the number of convolution kernels in the convolutional layer represents the number of feature maps extracted. For the input vector x , the output i h of the i-th convolution kernel is: Where:  represents one-dimensional convolution, i w represents the weight of the i-th convolution filter, i b is the offset of the i-th feature map, and ()  is the activation function.
The formula for reconstruction at the decoding stage is as follows: Where: y is the reconstruction value, H is the number of feature maps, c is the offset, and i w is the weight flip operation.
The loss function is:

B. RESIDUAL LEARNING
In order to increase the model's ability to abstract the contextual semantic information of the input power data in the station area and improve the accuracy of the completion, residual learning is introduced on the basis of deep convolutional autoencoding.  Residual network (ResNet) uses residual unit module to solve the optimization problem brought by the deepening of the network level [22]. The residual unit module is realized by short-circuit connection, the structure is shown in Figure 1, and the formula is:  (5) In formula (5), the partial derivative of () H x to x is greater than 1, which solves the problem of vanishing gradient during network training. When () H x is the identity mapping, that is, when the training target of ( , ) F xw is close to 0, the information can be transmitted across multiple layers, which solves the problem of network degradation that occurs as the network deepens [22].

A. IMPROVED DEEP CONVOLUTIONAL AUTO-ENCODING NETWORK STRUCTURE
Improved deep convolutional auto-encoding network (IDAE) is a combination of convolutional auto-encoding and residual learning network, which is specifically manifested in the introduction of two short-circuit connections. One is the short-circuit connection in a single module, which can avoid the network degradation problem caused by the deepening of the convolutional layer. The other is a symmetrical shortcircuit connection between modules. This connection can learn the nature of the identity mapping by virtue of the residual unit to avoid the problem of loss of input power data information due to a large number of samples. The effect of the two short-circuit connections is essentially to increase the depth of the model, optimize the network structure, and have a stronger learning ability, which is suitable for filling in the lack of power data in the low-voltage station area. The deepened network structure makes it easier for the model to capture the spatiotemporal characteristics of the power data in the low-voltage station area.

B. CONSTRUCTION OF MODEL INPUT TIME AND SPACE
Taking into account the spatio-temporal nature of the power data in the station area, it is necessary to construct the model input spatio-temporal volume. The construction process is shown in Figure 2. It can be seen from Figure 2 that in order to capture the spatiality of the power data of users in the station area (the connection between different users), the experiment constitutes the power data of all users (Nh) in the station area at each time point into a tensor of Nh×1 , that is, a time point is a sample tensor. This paper considers the timeliness of the user's own power data, selects the power tensor at m time points before the time point to be filled (for extreme fluctuations (abrupt changes)), and stacks the power tensor at the same time point before n weeks before the time unit (taking into account the periodicity of the power data) to be filled to the tensor to be filled. The power tensor of the station area at (m+n+1) time points constitutes a space-time tensor of Nh×(m+n+1), which is used as the input of the model, which makes it easier for the model to capture the time-space correlation of the station area power data.

B. POWER MISSING DATA RECONSTRUCTION MODEL AND TRAINING BASED ON IMPROVED DEEP CONVOLUTIONAL AUTOENCODING NETWORK
The network-based power missing data reconstruction model structure and hyperparameter selection are the results of experiments. The model structure is shown in Figure 3. The number of modules in Figure 3 is the initial set value, Nh is the dimension of the input data, and F1, F2, and F3 are the number of filters.
Considering that the user power data in the station area is one-dimensional time series data, a one-dimensional convolutional layer is used for feature extraction and ReLU is used as the activation function to improve the network training speed. This article uses Adam to update network parameters during training. The experiment selects Nh users in the station area, and stacks (m+n) historical tensors to the space tensor to be repaired. The input is Nh×(m+n+1) and the output is Nh×1, that is, the filling result of the entire station area can be generated at one time, and the filling efficiency is improved.
Since the goal of this model is the filling of missing data in the station area, more attention is paid to the reconstruction accuracy of missing data. To achieve this goal, the loss function is designed as follows: Where: x x m y y m (7) The trainable parameters of this model are concentrated in the convolutional layer. There are no trainable parameters in the input layer, pooling layer, and upsampling layer. The calculation method of the trainable parameters in the convolutional layer is as follows: In the formula: k is the size of the convolution kernel; i F and 1 i F  are the number of feature maps (filters) of this layer and the previous layer respectively; 1 is the bias.
The size of the convolution kernel in the model is 3 by default, and the choice of the number of filters is determined by experiments. The main adjustment idea for model training is to reduce or increase the number of filters in the entire network by a multiple of 2, and choose a smaller filter value when the accuracy is very similar to reduce the parameters of the entire network and increase the training speed. In order to improve the stability of training and speed up the convergence of the model, BN is added after each convolutional layer (except close to the input and output to prevent sample oscillation).

IV. CASE ANALYSIS
This article uses a PC configured with Intel I5 2.2GHz CPU, 8GB RAM and a Windows10 64-bit operating system, writes a python program, and verifies the performance of the proposed model with actual user power data.

A. EXAMPLE DATA DESCRIPTION
The data used in this example comes from user power data in a certain area of the distribution network of a certain city in China. In the actual data collected automatically, 40 consecutive days of non-missing power data are manually selected, and the data is divided into a training set and a test set at a ratio of 3:1. The data set specifically includes power data of 96 users from 00:00 on July 1, 2019 to 24:00 on August 9, 2019 (a total of 3840 samples with a sampling interval of 15 minutes). The shape of the training set Xtrain is (2880,96,1), and the shape of the test set Xtest is (960,96,1).

B. ERROR EVALUATION INDEX
In order to better evaluate the filling effect of missing values, the mean absolute error (MAE), mean absolute percentage error (MAPE) and root mean squared error (RMSE) are used to evaluate the performance of various algorithms for filling missing data in the station area.
MAE is the average absolute value of the difference between the true value of the sample and the completion value. The mean absolute error can prevent the errors from canceling each other out, so it can accurately reflect the size of the filling error.
MAPE can reflect the relativity of the error, and can better show the filling performance of the missing data in the station area. 1

100%
RMSE is more sensitive to the large error between the true value and the complement value.
Where: i f is the completion value; i x is the true value; N is the number of missing values.

C. CONSTRUCTION OF MODEL INPUT TIME AND SPACE
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. In the experiment, the power data of each time point of the 96 users in the station area constitutes a 96×1 tensor, that is, a time point is a sample tensor, and there are 3840 space tensors in total. This paper considers the timeliness of the user's own power data, selects the power tensor (for extreme fluctuating conditions (abrupt changes)) at three time points before the time point to be filled, and the tensor (taking into account the periodicity of the power data) at the same time point two weeks before the time unit of week, and stacks them to the tensor to be filled. The power tensor of the station area at 6 time points constitutes a 96×6 space-time tensor, which is used as the input of the model, which makes it easier for the model to capture the time-space correlation of the station area power data.

D. ANALYSIS OF THE COMPLEMENTARY RESULTS OF POWER MISSING DATA IN STATION AREA
In the low-voltage station area of the distribution network, the power data of each user is collected through smart meters, and then transmitted to the concentrator for management. This paper considers the random loss of power data of station users due to channel instability in the transmission process and a small number of users due to serious failures of smart meters that cannot be recovered in a short period of time, resulting in long-term data loss. By analyzing the accuracy of quantified power missing data completion, the effectiveness of the proposed method is further evaluated.

1) POWER DATA MISSING AT RANDOM COMPLETION RESULTS
In order to test the effect of the model on the completion of power randomly missing data, the test set is randomly missing 40% and used as the input of the trained model. Figure 4 shows a partial sample of the test set (one sample at a time point, and the selected sample is 96 samples on August 1st) with a 40% random loss of power and a heat map of the completion data. Figure 4(a) is a randomly missing heat map (missing rate is 40%), and missing points are represented by a value of 0 (dark blue). The power data completion result is shown in Figure 4(b). The peak load period (12:00-20:00) is light yellow (large value), and the low period (4:00-8:00) is blue (small value). It can be seen that when the missing rate is 40%, the power complement result of the model is in line with the actual load situation, indicating that the trained model has a strong spatiotemporal capture ability of user power data in the station area.

. Construction diagram of model input time and space volume
In order to further demonstrate the completion effect of power randomly missing data, the completion curve of a sample (time point is 12:00 on August 1st) in the test set and a user for four consecutive days (August 1st to August 4th) was drawn. The filling effect of deep convolutional autoencoding on missing data is shown in Figure 5. It can be seen from the figure that the green curve is the true value of the sample, the red curve represents the reconstructed value after 40% is randomly missing, and the blue curve is the missing point. From Figure 5(a), it can be seen that the green curve and red curve fit well, indicating that when the power data is randomly missing 40%, the model can use the nonlinear spatial relationship between non-missing points and missing points to complete the power missing data well. It can be seen from Figure 5  (b) Completion result graph when 40% is randomly missing (one user) Figure 6 shows the distribution of MAE, MAPE, and RMSE for power data completion for all test set samples (dates from 07-31 to 08-09, 960 samples in total). The box plot (pink part) shows the median, upper and lower quartiles and outliers of the completion error, and the violin chart (blue-green part) shows the data distribution. It can be seen from the figure that the median MAE filled with missing data for power is 0.018A, the median MAPE is 2.012%, and the median RMSE is 0.068A. From the perspective of error data distribution, the completion results are mainly distributed near the median, the number of samples with large completion errors is very small, and the three types of error outliers are all less than 5. Even for outliers, the maximum complement of power MAE, MAPE, and RMSE are 0.0255, 2.952%, and 0.101A, respectively, and the complement effect is still very good.

FIGURE 6. Current reconstruction data distribution
The greater the missing rate, the less relevant information the model can obtain from the data. This paper changes the missing ratio of the test set and obtains the reconstructed value through model training. The changes of MAE, MAPE, RMSE between the reconstructed result and the real value with the missing rate are shown in Table 1. It can be seen from Table 1 that when the missing rate is less than 50%, the MAE between the true value and the complement value is less than 0.026A, the MAPE is less than 3.254%, and the RMSE is less than 0.089A. That is, when the missing rate is less than half, the model can effectively complement the missing data of power. It shows that even if the model cannot obtain filling information from other non-missing power data related to the current space, it can still complete the data based on the temporal and spatial correlation of historical load data. However, when the missing rate exceeds half, the calculated values of MAPE, MAE, and RMSE increase greatly, especially when the missing rate is greater than 80%, the model data completion performance is the worst. However, in actual situations, the missing rate is generally not higher than 40%. Therefore, the method in this paper can effectively complete load data filling in practice.

2) COMPLETION RESULTS OF MISSING POWER DATA FOR CONSECUTIVE DAYS
Different from the large-scale random loss of user power data caused by communication attacks and communication interference, continuous day-long loss generally refers to a small number of specific users with continuous day-long power data loss due to damage to the This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. meter or physical failure of the communication device [17]. For such long-term, non-randomly missing measurement data, deep convolutional auto-encoding is used to fill in the missing data, which can correctly reflect the true changing law of the missing power data. Compared with traditional filling methods, using reconstructed data that fits the actual situation can greatly improve the accuracy of subsequent application analysis. The following will analyze and verify the filling effect of the method proposed in this article on the long-term missing power data.   Figure 7(a) is a heat map of a customer missing for two consecutive days (August 4th to August 5th) in the test set. Figure 7(b) shows the power data completion result of the improved deep convolutional auto-encoding model. The completion result is consistent with the residents' electricity consumption habits, indicating that the improved deep convolution auto-encoding model still has a good completion effect in the case of continuous days of missing. Figure 8 shows the error between the power data complement value and the true value. It can be seen from Figure 8 that the true value and the reconstructed value curve fit well, which shows that for the power data that has been continuously missing for a long time, this method not only has high complementation accuracy, but also correctly reflects the change characteristics of the power data sequence. For the load peaks (load mutations) in the graph, the filling results given by the model can not only reflect the real change trend, but also have a small difference in size, highlighting the superiority of the model.  Table 2 shows the completion error of the article method under different power missing days. It can be seen from the table that the trend of MAE, MAPE, RMSE is similar to that of random missing. In addition, when the number of missing days is greater than 8 days, the performance of the algorithm decreases faster, especially when the number of missing days is 9 days, the completion performance is the worst, but in actual situations it will generally be repaired within 7 working days. Therefore, the model is still valid for all-day power loss.

E. ALGORITHM COMPARISON
In order to verify the advantages of the improved deep convolutional auto-encoding model proposed in this paper, This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. the method is compared with auto-encoding and cubic interpolation. Figure 9 shows the change trend of MAE, MAPE, RMSE, and running time with the random missing rate from 10% to 50% in the test set data of the station area under the four algorithms. It can be seen from Figure 9(d) that the running time of the improved deep convolutional self-encoding is the longest, and the self-encoding and the cubic interpolation are sequentially reduced. As the random missing rate increases, the running time of the cubic interpolation increases. On the whole, it can be seen from Figure 9(a)~(c) that the three algorithms increase with the increase of the missing rate under the three errors. Among them, the cubic interpolation is the maximum under the three errors, and the completion effect is the worst, followed by the self-encoding, and the improved depth convolution auto-encoding is the minimum under the three errors, and the completion effect is the best. It shows that the introduction of residual network and BP in deep convolutional auto-encoding enhances the adaptability of the model to user load data in the station area, and improves the coding ability of high-level semantic information.
From a detailed point of view, the improved MAE for the completion of the deep convolutional auto-encoding model is about 0.006, 0.008, and 0.0121 when the missing rate is 10%, 20%, and 30%. Compared with 3 times of interpolation, the accuracy is improved by about 3 times, and the accuracy is improved by about 2 times compared with self-encoding. The improved MAPE for deep convolutional auto-encoding is about 0.800, 1.012, and 1.500 when the missing rate is 10%, 20%, and 30%. Compared with 3 times of interpolation, the accuracy is improved by about 3 times, and the accuracy is improved by about 1.7 times compared with self-encoding. The RMSE of the improved deep convolution auto-encoding completion is about 0.012, 0015, 0.022 when the missing rate is 10%, 20%, and 30%. Compared with 3 times of interpolation, the accuracy is increased by about 4 times, and the accuracy is increased by about 2 times compared with self-encoding. When the missing rate is greater than 40%, MAE, MAPE, and RMSE of improved deep convolutional auto-encoding increase slowly, while the errors of the other three algorithms increase rapidly. In summary, it is shown that the improved deep convolutional autoencoding model can learn the mapping relationship between the historical and non-missing load data of the low-voltage station area and the load data to be repaired, and its ability to extract the characteristics of load data and the abstraction of temporal and spatial correlation is better than other algorithms. The repair of the power missing data in the low-voltage station area is efficiently completed under different power data missing rates. 15 20  Figure 10 shows the change trend of MAE, MAPE, RMSE, and running time when the power data of the station area test set is missing from 1 to 5 days throughout the day. It can be seen from Figure 10(d) that the changing trend of the running time of the four algorithms is similar to that in the case of random missing. It can be seen from Figure 10(a)~(c) that the error trend is similar to the error trend in the case of random missing in Figure 9. The error curve of the improved deep convolutional auto-encoding is always below the other two algorithms, indicating that the improved depth convolutional auto-encoding has the highest completion accuracy. The MAE, MAPE, and RMSE of improved deep convolution To sum up, whether in the case of user power data missing randomly or all day long, the improved deep convolution auto-encoding completion method has better performance than the three-pass interpolation and auto-encoding completion method.

V. CONCLUSION
In this paper, based on the large amount of missing power data of users in the current low-voltage station area, this paper proposes an improved deep learning power missing data complement method with improved deep convolutional auto-encoding. This method makes it easier for the model to capture the spatiotemporal characteristics of the power data in the station area by constructing the power data of the station area into a spatio-temporal tensor format. By improving the training of the deep convolutional autoencoding network structure, the model has learned the complex spatiotemporal relationships that are difficult to model explicitly, such as the correlation of multi-user power data in low-voltage stations and the law of load fluctuations. Through the introduction of residual network and BP, the adaptability of deep convolutional auto-encoding to user power data in the station area is enhanced, the network depth and convergence speed are improved while avoiding network degradation, and the encoding ability of advanced semantic information by the completion model is enhanced. The proposed method can be applied in both scenarios of random missing and full-day missing. It is verified by calculating three error evaluation indicators that the method can achieve a 40% missing rate and accurate completion of user data under two consecutive days of missing.