Quantifying Analysis of the Impact of Haze on Photovoltaic Power Generation

Haze has a significant impact on photovoltaic (PV) power generation. When the fine particulate matter reaches a certain concentration, it becomes the main factor affecting solar irradiance and seriously reduces PV power generation, but few quantitative studies on the effect caused by haze to PV power generation. This study proposes the use of the improved method of the degree of grey slope incidence to analyze the weight factors of the effects of haze on irradiance. The exponential-linear model is used to describe the impact of haze on the amount of irradiance. Furthermore, the PV system model is used to focus on the quantitative loss of PV power under the influence of haze. By modeling and analyzing the data samples of PV power generation in Hangzhou, China, it can be concluded that the losses caused by haze on PV power generation in 2017 and 2018 were 5.25 ± 1.19% and 6 ± 1.16% of the original PV power generation, respectively. We extended this analysis to other cities to analyze the PV data in Tianjin, China. From December 2018 to December 2019, the loss of PV power generation caused by haze in Tianjin was 8.77 ± 0.9%. The quantitative analysis of haze on PV power can provide an effective basis for the economic evaluation of new PV systems and also plays an important role in the prediction and scheduling of PV power generation.


I. INTRODUCTION
Haze is a common phenomenon of air pollution and has a wide range of influence. Since 2013, the haze has become common in winter in many areas of China, which is mainly caused by a temperature inversion, and there are more heating pollutants in winter [1]. After recent years of governance and the development of clean energy, such as solar energy, the haze problem in China has been significantly improved [2]; however, compared to other developed countries, air pollution in China remains a serious problem [3]. Air pollution mainly includes small particulate matter (PM10) and fine particulate matter (PM2.5), among which PM2.5 is the main component of haze [4], [5]. Its effect on solar irradiance is mainly to reduce direct irradiance and increase scattered irradiance, which weakens the amount of irradiance reach-The associate editor coordinating the review of this manuscript and approving it for publication was Anisul Haque.
ing the ground to a certain extent. Moreover, some studies have used the air quality index as an additional parameter to estimate the global solar irradiance model on the horizontal plane, and experiments have shown that it can improve the accuracy of the model [6]. In recent years, rapid economic growth has led to the rapid development of various industries, with a concomitant increase in hazy weather. Studies have confirmed that pollutants in the atmosphere reduce solar radiation reaching the surface while reducing visibility [7]. Air pollution is a huge health risk, especially for respiratory and cardiovascular diseases [8]- [10], because it can enter the human lungs and blood, increasing premature mortality [11], [12]. Because PV has the characteristics of no noise, no pollution and no geographical limitations, undoubtedly, it has made outstanding contributions to the reduction of air pollution. According to statistics, by the end of 2018, the cumulative installed capacity PV nationwide was 174 million KWp. With the rapid development of the current economy, VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ hazy weather is likely to continue, leading to the continuous reduction of solar irradiance, and it is difficult to estimate its impact [13]. Solar irradiance is an important energy source for human activities, but the resulting haze weather reduces the amount of irradiance reaching the surface and it is found that different haze concentrations have different degrees of influence on irradiance [14]. When the haze concentration reaches a certain value, it becomes the main factor affecting irradiance and seriously reduces PV power generation [15]. Many studies have revealed the impact of solar pollution on solar irradiance by constructing an index representing atmospheric conditions or establishing a relationship analysis between haze and irradiance [16], [17]. Furthermore, there are studies that use observational experiments on the concentration of particulate matter in the atmosphere and use irradiance observation data to compare and analyze the variation trend between the two, which further confirms that hazy weather has an apparent effect on irradiance [18]- [20]. The weather type, the degree of haze and controlling similar meteorological factors other than haze based on similar days can then be evaluated, followed by an analysis of the independent impact of haze on PV power generation [21]. Furthermore, based on principal component analysis, the influence of different fog and haze levels on daily global irradiance is analyzed. The measured data indicates irradiance is gradually weakened with the increase of AQI, but the weakening speed gradually slows down [22]. Existing studies have made some progress in studying the impact of atmospheric pollution on the amount of solar irradiance, but most studies have only discussed the relationship between haze concentration and irradiance from a qualitative perspective [23]. This study focuses on the effect of haze on irradiation and the resulting reduction in PV power generation. The conversion between irradiance and PV power generation can be estimated using the photoelectric conversion efficiency but with large errors in the analysis results. In this work [24], through analyzing the correlation between haze and PV power generation, the relationship between haze and irradiance is established to calculate the irradiance loss in haze weather. We combine the actual PV system model to calculate the loss of power generation in hazy weather and provide a scientific basis for the rational use of solar resources [25]. The rest of the article is organized as follows. section II provides data preprocessing and database creation. Section III the Grey Slope Incidence method and analyzes the correlation degree of influencing factors of PV power. Section IV fits the relationship between PM2.5 and irradiance reduction and quantitatively calculates the effect of haze on PV power generation. Conclusions follow at the end.

II. DATABASE CREATION
The meteorological and irradiance data used in this paper are from Hangzhou Meteorological Bureau(E120 • 17 58.7 , N30 • 22 90.88 ). High-quality, high-frequency irradiance and air pollution data can ensure the accuracy of analysis results. The resolution of the original meteorological and irradiance data collected is once every 5 minutes, and then the average value of 12 numbers in an hour is taken as the calculation data. To accurately obtain the influence of haze and other factors on irradiance, the data needs to be processed. The data processing in this paper is divided into two parts, one is to remove the data of severe weather, and the other is to eliminate the influence of seasonal factors on irradiance.
A. DATA FILTERING Cloudy, rainy, snowy weather has a huge impact on irradiance. If the severe weather data is not discarded, the analysis results of the influence of haze on irradiance will have greatly deviated.
For the filtering of clear-sky data, refer to the method in for quality control of the data [17]. We establish clear-sky data filters to remove severe weather data except for clear-sky. The weather type is based on the real-time weather records in the data provided by the Meteorological Bureau. As high humidity values are strongly related to rain and foggy weather [26], to obtain clear-sky data, data with a relative humidity greater than 80% have to be discarded. In contrast, when the relative humidity reaches 80%, its effect on visibility is particularly obvious [27]. The decrease in visibility also shows that solar irradiance is affected. Figure 1 shows the Hangzhou clear-sky filtering process of 2017. Clear-sky filtering is performed on all the data required in the study, leaving clear-sky data that meets the conditions.

B. DATA NORMALIZATION
In the study, 24 months of insolation data were considered. Due to the rotation of the earth around the sun in a year, the solar irradiance and temperature obtained in different periods are different, resulting in differences in the four seasons [28], [29]. In the subsequent analysis, to eliminate the impact of seasonal differences, the data in different months is normalized to the same month to collect the data [24]. This study uses the Maximum-minimum normalization to transform the numerical variable from the original value range (min x to FIGURE 2. Use normalization to eliminate differences in irradiance due to seasons, irradiance before normalization (a), irradiance after normalization (b). max x) to the specified area range (min x to max x). The linear variation maintains the connection with the original data so that the data is not misleading in the analysis after the data is standardized.
The effect of this normalization shown in Figure 2 for the example of normalizing Hangzhou's January 2017 data to July. According to the above method, the data of the remaining months are normalized to July of the same year and a clear-sky database is established.

III. CORRELATION ANALYSIS OF INFLUENCING FACTORS OF PV POWER
In PV systems, PV power is related to meteorological factors, such as the irradiance, temperature, PM2.5 and wind speed, as well as the interactions between the various factors. Most importantly are irradiance and temperature. By establishing the correlation of influencing factors, the weight of their influence on the irradiance can be analyzed. Due to the many and complex factors that affect the efficiency of PV power, this study proposes the use of the improved degree of grey slope incidence model for analysis, which can reflect the positive and negative correlation degree and has symmetry. Compared with other correlation analyses such as Pearson, Kendall, Spearman, the grey correlation degree is more suitable for exploring nonlinear correlation because it obtains correlation by judging the similarity of the geometric shapes of two variables and there is no restriction on the data type. In my work, the relationship between variables is not linear, but a complex nonlinear relationship, so the gray correlation analysis is considered more applicable.

A. IMPROVED MODEL OF DEGREE OF GREY SLOPE INCIDENCE
Since the influence of multiple factors on the power generation in the PV system is not clear, we can think of it as a gray system. Gray correlation analysis is a method to measure the correlation of various influencing factors according to their similarity or dissimilarity. The degree of grey slope incidence is proposed based on the grey relation grade and the grey absolute relation grade, which can better reflect the correlation between the sub-sequence and the mother sequence. The basic concept is to express it according to the closeness of the relative change trend of the factor time series curve because the traditional correlation has the following defects [30]. The value of the correlation is not unique. In general, the resolution coefficient is 0.5. The change of the resolution coefficient will cause the change of the correlation. There is less discussion on the degree of negative correlation. There may be a negative correlation between the two sequences. The degree of correlation is not symmetrical [31]. Considering the above defects, the grey slope coefficient correlation degree is improved based on the difference in slope between the reference sequence and the comparison sequence. The improved correlation model can reflect the negative correlation and has symmetry. The calculation method of the improved degree of grey slope incidence model is as follows:

1) ESTABLISH THE CORRELATION CALCULATION MODEL OF EACH INFLUENCING FACTOR OF THE PV SYSTEM AND THE AMOUNT OF POWER GENERATION BY USING DEGREE OF GREY SLOPE INCIDENCE THEORY
In the calculation model of the correlation degree of PV system, . . , m 2 is the PV power, K represents the time series. In the PV power generation system, the average value of the period from 8 am to 5 pm on the k day of the influencing factor is taken as the variable X , let PM2.5 value is the X 1 ; the total PV power on the K day is regarded as Y 1 , irradiance is the Y 2 .

2) GREY SLOPE CORRELATION COEFFICIENT
For each sequence: at time k, γ rj (k) is the grey slope correlation coefficient between X i (k) and Y j (k): where sgn k is the sign function of correlation degree; x i (k)/ k is the gradient of the PV system influencing factor from k to k + 1, y j (k)/ k is the gradient of the PV system affected factor from k to k + 1.

3) CORRELATION
γ ij is called the correlation degree of X i (k) and Y j (k). When −1 ≤ γ ij < 0, X i (k) and Y j (k) are negatively correlated. The higher the absolute value γ ij , the stronger the negative correlation. When 0 < γ ij ≤ 1, X i (k) and Y j (k) are positively correlated. The higher the absolute value γ ij , the stronger the positive correlation. When γ ij = 0, X i (k) and Y j (k) no correlation. Calculate the impact weight of each influencing factor on the amount of PV power and provide a basis for the subsequent model of the impact of haze and PV power generation.

B. CORRELATION DEGREE OF INFLUENCING FACTORS OF PV POWER
According to the gray slope correlation model established in the previous section, calculate the correlation between PM2.5 concentration and PV power generation at seven temperature levels. Setting different temperature levels can eliminate the influence of season on power generation, while maintaining the original data feature. Table 1 shows the correlation between PM2.5 at seven temperature grade and PV power generation and irradiance, the average correlation is between PM2.5 and PV power generation −0.713, indicating that PM2.5 has a high negative correlation with PV power generation. As a major influencing factor, PM2.5 concentration and duration will directly affect PV power generation. The haze affects the PV power generation by reducing irradiance. Table 1 shows that the correlation between PM2.5 and irradiance is −0.73. In this work, irradiance is taken as the intermediate variable and the effect of PM2.5 on it is calculated to achieve the purpose of quantitative analysis.

IV. ANALYSIS AND DISCUSSIONS
A. RELATIONSHIP BETWEEN PM2.5 AND IRRADIANCE REDUCTION PM2.5 is classified according to the characteristics of meteorological data distribution. By averaging the irradiance at the same time point in each level, the irradiance curve corresponding to PM2.5 level and the degree of dispersion of the irradiance can be obtained. The full line in Figure 3 is the fitted irradiance curve corresponding to each PM2.5 level. The size of the circle represents the degree of discreteness of the data. It can be seen that the increase in haze value has an obvious effect on irradiance.
Based on the analysis of the above correlation and the effect of PM2.5 on irradiance, the more severe the haze, the greater the impact on irradiance. Since the severer the hazy weather, the higher the concentration of particulate matter and nitrogen dioxide in the atmosphere, the greater the degree of absorption and reflection of sunlight [32], which causes the PV module to receive less solar irradiance. In order to consider the quantitative impact of PM2.5 on the relative irradiance, the relationship between PM2.5 and irradiance reduction is described using a fitting relationship. It is found that the influence of PM2.5 on irradiance accords with the Lambert-Beer law and there is an exponential decay relationship [33]. There is also literature describing the relationship between PM2.5 and irradiance using a linear regression model [34]. For existing indexes and linear models, we propose the use of a complex model to fit the relationship between PM2.5 and irradiance. This model improves the lack of an exponential growth trend in the original linear regression model and the lack of linear factors in the exponential model. Through fitting verification, the compound function model has higher fitting accuracy. The least-squares method is used to solve the fitting parameters in the linear model [35], where the parameter that minimizes the difference is defined as the best fit result. The Levenberg-Marquardt algorithm is used for the index model and the composite model is used to solve the fitting parameters [36]. This method is the most widely-used nonlinear least-squares algorithm. It has the advantages of the gradient and Newton's methods and can provide numerical solutions for minimizing nonlinearity (local minimum). Figure 4 shows three fitting methods to describe the relationship between PM2.5 and irradiance reduction: linear function, exponential function and composite function.
The fitting method in Figure 4a is the same as that in Singapore [17], but the data collected in this study covers a wide range and has certain advantages. Figure 4b shows that the exponential fitting is the same as that of Delhi [24]. The blue dotted line in the figure reproduces the research of Delhi. It can be seen that it is roughly consistent with the findings of this article. The data in this study is directly provided by the local meteorological bureau, which is more accurate and has a large data set. The R 2 performance of the relationship fitted by the same method is not the best. Because the relationship between PM2.5 and irradiance is complex and unclear, this paper proposes the composite model in Figure 4c, which effectively combines the advantages of the first two fitting methods and improves the fitting accuracy.
The dashed line in Figure 5 is the 95% confidence interval. According to the above fitting situation, solar    irradiance shows a decreasing trend as the PM2.5 concentration increases. The fitting degree of the exponential linear combination model is the best among the three fitting relationships. In order to prove the universality of the index linear combination function, the haze data and irradiance data under the clear-sky conditions of Tianjin from January to October of 2019 are used to verify. Figure 5 shows that 92% of the data is within the confidence interval. Therefore, (5) is used to describe the relationship between the PM2.5 value and irradiance in the subsequent analysis: Ee = Y 1000 = exp(−0.0012 · PM 2.5 + 6.91 ± 0.014) − 0.0455 · PM 2.5 + 4.1 (5) where, Ee is the irradiance, and PM2.5 concentration is the measured concentration of PM2.5. VOLUME 8, 2020

B. IRRADIANCE LOSS BY HAZE
Analysis of the light loss caused by the hazy in Hangzhou in 2017 and 2018. The hazy situation in Hangzhou during the two years is shown in Figure 6. The haze days in Hangzhou are mostly from January to March and from November to December. Figure 7 shows the loss of irradiance caused by the haze calculated by (5). The amount of irradiance discussed in this study is accumulated from the hourly irradiance that meets the clear-sky conditions from 8 a.m. to 5 p.m. The haze-free weather in this study is to set the PM2.5 index to 10. It can be seen from Figure 6 and Figure 7 that the higher the PM2.5 index, the greater the irradiance loss caused.

C. PV POWER GENERATION MODE
Combined with the actual PV system in Hangzhou, a corresponding Simulink PV model is established, and the model structure is shown in Figure 8. The PV model uses the irradiance and the corresponding temperature as the variable inputs. Considering the effect of temperature on PV power generation, the accuracy of the PV power generation model can be effectively improved. The PV power generation system was installed in Hangzhou in 2009, with an installed capacity of 120 KWp, and the rated maximum output power of each panel is 167 Wp. Considering the longitude and latitude of Hangzhou and the maximum summer power generation, the installation angle is 30 degrees, and the inverter efficiency is 95%.
Use historical data to verify the accuracy of the model. Take the power generation data on June 10, 2017, as an example (the maximum temperature is 34 • C, the maximum air relative humidity is 76%, and the average PM2.5 index is 22 µg·m −3 ). The power generation curve of the model power generation and the actual power generation is shown in Figure 9, and the power generation error at that time is within 3 KW, and the cumulative power generation error is 1.8%.
In June 2017, the number of sunny days after excluding rainy days is 22 days, and the measured power generation is 7951 KWh. The Simulink PV model estimates the power generation is about 8037.6 KWh, which is 151.9 KWh different from the actual power generation. Figure 10 shows the error between the model power generation and the actual power generation, the average error for a month is 1.89%.

D. QUANTIFYING ANALYSIS OF THE IMPACT OF HAZE ON PV POWER GENERATION
Select sunny days with sufficient sunshine to analyze the power generation of four different haze values. Sunny days are more useful for analyzing the relationship between haze, irradiance, and power generation. The principle of similar days with haze was used to screen for clear-sky days with similar temperature, cloud cover and wind speed for analysis. Figure 11 is the curve of irradiance and PV power generation with different haze levels over four days. It can be seen that the loss of irradiance has a great impact on power generation. 215982 VOLUME 8, 2020  Comparing the haze value in Figure 11 with the actual daily power generation, With the increase of PM2.5 value, the corresponding reduction in daily generated energy. Among the data for Dec 2.2017 and Feb 27.2017, the daily power generation decreased by 118 KWh after the PM2.5 value increased by 74 µg·m −3 .
The increase in PM2.5 concentration will seriously reduce the PV power generation. The quantitative impact of Hangzhou's 2017 and 2018 haze on PV power generation can be obtained by comparing actual generated energy with the model generated energy. Use (5) to calculate the haze-free irradiance, and combine the 120 KWp PV system model to convert the haze-free irradiance to PV power and compare it with the actual power generation. As shown in Table 2, the proportions of power generation haze-induced losses by haze in 2017 and 2018 were 5.25 ± 1.19% and 6 ± 1.16% of the original power generation respectively.

V. EXPERIMENTAL VERIFICATION
The above method can be used to calculate the quantitative impact of haze on PV power generation in Tianjin. Tianjin is China's second-class solar irradiance area and solar energy resources are relatively abundant [37], but the hazy pollution in Tianjin is more serious than that in Hangzhou. Figure 12 shows the PM2.5 in Tianjin from December 2018 to December 2019. Combined with the PV system with an installed capacity of 50 KWp in Tianjin, the rated power of PV panels are 265 Wp, and the inverter efficiency is 97.9%.
According to calculations, the PV system with installed capacity of 50 KWp in Tianjin has 2422 hour of clear-sky sunshine in a year, the power generation of 61963.1 KWh, FIGURE 11. Comparison of four different levels of PV power generation with haze values. The yellow curve is irradiance and the blue curve is power generation. VOLUME 8, 2020  the PV panel area of 318 m 2 , and the annual clear-sky power generation unit area is 194.9 KWh·m −2 . Use (5) to calculate the loss of irradiance due to haze. Based on Tianjin's PV system modeling and analysis of the quantitative impact of haze on generated energy. In Table 3, the annual loss is 8.77 ± 0.9% of the actual power generation.

VI. CONCLUSION
In recent years, China's PV power generation has continued to develop rapidly. It has played an important role in building a clean, low-carbon, safe and efficient energy system with its technology level continuously improved and its cost significantly reduced. At the same time, the development of PV power generation has also contributed to reducing haze pollution, however, it has been found that haze pollution has been one of the main factors affecting irradiance from analysis of this paper based on the degree of grey slope incidence. It can be seen that the haze in Hangzhou is mainly concentrated in January to April and November to December. The collected data were preprocessed to establish a database to fit the relationship between PM2.5 and irradiance. Consequently, the PV power in the haze-free could be analyzed through the realization of photoelectric conversion through the Simulink PV power model.
By comparing the haze-free PV power with the actual power generation, it can be concluded that the power generation loss caused by haze in Hangzhou is 5.25 ± 1.19% of the original power generation in 2017. In 2018, the power generation loss is 6 ± 1.16%. Due to geographical restrictions, the loss of PV power generation caused by haze would increase greatly in areas with severe haze pollution. The haze situation in Tianjin is more serious than that in Hangzhou. In this paper, the PV system data of Tianjin is used to verify the relationship between haze value and irradiance. Among them, 92% of the irradiance data conforms to the Exponential-Linear model of PM2.5 and irradiance and the loss caused by haze in Tianjin is 8.77 ± 0.9% of the original power generation. The research in this paper can be applied to the analysis of the loss of PV power generation caused by haze in any region, and it can also be used as a reference for PV site selection and the calculation of the return on investment of the PV system. In the follow-up research, PM2.5 can be listed as one of the influencing factors of PV power generation, and the power generation prediction system can be updated to provide efficient and orderly information for the scheduling of the PV power system. This can also help the PV system to make the optimal scheduling strategy to maximize the operating efficiency.

A. CONTRIBUTION
With the accelerated development of urbanization and industrialization, there are obvious air pollution problems in many regions of China. In cities, such regional air pollution occurs frequently, a large number of aerosols will scatter and absorb the irradiance, which will reduce the irradiance. This work provides a feasible method to more accurately quantify the reduction of PV power generation caused by haze. Through the establishment of a database, the relationship between PM2.5 and irradiance is obtained, and the loss of power generation caused by PM2.5 is quantified. We believe that this can provide effective suggestions for solar energy investment.

B. OPPORTUNITIES
China has seven of the ten most polluted cities in the world. It can be seen that China's air pollution is quite serious. Only when everyone pays attention to it can relevant departments take scientific and reasonable measures to deal with and prevent air pollution. There is no doubt that PV power generation is a feasible method to control air pollution. A quantitative description of haze's influence on PV power can be used to evaluate the economic loss caused by this influence and provide a reference for PV investment. In future work, we will focus on PV power generation prediction under the influence of haze. The influence of haze is indispensable, in order to improve the precision of prediction. Accurate PV forecasting can ensure the safety and economy of the power system.

C. AIR POLLUTION
Solar energy is also affected by air pollution while improving air pollution. We hope this will form a virtuous circle. The reasons for the formation of haze are extremely complex, including automobile exhaust, industrial emissions, dense population, and numerous ecological environment damages. Air pollution seriously harms our health, especially the impact of the respiratory system and cardiovascular system, and will also increase the incidence of infectious diseases. The management of haze cannot be accomplished overnight and it requires the joint efforts of all mankind. We hope to attract people's attention in this way and contribute to reducing air pollution.

DECLARATION OF COMPETING INTEREST
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.