Introduction
The contradiction between global economic development and environmental pollution is becoming increasingly prominent, as shown by the 1952 London air pollution incident. The destruction of the environment by industrial development causes ecological crises [1], [2]. China is facing more severe dual challenges of environment and development in the process of economic transformation [3], [4]. The green development concept proposed by the Chinese government emphasizes the comprehensive consideration of environmental protection and resource conservation in economic development, which is useful [5].
Zhang et al. aimed to improve the ecological driving performance and green economic benefits of vehicles by establishing transient fuel consumption models and discrete vehicle speed planning models. They analyzed the computational complexity and verified their fuel saving features through comparative experiments. The new algorithm could save over 90% of computation time with the same accuracy and achieve real-time speed curve optimization for vehicles [6]. Saputra et al. introduced a federated learning based energy demand forecasting method to maximize the economic benefits of electric vehicle charging station networks. By exchanging learning models with other charging stations, the prediction quality was improved. The accuracy of energy demand forecasting in this framework had been improved by 24.63%, and it outperformed other economic models [7]. To address the urgent issue of low-carbon energy development in the context of global climate change, Chen et al. put forth the use of a multi-universe quantum harmonious search algorithm for the construction of a carbon emission prediction model. The composite model demonstrated an effective capacity to predict carbon emissions, exhibiting a low error rate [8].
Luo combined data envelopment analysis methods to design workflows, create predictive models, and introduce sliding factor matrices to achieve carbon emission prediction. The model demonstrated an effective capacity to predict carbon emissions, exhibiting a low error rate [9]. Jia et al. proposed put forth a model predictive control strategy for addressing the constrained optimization problem inherent to wave energy converter systems. The model used a tracking cost function, a general economic cost function had been established. The effectiveness of the proposed strategy, and the model had improved the accuracy of solving efficiency prediction [10]. Zhang et al. built a research method with provincial panel data in China from 2005 to 2016, using linear regression, mediation effects, and threshold regression models to address the issue of the neglect of the impact of technological innovation. The mediating role of energy consumption in the impact of technological innovation on carbon reduction was found to be significant, and there was a threshold effect [11].
Although the above research has made significant progress in promoting green economy and low-carbon technology, there are generally the following shortcomings: firstly, some models may be too idealized and lack adaptability to complex real-world environments; Secondly, the comprehensive consideration of economic, social, and technological factors is not comprehensive enough, which may overlook the interactive effects of multidimensional factors [12], [13]. In view of this, this study is based on a Fixed Effects Model (FEM) and innovatively introduces the Long Short Term Memory Network combined with Bootstrap Classification Algorithm (LSTMS-B) classification algorithm to assist in selecting the most suitable fixed effects hierarchy, to improve the accuracy and explanatory power of the model in measuring Green Economic Efficiency (GEE). By improving the attention gate mechanism of LSTM neurons, the internal parameters of the model are reduced, and important historical location information is focused. Finally, a new Enhanced Fixed Effects Model-based Comprehensive Scheme for Green Economic Efficiency Measurement (EFEM-CEEM) is designed. It is aiming to improve the FEM to measure and analyze GEE, providing theoretical support and empirical evidence for formulating more effective green development strategies.
Design of EFEM-CEEM Scheme Based on FEM
A. Construction of Green Economy Model Based on Fixed Benefit Model
The FEM is a panel data analysis method [14], [15]. This model can control for unobservable heterogeneity at the individual level i.e. factors that remain constant over time but differ between individuals), which is particularly important for studying green economy efficiency. As there are significant differences in resource endowments, technological foundations, and other aspects among different regions or enterprises, which may affect the results of efficiency evaluation [16], [17]. By using FEM, it is possible to more accurately identify the actual impact of policy interventions or other Variable changes on GEE, thereby providing more reliable data support for decision-making [18], [19]. To better apply FEM for empirical analysis, it is first necessary to explore the differences. Cross sectional data refers to data collected from different individuals meanwhile point, and the regression model expression for cross-sectional data is shown in equation (1).\begin{equation*} Y_{i} =\mu +\beta X_{i} +\in _{i} \tag {1}\end{equation*}
In equation (1), \begin{equation*} Y_{it} =\mu _{i} +\beta X_{it} +\gamma Z_{i} +\alpha _{i} +\varepsilon _{it} \tag {2}\end{equation*}
In equation (2),
In Figure 1, simple regression methods cannot fully represent the relationships between data. If individual fixed effects are added to the model, that is, countries are added as dummy variables, each country will have a regression line [20], [21]. The slope of the new regression line is higher than before, thus after controlling for variations between countries, per capita GDP can better explain changes in GEE. Secondly, the regression lines of different countries are parallel. This indicates that the FEM assumes that the causal effect of per capita GDP on GEE is constant for different individuals. The FEM can control for unobservable individual heterogeneity, such as region specific resource endowments or policy environments, to more accurately estimate the influence of technological progress on GEE. To comprehensively consider the impact of multiple levels of efficiency evaluation and technological progress, this study introduced the super efficiency SBM model to conduct relative efficiency evaluation within peer groups. Even if all decision units reach the forefront of technology, more efficient units can be distinguished. By combining these two methods, not only can the real-time GEE level of various regions be evaluated, but also the driving factors behind efficiency changes over time can be examined. The study first uses the Slack Based Measure (SBM) model of super efficiency to assess the GEE of each decision-making unit, and obtains the efficiency value of each unit. The construction of the super efficiency model is shown in equation (3).\begin{equation*} \rho ^{\ast }=\min \frac {1-\frac {1}{L}\sum \nolimits _{l=1}^{L} {\frac {S_{l}^{x} }{x_{kt}^{t}}}}{1+\frac {1}{M+1}\left ({{\sum \nolimits _{m=1}^{M} {\frac {s_{m}^{y}}{y_{km}^{t}}}}}\right)+\sum \nolimits _{n=1}^{N} {\frac {S_{n}^{b}}{b_{kn}^{t}}}} \tag {3}\end{equation*}
In equation (3), \begin{equation*} ML=\sqrt {\frac {1+D_{0}^{t} \left ({{x^{t},y^{t},b^{t},y^{t},-b^{t}}}\right)}{1+D_{0}^{t} \left ({{x^{t+1},y^{t+1},b^{t+1},y^{t+1},-b^{t+1}}}\right)}\times \frac {1+D_{0}^{t+1} \left ({{x^{t},y^{t},b^{t},y^{t},-b^{t}}}\right)}{1+D_{0}^{t+1} \left ({{x^{t+1},y^{t+1},b^{t+1},y^{t+1},-b^{t+1}}}\right)}} \tag {4}\end{equation*}
In equation (4),
In Figure 2, assuming AB represents the time front, X1 and Y respectively represent technically effective points, but Y is more superior than X1. X represents the point where the technology is not yet effective, so the distance required to reach the technology effective point for the technology ineffective point can be represented by XX1. However, the decision point at this time has not yet reached the optimal point on the production surface [22], [23]. If the effective point of technology needs to reach the optimal point of the technology, the distance from the effective point of technology to the optimal point of technology also needs to be added. It can be expressed as XX1+X1Y. In the efficiency chart, X is the efficiency invalid point, Y is the efficiency effective point, and X1 is the efficiency optimal point. The distance to reach the effective point of technology is XY, and the distance to reach the optimal point of technology is XX2+X1Y. Then, based on the calculated efficiency values, a FEM is used for regression analysis to explore the influence of different factors on the GEE. Green GDP is the total economic output after deducting resource consumption costs and environmental degradation costs from traditional GDP. The evaluation index system for green economy is shown in Figure 3.
In terms of environmental indicators, the discharge of industrial wastewater measures the pollution situation of water resources, usually measured in tens of thousands of tons. The amount of industrial waste gas emissions (using sulfur dioxide as an example) measures the industrial pollution, also measured in tens of thousands of tons. The quantity of industrial solid waste produced is calculated by measuring the amount of solid waste generated by industrial activities, with the unit of measurement being 10,000 tons [24], [25]. In terms of social indicators, the investment in technological innovation measures the degree of attention and intensity of investment in technological innovation in a region, usually measured in units such as R&D expenditure, in billions of yuan. The per capita green space area reflects the quality of life and ecological environment of local residents, measured in square meters per person. Resource utilization efficiency measures the amount of resources consumed per unit of economic output, reflecting the degree of resource conservation. These indicators together constitute a comprehensive evaluation system for the benefits of green economy. Thus, a Variable system for GEE model based on fixed benefit model can be constructed, as shown in Figure 4.
In Figure 4, in the Variable system of the fixed benefit model proposed in the study, the Technical Efficiency (TE) is a key dependent Variable that measures the ratio between actual output and theoretical largest output under given input conditions. The number of employed people (X1), capital stock (X2), and energy consumption (X3) constitute the basic components of the independent variables in the green economy efficiency model. The X1 reflects the size of the population participating in labor in the economy. The X2 represents the total value of available capital assets within the economy. X3 measures the total amount of energy consumed in production activities. In addition, human capital structure upgrading (HSTRUC), as an important independent variable, emphasizes the proportion of highly educated or skilled personnel in the labor force, which directly relates to the development potential and technological innovation capability of the green economy. In terms of explanatory variables, energy consumption intensity (enc) describes the amount of energy consumption required per unit of output, industrial structure (ind) reflects the relative size and contribution of different industrial sectors in the economic system, and government regulation (gov) includes policies, regulations, and tax measures. The degree of financial agglomeration (la) examines the concentration of financial activities within a region.
B. Improvement Plan for Green Economy Efficiency Calculation Based on FEM
The traditional FEM has some limitations in measuring the GEE. This type of model assumes that all individual differences can be captured through fixed effects, which may overlook some important heterogeneity factors. Secondly, if the time effects or other hierarchical effects included in the model are not correctly specified, the estimation results may be biased [26], [27]. In addition, excessive fixed effects may lead to a reduction in degrees of freedom, thereby affecting the statistical power and predictive ability of the model. To overcome these problems, the LSTMS-B classification algorithm was introduced to assist in selecting the most suitable fixed effects hierarchy and finding the most explanatory model, which can effectively control unobservable heterogeneity at the individual level and reasonably consider other possible important variables such as time effects. To identify the factors that have the greatest influence on the GEE, their structure is shown in Figure 5.
In Figure 5, the schematic diagram of LSTMS-B structure is a method of improving traditional FEM using deep learning techniques. This method mainly generates multiple sample sets through Bootstrap technology, weights and validates the training results of the classifier using the data, and trains different LSTM models using these samples separately. LSTMS-B captures dynamic features in time series data by introducing LSTM. For each sample set \begin{equation*} \theta _{i} =\arg \min _{\theta } \sum \limits _{(x,y)\in S_{i}} L \left ({{f_{LSTM}^{i} (x;\theta),y}}\right) \tag {5}\end{equation*}
In equation (5), L is the loss function; \begin{equation*} \hat {y}=\sum \limits _{i=1}^{N} {w_{i}} \cdot f_{LSTM}^{i} \left ({{x;\theta _{i}}}\right) \tag {6}\end{equation*}
In equation (6), each classifier corresponds to a weight
In Figure 6, the study uses AM instead of the forget gate (FG). The main function of the FG is to determine the location where historical information has been deleted and add information at that location, which is essential in updating cell state [28]. Equation (7) represents the status update of the FG.\begin{equation*} c_{t} =f_{t} \ast c_{t-1} +(1-f_{t})\ast a_{t} \tag {7}\end{equation*}
In equation (7), \begin{equation*} f_{t} =\delta (V_{f} \ast \tanh (W_{f} \ast c_{t-1})) \tag {8}\end{equation*}
In equation (8),
Step 1: Relevant panel data is collected, including individual and temporal dimensions. Step 2: The collected data is cleaned, including handling missing values, outliers, and data conversion. Step 3: Based on the research objectives, a green economic benefit indicator system including economic, environmental, and social indicators is constructed. Step 4: Descriptive statistical analysis is conducted on variables to understand their distribution. Step 5: The correlation between variables is analyzed to determine their interrelationships and their strength. Step 6: The FEM is established, including determining the form of the model, explanatory variables, and dependent variables. Step 7: The fixed effects hierarchy is determined, and the LSTMS-B classification algorithm or other methods are used to determine the most suitable fixed effects hierarchy. Step 8: Statistical methods such as multiple regression analysis are used to estimate the model and obtain the regression coefficients of each variable. Step 9: Variance Inflation Factor (VIF) and other methods are used to test for multicollinearity in the model. Step 10: The model results are subjected to robustness testing to ensure their reliability. Step 11: The application scenarios and practical significance of the model are explored. Step 12: The results of the estimation are presented and the impact of each Variable on the GEE and its economic significance are analyzed.
Analysis of Improving Fixed Effect GEE Calculation Scheme
A. EFEM-CEEM Model Performance Testing
To address the issue of neglecting important heterogeneity factors when measuring GEE in FEM, this study proposed the LSTMS-B classification algorithm to assist in selecting the most suitable fixed effects hierarchy. This study introduced the LSTM-GA model proposed in reference [29] and the GA-BP classification model proposed in reference [30] to compare with the LSTMS-B proposed in the study. The three algorithms mentioned above were implemented using Python language on the Windows 10 platform. The graphics card adopts NVIDIA GeForce GTX 1050 Ti, and the CPU model was Intel Core i5. The experimental results are shown in Figure 8.
Classification performance of various classification algorithms on different green economy indicators.
In Figure 8 (a), the LSTMS-B algorithm performed well in the classification task, with an average recall rate of 0.9813 for various indicator classifications, indicating that the algorithm could accurately identify all positive class samples. In contrast, the average recall rate of the LSTM-GA algorithm was 0.9713, while the average recall rate of the GA-BP algorithm was 0.964. Although the recall rates of these two algorithms were relatively high, there was still room for improvement compared to LSTMS-B. In Figure 8 (b), the average area of the LSTMS-B algorithm under the AUC-ROC curve reached 0.975, indicating that the algorithm has perfect classification performance on all classification thresholds and could fully distinguish between positive and negative class samples. This result indicated that the LSTMS-B algorithm had extremely high accuracy and discriminative ability in the classification task of green economy indicators. Overall, the LSTMS-B algorithm had shown a high AUC-ROC area across various economic indicator categories, further validating its superiority in measuring GEE. The study also compared the accuracy of three algorithms, as shown in Figure 9.
In Figure 9 (a), the three improved LSTM models (LSTM-B, LSTM-GA, and GA-BP) showed a good performance improvement trend and the classification accuracy of all models gradually improved. The LSTM-B model achieved a classification accuracy of approximately 0.97 after about 300 iterations, demonstrating high learning efficiency. The LSTM-GA and GA-BP models also maintained stable classification accuracies of approximately 0.95 and 0.93, respectively, demonstrating good generalization ability. In Figure 9 (b), in terms of loss function values, as the training progressed, the loss values of all models significantly decreased, indicating that the models were gradually optimizing and reducing prediction errors. The loss value of the LSTM-B model decreased the most rapidly and eventually stabilized, converging to 0.32, reflecting its optimization efficiency and stability. Overall, all three algorithms could effectively improve classification accuracy and reduce loss values during the training process, but the LSTM-B model slightly outperformed in performance. Table 1 presented the results of the descriptive statistical analysis of each variable.
In Table 1, the TE was used as the dependent variable, with a smallest value of 0.50, a largest value of 0.95, a mean of 0.75, and a Standard Deviation (SD) of 0.10. This indicated that the distribution of TE values in the research sample was relatively concentrated, and the technical level of most enterprises was close to the average level, but there was a certain degree of fluctuation. The smallest value of independent Variable X1 was 1200, the largest value was 48000, the mean was 24000, and the SD was 11000. This reflected significant differences in the number of employed individuals in the sample, which may be related to enterprises of different sizes. The value range of capital stock was relatively large, from 12000 to 480000, with a mean of 240000 and an SD of 120000. This indicated that there were significant differences in capital stock among different enterprises, and capital size may have a significant impact on the technological efficiency of enterprises. The smallest value of energy consumption was 150, the largest value was 4800, the mean was 2400, and the SD was 1200. This may reflect differences in energy efficiency among enterprises, as those with higher energy consumption may face higher costs and environmental pressures. The smallest value of this Variable was 0.15, the largest value was 0.55, the mean was 0.38, and the SD was 0.12. This indicated that the degree of advanced human capital was evenly distributed in the sample, but there was a certain degree of variation. The smallest value of the explanatory Variable for enc was 0.02, the largest value was 0.14, the mean was 0.07, and the SD was 0.04. This indicated that the fluctuation of energy consumption intensity in the sample was relatively small, which may mean that most enterprises are relatively efficient in energy use. The mean values of the variables’ ind and gov were 0.55 and 0.60, respectively, both close to the middle of their possible ranges, indicating a relatively balanced level of industrial structure and government regulation in the sample. Then, the correlation between independent variables was analyzed using the model proposed by the research institute, and the results were shown in Figure 10.
Figure 10 showed the correlation matrix between multiple variables, which intuitively reflected the strength of the correlation through the color depth. There was a high correlation between X1, X2, X3, and X4, with correlation coefficients of 1.00, indicating a strong positive correlation between these variables. TE was also fully correlated with HSTRUC, with a coefficient of 1.00, while their correlation with ind, gov, and la gradually weakens, reaching 0.54, 0.38, and 0.42, respectively. There was a strong correlation between HSTRUC and ind (0.54), but the correlation with gov and la was weak. The correlation between ind and gov was the weakest (0.14), while there was a certain correlation with La (0.42). It was worth noting that there was once again a complete correlation (1.00) between gov and la. The correlation between TE and ind, gov, and la gradually weakened, with values of 0.54, 0.38, and 0.42, respectively. This indicated that although these explanatory variables had a certain impact on technological efficiency, their influence was not as strong as the advanced human capital structure. The correlation between ind and la was 0.42, indicating a certain positive correlation between the two, which may reflect that the optimization of industrial structure may promote the agglomeration of financial resources. The correlation between gov and ind was the weakest, at 0.14, which may indicate that the impact of government regulation on industrial structure was limited, or the relationship between the two was complex and influenced by other variables.
B. Applicability and Robustness Analysis of EFEM-CEEM Model
Then, this study applied EFEM-CEEM to actual green economy efficiency analysis. This study first obtained major data such as GDP growth rate, per capita GDP, and proportion of the tertiary industry in a certain region from 2015 to 2020. Some of the data were shown in Table 2.
To verify the degree and significance of the impact of different economic factors on TE values. This study adopted the EFEM-CEEM protocol and conducted multiple regression analysis on the model using SPSS13 statistical software. The regression results were shown in Table 3 through the analysis of VIF.
Table 3 showed the degree and significance of the influence of different independent variables on the dependent Variable TE value. The number of employed people (X1) had significant impact, although the significance level was close to 0.05, indicating that its impact was relatively weak. The capital stock (X2) had a significant negative impact on the TE value, and the degree of impact is significant. The energy consumption (X3) had a positive impact on the TE value, and the significance was relatively high. However, the upgrading of HSTRUC had a slight negative impact on the TE value, and the significance level was also relatively low. The impact of ind and gov on TE values varied in different directions, but the influence of gov was not significant. The la had a remarkable positive impact on the TE value, and the degree of impact was moderate. In addition, the multicollinearity problem in the model was evaluated through tolerance and VIF values, and the results showed that the VIF values of most variables were below 10, indicating that the multicollinearity problem in the model is not severe.In addition, in Figure 11, this study also explored the average contribution Shapley value of the main green economy indicator features to the accuracy of the model.
The shapley value of the average contribution of the main green economy indicator features to the accuracy of the model.
Figure 11 used Shapley values to quantify the importance of each variable. In the figure, “per capita GDP”, “proportion of the tertiary industry”, and “urbanization rate” are the main factors. Green GDP had a high contribution in the three dimensions of X1, HSTRUC, and ind, indicating that this indicator had important value for model prediction. The contribution of per capita GDP in all dimensions was relatively balanced and high, indicating that this indicator had a positive impact on the model. The proportion of the tertiary industry was significant in both HSTRUC and gov dimensions, demonstrating the importance of service industry development in optimizing the overall economic structure. To verify the accuracy of this discovery, this study investigated the actual data. The survey results were shown in Figure 12.
In Figure 12, the per capita GDP, urbanization rate, and proportion of the tertiary industry showed significant consistency with the efficiency of green GDP. Regions with per capita GDP exceeding 60000 yuan (samples 16-20) generally had higher green GDP efficiency, indicating that economically developed regions place more emphasized on green growth. The 16 regions with urbanization rates exceeding 60% (samples 8-12) also had high green GDP efficiency, reflecting the parallel progress of urbanization and environmental benefits. The regions where the proportion of the tertiary industry exceeded 50% (samples 8-20) also had significant green GDP efficiency, proving that the development of the service industry promoted economic green transformation. These specific values strongly validated the viewpoint that the increased in per capita GDP, urbanization rate, and the proportion of the tertiary industry was positively correlated with the enhancement of GEE, jointly promoting sustainable economic development.
Conclusion
This study constructs an EFEM-CEEM based on an improved FEM to deeply analyze the key factors affecting GEE and verify the effectiveness and robustness of the model. This study found that the capital stock (X2) had a remarkable negative influence on TE, with a Beta value of -0.220, indicating that an increase in capital stock may lower the TE of the economy. Energy consumption (X3) had a positive impact on TE, with a Beta value of 0.180. The industrial structure (ind) had a remarkable negative influence on TE, with a Beta value of 0.230, indicating that optimizing the industrial structure helped to improve the technological efficiency of the economy. The la also had a significant positive impact on TE, with a Beta value of 0.150, indicating that the agglomeration of financial resources could promote the improvement of economic efficiency. Through empirical analysis, the study further confirmed the positive correlation between per capita GDP, urbanization rate, and the proportion of the tertiary industry and green GDP efficiency. Specifically, regions with per capita GDP exceeding 60000 yuan, urbanization rates exceeding 60%, and the proportion of the tertiary industry exceeding 50% all demonstrated high green GDP efficiency. The level of economic development, urbanization process, and the growth of the service industry was essential in promoting the green transformation of the economy. In summary, the EFEM-CEEM model effectively identifies the key factors that affect GEE and verifies its applicability at different levels of economic development. The model results emphasize the positive effects of industrial structure optimization, financial agglomeration, and energy consumption on improving technological efficiency. The l limitations lies in the generalization ability of the model and the analysis of long-term dynamic effects. Future research needs to further expand the sample range and consider the dynamics of time series.