Introduction
Urbanization, with its multitude of effects, offers both societal and environmental benefits alongside various challenges. It is crucial to acknowledge the potential consequences, especially regarding climate change, to promote sustainable growth. In 2021, the urban population accounted for over half (56.6%) of the world's total population [1], [2]. As per the projections, this figure is estimated to increase to 68% by 2050 [3]. The rise in urban population contributes to the growth of cities and a subsequent surge in development, leading to heightened effects of urban heat island (UHI) [4], [5]. UHI phenomenon refers to elevated temperature in urban areas rather than adjacent rural areas. The UHI effect mainly arises from the widespread presence of densely constructed urban infrastructure and a significant proportion of impermeable surfaces. Urban warming is commonly observed at many scales, ranging from tiny towns to large cities, where the most intense UHI effects are measured. Specifically, the UHI effect is more prominent during nighttime, initiates with sunset, and remains till sunrise [6]. Conversely, the covered areas in urban regions inherently have cooler temperatures during the daytime, whereas open rural areas are directly imperiled to solar radiation [7], [8]. Urbanization is the principal cause of the UHI effect, resulting in several detrimental environmental impacts, such as loss in vegetation activity, increased energy consumption, and degrading urban climate. Furthermore, the UHI effect adversely impacts human health; studies indicate that a 1°C rise in air temperature might elevate mortality by as much as 4.1%, endangering millions residing in urban areas [9].
Over the past few decades, much research has been conducted to assess the effects of urban characteristics, including the scale of buildings, distance from city centers, and the nature and quantity of vegetation, on the severity of the UHI effect and its changes across different areas [10], [11]. The classification of UHI is based on air temperature, leading to the changes between air or canopy-layer UHI (AUHI or CLUHI) and surface UHI (SUHI), which deals with surface temperature [12]. Thermal data retrieved from satellite remote sensing is crucial for examining the relationship between land surface temperature (LST) and urban land use/land cover (LULC) [13]. The SUHI phenomenon elucidates the variations in temperature within urban areas, establishes a correlation between various land use patterns and LST, and classifies regions with exceptionally elevated temperatures [14], [15]. However, there is still a need to improve the precise interpretation of the relationship with the air temperature that appropriately conveys important urban characteristics [16]. A comprehensive network of air temperature monitoring in all urban land-use sectors can provide significant advantages [1], [17]. Nevertheless, it requires substantial investments of both time and money [18]. As a result, remote sensing technology has become increasingly essential for investigating urban climate, particularly in measuring LST [19], [20]. Remote sensing and Google Earth Engine (GEE) have become potent platforms for examining the UHI impact, facilitating the analysis of extensive spatial and temporal data with remarkable accuracy. Utilizing satellite imagery, such as Landsat, researchers can extract essential indices like the normalized difference vegetation index (NDVI) and normalized difference built-up index (NDBI) to evaluate LST and urban heat dynamics [21], [22]. GEE's cloud-based technology enables efficient processing of extensive information, facilitating trend analysis and predictive modeling of UHI dynamics. These technologies offer essential information for urban planning and climate mitigation efforts [23], [24].
Urban land use is essential in exacerbating the UHI phenomena. Urban expansion leads to a higher ratio of developed land compared to areas with vegetation, which ultimately impacts the thermal properties of the urban environment [25], [26]. The explosion of impermeable surfaces and the decrease in vegetated areas exacerbate heat retention, amplifying the effects of the UHI phenomenon. To accurately evaluate and mitigate the negative impacts of UHI, it is essential to possess a consistent and methodical understanding of the fluctuations in LST and the changes in LULC [27], [28]. Advanced remote sensing technology provides numerous indices for studying the components contributing to the UHI phenomena [29], [30]. The NDVI and NDBI are particularly important for the evaluation of LST and UHI [31], [32]. These indicators precisely delineate vegetation and built-up regions and have demonstrated a robust correlation with LST. The NDVI provides a quantitative assessment of vegetation's level of greenness or vitality, indicating the abundance and condition of vegetation [33], [34]. In contrast, the NDBI index can accurately identify and estimate the size of urbanized regions. Both indices are significant in comprehending the thermal properties of urban environments since they provide information about the surface's ability to retain, assimilate, and reflect solar energy [35], [36].
The link between SUHI, NDVI, and NDBI is complicated and mutually influential. High NDVI values indicate abundant vegetation and are often associated with lower surface temperatures due to the cooling influence of evapotranspiration and tree canopy coverage [37], [38]. Conversely, areas with high NDBI and low NDVI indicated abundant built-up structures, and these areas have higher temperatures due to the high heat storage capacity of concrete and asphalt and low evapotranspiration [12]. The examination of UHI concerning NDVI and NDBI is especially pertinent given the swift urban growth and climate change. Moreover, local temperature patterns can worsen or alleviate the UHI impact, making it a dynamic and intricate phenomenon to investigate and control [36], [39].
Evaluation of UHI effects accurately is challenging due to the contributing factors, spatial heterogeneity, and temporal variability. However, integrating remote sensing data with advanced machine learning and deep learning techniques offers a promising way to improve UHI prediction [6], [40], [41]. Machine and deep learning algorithms, including XGBoost (eXtreme Gradient Boosting), convolutional neural networks (CNNs), and recurrent neural networks (RNNs), have all demonstrated remarkable capabilities in modeling complex, nonlinear relationships and high-dimensional data [42], [43]. Neural networks, especially deep neural networks, are well-known for their ability to acquire nonlinear function approximations. Research suggests that a feed-forward neural network was employed to represent the complex connections in sequential data, as time series data frequently display linear and nonlinear patterns [10], [26]. To reduce the computing load of dealing with linear and nonlinear learning separately, researchers introduced RNNs built explicitly for modeling sequential data. Nevertheless, RNNs encounter obstacles such as the issue of vanishing or expanding gradients [44], [45]. This requires careful consideration of the design of the learning process during backpropagation, employing gradient descent. Advanced iterations of RNNs, namely long short-term memory (LSTM) networks, have been created to capture sequential dependencies better [16], [46]. In addition to LSTM, the gated recurrent unit (GRU)-based RNNS proved the ability to evaluate and predict environmental factors. GRUs excel at capturing temporal relationships and provide distinctive mechanisms to address the problems of disappearing and exploding gradients, rendering them well-suited for constructing resilient time series prediction models [5], [47]. Due to the dynamic nature of SUHI, which is influenced by transient climate patterns and urban development, there is a lack of real-time adaptability [2], [48]. However, an adaptation of GRU, known for its ability to handle large datasets and nonlinear relationships effectively, can create more accurate predictive models of SUHI, even in complex urban environments [49].
This study aims to investigate the SUHI in response to NDVI and NDBI utilizing a novel GRU-RNN model using Landsat data [50]. This model will help urban planners and policymakers develop effective strategies to reduce the UHI effect and improve urban environments' thermal comfort and sustainability. Accordingly, the study has the following primary objectives:
to assess the spatial and temporal trends of NDVI, NDBI, and UHI;
to investigate the temporal relationship between NDVI, NDBI, and UHI;
to develop a novel deep learning method to evaluate UHI and its influencing factors.
Our study is distinctive as it will present a novel approach to deep learning in combination with wavelet transformation and its utilization for evaluating and forecasting UHI. Additionally, this study utilized the long-term monthly data from the Landsat satellite to explore the temporal correlation between UHI and its influencing factors using wavelet coherence. Through this comprehensive approach, the study enhances knowledge of UHI and its mitigation, offering insights crucial for developing a more resilient and sustainable urban environment.
Material and Method
This study applied a novel GRU-RNN model to construct a UHI predictive model, utilizing a dataset of environmental variables (NDVI and NDBI) extracted from remotely sensed data. The dataset was preprocessed by a normalization process, which ensure uniform inputs for the model.
A. Study Area
Multan is located in the southern region of Punjab, Pakistan, with coordinates ranging from 71° 00′ 54″ E to 72° 58′ 43″ E in longitude and from 29° 27′ 21″ N to 30° 45′ 30″ N in latitude. Fig. 1 shows the complete methodology used in this study. With a total area of 3720 km2, the district has a population of 5362305 with a population density of 1400 people per km2. This district encompasses a variety of terrains, including productive farmland, metropolitan areas, and significant historical sites, resulting in a dynamic and diversified region. In recent times, Multan has encountered severe climatic conditions, with the maximum temperature reaching roughly 52°C and the minimum dropping to around −1°C. The region experiences an average precipitation of approximately 186 mm, and it anticipates heatwaves during the peak summer months of May and June. Changes in the summer and winter seasons occur rapidly due to fluctuating weather conditions [51]. Multan experiences intense heat and swift shifts in climate, making it one of the most vulnerable regions to extreme weather events.
B. Datasets Used
This study investigates the UHI phenomenon and its driving factor with special emphasis on NDVI and NDBI. The analysis spanned two decades (2001–2023); providing ample time to identify significant trends in urban heat and its influencing factors, thereby enhancing the robustness and relevance of the results. Data from Landsat-7 and Landsat-8 were utilized to obtain the monthly LST, NDVI, and NDBI, as shown in Table I. The first phase of the study lasted 12 years (2001–2012) and utilized Landsat-7 data, whereas the next 11-year period was based on Landsat-8 data (2013–2023). The cloud cover ratio was less than 5% throughout the analysis period. If no images meet the cloud cover criteria for a particular month, the linear interpolation technique was applied to estimate the missing LST, NDVI, and NDBI values. Moreover, all the data processing to obtain LST, NDVI, and NDBI was done using GEE, a cloud platform crucial in efficiently analyzing remote sensing data. This platform efficiently provides extensive remotely sensed data and has robust data analysis and interpretation capabilities.
C. Land Surface Temperature Retrieval
A standardized LST retrieval technique was used for all datasets to guarantee uniformity in the derived LST from Landsat-7 and Landsat-8 imagery. The image-based method was chosen over different inversion algorithms that necessitate intricate simulation of atmospheric parameters during the satellite overpass due to its simplicity and user-friendly nature. This technique was considered appropriate for obtaining LST from all the Landsat photos examined in this study. The image-based method begins by calculating the brightness temperature (Ts) using the following equation:
\begin{equation*}{{T}_s} = \frac{{{{K}_2}}}{{\ln \left( {1 + \left( {{\raise0.7ex\hbox{${{{K}_1}}$} \!\mathord{\left/ {\vphantom {{{{K}_1}} {{{L}_{\rm{\lambda }}}}}}\right.} \!\lower0.7ex\hbox{${{{L}_{\rm{\lambda }}}}$}}} \right)} \right)}}\ \tag{1} \end{equation*}
\begin{equation*}{{L}_{\rm{\lambda }}} = {{M}_L}\ .{{Q}_{\text{cal}}} + {{A}_L} \tag{2} \end{equation*}
\begin{equation*} T\ = \frac{{{{T}_s}}}{{1 + \left( {{\rm{\lambda }} \times {\raise0.7ex\hbox{${{{T}_s}}$} \!\mathord{\left/ {\vphantom {{{{T}_s}} \alpha }}\right.} \!\lower0.7ex\hbox{$\alpha $}}} \right)\ln \left( \varepsilon \right)}}\ . \tag{3} \end{equation*}
In the formula given above, T denotes the surface temperature,
\begin{align*}{\rm{PV\ }} & = {{\left( {\frac{{{\rm{NDVI\ }} - {\rm{\ NDV}}{{\mathrm{I}}_{\text{min}}}}}{{\text{NDV}{{\mathrm{I}}_{\text{max}}}\ - {\rm{\ NDV}}{{\mathrm{I}}_{\text{min}}}}}} \right)}^2}\ \tag{4}\\ \varepsilon & = \left( {0.00149 \times PV} \right)\ + 0.986. \tag{5} \end{align*}
By applying this method, the study maintains a consistent approach to LST retrieval, mitigating the potential variability that could arise from different retrieval methodologies. Fig. 2 explains the whole methodology.
D. SUHI Calculation
This work employed remote sensing data from the Landsat satellite to accurately assess the thermal properties of Multan by calculating the SUHI effect. The SUHI is generally calculated based on the LST data using
\begin{equation*} \text{SUHI}\ = \frac{{{{T}_s} - {{T}_m}}}{{{{T}_{\text{std}}}}}\ \tag{6} \end{equation*}
E. Indices Calculation
Monthly NDVI and NDBI were extracted using Landsat imagery to evaluate the contribution of NDVI and NDBI toward changes in UHI. NDVI was extracted according to the following equation using near-infrared (NIR) and red bands of Landsat imagery:
\begin{equation*}{\rm{NDVI\ }} = \frac{{\text{NIR} - \text{Red}}}{{\text{NIR} + \text{Red}}}\ . \tag{7} \end{equation*}
The NDBI index was produced using near-infrared and short-wave infrared according to
\begin{equation*}{\rm{NDBI\ }} = \frac{{\text{SWIR} - \text{NIR}}}{{\text{SWIR} + \text{NIR}}}\ . \tag{8} \end{equation*}
F. Trend Analysis
The modified Mann–Kendall (MMK) trend test and the Theil–Sen estimator (Sen's Slope) were used to examine time series data for UHI, NDVI, and NDBI to identify trends. Mann–Kendall (MK) trend test is a nonparametric trend test and is extensively used to test the trend of time series. One of the key advantages of this test is its insensitivity to sudden changes in the data and its independence from assumptions about the linearity of the trend. The MMK method was adopted to address the regularly occurring serial correlation in time series data, which considers variance adjustments. Sen's slope method, a robust methodology for linear regression, was employed to obtain the median slope between all pairs of sample points. This was done to evaluate the trends of UHI, NDVI, and NDBI. The following equation is the mathematical representation of the MK trend test:
\begin{equation*} S\ = \mathop \sum \limits_{i = 1}^{n - 1} \mathop \sum \limits_{j = i + 1}^n \text{sig}\left( {{{x}_j} - {{x}_i}} \right)\ \tag{9} \end{equation*}
\begin{equation*} \text{sig}\ \left( {{{x}_j} - {{x}_i}} \right) = \left\{ {\begin{array}{c} {1{\rm{\ if\ }}{{x}_j} > {{x}_i}}\\ {0{\rm{\ if\ }}{{x}_j} = {{x}_i}}\\ { - 1{\rm{\ if\ }}{{x}_j} < {{x}_i}} \end{array}.} \right.\ \tag{10} \end{equation*}
The variance of S, denoted as Var(S), is calculated as
\begin{equation*} \text{VAR}\ \left( S \right) \!=\! \frac{{n\left( {n \!-\! 1} \right)\left( {2n \!+\! 5} \right) \!-\! \mathop \sum \nolimits_{k = 1}^m {{t}_k}\left( {{{t}_k} - 1} \right)\left( {2{{t}_k} + 5} \right)}}{{18}}\ \tag{11} \end{equation*}
\begin{equation*} Z = \left\{ {\begin{array}{cl} \frac{{S - 1}}{{\sqrt {\text{VAR}\left( S \right)} }}& \text{if} \ S > 0\\ 0 & \text{if}\ S = 0\\ \frac{{S + 1}}{{\sqrt {\text{VAR}\left( S \right)\ } }} & \text{if}\ S < 0 \end{array}} \right.. \tag{12} \end{equation*}
If Z>, it indicates an increasing trend, and if Z < 0, it indicates a decreasing trend. Trend analysis was also carried out on a seasonal basis, and for this, datasets were classified into four seasons per the region's climatology. The winter seasons contain the months from December to February; the spring seasons have March, April, and May; the summer consists of June, July, and August; and the months of September to November represent the autumn season.
G. Correlation Analysis
The Pearson correlation test given in (13) [43] was adopted to evaluate the correlation between UHI, NDVI, and NDBI at annual and seasonal intervals and to detect the key factors affecting UHI. A correlation test was conducted for all variables at a significance level of 5%.
\begin{equation*} r\ = \frac{{\mathop \sum \nolimits_{i = 1}^n \left( {{{x}_i} - \bar{x}} \right)\left( {{{y}_i} - \bar{y}} \right)}}{{\sqrt {\mathop \sum \nolimits_{i - 1}^n {{{\left( {{{x}_i} - \bar{x}} \right)}}^2}.\mathop \sum \nolimits_{i - 1}^n {{{\left( {{{y}_i} - \bar{y}} \right)}}^2}} }}\ . \tag{13} \end{equation*}
The Pearson correlation coefficient, represented as r, is a metric that indicates the magnitude and direction of the linear association between two variables. A negative r value indicates an inverse correlation, whereas a positive value indicates a positive correlation. This study also employs wavelet coherence to investigate the temporal link between UHI, NDVI, and NDBI. Continuous wavelet transform (CWT) was initially used to convert the data into the time–frequency domain [53]. This transformation is crucial for evaluating the localized regularity and managing the variable character inherent in environmental time series. The Morlet wavelet was selected due to its ideal balance between time and frequency resolution, which is essential for evaluating the smooth, oscillating patterns in UHI, NDVI, and NDBI.
Although wavelets like Haar and Mexican Hat effectively localize time, they do not possess the necessary frequency resolution for conducting in-depth environmental investigations. While effective for multiresolution analysis, the Daubechies wavelets do not provide the same level of clarity in continuous time–frequency analysis as the Morlet wavelet. The cross-wavelet transform (XWT), denoted as
\begin{equation*} R_n^2 ( s) = \frac{{{{{\left| {S\left( {{{s}^{ - 1}}w_n^{xy}\left( s \right)} \right)} \right|}}^2}}}{{S\left( {{{s}^{ - 1}}{{{\left| {W_n^x\left( s \right)} \right|}}^2}} \right).S\left( {{{s}^{ - 1}}{{{\left| {W_n^y} \right|}}^2}} \right)}}\ \tag{14} \end{equation*}
The wavelet transform (S) serves as a smoothing operator and the local correlation coefficient
H. Discrete Wavelet Transformation
Discrete wavelet transformation (DWT) is frequently preferred over CWT for forecasting applications due to its reduced computational time and greater ease of implementation [54]. Conversely, CWT is not often used for forecasts due to its computing complexity and time demands. The following equation represents the DWT:
\begin{equation*}{{\omega }_{x,y}}\ \left( t \right) = \frac{1}{{\sqrt {{{p}^x}} }}\ \omega \left\{ {\frac{{t - yq{{p}^x}}}{{{{p}^x}}}} \right\} \tag{15} \end{equation*}
\begin{equation*} x\ \left( t \right) = \ T + \mathop \sum \limits_{x\ = \ 1}^M \mathop \sum \limits_{t\ = \ 0}^{{{2}^{M - m - 1}}} {{W}_{m,y}}{{2}^{ - \frac{x}{2}}}\omega \left( {{{2}^{ - x}}t - y} \right) \tag{16} \end{equation*}
This study employed the DWT to decompose and reassemble the data, thereby improving the performance of the GRU-based deep learning model. Each sub-time series derived by decomposition uniquely contributes to the original data, playing a distinct role in its structural formation. Correlation analysis investigated the relation between each sub-time series and the original data to determine the most relevant wavelet components. The correlation coefficients quantified the varying impact of each wavelet component on the original series. The significant and selected wavelet components were integrated to improve the efficacy of the GRU-based deep learning model. The approximation subseries were integrated to provide a new, comprehensive time series for each reformed series. This method facilitates the removal of noisy data while maintaining quasiperiodic and periodic signals. This study utilized a “coif1” wavelet to decompose all variable data up to three levels of wavelet decomposition (see Fig. 3).
I. GRU Model
GRU is a type of RNN that is more streamlined than the LSTM model. In contrast to LSTM's three gates, it has only two gates: reset (rt) and update (zt). The reset gate establishes the method for integrating the incoming input with the prior memory, whereas the update gate specifies the extent of the old memory to preserve. The fundamental computations of the GRU model shown in Fig. 4 are given in the following equations:
\begin{align*}{{r}_t} & = \sigma \left( {{{w}_r}\left[ {{{h}_{t - 1}},{{x}_t}} \right] + {{b}_r}} \right) \tag{17}\\{{z}_t} & = \sigma \left( {{{w}_z}\left[ {{{h}_{t - 1}},{{x}_t}} \right] + {{b}_z}} \right) \tag{18}\\{{{\rm{\hat{h}}}}_t} & = \tan h\left( {W\left[ {{{r}_t}^\circ {{h}_{t - 1}},{{x}_t}} \right] + {{b}_{{{h}^ - }}}} \right) \tag{19}\\{{h}_t} & = {{z}_t}\ ^\circ {{\hat{\text{h}}}_t} + \left( {1 - {{z}_t}} \right)^\circ {{h}_{t - 1}}. \tag{20} \end{align*}
In above equations,
J. Feature Importance
This study utilized SHAP (SHapley Additive exPlanations) to ascertain the significance of each feature. SHAP is based on the concept of Shapley values, which quantify the influence of various features on the outcome variable. SHAP values are employed in feature selection to quantify the individual impact of each feature, therefore assessing its significance. The initial dataset is initially fed into the model, following which the SHAP framework assigns a SHAP value to each feature of every data point, reflecting the individual contribution of each feature to the model's output. Therefore, the SHAP value calculation depends on the model being used. The SHAP value
\begin{equation*}{{\emptyset }_j} \!=\! \frac{1}{{\left| N \right|!}}\ \mathop \sum \limits_{S \subseteq N\backslash \left\{ j \right\}} \left| S \right|!\left( {\left| N \right| \!-\! \left| S \right| \!-\! 1} \right)!\left[ {f\left( {S\cup\left\{ j \right\}} \right) \!-\! f\left( S \right)} \right] \tag{21} \end{equation*}
Results
A. Temporal Trends
The study analyzed the temporal trend of the UHI over the study area over the last 23 years (2001–2023). Monthly data was processed to eliminate seasonal variations, and the MMK trend test assessed the temporal trends. This study examines the UHI phenomenon and demonstrates noteworthy trends for seasonal and annual data (see Fig. 5). The monthly standardized UHI data exhibits a marginal upward trend (stable) from 2001 to 2023, as indicated by a Tau value of 0.121 and a Sen's Slope of 0.0001. The winter UHI exhibits a significant decline, as seen by a Tau value of −3.486 and a Sen's Slope of −0.059. These values indicate a decrease in the intensity of UHI throughout the winter season. The UHI throughout the summer exhibits a significant rise, shown by a Tau value of 0.158 and a Sen's Slope of 0.012. The yearly UHI data shows a consistent decline, as seen by a Tau value of −1.268 and a Sen's Slope of −0.004 (p < 0.05). The data indicate a decrease in the UHI effect during winter but an increasing tendency during summer, suggesting seasonal fluctuations in urban heat effects.
The NDVI trends result in Multan for annual and seasonal data are shown in Fig. 6. The monthly normalized NDVI data exhibits a significant upward trend from 2001 to 2023, as indicated by a Tau value of 2.23 and a Sen's Slope of 0.002. During winter, the NDVI shows a significant rise, with a Tau value of 4.54 and a Sen's Slope of 0.006 (p < 0.001), indicating enhanced vegetative health. The summer NDVI shows a significant upward trend, with a Tau value of 2.69 and a Sen's Slope of 0.002 (p < 0.05). The yearly NDVI data supports this favorable trend, as indicated by a Tau value of 3.43 and a Sen's Slope of 0.002 (p < 0.001). The results demonstrate a general improvement in vegetation coverage and condition throughout all seasons, with notable increments, particularly during winter and summer. The NDBI data for Multan show substantial decreases seasonally and annually (see Fig. 7). The monthly normalized NDBI data exhibits a significant downward trend from 2001 to 2023, as indicated by a Tau value of −9.907 and a Sen's Slope of −0.007. During the winter, there is a significant decrease in the NDBI, with a Tau value of −4.121 and a Sen's Slope of −0.009. This indicates a fall in the extent of built-up regions. The summer NDBI exhibits a significant decline, shown by a Tau value of −4.756 and a Sen's Slope of −0.007. The annual NDBI data provides additional evidence to support these conclusions. It shows a consistent decrease over time, with a Tau value of −4.754 and a Sen's Slope of −0.007. The data demonstrate a steady decrease in developed regions, specifically in the winter and summer, indicating shifts in urban growth trends.
B. Relationship of SUHI and Environmental Variables
Correlation analysis was performed on standardized values of monthly SUHI and environmental variables (NDBI and NDVI) for Multan from 2001 to 2023. Fig. 8 shows the correlation results from Pearson correlation analysis. The monthly correlations reveal a noteworthy inverse correlation of −0.540 between NDVI and UHI, indicating that more vegetation cover is linked to reduced UHI impacts. In contrast, the NDBI exhibits a positive association of 0.344 with UHI, suggesting that the expansion of built-up regions leads to elevated UHI levels. Every year, the NDVI consistently shows a negative connection of −0.394 with the UHI, whereas the NDBI maintains a positive association of 0.347. Significantly, in winter, there is a stronger negative connection (−0.568) between NDVI and UHI, whereas there is a considerable positive correlation (0.590) between NDBI and UHI. In winter, a highly negative correlation (−0.912) between NDVI and NDBI indicates a clear inverse association between vegetation and built-up areas. There are weaker but still significant connections during the spring and summer seasons. In particular, a negative correlation (−0.384) exists between NDVI and UHI during spring. The wavelet coherence analysis demonstrates UHI's dynamic and time-varying connections with NDBI and NDVI. The wavelet coherence between UHI and NDVI in Multan reveals a strong and statistically significant coherence, particularly throughout the time range of 4 to 16 months, spanning from 2003 to 2022 [see Fig. 9(a)]. The strongest correlation is found between 4 to 8 months, indicating a seasonal connection between vegetation cover and urban heat. The arrows inside these locations predominantly indicate a leftward and downward direction, signifying that the UHI and NDVI exhibit an out-of-phase relationship, with NDVI leading SUHI. This correlation suggests that when vegetation cover grows, there is a corresponding decrease in the UHI, indicating the cooling impact of vegetation on urban heat. The coherence is particularly robust during the 16 months, suggesting an annual cycle in which vegetation changes have a substantial and lasting impact on urban heat. Additionally, the wavelet coherence of the study shows distinct areas of strong coherence between UHI and NDBI at various time intervals [see Fig. 9(b)]. The time period spanning from around 2003 to 2015 exhibits a strong coherence ranging from 4 to 16 months, suggesting a persistent correlation between UHI and NDBI over these specific time intervals. The arrows within these prominent areas primarily indicate a rightward and upward direction, suggesting a strong correlation between UHI and NDBI, with NDBI leading UHI. This correlation indicates that the expansion of developed regions (as represented by NDBI) is causing an increase in the UHI. The coherence remains high throughout shorter periods (4–8 months), suggesting a seasonal impact where changes in urban infrastructure rapidly impact urban heat dynamics.
Temporal correlation of (a) SUHI and NDVI and (b) SUHI and NDBI with UHI for Multan from wavelet coherence.
C. Novel GRU-Based Model
The novel GRU-based model was developed to forecast UHI based on two environmental variables: NDVI and NDBI. The raw data was analyzed and recreated utilizing DWT to improve the model's efficacy. Subsequently, the data was organized for time-series forecasting by constructing sliding windows. A look-back period of 12 months is selected, indicating that the model will utilize data from the preceding 12 months to forecast the subsequent month's UHI. The dataset is partitioned into training and testing subsets, including 80% allocated for training and 20% for testing; the study employed k-fold cross-validation (with k = 5) to meticulously evaluate the model's performance across various subsets of the dataset. We have constructed a three-layer model based on GRU, with the initial two GRU layers comprising 128 units each and the third GRU layer including 64 units. The increased number of units in the initial two GRU layers enhances the model's potential to assimilate intricate patterns. Dropout layers with a rate of 0.2 are incorporated following each GRU layer to mitigate overfitting. A solitary neuron-dense layer is integrated at the output, forecasting the UHI value for the subsequent phase. The model utilizes the Adam optimizer and the RMSE loss function. Early stopping is employed as a callback to terminate training if the validation loss fails to improve after ten successive epochs to avoid overfitting. The model undergoes training on the training set for a maximum of 100 epochs with a batch size of 12. The selection of 100 epochs provides sufficient duration for the model's learning, whereas early termination ensures the training process does not continue excessively if the validation loss stabilizes. A dropout rate of 0.2 is established to regularize the model. Accuracy metrics, including R2, RMSE, and MAE, were adopted to assess the predicted and actual UHI values.
D. GRU-Based Model Performance
The efficacy of the innovative GRU-based model in predicting UHI effects was extensively assessed utilizing diverse performance criteria. The learning curve, residuals, and accuracy plots provide evidence of the model's robust capacity to acquire knowledge, generalize, and make precise predictions of UHI values over the research duration. The learning curve plot displays the loss values for the training and validation datasets throughout 75 epochs [see Fig. 10(a)]. At first, the training and validation losses fall quickly, suggesting that the model is learning efficiently. During the 10th epoch, there is a temporary increase in the validation loss, indicating a possible case of overfitting. However, it later reduces and closely aligns with the training loss by the 20th epoch. The ultimate loss values reach a stable state, with minimal deviation between the training and validation losses, suggesting the model has strong generalization. The behavior observed indicates that the innovative GRU-based model successfully acquires the fundamental patterns in the data without experiencing excessive overfitting, thereby demonstrating a robust and reliable model performance.
(a) Learning process of GRU-based with learning curves of the training and validation phases. (b) Residual graph of GRU-based model.
The residuals plot displays the prediction errors of the GRU-based model on the testing dataset [see Fig. 10(b)]. The residuals, when plotted against time, display a random distribution around the zero-error line, suggesting the absence of any discernible systematic bias in the model's predictions. The mistakes are typically insignificant, as most data points are near the zero line, indicating precise predictions. Additional notable residuals indicate occasional large errors; however, these occurrences are relatively rare. Typically, most prediction errors are evenly spread around this line, strengthening the model's dependability in forecasting UHI levels.
The accuracy plot compares the observed and anticipated UHI values throughout the study period, indicating the different training, testing, and prediction stages (see Fig. 11). The training data of UHI demonstrates a strong correlation with an R2 value of 0.96. Additionally, the RMSE is 0.21, the MSE is 0.04, and the MAE is 0.17. The testing and prediction data correlate significantly with the observed values, showing excellent model performance throughout the testing stage. The performance metrics values in the testing phase, including RMSE = 0.30, R2 = 0.90, MSE = 0.09, and MAE = 0.25, emphasize the model's high accuracy and predictive capability. The R2 value of 0.90 indicates that the model can explain 90% of the variance in the UHI data, highlighting the usefulness of the innovative GRU-based technique in capturing the intricate dynamics of UHI. The innovative GRU-based model generally demonstrates strong performance in predicting UHI. The model successfully learns from the available data, extrapolates to new data, and delivers precise predictions with a small margin of error. The high coefficient of determination (R2) value and low RMSE validate the model's strength and appropriateness for UHI prediction tasks.
E. Feature Importance
The SHAP analysis conducted on the based model reveals the crucial role of NDVI and NDBI in accurately predicting the UHI effect. Based on the SHAP analysis, the NDVI feature is the most important, with the highest average absolute SHAP value of 0.036. This indicates that NDVI has a significant impact on the model's output (see Fig. 11). The significance of NDVI highlights the crucial role of vegetation in regulating urban heat since greater NDVI values usually correspond to lower intensity of UHI due to the cooling impact of vegetation. Although NDBI is considered less critical than NDVI, it contributes considerably to the model's predictions. The SHAP values for NDBI (0.023) elucidate their significance in capturing the impacts of constructed regions on urban heat dynamics. The NDBI exhibits a substantial coverage value, indicating that variations in constructed regions have a wide-ranging effect.
F. Comparison of Performance of Developed Model
To evaluate the efficacy of the GRU-based model, we assessed its accuracy against the most widely tested and effective deep learning model, the LSTM-RNN model [55]. Figs. 12 and 13 show the scatter accuracy of other models by the R2, RMSE, and MAE metrics, indicating that the GRU-based model outperforms the LSTM model in predictive accuracy. The GRU model achieves an R2 of 0.90, representing a better correlation between the forecasted and actual values than the LSTM model, which has an R2 of 0.84. The GRU model shows exceptional performance in error measures, exhibiting a lower RMSE of 0.25 and an MAE of 0.09, signifying that its predictions align more closely with actual values than the LSTM model, which has an RMSE of 0.28 and an MAE of 0.12. Reduced RMSE and MAE values indicate that the GRU model generates fewer prediction errors on average, enhancing its accuracy and reliability for UHI forecasting.
Representation of actual and forecasted data in the testing phase of the GRU-based model.
Discussion
This study thoroughly investigates the UHI phenomenon in Multan over 23 years (2001–2023). The analysis incorporates state-of-the-art statistical approaches, wavelet coherence, and cutting-edge deep learning techniques. The study findings indicate consistent but slightly rising trends of UHI effects, although these trends are not statistically significant [56]. However, the environmental variables showed significant trends: NDBI revealed a decline, whereas NDVI demonstrated an increase [57], [58]. These changes reveal the intricate correlation between urbanization, vegetation cover, and local thermal conditions, in alignment with Zhang et al. [59] our research supports the role of increased vegetation (as NDVI and UHI are inversely correlated) in mitigating the UHI effect, accentuating the significance of green infrastructure in urban planning. This study also endorsed results from Hidalgo-García and Arco-Díaz. This study stated that an upward trend in NDVI helps mitigate the UHI effect [60], [61]. These findings emphasize the pivotal role of green infrastructure in urban planning to improve thermal comfort and alleviate heat-related risks. Similar to UHI increases observed in global mega-cities and the increase of UHI in different-sized cities, our study shows similar results for two seasons and monthly data. In contrast, annual, winter, and autumn UHI has demonstrated a slightly decreasing trend in the study area. This difference indicates the role of local factors in shaping urban thermal landscapes [62], [63]. Additionally, the absence of densely populated urban areas and the low density of high-rise buildings in Multan compared to big cities around the globe can be the reason for these results. The stable trend of UHI can be associated with the green infrastructure projects by Pakistan, i.e., Green Pakistan and Billion Tree Tsunami. These green initiatives promoted afforestation, which ultimately increases the NDVI and can cause stability in the UHI effect [64].
Our correlation analysis between UHI and NDBI supports the findings of Tariq et al. [65] highlighting a significant relationship between built-up areas and UHI in the Multan region. Our correlation analysis reveals significant relationships between UHI and environmental variables such as NDBI and NDVI, consistent with recent studies emphasizing the role of land cover in modulating urban heat islands. Furthermore, the study utilized the GRU-based deep learning model, which has been increasingly utilized in environmental research for its predictive power and interpretability. The model's performance, with an RMSE of 0.30 and R2 of 0.90, aligns with recent advancements in predictive modeling for UHI analysis. These results are in line with the accuracy of a study by Tian et al. [66], which also gave 0.90 R2 for the GRU-based model, and a study by Alamgir et al. [67] also proved the high accuracy of the based model. Additionally, the study suggested the usability of the GRU base model over LSTM [68], which is the same as our results in the current study. Our analysis highlights NDVI as the most influential predictor, which aligns with previous research emphasizing the crucial significance of local surface temperatures in urban heat dynamics.
A. Limitations of the Study
Although this study has made significant contributions to our understanding of the UHI phenomena through advanced Wavelet analysis for temporal correlation and a unique GRU-based model for prediction, there are still opportunities for future research to build upon these findings [53], [69]. This study utilized monthly data to analyze long-term UHI trends, it is essential to acknowledge that higher temporal resolution data, such as daily or weekly observations, could provide more granular insights into short-term UHI dynamics and their drivers, representing a potential avenue for future research [70], [71]. Further research should explore additional environmental and socioeconomic elements that influence the UHI phenomena, such as population density, automotive heat emissions, changes in land use, albedo, and industrial heat emissions. Although the GRU-based model has shown strong performance, it is essential to recognize that all models have inherent limitations. The accuracy of the model, which is based on the GRU architecture, depends on the quality of the input data. Furthermore, implementing the GRU-based results in various urban environments may limit weather effects and differences in urban structure. To improve the reliability of the forecasts, it is possible to enhance the dependability by conducting further validation using real-time data and conducting sensitivity analysis on the parameters. Furthermore, performing tests on different models and comparing their accuracies can significantly improve the selection of the most appropriate models for predicting the UHI phenomenon. Although there are several limits, the study has successfully laid a solid groundwork for using deep learning in environmental modeling. The results of this study are pretty significant for the field. These results can enhance urban planning and formulate more efficient strategies for climate change adaptation. Future research should improve the model, use more diverse datasets, and promote interdisciplinary collaboration to address the complex problems of UHI and sustainable urban development.
Conclusion
This study enhances the current understanding by offering specific insights into the dynamics of UHI in Multan. Additionally, it showcases the effectiveness of GRU-based deep learning methods in environmental research. The comprehensive comprehension of this study highlighted the need to plant trees and prioritize including green infrastructure in urban planning to alleviate the impacts of UHI. Policymakers should enhance afforestation endeavors, enhance urban green areas, and promote sustainable land-use practices. Future studies should prioritize conducting extended studies to assess the enduring effects of green infrastructure on the UHI phenomenon. Additionally, future research should investigate the possibilities of cutting-edge technologies like the Internet of Things in managing urban heat. Furthermore, it is crucial to analyze the socioeconomic advantages of mitigating UHI, particularly regarding public health and urban comfort. Researchers should prioritize the improvement of deep learning models to enhance predicted accuracy. Additionally, they should incorporate a broader range of environmental and socioeconomic variables to thoroughly understand the complex dynamics of the UHI phenomenon.