Impact of High Concentrations of Saharan Dust Aerosols on Infrared-Based Land Surface Temperature Products

An analysis of three operational satellite-based thermal-infrared land surface temperature (LST) products is presented for conditions of heavy dust aerosol loading. The LST products are compared against ERA5’s skin temperature (SKT) across the Sahara Desert and Sahel region, where high concentrations of dust aerosols are prevalent. Large anomalous differences are found between satellite LST and ERA5’s SKT during the periods of highest dust activity, and satellite–ERA5 differences are shown to be strongly related to dust aerosol optical depth (DuAOD) at 550 nm, indicating an underestimation of LST in conditions of heavy dust aerosol loading. In situ measurements from two ground stations in the Sahel region provide additional evidence of this underestimation, showing increased biases of satellite LST with DuAOD, and no significant dependence of ERA5’s SKT biases on dust aerosol concentrations. The impact of atmospheric water vapor content on LST and SKT is also examined, but dust aerosols are shown to be the primary driver of the inaccurate LSTs observed. Based on comparisons with in situ data, we estimate an aerosol-induced underestimation of LST of approximately 0.9 K for every 0.1 increase in DuAOD. Analysis of brightness temperatures (BTs) in the thermal infrared atmospheric window reveals that dust aerosols have the opposite effect on BT differences compared to water vapor, leading to an underestimation of atmospheric correction by the LST retrieval algorithms. This article highlights a shortcoming of current operational LST retrieval algorithms that must be addressed.

Satellite-based LSTs are estimated most commonly from radiometric measurements in the thermal infrared (TIR) atmospheric window (8-13 μm). Estimating LST is a challenging task since multiple factors, such as surface properties, atmospheric composition, and satellite viewing geometry, affect the measured signal. These can be mostly accounted for via radiative transfer calculations, but require accurate knowledge of surface emissivity, vertical atmospheric profiles, and their respective effects on the measured TIR radiances. Numerous LST retrieval algorithms have been developed over the years [17] and many already provide LST with accuracies better than 1 K [18], [19], [20], [21], [22] under most surface and atmospheric conditions. However, despite continued development of these algorithms, they still have limitations and potential to be improved.
One limitation concerns retrievals under high aerosol loading. In the case of semiempirical methods, relating surface temperature with top-of-atmosphere brightness temperatures (BTs) in the TIR (split-window or mono-channel algorithms), a simplistic correction of aerosol effects is often included, assuming static average aerosol distributions during algorithm calibration [17], [23]. Although this approach is adequate for typical clear-sky conditions, the atmospheric transmissivity in the TIR decreases considerably in conditions of heavy aerosol loading [24] and this will lead to significant errors in LST retrievals if not properly accounted for. The impact of high aerosol concentrations on TIR-based retrievals of sea surface temperature (SST) has been studied since the 1980s [25], [26], [27], when stratospheric volcanic aerosols were found to cause a systematic underestimation of SSTs by up to 7 K following a major volcanic eruption. aerosol-correction schemes exist [25], [26], [27], [28], [29], [30], [31], [32], [33], [34].
Despite the growing interest in developing aerosol-robust LST retrieval algorithms, operational LST products largely disregard this effect-either ignoring aerosols or simply flagging pixels with high aerosol concentrations. The articles published on this issue over the past decade [35], [36], [37], [38], [39], [40], [41] have generally focused on simulating the effect of aerosols on LST retrievals using radiative transfer models, showing a similar impact as on SST (i.e., an underestimation of LST with increased aerosol loading). In some studies, aerosol-robust algorithms [36], [38], [39] have already been proposed.
The aim of this article is to assess three operational and commonly used LST products in conditions of high dust aerosol concentrations, complementing the theoretical findings of previous modeling exercises [35], [36], [37], [39], [40], [41], as a first step toward building the case for the implementation of aerosolrobust algorithms in operational LST retrievals. We analyze the LST product provided by EUMETSAT's Satellite Applications Facility on Land Surface Analysis (LSA SAF) [42] that uses a generalized split-window (GSW) algorithm applied to data from the Spinning Enhanced Visible and InfraRed Imager (SEVIRI) on board Meteosat Second Generation (MSG) satellites; and two LST products provided by NASA, both using data from the Moderate Resolution Imaging Spectroradiometer (MODIS) onboard the Terra and Aqua satellites-one employing a similar GSW algorithm [43] and the other a temperature-emissivity separation (TES) algorithm [44]. We focus here on the effect of dust aerosols as they are one of the most abundant types of aerosols by mass and have the highest maximum optical thickness of all aerosol types [45].
In the first part of this study (Section III-A), we compare the three LST products against the skin temperature (SKT) from ECMWF's fifth-generation climate reanalysis, ERA5, over the Sahara Desert and surrounding regions-an area of high prevalence of dust aerosols [46], [47]. ERA5's SKT corresponds to the temperature of the interface between the atmosphere's lower model layer and the soil's upper layer. It determines the longwave radiation emitted by the model's surface, being therefore physically equivalent to the satellite LST. It is well-known that ERA5's SKT is not error (or bias-) free [48], [49], [50]. Indeed, ERA5's SKT is not meant to be used here as an absolute reference for the LST products, but it is used instead as a spatially and temporally coherent dataset, appropriate to support consistency assessments of LST retrievals. The satellite-ERA5 differences (termed LST Delta) are assessed under varying atmospheric conditions, namely dust aerosol (Section III-B) and water vapor (Section III-C) concentrations, as represented by the Copernicus Atmosphere Monitoring Service (CAMS) global reanalysis of atmospheric composition, EAC4, and ERA5, respectively. The inherent differences between LST and SKT (more details in Section II-C) mean that the results from this analysis are only a first indication of the impact of dust aerosols on the examined LST products. The second part of this study (Sections III-D and III-E) makes use of in situ measurements from two ground stations within the domain of study to complement the results from the LST versus SKT comparisons. The stations are located in the Sahel region (Niamey, Niger during 2006 and Dahra, Senegal from 2009 to 2013), south of the Sahara Desert, and provide in situ LSTs, which serve as an independent reference to both the satellite-based LST products and the reanalysis SKT. These stations are located in a region where high dust aerosol concentrations are frequent, making them particularly appropriate to assess LST retrievals under such conditions. To our knowledge, there are no previous studies providing a consistent comparison between satellite-retrieved LST (regardless of the algorithm) and in situ data under high aerosol loading. In the final part of this work (Section III-F), we examine MSG's BTs used in the estimation of LST to show how dust aerosols affect the LST retrievals.

A. Area and Period of Study
This study is performed across the area delimited by latitude circles 40°N and 10°N, and by meridians 20°W and 40°E. This domain contains the Sahara Desert, which is an optimal location to analyze the effect of dust aerosols on LST retrievals: it is a vast region with a relatively homogenous surface, largely stable over time; it is the largest source of airborne dust particles on the globe [46], [47]; the atmosphere over the desert has a relatively low water vapor content [51], which means LST retrievals over this area are less affected by this critical atmospheric component; and it has negligible amounts of other types of aerosols (such as soot, organic matter, or sea-salt) [52]. We examine the effect of dust aerosols on LST retrievals during the year 2006 and the years 2009 to 2013, to ensure overlapping with the available in situ LSTs within the area of study.

B. Satellite Data
Three different satellite-based LST products are examined in this work, all derived from TIR top-of-atmosphere observations for clear-sky pixels. When comparing against ERA5's SKT, all products are reprojected onto a regular grid of 0.25°× 0.25°. This reprojection is done by averaging all valid (i.e., cloud-free) LST datapoints inside each 0.25°× 0.25°grid-box. During regridding, we exclude bundles with less than 70% valid pixels to minimize the impact of cloud contamination on our results. The comparison between satellite-based LST and in situ LST is performed using the original resolution of the satellite products. A brief description of the three LST products follows.
1) MSG SEVIRI: LSA SAF's LST product [42] is based on radiance measurements of the SEVIRI sensor on-board MSG satellites and is retrieved with the GSW algorithm [53], [54]. It is available every 15 min in near-real-time in the original geostationary satellite projection (3 km at the subsatellite point) as well as on a regular 0.05°× 0.05°grid. Here, we restrict our analysis to daytime observations between 8:00 UTC and 14:00 UTC, and to night-time observations between 20:00 UTC and 02:00 UTC, which ensures overlap with MODIS overpasses.
Special attention is given to this LST product in the last part of this work, where a more detailed analysis of the effect of dust aerosols on MSG observations is performed. Specifically, we examine the behavior of the BT difference between SEVIRI channels centered on 10.8 and 12.0 μm in relation to dust aerosol loading and water vapor. The BT data are also preprocessed to avoid cloud-contaminated values: first, its spatial resolution is adjusted to match ERA5's regular grid of 0.25°× 0.25°, during which all bundles with less than 70% cloud-free pixels are eliminated, taking into account the cloud mask of the MSG LST product. During this specific analysis, only desert pixels are considered, to reduce the impact of surface emissivity effects on the BT signal. Information on the landcover is obtained from the International Geosphere-Biosphere Program [55], which is currently used by LSA SAF to estimate MSG/SEVIRI emissivities [56], [57]. Datapoints with viewing zenith angle (VZA) above 42°are discarded to limit the effect of increased atmospheric path length on BT differences. This threshold is chosen as a compromise between having enough data for this analysis and limiting as much as possible the view-angle effects in atmospheric correction.
2) Terra/Aqua MODIS: The remaining two LST products are based on measurements of the MODIS sensor on-board Terra and Aqua, two satellites of NASA's Earth Observing System. a) MxD11 product: We consider the gridded MYD11C1 [58] and MOD11C1 [59] products for the comparisons against ERA5's SKT; and the Level-2 MYD11A1 [60] and MOD11A1 [61] products for the comparison against in situ data. The MxD11 product is derived using a GSW algorithm, similar to that of the LSA SAF SEVIRI product. b) MxD21 product: Following the strategy described above, we consider the gridded products MYD21C1 [62] and MOD21C1 [63] for comparisons against ERA5's SKT; and the Level-2 version, MYD21A1D [64], MYD21A1N [65], MOD21A1D [66], and MOD21A1N [67] for comparisons against in situ data ("D" and "N" indicate the daytime and night-time products, respectively). The MxD21 product is derived using a TES algorithm [68]. Terra and Aqua have sun-synchronous orbits (i.e., cross the equator at the same local time); for the study area, the daytime Terra overpass is between 8:00 UTC and 13:00 UTC; and between 20:00 UTC and 01:00 UTC for the night-time overpass, whereas that of Aqua is between 10:00 UTC and 14:00 UTC (between 22:00 UTC and 02:00 UTC) for the daytime (night-time) overpass. "C1" products have a spatial resolution of 0.05°× 0.05°and "A1" products maintain the original 1 km resolution.
The details of the satellite LST products used are summarized in Table I.

C. ERA5 Data
ERA5 is the fifth-generation climate reanalysis from ECMWF [69]. The following three ERA5 variables are used throughout this study.
1) SKT, for the comparison against satellite LST.
2) Total cloud cover, to eliminate SKT data where cloud coverage is above 30%, following [48]. TIR-based LST products cover clear-sky conditions only and, therefore, meaningful satellite-model comparisons can only be performed considering (nearly) clear-sky SKT (e.g., [48], [50]). 3) Total column water vapor (TCWV), to assess the impact of atmospheric correction on the performance of LST algorithms, given that water vapor is the main component affecting the infrared bands used by the studied LST products. All variables are provided in a 0.25°× 0.25°grid with hourly temporal resolution.
ERA5's SKT represents the temperature at the interface between the model's soil and atmosphere, ensuring the closure of the surface energy balance. The SKT determines the model's surface thermal emission and, therefore, is physically compatible with LST estimates from TIR observations. SKT is not considered an absolute reference but, as shown in this work, is useful to diagnose the potential impact of dust aerosols on LST retrievals over an area with scarce in situ measurements. The conclusions drawn from the comparison between SKT and LST are later evaluated using in situ LST from two stations within the domain of study.
The satellite-based LST products are compared against SKT by computing their difference (as LST minus SKT), hereafter referred to as LST Delta, which is then systematically assessed for various ranges of dust aerosol and water vapor concentrations. For the calculation of LST Delta, ERA5's SKT values are directly matched to the hourly observations of SEVIRI/MSG and are linearly interpolated to the satellite observation time in the case of MODIS products.

D. CAMS Dust Aerosol Optical Depth (DuAOD) at 550 nm
Dust aerosol concentrations are obtained from EAC4, a CAMS global reanalysis optimized to estimate atmospheric composition [70]. EAC4 data are available three hourly on a 0.75°× 0.75°grid. In this work, we use the DuAOD at 550 nm. CAMS DuAOD is only available for the visible range; however, since it is found to be linearly related to DuAOD in the TIR [71], [72], [73], we consider it a suitable variable to quantify aerosol loading.
To analyze the relation between LST Delta and DuAOD, the DuAOD data are regridded with a nearest-neighbor approach to the 0.25°× 0.25°grid adopted for all products and are linearly interpolated to match the respective LST retrieval times. Fig. 1 shows the monthly mean DuAOD corresponding to the observation times of MSG. Dust production and transport have a clear seasonality, with higher values in the south-western region of the Sahara Desert during June to August, in agreement with previous studies of dust aerosol distributions in this area [46], [47].

E. In Situ LST
In situ retrievals of LST are available from two locations within the area of study, obtained during two different periods. The first set of measurements was obtained by an Atmospheric Radiation Measurement mobile facility in Niamey, Niger, located at 13°28 38.28 N, 2°10 32.88 E (marked with a cross in Fig. 1), during 2006 [74]. This site provided downwelling and upwelling longwave radiation measured by broadband radiometers. The data are provided every minute, with data available from January 13, 2006 to December 8, 2006, from which LST can be calculated via Stefan-Boltzmann's law [75] where R u and R d are the measured upward and downward longwave radiances, respectively, ε BB is the surface broadband emissivity, and σ SB is the Stefan-Boltzmann constant. Broadband emissivity of the site is obtained from the ASTER Global Emissivity Dataset 100-m V003 [76].
The second set of in situ LST originates from an LST validation station operated by the Karlsruhe Institute of Technology in Dahra, Senegal, located at 15°24 8.28 N, 15°25 58.08 W (marked with a dot in Fig. 1), from 2009 to 2013 [77]. The station is equipped with narrow-band radiometers measuring surface upwelling and atmospheric downwelling radiances from wavelengths of 9.6 to 11.5 μm. The data are provided every minute, in the period between July 12, 2009 and December 31, 2013. The LST is calculated by inverting Planck's function where λ is the radiometer central wavelength (10.55 μm), L λ,u and L λ,d are the upwelling and downwelling radiance measured by the station radiometers, and ε λ is the surface emissivity at λ. In this case, the emissivity estimates are those derived for the SEVIRI/MSG channel centered on 10.8 μm, which is close to the radiometers' central wavelength (following [77]). These two sets of in situ LST are used as an independent reference for the satellite-based LSTs and ERA5's SKT. Further precautions are taken to avoid selecting cloud-contaminated data, especially during the wet season, when the Intertropical Convergence Zone (ITCZ) crosses the stations [78], [79]. First, only satellite datapoints where all surrounding pixels are cloud-free are used: if only one of the eight surrounding pixels is cloud-contaminated, the satellite datapoint is discarded. Second, only SKT datapoints where the corresponding MSG LST is cloud-free (according to the previous condition) are used, to avoid comparing cloudy in situ observations against clear-sky ERA5 data. Furthermore, the in situ observations are averaged over 5-min intervals centered at the satellite/reanalysis observation/simulation time, in order to minimize the effect of fluctuations in the ground observations. Similar to the analysis of LST Delta, the differences between satellite-LST/ERA5-SKT and in situ LST (as satellite-LST/ERA5-SKT minus in situ LST), termed LST Error, are calculated and systematically compared to DuAOD and water vapor content. Additionally, the bias, the dispersion, and the uncertainty of each satellite-LST/ERA5-SKT product with respect to the in situ LSTs are calculated. These calculations follow the recommendations by the Committee on Earth Observation Satellites Working Group on Calibration and Validation -Land Product Validation Subgroup. Accordingly, the robust bias between satellite-LST/ERA5-SKT and in situ LST is given by the median error where LST sat/rea and LST insitu represent the satellite-LST/ERA5-SKT and in situ LST, respectively. The dispersion of the differences between satellite-LST/ERA5-SKT and in situ LST is given by the robust standard deviation (SD) The root-mean-square error (RMSE) of the satellite/reanalysis products is computed as where N is the sample size.

A. Comparison of Satellite LST Against ERA5's SKT
Monthly averages of LST Delta for the area and period of study are presented in Fig. 2 for the three investigated LST products (MSG, MxD11, and MxD21). In agreement with previous studies [48], [49], [50], [80], [81], ERA5's SKT has an overall cold bias with respect to satellite-based LST (i.e., LST Delta is positive), throughout most months over most of the area. However, strong negative LST Delta values can be observed in all three comparisons during summer (June, July, and August), when concentrations of dust aerosols are generally higher (see Fig. 1). These results suggest that high concentrations of dust aerosols might be affecting LST retrievals, leading to an underestimation of LST and the observed negative values of LST Delta.
Negative LST Delta are particularly pronounced in the MODIS products, which have lower values than the MSG product. The variable viewing geometry of MODIS (and longer optical paths) increases the sensitivity of the product to aerosols and may at least partially explain the differences between MODIS and MSG when compared against ERA5. Differences are also observed between the MODIS products: MxD21 shows a more localized negative pattern than MxD11, particularly at the southern edge of the domain. However, during summer months (wet season) the atmospheric water vapor content in this region is particularly high due to the position of the ITCZ [78], [79], which is also associated with a strong greening of the vegetation. As such, the discrepancy seen between LST products in this area  could be related to differences in atmospheric correction and/or surface emissivity estimates.
To better examine the role of dust aerosol and water vapor concentrations on values of LST Delta, a statistical analysis of the relation between LST Delta and these two variables follows.

B. Relation Between LST Delta and DuAOD
To assess the relation between LST Delta and dust aerosol loading, the DuAOD data are subdivided into six classes: DuAOD≤0.2, 0.2<DuAOD≤0.4, 0.4<DuAOD≤0.6, 0.6<DuAOD≤0.8, 0.8<DuAOD≤1.0, and DuAOD>1.0. The LST Delta values for each LST product are then sorted into the respective DuAOD class. The resulting LST Delta distributions are shown in Fig. 3 for daytime (red) and night-time (blue) separately.
Under low dust aerosol loading (DuAOD≤0.4) LST Delta values are predominantly positive for daytime observations, independently of the LST product considered, and slightly negative for night-time observations. This is in agreement with the patterns seen in Fig. 2 for regions with low dust aerosol concentrations. More significantly, and as expected from the results of the previous section, LST Delta values of each LST product decrease with increasing dust aerosol concentration, independent of the time of day, although the sensitivity to DuAOD seems to be stronger for daytime observations. In all LST products, there is a noticeable increase in the interquartile range for higher DuAOD classes. This could be partially related to increased DuAOD variability in situations of high aerosol concentrations that may not be well captured by the coarse temporal and spatial resolution of the CAMS reanalysis.
LST Delta shows a significantly lower dependence on TCWV than for DuAOD (see Fig. 3), although all LST products tend to have positive LST Delta values for drier atmospheres and negative or lower LST Delta values for moister atmospheres. However, while Fig. 3 revealed a systematic change in LST Delta with DuAOD, regardless of the satellite sensor or LST algorithm, the variation of LST Delta with TCWV is less pronounced and appears to be dependent on the respective algorithms' atmospheric correction.

D. Comparison of Satellite LST Against In Situ LST -Niamey
For an independent assessment of the three LST products (and ERA5's SKT), we now compare these products against in situ LST, starting with the data from the Niamey site. Fig. 5 displays the satellite LST and ERA5's SKT datapoints versus the corresponding in situ LST for northern hemisphere winter (DJF) and summer (JJA) months, the seasons with the lowest and highest dust aerosol concentrations, respectively. The corresponding bias, dispersion, and RMSE of the satellite LST and reanalysis SKT products are presented in Table II. MSG LSTs are in general agreement with ground measurements during winter months, showing a bias of -1.46 K (0.73 K) for the daytime (night-time) and relatively low SD and RMSE, with values of 1.17 K (1.05 K) and 2.35 K (1.34 K) for the daytime (night-time) observations, respectively. The MODIS products show overall higher dispersion and a stronger contrast between daytime and night-time overpasses. MxD11 presents a strong negative daytime bias (-4.48 K), which is significantly reduced to -0.85 K for night-time overpasses; the SD and RMSE also decrease significantly between day and night, from 3.72 to 1.41 K and from 6.18 to 2.67 K, respectively. In contrast, MxD21 has a weaker daytime bias (-0.38 K) compared to night-time (2.32 K), whereas the dispersion (SD) and RMSE also decrease from daytime to night-time (from 2.91 to 1.45 K and from 5.11 to 2.83 K, respectively). The higher dispersion of the MODIS products compared to MSG seems to be related to the variability   in the viewing geometry of satellites with polar orbit. In fact, we found that LST Deltas for high DuAOD tend to increase with increasing view angle (not shown). Finally, ERA5's SKT presents a significant daytime cold bias of -4.50 K (with SD of 2.19 K). Its RMSE (6.63 K) is high due to the high bias value. At night, the bias (-0.38 K), dispersion (1.56 K), and RMSE (1.49 K) are all lower. These results are in agreement with previous works showing that ECMWF SKT tends to underestimate the daily amplitude in arid and semiarid conditions (e.g., [48], [49], [50]). In summer months, a period of higher prevalence of dust aerosols in Niamey (see Fig. 1), all satellite LST products present stronger negative biases, larger dispersions, and higher uncertainties. The daytime (night-time) biases increase to -3.70 K (-3.41 K), -8.64 K (-5.09 K), and -2.62 K (-2.55 K) for MSG, MxD11, and MxD21 products, respectively, and the corresponding RMSE increases to 5.95 K (4.47 K), 9.70 K (6.86 K), and 7.43 K (4.68 K). This indicates less accurate LST retrievals during this season and is consistent with the previous conclusions that heavy dust aerosol loading causes LST to be underestimated. Interestingly, ERA5's daytime cold bias is significantly reduced to -0.37 K but increased (-2.12 K) for night-time simulations. The reasons for this seasonal change in ERA5 accuracy are out of scope of this work: these may be related to effects of aerosols on radiation fluxes, changes in surface fluxes and their simulation during the wet season (as opposed to those in the dry DJF season), or even due to compensation by other errors. Nevertheless, this result confirms that the comparisons between SKT and LST must be taken with caution.
To assess the role of dust aerosol and water vapor concentrations on the LST Errors (i.e., satellite LST or ERA5's SKT minus in situ LST), a similar statistical analysis as presented in Sections III-B and III-C is performed. For this analysis, the data for the entire year 2006 are used. Fig. 6 displays the relationship between LST Error and DuAOD (top panels) and TCWV (bottom panels), separated into daytime (red) and night-time (blue) observations. The panels clearly show an increased underestimation of all satellite LST products with increased DuAOD. For these products, the relation between the LST Error and TCWV is less steady, but all exhibiting a local minimum for TCWV values between 30 and 40 kg/m 2 . The limited sample size should be noted, especially in the case of MODIS products. Regarding ERA5's SKT, the LST Error shows a small dependence on DuAOD and a slightly stronger dependence on TCWV. Daytime cool biases tend to be larger for dry atmospheres and low DuAOD (more frequent during the dry season), being likely that multiple factors, such as seasonal variation in soil moisture, the representation of vegetation cover and radiative fluxes at the surface, and their respective role on the sensible/latent heat partition, play a role here [49].

E. Comparison of Satellite LST Against In Situ LST -Dahra
We now examine the satellite LST products over the Dahra site during the years 2009-2013. Fig. 7 displays the comparisons of satellite LST (and ERA5's SKT) against in situ LST. Table III presents the corresponding statistical results. Just as in Niamey, all satellite LST products show stronger negative biases for JJA, as well as larger dispersion and higher uncertainties than for the winter months, for both daytime and night-time observations. The daytime (night-time) biases increase to -5.09 K (-3.06 K), -11.16 K (-4.87 K), and -6.87 K (-2.86 K) for the MSG, MxD11, and MxD21 products, respectively, whereas RMSE increases to 7.54 K (4.51 K), 12.32 K (6.29 K), and 10.35 K (4.31 K). Regarding ERA5's SKT, we see an increase in its night-time bias (to -2.12 K), as well as in the daytime value (to -5.99 K), in contrast to Niamey, where ERA5 daytime statistics are better for JJA. However, while the SD and RMSE increase during daytime (to 7.01 and 9.32 K, respectively), they decrease during night-time (to 1.06 and 2.39 K). Again, these results, together with the ones for Niamey, show the limitations of using ERA5's SKT as an absolute diagnostic reference for LST and highlight the need for independent ground measurements.
As in the previous section, the relation between LST Error and dust aerosol and water vapor concentrations is analyzed statistically (see Fig. 8) for the entire years 2009-2013. The panels show similar results to the ones for Niamey, i.e., an increased underestimation of all LST products with increased dust aerosol loading. The relation with TCWV is similar for all satellite LST products, with the largest median LST Error around the fourth and fifth TCWV class. ERA5's LST Error exhibits no obvious dependence on DuAOD and a noticeable dependence on TCWV (especially at daytime), similar to Niamey.

F. MSG BT Analysis
To better understand the underestimation of satellite-based LSTs under heavy aerosol loading, we perform a more detailed analysis of MSG data. In particular, we analyze the BT difference between the channels centered on 10.8 and 12.0 μm, as it is explicitly used in the GSW algorithm and implicitly impacts the TES approach. Here, we only analyze BT differences (and not the BT mean) since BT values vary strongly with the season, making it difficult to decouple the effects of aerosol loading, atmospheric water vapor, and LST amplitude. The BT difference is predominantly determined by the emissivity of the surface and the transmissivity of the atmosphere. It is therefore useful to study how the BT difference varies with dust aerosol concentrations as this might help understand the underestimation of LST. Fig. 9 shows the two-way dependence of BT differences on DuAOD and TCWV for the desert region of the study area where VZA<42°, for the entire period of study. The data clearly show that BT differences increase with TCWV (underlining the role of water vapor in the GSW algorithm) and decrease with DuAOD. When water vapor and dust aerosol concentrations are low, typical of winter months, BT differences are negative since desert emissivities at 12.0 μm are generally higher than at 10.8 μm. In summer months, when water vapor and dust aerosol concentrations are higher, the BT difference may be negative or positive, and for certain combinations of water vapor and dust aerosol concentrations, their effect can balance out.
The increased BT difference with TCWV is a well-studied behavior [17]: water vapor mainly affects the "dirty" 12.0 μm channel, significantly decreasing the transmissivity of the atmosphere in this channel compared to the 10.8 μm channel and resulting in an increasingly positive BT difference with increased water vapor. The behavior of BT difference with DuAOD is consistent with the optical properties of dust aerosols: the imaginary part of the complex index of refraction of dust aerosols is higher in the 10.8 μm channel than in the 12.0 μm channel [82], [83], meaning that an atmosphere with heavy dust aerosol loading will have a lower transmissivity in the 10.8 μm channel, resulting in a progressively lower BT difference with increased dust aerosol concentrations. This result is also in agreement with findings from previous studies of the aerosol effect in the TIR [29], [32], [35], [41].
These changes in transmissivity due to dust aerosols, combined with the additional atmospheric emission caused by them, will introduce large errors in LST retrievals for two reasons: 1) by affecting the mean BT, which is also an essential component of the GSW algorithm; 2) by changing the expected dependence of the BT differences on TCWV, which is critical in algorithms such as the GSW. In the case of the TES, the emissivity retrieval is based on the spectral contrasts between the different channels. Therefore, any contribution of the atmosphere to the observed BT spectral contrasts not accounted for may introduce significant errors to the emissivities [20].

IV. DISCUSSION
Satellite-based estimations of LST are a valuable asset in the global study of earth's land surface. However, current TIR LST products largely ignore the effect of aerosols on satellite retrievals, either by only using simplistic approaches (e.g., using average aerosol loadings or profiles in the simulations for algorithm calibration) or by not accounting for this effect at all [43], [84], [85], [86], [87]. In this work, we evaluate the performance of three operational LST products (LSA SAF's SEVIRI/MSG and NASA's MxD11 and MxD21) in conditions of heavy dust aerosol loading, assessing them over the Sahara Desert and the Sahel region, where high concentrations of dust aerosols frequently occur. The LST products are compared against ERA5's SKT over this domain, as well as against in situ measurements from two ground stations within it. We provide compelling evidence that current LST products underestimate in conditions of heavy dust aerosol loading, complementing previous findings from modeling exercises about the impact of aerosols on LST retrievals [35], [36], [37], [38], [39], [40], [41].

A. Comparison of Satellite LST Against ERA5's SKT
In order to understand the spatial and temporal extent of aerosol-affected LST retrievals, we compare the three LST products against a corresponding reanalysis variable, ERA5's SKT. Several works indicate that ECMWF tends to underestimate maximum daily values of SKT (compared to LST) and slightly overestimate minimum (night-time) temperatures, especially in arid regions such as Northern Africa [48], [50], [80], [81]. Our satellite versus reanalysis comparisons, computed as LST minus SKT (termed LST Delta), exhibit these time-of-day-dependent biases over much of the area and period of study (see Fig. 2).
During summer months, however, coinciding with high concentrations of dust aerosols over the Sahara Desert, the LST Delta signal inverts to being strongly negative. Indeed, LST Delta (for all three LST products) is shown to be related to dust aerosol concentrations (DuAOD), becoming more negative with increased DuAOD. With the use of in situ measurements, we ultimately show that the anomalous LST Delta signal observed during summer months is associated with underestimated satellite LST due to heavy dust aerosol loading.
Given the similar seasonality of water vapor and dust aerosols in the region of study, the relation between LST Delta and atmospheric water vapor content, i.e., TCWV, is also examined. The dependence of LST Delta values on TCWV is shown to be much weaker (see Fig. 4), being slightly more pronounced for MODIS than MSG. Although it is difficult to decouple the effects of water vapor and dust aerosols, there is a stronger relation between the negative LST Delta anomalies and high dust aerosol concentrations, suggesting that this is the primary driver of these anomalies.
In a recent study [88], large biases of the MxD11 LST product were found when compared to ground measurements performed at the Heihe River Basin, an arid region in northwest China. The biases were particularly high during spring, when concentrations of dust aerosols are highest in that region. Although the authors attributed those large errors to the misrepresentation of surface emissivity, it is likely that dust aerosols also play a major role in the inaccuracies. Here, the comparisons against ERA5 over the Sahara region are particularly useful to disentangle the effect of emissivity and aerosols, given that the whole area has fairly stable and similar emissivity values. Since the high LST-SKT differences are mostly confined to the westernmost part of the study region, where DuAOD values are highest, it is possible to pinpoint aerosols as key to the underestimation of satellite LST.

B. Comparison of Satellite LST Against In Situ LST
Recognizing that ERA5's SKT cannot be considered an absolute reference, we use in situ retrievals of LST as an independent reference to both satellites LST and ERA5's SKT. At the two locations analyzed, Niamey and Dahra, the LST products show increased negative biases during the season of high dust aerosol concentrations (i.e., summer) compared to the season of low dust aerosol concentrations (i.e., winter) (see Figs. 5 and 7, Tables I and II). The dispersion and uncertainty of the LST products also generally increased from winter months to summer months, indicating less accurate LST retrievals during the summer. Moreover, the differences between satellite LST and in situ LST  Tables I and II), this change in accuracy is not sufficient to account for the negative LST Delta anomalies seen in Fig. 2 and, therefore, cannot be its , suggesting that the changes in LST Delta with DuAOD (Fig. 3) can be mostly attributed to the influence of dust aerosols on the LST products.
The relation between ERA5's LST Error and DuAOD and TCWV [see Figs. 6(d), (h) and 8(d), (h)] suggests that there are various potential sources of error in the SKT estimates. After all, SKT is estimated through a complex surface model that requires a large array of inputs and is limited by the model's representation of surface conditions and their interaction with the atmosphere, making it prone to errors in complex situations such as during episodes of high aerosol concentrations. While a full characterization of these errors exceeds the scope of this study, it is worth noting that ERA5 only uses a monthly climatology of optical depth at the surface [69], [89]. The misrepresentation of peak values in aerosol loading, as well as possible limitations in the simulation of their impact on radiation fluxes, may be a source of SKT errors. It is also relevant to consider that the seasonal pattern of dust aerosols in the Saharan region is closely linked to the ITCZ, which in turn results in similar patterns of atmospheric water vapor concentrations and vegetation density. Vegetation cover may affect SKT due to limitations in the representation of surface emissivity, especially if the representation of vegetation mean values and seasonal dynamics is limited (e.g., [48]). For these reasons, ERA5's SKT is used here only as a means to diagnose the LST products and to assess the extent in space and time of aerosol-induced underestimations.

C. Limitations of Ground Stations
The ground measurements used in this work are essential to unambiguously show the weak performance of the examined LST retrieval algorithms under high dust aerosol concentrations, but they also have limitations, namely 1) the small number of sites offers only a restricted range of atmospheric (dust aerosol and water vapor) and surface conditions; 2) both stations are located in the Sahel region, a transition zone from desert to savanna, meaning the surface around them is not as homogenous as ideally needed to assess the intricate effect of aerosols on LST retrievals. The heterogeneity of the land surface around the station means they have a limited representativeness on the satellite pixel scale. Dahra, for example, is covered by sparse trees and seasonal grass, with a strong greening during the rainy season (July to October) [77], which can introduce discrepancies if not correctly represented in the reanalysis and satellite products. Niamey's measurements, on the other hand, were performed at an airport site [90], which will have highly heterogeneous thermal surface emissions, whereas the surrounding area is mostly urban fabric and sparse vegetation. The full attribution of sources of error at these stations is, therefore, difficult.

D. Evaluation of Aerosol-Induced Underestimation of LST
Assuming that uncertainties in the ground measurements are small compared with the satellite LST Errors, a rough quantitative evaluation of the underestimation under high DuAOD can be made: based on MSG's LST Error in Dahra, by applying a simple linear regression to the data presented in Fig. 8(a) (daytime and night-time combined), we find an underestimation in satellite LST of approximately 0.9 K for every 0.1 increase in DuAOD. We chose the Dahra station for this evaluation because of its more homogeneous surroundings, and the MSG product because of the larger sample size. The value we obtain is much higher than that simulated by [36] (∼4 K) and [39] (∼3 K) for an AOD of 1.0 and a VZA of approximately 25°(MSG's VZA at Dahra). The higher values obtained here may be related to the fact that those studies relied on simulated data, which allows to control each source of error in the LST retrieval, whereas our comparisons against in situ are likely affected by a combination of different error sources. Still, if we apply this correction to the three LST products, the dependence of LST Error (at Dahra and Niamey) on DuAOD significantly decreases (see Fig. 10). Better results are obtained for MSG's LST product since the DuAOD adjustment is based on this product's LST Error, with no accounting of variable viewing angles. It must be stressed that this is only a rough evaluation of the underestimation of LST, devoid of a complete examination of the radiative processes causing it and ignoring aerosol vertical distributions. Even so, until a more comprehensive study of the aerosol effect on LST retrievals is conducted, this correction might be considered by users of LSA SAF's LST product in regions of heavy dust aerosol loading, especially over West Africa in JJA.

E. Analysis of MSG BT Differences
As a brief investigation into the potential causes of the underestimated LSTs, we analyzed MSG's BT differences between the channels centered on 10.8 and 12.0 μm-a key component of MSG's GSW algorithm. In general, for atmospheres with high transmissivity (i.e., dry and transparent), BT differences are mostly controlled by the emissivity difference between the two channels. With the increase of TCWV, the BT difference is known to increase (as shown in Fig. 9). Significantly, dust aerosols have the opposite effect on BT differences compared to water vapor, i.e., higher concentrations lead to more negative BT differences, as indicated by Fig. 9. This response of TIR radiances to the presence of dust aerosols is consistent with findings from other authors [29], [32], [35], [41]. In the studied region, where high DuAOD values occur near or during the wet season (i.e., with moderate-to-high TCWV values), the opposite effects of DuAOD and TCWV on BT differences result in a reduced atmospheric correction performed by the GSW algorithm, as the observed BT difference effectively corresponds to that of a drier atmosphere. This reasoning may explain the relatively high adjustments to DuAOD changes suggested by Dahra's observations (and corroborated by Niamey's data) when compared to previous studies [36], [39].

V. CONCLUSION
In this work, we have shown that LSA SAF's SEVIRI and NASA's MODIS LST products systematically underestimate LST over the Sahara Desert during late spring and summer, when aerosol activity is highest. We find that dust aerosols have the opposite effect on top-of-atmosphere BT differences in the TIR window compared to water vapor, leading to a reduced atmospheric correction by the LST retrieval algorithms. The underestimation of LST, which in situ measurements suggest may reach up to 9 K for a dust AOD of 1 (at 550 nm), is likely to also occur in other regions with high dust aerosol concentrations, such as the deserts in the Middle East and Central and Southern Asia [46], [47], as well as in the surrounding areas affected by the advection of dust. Furthermore, the fact that such underestimation affects three different LST products, obtained from diverse observations (SEVIRI and MODIS) and algorithm approaches (GSW and TES), means that this issue may also impact other LST products derived from TIR measurements.
The behavior of BT differences observed in conditions of heavy dust aerosol loading is consistent with the wavelengthdependent reduction in atmospheric transmissivity controlled by the dust aerosols' optical properties. However, a layer of dust aerosols not only blocks part of the surface-emitted TIR radiation but also contributes to the atmospheric emission. This is corroborated by preliminary radiative transfer simulations, which indicate that an atmospheric layer with a high concentration of dust aerosols causes a reduction in atmospheric transmissivity and an increase in atmospheric emission, especially in the channel centered at 10.8 μm. These two opposing effects will have to be considered in the development of an aerosol-robust LST retrieval algorithm, which will be the focus of our future work.