Daytime Sea Surface Temperature Retrieval Incorporating Mid-Wave Imager Measurements: Algorithm Development and Validation

Incorporation of mid-wave infrared (MWIR) channel/s into the prevalent regression-based split-window technique (SWT) for operational daytime sea surface temperature (SST) retrieval is challenging. However, the MWIR channels are highly desirable to obtain unambiguous information from the surface since these channels offer high transparency with respect to the earth’s atmosphere and are very sensitive to the thermal emission from the surface. On the other hand, the MWIR channel/s can be easily incorporated into any physical-based SST retrieval scheme. Daytime SST retrieval using various physical-based methods is studied and it is found that the physical deterministic sea surface temperature (PDSST) retrieval scheme is the best choice. This article discusses various scientific aspects of the daytime PDSST retrieval including MWIR channels from a theoretical point of view and its application on real data from Moderate Resolution Imaging Spectroradiometer (MODIS)-Aqua. Daytime SST retrievals from PDSST, including MWIR channels, are also compared with the currently operational SWT-based SSTs from MODIS-Aqua and MODIS-Terra by NASA, without MWIR channels. The root-mean-square differences in PDSST from the in situ buoys using the global matchup data for daytime MODIS-Aqua SSTs is ~0.28 K for complete cloud-free set and is ~0.38 K for MODIS-Aqua and MODIS-Terra when quasi-deterministic cloud and error masking algorithm is applied for cloud detection. The information gain is defined by combining the two metrics, quality improvement and the increase in cloud-free data. The PDSST suite rendered two to three times as much information as the NASA-produced daytime regression-based SST.


I. INTRODUCTION
S EA surface temperature (SST) is one of the most crucial parameters for any earth science model. It is selected as a basic climate variable by the World Meteorological Organization, Geneva, Switzerland. Recently designed modern imager instruments with information-rich multichannels and low noise equivalent differential temperature offer high Manuscript received June 5, 2020; accepted July 8, 2020. Date of publication July 23, 2020; date of current version March 25, 2021. This work was supported by National Aeronautics and Space Administration (NASA), USA, under Grant 80NSSC18K0705.
The author is with Earth System Science Interdisciplinary Center, University of Maryland, College Park, MD 20740 USA (e-mail: pkoner@umd.edu).
Color versions of one or more of the figures in this article are available online at https://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TGRS.2020.3008656 potential for obtaining both accurate and large coverage of the SST data. Nevertheless, it is not yet used to the greatest extent, because most of the operational agencies are still using regression-based methods where only two to three channels are used [1]. Anding and Kauth [2] first proposed, almost five decades ago, the concept of two-channel regression of 11 and 12 μm to derive SSTs, using two simultaneous brightness temperature (BT) measurements of long wave infrared (LWIR) window channels in different spectral bands for correcting the atmospheric water vapor absorption, which was later tested by Prabhakara et al. [3]. The original split-window technique (SWT) algorithm, theoretically derived by McMillin [4], where the atmospheric water vapor absorption in the window channel is directly related to the difference in BT of two window channels, is a coarse assumption.
Since the above-mentioned assumption does not hold true for all the atmospheric conditions, the SWT includes the initial guess (IG) of SST to reduce the errors in SST retrieval [5]. The triple window algorithm (TWA) is based on regression of the 3.7-, 11-, and 12-μm channels [6] and the SST4 algorithm for Moderate Resolution Imaging Spectroradiometer (MODIS) [7] is also a regression of two channels (3.9 and 4.0 μm). Both the algorithms are used only at night to avoid complications with scattered solar radiation. Recently, a regression-based algorithm has been developed where two additional channels of ∼8.5 and ∼10.3 μm are included in the SWT to improve SST quality [8]. However, both the channels are from the LWIR region and their atmospheric attenuation is high. On the other hand, mid-wave infrared (MWIR) channels, where the Planck function is steeper, the attenuation of water vapor is considerably lower, and the measurement noise is smaller, are highly desirable to obtain unambiguous surface information. However, no operational regression-based daytime SST algorithms have yet incorporated the MWIR channels into their processing chain due to the generation of linear coefficients for daytime for the MWIR channels being highly onerous as the MWIR channels contain nonlinear solar scattering and reflection. As opposed to the conventional approach, the incorporation of the MWIR channels is straightforward in physical deterministic sea surface temperature (PDSST) retrieval suite [9]- [11], which is based on a physical model. MODIS instrument is used as a case study because only MODIS has three MWIR channels which will help understand the importance of MWIR channels in SST retrieval. In this article, a thorough investigation on This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ the daytime SST retrieval using MWIR channels in PDSST retrieval scheme will be conducted. The different scientific aspects for daytime SST retrieval including MWIR channel/s, under complete cloud-free subset using experimental filter (EXF) which was proposed in [9], will be discussed. Daytime SSTs including MWIR channels in PDSST will also be compared against regression-based daytime (MODIS-Aqua and MODIS-Terra) SSTs obtained from Physical Oceanography Distributed Active Archive Center (PO.DAAC). Elaborately, Section II will discuss the generic shortcomings of SWT without MWIR channel/s. Section III will provide a brief description of the PDSST retrieval methodology and extension of PDSST for daytime SST retrieval including MWIR channels. Section IV will examine the error and information content of the retrieval for PDSST, and daytime SST retrieval using PDSST will be compared with other physical-based SST retrieval methods. Section V will discuss the modification in cloud and error masking (CEM) algorithm for cloud detection of daytime scenarios. Section VI will provide the validation of PDSST product against in situ.

II. SHORTCOMINGS OF REGRESSION-BASED SST RETRIEVAL
The final form of the SWT [5] is where T sw is the retrieved temperature using the SWT. T m 11 , T m 12 , and T ig are the measured BT for 11-and 12-μm channels and IG SST, respectively. α (0,1. . .) are the regression coefficients and θ stz is the satellite zenith angle (SZA). The coefficients are conventionally derived using direct regression of satellite measurements (cloud-free) against the in situ measurements. The coarser assumptions of radiative transfer (RT) physics in the SWT were verified in limited atmospheric conditions, for example, mid-latitude winter, mid-latitude summer, tropical, and subarctic summer, due to lack of computational facility to calculate forward model at that time. Due to a tremendous advancement in computational facility, fast-forward-model-based near-real-time operational SST retrieval is now achievable [9]. However, most environmental and space agencies are supplying the satellite-derived SST products using only two-channel regression. Using advanced computational facility, an experiment has been conducted to exhibit the shortcomings of the SWT using a month of global MODIS-Aqua matchups from September 2019. The "point" matchup is generated at the in situ measurements of buoys from the iQuam database [12] with level-1 (L1B) pixel data. The community radiative transfer model, CRTM v.2.3, is considered for the forward model simulation and the input data for the CRTM are obtained from the Global Forecast Simulation (GFS), NOAA, USA. The in-built Wu-Smith emissivity model in the CRTM and wind speed data from GFS are used to calculate the sea surface emissivity for the forward model simulation.
The histograms of the calculated transmission of the 11-and 3.9-μm channels are plotted in Fig. 1. The transmission values of a large number of cases are less than 0.4 as seen in Fig. 1. The basic assumption of the SWT is that the 11-μm channel is the "Window" channel. But the 11-μm channel cannot be considered as a "Window" channel when the transmissivity of this channel for numerous global locations is less than 0.4. Put another way, how should the detectability of the parameters be defined from an RT-based measurement perspective? The transmission function of the RT equation is an exponential of the path distances; thus, the parametric information of an RT-based measurement comes from an infinite distance in the media/atmosphere. One of the safe ways to define the "Window" channel or detectability of RT measurements is using the "penetration depth" concept of RT literature. The "penetration depth," equivalent to the optical depth, is well-defined in scientific literature. It is the depth in the material/media where the intensity of radiation reduces to 1/e (about 37%) of its previous value, where "e" is the Euler's number. According to this criterion, all empirical measurements using forward simulation output are divided into two groups by a "green line" in Fig. 1. The total number of matches is 36 129 for this month and the number of occurrences is 14 344 at the left side of the "green line." This implies that ∼40% of the total simulations are beyond the "penetration depth." Note that a similar finding was also reported earlier by Koner [13]. When the transmissivities of the atmosphere are low (the left side of green line in Fig. 1), the satellite measurements detect the air temperatures and not SSTs, according to the "penetration depth" concept.
Another drawback of the SWT to be implemented in realistic situations is that it requires skin temperature to develop the regression coefficients, but the common practice is to use buoy temperature (T b ) due to the lack of skin temperature data. This introduced a significant ambiguity in the calculated values of the coefficients as the IR radiation can penetrate only few micrometers of a water body, whereas buoys are generally located around 1-2 m depth in the sea. Note that T b can be used for first-order statistical validation purposes of the satellite-derived skin-SST product. Moreover, the skin temperature retrieval using two-channel regression-based method is extremely difficult when temperature inversion occurs, for example, sometimes water temperature is significantly low compared with air temperature due to convection in high-latitude regions. This implies that a large portion of SST retrieval using the SWT is ambiguous.
On the other hand, almost all the transmittance values for the 3.9-μm channel, as shown in Fig. 1 (red line), are above the boundary line of the "penetration depth" intensity. This confirms that the 3.9-μm channel is highly desirable for SST retrievals or to extract surface information from satellite measurements. Currently, most satellite-derived SST producers are using the TWA/SST4 algorithm or at least one of the MWIR channels (3.7, 3.9, and 4 μm) for nighttime retrieval in their operational chain. However, MWIR channel/s are rarely used in operational daytime SST retrieval; at the same time, two-channel SWT is still applied for daytime and nighttime to retain a consistent record [14].
To illustrate further the shortcomings of the SWT, a hypothetical experiment is conducted where SSTs are retrieved by the SWT (1) using the following conditions.
1) The coefficients and validations are made using the same set of buoy measurements. 2) Replacing the values of T ig by "truth" (T b ) in (1).
A month of matchups from May 2015 and a new experimental filter (EXF new ) for completely cloud-free measurements are used for this study. EXF new will be discussed in Section III-A.
The Jacobian values of the 11-μm channel with respect to SST (K 11 sst ) from the CRTM output are used as a representative for the transmittance of the 11-μm channel. The differences "T b − T sw ," the retrieval errors of the SWT, are plotted against K 11 sst in Fig. 2(a). It can be seen from Fig. 2(a) that the differences between T sw and T b are more than 3 K for numerous cases when the values of K 11 sst are below 0.6. Note that this is a hypothetical experiment where coefficients and validation are made using the same set of buoy measurements. This result also supports the above discussion of the "penetration depth" concept.
The ratio of the cloud-free pixels with respect to total matches, so-called data coverage (DC), using EXF new is 25.6%, and the root-mean-square difference (RMSD) between the hypothetical retrieved SST using SWT and in situ is 0.86 K. Note that ∼50% of the SST data under EXF new have been masked by a low quality flag (QF) of 1. To avoid further disagreement on this issue, all the comparisons are made using the result of the end product (retrieval plus cloud detection) in terms of the values of RMSD and DC.
To demonstrate the importance of the 3.9-μm channel for SST retrieval, another regression-based daytime SST retrieval has been conducted using an additional 3.9-μm channel on the top of the SWT. The parametric form is given as where θ sp is the specular angle. All other abbreviations of (2) are similar to (1). The differences in daytime SST retrievals from in situ using the regression of three channels (3.9, 11, and 12 μm) are plotted in Fig. 2(b). Although the value of RMSD using 3.9 μm reduces to 0.69 K from 0.86 K, the RMSD value is still much higher than the SST error reported by different operational agencies, because operational SST producers often use multiple sets of regional/seasonal regression coefficients using piecewise regression to reduce these errors [1]. As a result, the atmospheric correction coefficients' value is large when K 11 sst is below 0.6 and the quality of these retrievals is still questionable because the measurement error is also multiplied by the same factor in regression coefficients. Some approaches [15] suggest to consider more physical variables (i.e., aerosols, under-screened cloud, wind speed) to improve the quality of SSTs. However, it remains unclear whether it is possible to reduce the retrieval errors by considering these variables without increasing the number of measurements. Two measurements of SWT can provide a maximum of two pieces of information. If one piece of information is preserved for correct SST retrieval, then only remaining piece of information cannot characterize several errors caused by dynamic global atmospheric absorptions, along with the measurement error. In reality, a set of erroneous retrieval values using SWT are also discarded using QF to improve the quality at the cost of DC. For example, one of the MODIS ambiguous operational constraints in terms of the QF is that retrieved SST is very different from Reynolds SST analysis [16].
In addition, the presence of sunlight on the surface makes it more complex to model skin-bulk effect in daytime and increases the variability between satellite-derived skin temperature and bulk temperature, which was assumed to be equal for the above-mentioned hypothetical experiment. Thus, the errors in both the calculated regression coefficients and validations are large in daytime SSTs. It should be noted that the very thin layer (<100 μm) between the dynamic ocean and the atmosphere plays a crucial role in earth science research [17]. It is extremely challenging to design an in situ instrument that can measure the temperatures in such a thin layer above the turbulent ocean. While the IR electromagnetic skin layer has thickness comparable to the thermal skin layer, the thermal skin temperature value can be determined from the passive remote-sensing-based satellite IR measurements if physical deterministic (PD) inverse scheme is applied. Section III will discuss the PDSST retrieval scheme.

III. PDSST RETRIEVAL
Different scientific aspects of PDSST have been discussed in several publications [9]- [11], [18]- [21]. The PDSST retrieval suite on the global matchup data shows that 3-4 times information gain can be achieved on real data; see both current NASA-distributed MODIS-Aqua [11] and NOAA operational GOES-13 SSTs [18]. The PDSST retrieval scheme determines the SST parameter values by inverting the RT equation for a pixel level and not as an estimation method. It requires a fast-RT model and numerical weather prediction (NWP) data, including aerosol forecast profiles. The choice of forecast data from GFS instead of more accurate analysis data for the fast forward model calculation is purposeful, to demonstrate that PDSST is independent of the IG data. It has already been shown in earlier publications [18] that PDSST can produce good retrieval when T ig is far away (10 K) from truth. Also, forecast data give flexibility to implement the PDSST scheme for near real-time operational retrieval.
Innovations of the PDSST algorithm include the following: 1) pixel-level analytical error estimation and information content analysis of the retrieval; 2) supplemental cloud detection capability; 3) including aerosol information in both the CRTM and the retrieved parameters; and 4) substantially improved performance over prevalent methods, as demonstrated in several publications [18]- [21]. The PDSST suite is suitable for SST retrieval from dynamic coastal regions and at a large SZA (> 55 • ). The inclusion of more parameters in the retrieved vector and more measurements in the measurement vector for PDSST retrieval scheme is straightforward. PDSST retrieval scheme inverts the RT model at single-pixel measurements, while stochastic retrievals use a set of measurement instances to develop the coefficients or error covariances. The PDSST suite inherently adjusts to improvements in cloud detection and RT modeling, instrument calibration, and so on, due to dynamic determination of the regularization strength in a single measurement instance using singular vector decomposition to dampen the noise at the solution time. Another important advantage of the PDSST approach is that error is not treated as definite information. It ensures the balance between information content and noise of the retrievals, that is, sensitivity is close to 1 when T ig is "far" from truth.
Two different forms of deterministic algorithms, namely, modified total least squares (MTLS) and truncated total least squares (TTLS), have been developed for two different classes of instruments. MTLS is designed using three measurements and two unknowns and it can be used when instruments are limited by the number of channels (AVHRR, until GOES-15) [10], [18]. TTLS is designed using six measurements and three unknowns for the modern geostationary IR imagers (e.g., GOES-16, GOES-17, Himawari-8, MSG) including MODIS, where instruments comprise several channels [11], [13]. The detailed formulations of MTLS and TTLS are available in [10] and [11]. The parametric forms of MTLS and TTLS are where σ end is the lowest singular value of the augmented matrix [Ky δ ], I is the identity matrix, κ is the condition number of the Jacobian, K is the Jacobian, y δ is the BT residual (i.e., observation-model), and γ snr is an additional empirical constraint parameter. x mtls = [s log(w)] is used for MTLS and x ttls = [s log(w) log(a)] is used for TTLS, where w is the total column water vapor (TCWV), s is the SST, and a is the sum of total column of all aerosols (TCA). The logarithmic values of w and a are chosen in the retrieved vector to obtain the unit of the respective Jacobian in Kelvin as is accepted in the tangent-linear model [28]. The variable regularization parameter for TTLS is λ; λ = σ 2 end−1 when r ≤ t and λ = (σ end−1 /sqrt(log(κ))) 2 when r > t, where r is the L2-norm of the residual and t is the threshold, which has to be determined empirically (it may vary with channel combinations, different instrumental characteristics, and the number of channels). This is the measure to reduce the regularization strength for high residual retrieval to increase the sensitivity of the algorithm by compromising to a reasonable amount of error in retrieved parameters. Empirically, the value of "t" is set to be 2 K for MODIS-Aqua when six measurements and three parameters are considered. The approximated inverse matrix A inv is equal to (K T K + (2log(κ)/γ 2 snr )σ 2 end I) −1 K T for MTLS and (K T K + λI) −1 K T for TTLS, which is used for analytical error calculation. The model resolution matrix, M r m , which is equal to A inv K, is used for the calculation of expected quantity of information content in retrieval. The values of degree of freedom in retrieval (DFR) are the diagonal elements of M r m . The pixel-level analytical error (AE or e) for the retrieval is Another advantage of the TTLS and MTLS methods is that the retrieval quality indices (QI ttls and QI mtls ) can be calculated [9] using AE as defined in (5). The binning of the x-axis for plots is done in nine logarithmically evenly spaced bins, depending on the values of AE (0.1 < ||e|| < 1). The percentage of total matches for each bin is based on the cumulative AE along the x-axis. Pixels, where the value of AE is greater than 1, are allotted to the tenth bin. A bin is combined with its subsequent bin to get a smooth curve when it has less than 10% of the cloud-free data. When retrieved SST is highly negative and TCWV is highly positive, it is suspected to be a "bad retrieval." It may be due to cloud leakage or forward model simulation error, and these "bad retrievals" are placed in the last bin.
The above-mentioned PDSST retrieval methodology was developed using mainly nighttime data and this article is the extension of PDSST for daytime to make a generic PDSST retrieval for both day and night. Two major aspects of the PDSST retrieval methodology which vary differently for day and night, namely, the generic version of EXF new and the channel selection for daytime SST retrieval, will be discussed in Sections III-A and III-B.

A. Verification on CRTM-Simulated MWIR Channels' Output
To verify the solar contribution generated for the MWIR channel by CRTM v2.3, a hypothetical experiment is made using a month of MODIS-Aqua matchups (December 2017). A set of CRTM-required input parameters are generated from this matchups, which are passed through the old experimental filter (EXF old ). EXF old is designed for determining the nighttime cloud-free measurements and the details of EXF old have been thoroughly explained in earlier publications [11], [18]. The basic working principles are reiterated for the convenience of the readers. It assumes that T b is "true" (as per QF = 5 from iQuam) and that the measurement of the 3.9-μm channel is free from water vapor attenuations [22]. The 3.9-μm Jacobian (K 3.9 sst ) is the partial derivative of the 3.9-μm channel BT with respect to the SST values. The single-channel retrieval component (rtv 3.9 ) is the difference between the observation and the model of the 3.9-μm channel divided by K 3.9 sst . In addition, T ig provides the single-channel satellite-derived SST at the surface and a direct comparison with T b , within a threshold, can provide the cloud-free subset. Note that T ig here is from GFS-supplied surface temperature. The value of threshold was empirically determined as ∼0.75/K 3.9 sst using several months of matchup data [11].
A CRTM-required nighttime data set is generated using the above-mentioned daytime data set by adding 90 • to the solar zenith parameter and keeping all the other parameters the same. The differences between the CRTM-simulated 3.9-μm BTs of "equivalent nighttime" and daytime with Fig. 3. (a) Differences in the CRTM-simulated 3.9-μm BT between "day" and "equivalent night" with respect to the specular angle using EXF old . R pix is the ratio of the number of pixels less than 40 • of the specular angle and all cloud-free data. (b) Retrieval errors of PDSST in terms of SD (dashed line) and RMSD (solid) with respect to the percentage of total matches using EXF old and EXF new and compared with PO.DAAC. AE stands for analytical error values. (c) Differences in the CRTM simulated 3.9-μm BT between "day" and "equivalent night" with respect to the specular angle using EXF new . R pix is the ratio of the number of pixels less than 40 • of the specular angle and all cloud-free data. respect to the specular angle are plotted in Fig. 3(a). The specular angle,θ sp , is calculated as follows: θ sp = 180 * cos −1 (max(min(−θ vw , 1), −1))/π (6) θ vw = cos(θ slz ) * cos(θ stz )+sin(θ slz ) * sin(θ stz ) * cos(θ rz ) (7) where θ vw , θ slz , θ stz , θ sla , and θ sta are the view angle, solar zenith angle, SZA, solar azimuth angle, and satellite azimuth angles, respectively. Fig. 3(a) shows that the solar contribution for 3.9 μm can go up to 15 K when the specular angle tends to zero. However, the solar contribution is small if the specular angle is greater than 40 • . This is a reasonable output as per RT physics under the variation in specular angle, and thus, CRTM-calculated 3.9-μm BTs can be used for further study. TTLS retrieval with six channels and three parameters applied in the above-mentioned cloud-free matches and the error statistics in terms of RMSD and standard deviation (SD) are shown in Fig. 3(b). Both RMSD and SD are plotted in Fig. 3(b) to verify whether there is any systematic error, which is caused by modeling error. Fig. 3(b) shows that the systematic error is almost negligible in this case. Although the RMSD is 0.33 K after discarding the last bin, the RMSD for whole data set is 0.9 K, which is significantly high. Note that PDSST suite has an advantage that it can inherently discard the pixels at the solution time that produces high retrieval error due to fractional cloud cover and/or forward model error, which is already discussed in earlier publications [10], [11]. A high retrieval error of PDSST under EXF old is expected because the measured BT of the 3.9-μm channel for daytime can be compensated by the cold IR effect and warm solar contribution in the presence of fractional clouds. This implies that a subset of pixels, which are covered by fractional cloud, are passed through under EXF old . Thus, an additional filter is implemented in such a way that this can deal with the daytime issue without affecting the nighttime results to obtain a generalized EXF new . The additional filter is designed using double differences (dd) of measurements and simulations between the 3.9-and 11-μm channels as dd(3.9, 11) < max w 20 This additional filter does not exclude any pixel for the nighttime cloud-free subset of the same month, but the results for daytime are improved significantly for the whole data set as shown in Fig. 3(b). Although the new filter offers substantially fewer pixels (4558 compared with 5629 using EXF old ) for daytime, the ratio of the pixels highly affected by solar radiation with respect to the total number of pixels (R pix ) is almost the same as shown in Fig. 3(a) and (c). The pixels highly affected by solar radiation is empirically set where the specular angle is less than 40 • . This is expected because the location of the fractional cloud is independent of the specular angle. This also confirms that the other artifact of daytime CRTM simulation was not the cause for high error under EXF old . Fig. 3(c) shows that a significant number of measurements from the 3.9-μm channel are affected by solar radiation, but a good retrieval is possible using the PDSST retrieval method, which is not possible using SWT. The RMSD value of the operational retrievals from PO.DAAC under complete cloud-free set using EXF new is 0.64 K for the DC of 23.6% for this month of matchups. This value of RMSD is high because it is found from this database that the value of RMSD is ∼0.42 K with a DC of ∼7% for QF = 5 as is shown in Fig. 3(b). Note that the validation of the satellite-derived skin SST against bulk temperature of buoy is generally arguable. However, the cool skin offset of 0.17 K for a wind speed ≤ 5 m/s for nighttime [23] can be accepted, but the additional source of direct sunlight on the skin makes it more complex to model the skin-bulk effect in daytime. High insolation and low wind speed can increase the skin-bulk differences up to a few Kelvin [24]. As alternative in situ databases for skin SST are not readily available, first-order validation of daytime SST is made using RMSD without removing the cool skin effect in this study including an additional flagging on the top of the QF from iQuam for a completely cloud-free set (applying EXF new ). The additional flagging is made using the differences in the measurement and simulation of the 11-μm channel, and the corresponding buoy temperature (T b ) and IG SST for simulation (T ig ) as where T c 11 is the CRTM-simulated BTs for 11 μm. Note that only 2%-3% of the good buoy measurements are discarded using this additional filter.

B. Channel Selection for Daytime SST Retrieval
The weakness of the PDSST retrieval is that it is sensitive to errors due to approximation of the fast-radiative transfer modeling (FRTM) and un-modeled NWP data, but these errors reduce with time due to advancement of both the NWP data quality and the FRTM capability. Also, the reduced state-space parameters in the retrieval vector may cause high error. For example, the RT calculation is related to the water vapor profiles, but the reduced state vector of TCWV is considered as a retrieved parameter in the PDSST scheme due to the limited number of channels in the imager. Atmospheric water vapor correction using only one parameter of TCWV in retrieved vector is possible if the shape of the supplied profile is reasonably correct in the PDSST scheme. The drastic change in the profile shape from the truth will produce errors in SSTs, but these can be minimized using more channels than the traditional use of two channels only. The peak values of the Jacobians of the water vapor profiles for various channels are in different altitudes, which help correct the shape of the profile using any kind of LS minimization. The increased number of measurements (by means of more channels) and the number of elements in retrieval vector for the PDSST retrieval scheme can also be used for reducing the SST retrieval error.
A detailed procedure of the selection of channels for nighttime SST retrieval is available in earlier publication [11]; it is protracted for daytime case study using matchups from December 2017. TTLS and EXF new are used for retrievals and Fig. 4.
cloud-free set, respectively, and an additional cloud detection at the solution time is applied by removing the last bin. The minimum number of channels required to implement TTLS with three parameters is 4, which are selected from LWIR channels of [8.7, 11, 12, 13.4] μm for the first experiment (black line in Fig. 4). This can be treated as the reference set for SST retrieval without MWIR channel/s, which produces SST error of more than 0.9 K for whole data set. The error is reduced significantly (∼50%) when one MWIR channel of 3.9 μm is integrated (see green line in Fig. 4). The error is further reduced by adding a second MWIR channel of 4 μm (red line). However, the error is increased when a third MWIR channel of 3.7 μm is combined (blue line) with a previous set. To illustrate further, the 3.7-μm channel is added to the reference set of only LWIR channels and produces more error (cyan line) than the LWIR-only channels' set (black line). This is most likely due to the high error in solar modeling of this channel using CRTM v2.3 because the same 3.7-μm channel reduced the error for nighttime [11], which will be investigated in the future. Although the performances of the combination of [3.9, 4, 4.5, 11, 13.4, 13.6] and [3.9, 4, 8.7, 11, 12, 13.4] μm are comparable in this study, the combination of [3.9, 4, 4.5, 11, 13.4, 13.6]-μm channels is considered the best for daytime SST retrievals by several months' matchup study.

IV. PERFORMANCES OF DAYTIME SST RETRIEVAL USING MWIR
Several months of matchups are studied and some of the important results are presented here. A comparative study among the various physical-based SST retrieval algorithms is discussed using the sensitivity of retrieval against the error in retrieval to demonstrate the strength of the PDSST approach, including MWIR channels, for daytime SST retrieval in Sections IV-A and IV-B. Note that this study is made using EXF new as a cloud-free subset to remove the ambiguity from cloud detection error.

A. Information Content and Error Analysis
The sensitivity of the forward model-based algorithms can be easily compared using their analytically calculated information content. The comparative sensitivity of the MTLS (DFR) and stochastic optimal estimation method (OEM) [30], in terms of degree of freedom for signal (DFS), has already been published using GOES-13 nighttime matchups, where a cloud-free set of measurements was determined using the Bayesian cloud detection (BCD) algorithm [10]. The LS method was also included for comparison purposes since the typical condition number of the Jacobian for the two-parameter SST retrieval problem is very low (∼5), and thus, the noise amplification to the state space from the measurement space using LS should not be excessive [25], [26]. Note that the sensitivity of LS, by definition, is always 1. A similar experiment as reported in [10] for nighttime, MTLS, OEM, and LS, each with three measurements (3.9, 11, and 13.4 μm) and two state-space parameters of SST and TCWV, is also conducted here to verify the repeatability of the occurrences using different setups: 1) daytime SST retrieval instead of nighttime; 2) using MODIS-Aqua measurements instead of GOES-13; and 3) applying EXF new for the cloud-free set instead of BCD.
For OEM implementation, the values of measurement error covariances are taken from the NASA website. However, the publicly available error covariances' data for the CRTM simulation are difficult to obtain. Thus, the previously used values of 11 and 13.4 μm are kept the same here. Previously, the error of the CRTM simulation for 3.9 μm was assumed to be 0.25 K for night. It is expected that the error of the CRTM simulation for 3.9 μm for day will be higher, so it is assumed here to be 0.3 K. The value of DFR/DFS is a measure of the amount of information coming from the measurements for a selected inverse problem. The metrics of SST retrievals using different algorithms are plotted in Fig. 5 using a month (MODIS-Aqua, May 2015) of matchup data. Since this study is performed using matchup data (i.e., truth is known), the IG error can be calculated and is also plotted in Fig. 5. In addition, the OEM hypothetical solution (OEM h ) and information content (DFS h ) have been calculated using exact a priori error at each and every pixel (see Fig. 5). The x-axis of the plot is the percentage with respect to all matches (including cloudy ones), binned using the value of DFR (MTLS). The y-axis of the plot is the cumulative errors of all retrievals and the average values of DFS and DFR within the x-bin. Note that to display the average values of DFR in a bin, binning of the x-axis is made using the calculated DFR value.
Although the DFR values of MTLS vary from 0.97 to 0.32, it produces higher quality SST retrievals (RMSD ranging from 0.41 to 0.33 K) when compared with other methods. MTLS inherently adapts the regularization strength according to the problem, for example, it increases when the problem has high error in measurement space, is highly ill-conditioned, or both. The MTLS solution is closer to a priori/IG when the dynamically calculated regularization is high as discussed in a previous study [10]. This behavior ensures that the MTLS error remains below the IG error throughout the retrieval space. On the other hand, the value of DFS from OEM is ∼0.88 throughout, but the a priori error is lower than the OEM error for ∼85% of the cloud-free retrieval as seen in Fig. 5. Degrading a priori knowledge, allowing noise injection to state space for maximizing the sensitivity value as is seen in the case of OEM is regarded as undesirable. An interesting observation is that the LS error is always lower than the OEM error in retrieval, which is justifiable from a deterministic view point because increased error in retrievals is due to using error covariances as input. Input error covariances are error, not statistical information as is assumed in stochastic literature. The results shown in Fig. 5 are similar to our previous study for nighttime scenarios in [10], which ascertain the repeatability of the occurrences. Some of the key inferences are reiterated here: 1) of the practical retrieval schemes, only the MTLS retrieval error is always less than a priori errorthis ought to be regarded as a prerequisite for any retrieval system; 2) the sensitivity of MTLS is adaptive, that is, it is high when desired (a priori error is high) and vice versa; 3) OEM error is always higher than all other methods, even the LS error; 4) although the sensitivity of OEM is always close to 1.0, the OEM solution results in degraded a priori knowledge for ∼85% of cloud-free retrievals; 5) the values of DFS (DFS h in Fig. 5) using exact a priori error is significantly lower than the DFR of the MTLS for ∼70% of cases; and 6) the values of OEM h are closer to the MTLS results, but "truth" is required to obtain such results from OEM. A similar finding is also reported in [27], and OEM can produce better retrieval than regression when truth is used as a priori.

B. Effects of Additional Parameterization in Retrieved Vector
Although all the realistic scientific problems are related to multiple parameters, most realistic inverse algorithms have been designed for the reduced state-space parameters due to lack of adequate measurements. Traditionally, physical-based SST retrieval was designed with two parameters: SST and TCWV. An additional parameter, TCA, is included in the Fig. 6. Plots of retrieval errors for two parameters' MTLS ("circle blue") without aerosol and three parameters' TTLS ("+blue") with aerosols, and information content in terms of DFR ("red") under DFR ttls and DFR mtls . RMSD: solid and SD: dash.
PDSST scheme for MODIS-SST retrieval because MODIS has several channels available in the IR region and more than four of them can be used for SST retrieval. A sensitivity study for the TTLS algorithm for a month of MODIS-Aqua matchup data under complete cloud-free conditions using EXF old has already been published for nighttime scenarios [11]. The extension of this study for daytime SST retrieval problem using EXF new is shown in Fig. 6. It compares the algorithm's performance of the three-parameter TTLS and the two-parameter MTLS methods, which are applied for SST retrievals using a month of matchups from January 2016. Note that the aerosols are not supplied in forward model calculation for the two-parameter MTLS with three measurements (3.9, 11, and 13.4 μm) to keep constancy with the above experiment. TTLS retrievals have been conducted using six measurements (3.9, 4, 4.5, 11, 13.4, and 13.6 μm). Binning has been made using the calculated values of DFR and the cumulative errors have been plotted on the y-axis. The detailed description of this experiment can be found in an earlier publication [11]. Fig. 6 shows that the TTLS performs better, yielding lower error in retrievals and higher values of DFR for all cases similar to our previous work, but different in magnitudes. It can be concluded from this experiment that the aerosol presence in real measurements is treated as noise in the two-parameter MTLS solution. The high noise in residuals due to reduced state vector of two elements leads to high regularization for the MTLS solution and yields a low DFR value. The addition of a third parameter in TTLS increases the degrees of freedom to improve the algorithm sensitivity without deteriorating the noise performance. Elaborately, the inclusion of TCA, as a retrieved parameter in the three-parameter TTLS retrieval algorithm, minimizes the residual with respect to the three variables s, w, and a. However, TTLS three-parameter retrieval cannot be implemented when the number of channels of a sensor is limited.
The lowest DFR in Fig. 6 using TTLS (4) when compared with MTLS (3) is improved from 0.31 to 0.55 for daytime, whereas the same was improved from ∼0.43 to ∼0.69 K for the nighttime cases [11]. The information content of MTLS and TTLS is reduced significantly in daytime when compared with nighttime due to the fact that the variability of the measured BT of 3.9 μm for daytime in the presence of aerosols is much higher than the same for nighttime. Elaborately, a high inherent regularization due to a high error in residual for the absence of TCA decreased the lowest daytime DFR value of MTLS. As can be seen in [11], the range of RMSD of nighttime cases for TTLS is from 0.24 to 0.22 K, which is almost equal to the implicit "buoy" random error [29]. Note that a first-order approximation for the skin-bulk SST differences was accounted by a constant value of 0.17 K for nighttime cases in our earlier study [11]. However, no offset is applied for daytime validation and the range of RMSD of daytime for TTLS in Fig. 6 is from 0.32 to 0.27 K. Note that the total error including cool-skin offset for nighttime is ∼0.28 K ( 0.17 2 + 0.22 2 ), which is comparable with the daytime results in Fig. 6. The DFR value for lowest bin of the daytime is ∼0.55, which is rather low compared with nighttime cases of 0.69. This likely occurs due to a deficiency of CRTM and the limitation of daytime validation using buoy measurements. However, the calculated solar component due to the variation in the specular angle using CRTM is reasonable.

V. CLOUD AND ERROR MASKING
The experimental filter, EXF new , which is used for complete cloud-free set using matchup data, can help design the retrieval algorithm and conduct performance studies of cloud detection schemes. Nevertheless, it cannot be used as an operational algorithm because it is based on the in situ measurements. The novel CEM algorithm was based on both the RT-based tests and functional spectral differences, considering the GOES-13 measurements for operational environment [18]. The extension of CEM for MODIS-Aqua using functional dd test for nighttime has been already reported in peerreviewed literature [19]. Although the target is to keep uniform cloud detection schemes for both day and night, the lower threshold test for the functional spectral differences between 3.9 and 11 μm is removed from daytime cloud detection to increase the DC. The additional test incorporated for a generic scenario (both day and night) is the dd test dd(3.9, 11) < max((w/20), 1.5), whenθ sp > 30 • , as is discussed earlier (9), due to the different physical processes associated with the 3.9μm measurements for day and night.

VI. COMPARISON AND VALIDATION OF PDSST USING IN SITU POINT MATCHUPS
To illustrate the performance of PDSST suite for daytime, the comparative performances between PDSST and operational PO.DAAC retrievals using a month of matchups from October 2019 from MODIS-Terra and MODIS-Aqua are plotted in Fig. 7(a) and (b), respectively. Note that the PDSST suite is the combination of the truncated TTLS (4) retrieval algorithm plus cloud detection algorithm of CEM [11], and PO.DAAC is the regression-based retrieval plus threshold-based cloud detection algorithm. The bin of the x-axis is made for PO.DAAC according to the values of QF from 3 to 5 and for PDSST based on the values of AE. The values of SD and RMSD of two retrieved SSTs (PDSST, blue and PO.DAAC, red) from the in situ buoy measurements are plotted in the y-axis. Two different error metrics of SD and RMSD are considered to understand the systematic error, which is the difference between two metrics. Fig. 7(a) and (b) shows that the systematic error for both the products is significantly low. Fig. 7(a) shows that RMSD of PDSST retrievals from MODIS-Terra is 0.37 K with a DC of ∼29% after discarding the last bin, whereas PO.DAAC RMSD is 0.47 K with a DC of 12% for QF = 5. The DC for PO.DAAC can be increased to 17.5% by setting QF ≥ 3, but the value of RMSD increased to 0.54 K. Fig. 7(b) shows that the RMSD of MODIS-Aqua using PDSST is 0.39 K with a DC of ∼30% and that of PO.DAAC is 0.45 K with a DC of ∼12% at QF = 5 and is 0.57 K with a DC of 17% when QF ≥ 3. This implies that PDSST simultaneously increases the DC significantly and reduces the error value over PO.DAAC.
To illustrate further, the geographical error distributions for two different products using a month of MODIS-Aqua matchups from the month of July 2017 are shown in Fig. 8.
The RMSD values of the retrieved SSTs from the corre- sponding buoy measurements at the finer grid of (1 • × 1 • ) latitue/longitude are shown in Fig. 8. The finer grid of 1 • × 1 • is allowed to compare both the accuracy and DC of different retrievals and cloud detection schemes. It is challenging to discern the exact error distribution by the human eye when the errors are mapped at the fine-grid level, and thus, the spread of errors (Serr) is, the numeric  I   INFORMATION GAIN OF PDSST OVER PO.DAAC SSTS FOR  DIFFERENT MONTHS AND FOR THE TWO SENSORS OF  MODIS-TERRA AND MODIS-AQUA values, introduced for the comparison of different products. The maximum value of Serr is determined by sorting the lowest 95% of data and keeping the minimum value unaltered.
The specific trend of the error distribution for any products with respect to a particular geographical location is not identified, but the DC of PO.DAAC at high latitude is significantly low, which is highly desirable for many geoscience problems. These values of RMSD and DC for two different methods vary significantly for different months. Thus, the information gain (G inf ) of PDSST over the operational PO.DAAC performances, the total improvement combining the two different metrics, is introduced and calculated as G inf = 1+ ε podac −ε pdsst min ε pdsst , ε podac × 1+ C pdsst −C podac min C podac , C pdsst (11) where ε pdsst and ε podac are the RMSD of PDSST and PO.DAAC, respectively. Similarly, C pdsst and C podac are the DC of PDSST and PO.DAAC, respectively. The values of G inf for some months of matchups from 2019 for MODIS-Aqua and MODIS-Terra are shown in Table I. The two cloud-free sets for PO.DAAC SSTs are assumed to be QF = 5 and QF ≥ 3 for the experiment purpose, because there is no explicit cloud flagging available in this database. Table I shows that PDSST can extract -two to three times more information when compared with PO.DAAC. The values of G inf for QF ≥ 3 are lower than the same for QF = 5; however, the RMSD values of PO.DAAC are significantly higher (e.g., ∼0.55 K for October 2019) for the cloud-free subset using QF ≥ 3. The SSTs having such high error content may not suitable for a higher level scientific model. Although several months of the matchups are studied randomly, a systematic comparative time series performance study will be reported in a future publication.

VII. CONCLUSION
This work concludes that the MWIR channels can be used in the PDSST retrieval scheme for daytime SST retrieval from MODIS-Aqua and MODIS-Terra to increase the accuracies of the satellite-derived SSTs. It is found that PDSST is superior to prevalent stochastic SST retrieval methods, namely, OEM and SWT. The improvement of PDSST performances over PO.DAAC for daytime SST is even more than that for nighttime SST which was reported in our earlier publications. This is due to the fact that the PDSST suite can use the MWIR channels for daytime SST retrieval and there is little scope to use MWIR channels in regression-based daytime SST retrievals. Even though a physical-based estimation method, namely, OEM, has the capability to include MWIR channels in daytime SST retrieval, the comparative information content with respect to the error in retrieval of PDSST and OEM shows similar conclusions to the results reported in earlier publications [10], [25], [31]- [34]. For example, retrieval error is higher than the a priori error and the OEM error is comparable with the error in PDSST retrieval when the exact a priori error is supplied as an input. Sequential inclusion of the retrieved parameters in the PDSST retrieval scheme for daytime SST retrieval in the operational environment is wellharmonized with our previous publications using nighttime data.
The use of MWIR channels for daytime SST retrieval is the first of its kind when compared with the conventional community practice, which is thoroughly investigated. This study shows that daytime satellite-derived SSTs using the PDSST scheme with respect to the predominant regression-based SST products can gather at least three times as much information as the legacy approaches using global matches. One of the explanations for such an improvement is the inclusion of MWIR channels in the daytime SST retrieval scheme.