PCRTM-RA Enhancements for Improving CO Retrievals Using NAST-I Measurements From the FIREX-AQ Field Campaign

The principal-component-based radiative transfer model retrieval algorithm (PCRTM-RA) for carbon monoxide (CO) retrieval has been improved for better use of National Airborne Sounder Testbed-Interferometer (NAST-I) measurements obtained during the Fire Influence on Regional to Global Environments and Air Quality field campaign. One of the explicit purposes of the campaign was to characterize wildfire-induced atmospheric changes. Coincidental measurements from various airborne instruments, including NAST-I infrared hyperspectral measurements from the NASA ER-2 aircraft, provided us an opportunity to test and improve the PCRTM-RA CO retrieval. By relaxing the vertical correlation of CO profiles in the a priori covariance constraint, a significant improvement in the vertical structure of the CO retrieval has been confirmed. The methodology is validated using a synthetic testing dataset that covers observations associated with various CO vertical profiles, including that of an extremely polluted atmosphere. The methodology is also applied to real NAST-I measurements, and the results have successfully demonstrated the capability of using PCRTM-RA retrieval results for CO plume evolution and transport monitoring.

It is reasonable to assume that a retrieval solution would exist within the range of variance of climatology [15]. However, one can expect that the global climatology constraint would not be the best when the set of measurements are targeting atmospheric states, which do not closely follow preassumed a priori covariance. One example is the measurements from NAST-I for retrieving CO during a recent field campaign, the Fire Influence on Regional to Global Environments and Air Quality (FIREX-AQ), which took place in the western United States in the summer of 2019 [37]. This is because the original CO climatology variance used in PCRTM-RA was constructed using globally distributed CO profiles collected from various data sources with a large percentage of samples measured in relatively clean environments. In other words, the currently optimized variance for the global scale CO retrieval may not be optimal for the application to NAST-I measurements in a polluted environment, such as that during the FIREX-AQ campaign. One focus of this study is to evaluate the impact on CO retrieval from decreasing the contribution of a priori constraint for determining vertical structure.
During the FIREX-AQ campaign, various in situ CO observations were collected as reference data. However, the spatial and temporal sampling differences between NAST-I measurements and in situ measurements do not allow us to have a large enough collocation sample dataset representation of a diverse range of CO concentrations, making it hard to perform a direct quantitative comparison study. Therefore, we assess the CO retrieval using the synthetic NAST-I measurements simulated by PCRTM for possible atmospheric states, including various CO profiles from clean to extreme pollution cases.
The unique capability of NAST-I as a remote sensing instrument to monitor CO distribution within and surrounding wildfire combustion during the FIREX-AQ campaign was successfully reported by Zhou et al. [38]. They demonstrated NAST-I CO retrieval capability, intercompared remotely sensed CO with in situ CO observations and assessed plume correlation between CO and smoke-dust observed from lidar. The CO results from PCRTM-RA also hold the same potential as one of the NAST-I retrieval data products. Here, we test the capability of PCRTM-RA for capturing CO distribution over a wildfire during the developing and decaying stages. We also compare our product with in situ observations for assessment purposes. Using the same date and collocation strategy as the previous study allows us to do an intercomparison study between results from different algorithms. We do believe that having a detailed intercomparison of two retrieval algorithms for cross-examination is critical to understand the accuracy of retrieval products. However, a lack of independent validation data especially from extremely polluted regions to explain the difference due to an ill-posed issue of a retrieval problem within two different retrieval algorithms could lead to a misconception.
This article is organized as follows. A brief introduction of the data used in this study, the introduction of PCRTM-RA and retrieval optimization for NAST-I measurements, and the way of balancing the contribution of constraint for improving CO retrieval are given in Section II. Discussions on a qualitative assessment of improvement by using a synthetic dataset with CO retrieval demonstration and evaluation with in situ observation are presented in Section III. Finally, Section IV concludes this article.

A. Data Used
This study uses two different datasets: NAST-I measurements during the FIREX-AQ campaign as raw input radiance data for retrievals, and in situ CO geophysical observations obtained during the same campaign.
NAST-I is an airborne infrared hyperspectral sounder that measures radiances over the infrared region from 645 to 2500 cm −1 . The number of channels is 8632, and the spectral resolution is 0.25 cm −1 . NAST-I provides a high-spatial linear resolution equal to 13% of the aircraft altitude at nadir and 13 FOVs across the aircraft track from 13 NAST-I scan angles (i.e., ∼2.6 km for FOVs directly below aircraft and ∼3.4 km apart on the ground from an ER-2 altitude of 20 km) [38]. Fig. 1(a) shows an example spectrum for NAST-I measurements. The channels cover carbon dioxide (CO 2 ) and water vapor (H 2 O) absorption bands for temperature and water vapor sounding. Also covered are many absorption bands for trace gases, such as O 3 , CO, methane (CH 4 ), and nitrous oxide (N 2 O). During the FIREX-AQ campaign, NAST-I flew onboard the ER-2 aircraft for 11 days on 2, 6,7,8,12,13,15,16,19,20, and 21 August 2019, over the Western United States. NAST-I made 4000 to 16 000 measurements during each flight day.
During the FIREX-AQ campaign, various in situ instruments on DC-8 aircraft are also used to observe wildfires. Unlike remote sensing data, in situ observations are point observations because observations are generated from point-sampled air parcels. In this study, we choose the Differential Absorption Carbon monOxide Measurement (DACOM) [39] as the reference observation for assessment purposes. The in situ data taken from DC-8 are available online (available at https://asdc.larc. nasa.gov/project/FIREX-AQ).

B. PCRTM and Its Retrieval Algorithm for NAST-I Measurements
PCRTM developed by Liu et al. [23] adopts a unique principal component (PC) approach to effectively reduce the computational cost of simulating thousands of hyperspectral channels while maintaining accuracy. Now it can simulate the radiative contribution and corresponding Jacobian for various trace gases, including CO 2 , H 2 O, O 3 , CO, CH 4 , and N 2 O. PCRTM has been continuously improved in various aspects, including multiple-scattering cloud modeling, treatment of solar contributions, nonlocal thermodynamic equilibrium, and updating physical coefficients from the updated gas absorption database. An intercomparison study with other radiative transfer models has been made and a PCRTM performance assessment is available [40], [41]. This study uses the most updated PCRTM to simulate apodized NAST-I measurements using a Kaiser-Bessel apodization function. NAST-I radiances are represented using 160 PC scores (i.e., 60 for the long-wave infrared region, 50 for the mid-wave infrared region, and 50 for the short-wave infrared region).
Since PCRTM provides analytical Jacobian, representing the derivatives of PC scores with respect to physical parameters, such as atmospheric temperature and cloud optical depth, it is ideal to retrieve atmospheric states using PC-compressed hyperspectral data. This approach not only enables the usage of all hyperspectral information but also increases the speed of inversion. Furthermore, the PC noise filtering of the measured radiances reduces the ill-conditioning of the inverse problem. Currently, PCRTM-RA routinely retrieves temperature, water vapor, CO, CO 2 , O 3 , CH 4 , and N 2 O profiles and cloud properties, such as cloud optical thickness, cloud top height, and cloud effective radius. Surface emissivity and surface temperature are also retrieved. It retrieves those parameters simultaneously, using the rich spectral information provided by hyperspectral measurements. The application of hyperspectral sounder observations has been demonstrated in our previous studies [24], [25], [31]. Similar to applying PC analysis (PCA) to NAST-I measurements, we also use PCA to compress our state vector (or retrieval variables) from 101 pressure levels to their PC domains. Table I shows the number of PC modes for each state variable. By removing PCs with low singular values in the state vector, the PCRTM-RA prevents retrieving a nonphysical profile (e.g., a vertical atmospheric profile with oscillations between adjacent levels, which often happens with ill-posed inversions [42]). PCRTM-RA was first used to analyze NAST-I for the EAQUATE field campaign data [31].
The optimal estimation described in [43] is one of the most common ways of performing atmospheric retrievals [44], [45], [46]. PCRTM-RA is also designed using an optimal estimation framework. In the optimal estimation approach, the maximumlikelihood solution (i.e., retrieval) is constrained with a priori information. The a priori information consists of a mean state vector and associated error covariance matrix (also called background and background error covariance matrix), which describes the variance of the state vector as well as correlations between each element of the state vector (e.g., the vertical correlations of the atmospheric profiles). For PCRTM-RA, climatology and a covariance matrix are used as the constraint. Improvement of trace gas retrievals, including CO using a relaxed a priori covariance matrix in a simultaneous physical retrieval, has been studied [46]. They set a rather loose constraint for CO retrievals based on climatology, which shows a CO variability larger than 30%. By relaxing the trace gas error covariance, the optimal estimation retrievals lean more toward unconstrained minimization. They also noted that a good constraint for temperature profile is needed when relaxing CO and other minor trace gas variances in their simultaneous retrievals. In this study, we keep the CO a priori profile the same as the global climatology and only relax the variance and vertical correlations in the CO error covariance matrix.
To eliminate the constraint that forces retrievals to have a vertical correlation of a CO profile that represents global CO profiles, which are measured in a relatively clean atmospheric environment, we removed off-diagonal components of the error covariance matrix for CO and added excluded variability into diagonal elements of the CO a priori error covariance matrix. We then added vertical CO correlations by using an exponential function with a scale height of about 2 km. This correlation function was suggested for CO retrieval of the Tropospheric Emission Spectrometer [47]. Details of this function and its scientific background can be found in [48]. Fig. 2 compares two error covariance matrices before and after relaxing offdiagonal factors. After relaxation, the correlation is smoothed, i.e., features over off-diagonal area shown in dark-blue and/or blue color from the original covariance matrix have disappeared. This indicates that the new a priori allows CO retrieval to have a smaller vertical correlation as compared to the original one. It is clearly shown that vertical correlations between different pressure levels are more localized to adjacent levels from the relaxed a priori covariance matrix [see Fig. 2(b)] in comparison with the original one [see Fig. 2(a)]. In other words, a relaxed covariance matrix has a significant diagonalized feature (i.e., higher values apparent along with a diagonal line of the matrix), and the contour is narrower next to the diagonal line after relaxation.
For the observational error covariance matrix, we used the NAST-I radiance noise derived from a calibration study using the onboard blackbody. We also considered uncertainties, such as radiative transfer model error and PC representation error. The actual observation error that we used is shown in Fig. 1(b).

A. Assessment of PCRTM CO Retrieval Improvement Using a Synthetic Dataset
Since the impact of a priori constraint on an optimal estimation retrieval is not straightforward, it is necessary to assess how the algorithm responds to a given change in the constraint. In doing so, a synthetic dataset that covers various ranges of CO concentrations, including extremely high concentrations for wildfire-induced CO, has been constructed by running PCRTM-RA for multiple cases of measurements. It should be mentioned that the purpose of this assessment is to see how the algorithm will generate a product differently with a given retrieval setting under extreme cases, unlike the previous study by Wu et al. [25]. The previous research assessed the quantitative accuracy of PCRTM-RA for CO retrieval by using independent synthetic data from a chemical model, which does not have fire-induced extreme CO profiles. Fig. 3(a) shows the CO profiles included in the synthetic database. A total of 2852 diverse CO profiles was used to simulate NAST-I radiances using PCRTM with other atmospheric and surface parameters collected from the PCRTM-RA retrievals, then NAST-I instrumental noise was added to those synthetic radiance spectra for retrieval analysis.
From a set of simulated NAST-I radiances, two sets of CO profiles are retrieved using PCRTM-RA using different a priori covariance matrices: the default global climatology-based a priori [see Fig. 3(b)] and the relaxed a priori with reduced vertical correlation [see Fig. 3(c)]. The same color given in Fig. 3 indicates the same case. The default scheme successfully separates high-and low-concentration cases, and the general decreasing pattern toward increasing height is well captured. However, it seems that CO profiles always have similar structure even though input radiance spectra were simulated with very different CO profiles. The truth profiles [see Fig. 3(a)] have very extreme cases with peak concentrations reaching around  800 ppbv, but the maximum concentration shown in Fig. 3(b) is less than 500 ppbv. CO is underestimated at the peak level and overestimated at other heights. As compared with the truth profiles, the retrieved CO concentration is generally lower in the lower mid atmosphere and slightly higher over the upper atmosphere. This is caused by a strong vertical correlation between upper and lower atmospheric CO imposed by the default a priori error covariance matrix.  On the other hand, CO profiles retrieved with a relaxed constraint better represent the vertical variation of the truth profiles, even though there is still room for improvement. For example, the algorithm is missing a few high concentration cases that peak near 700 hPa shown in light green or sky blue. Also, a few overestimations of high concentration cases around 400-500 hPa colored in dark blue means that the vertical correlations in those pressure regions may still be high in our modified CO a priori covariance matrix.   4 shows the performance of the scheme using the relaxed a priori for cases with very different vertical structures. Fig. 4(a) and (d) and Fig. 4 (b) and (e) are testing CO profiles and retrieved ones with the relaxed a priori covariance matrix, respectively. The differences between testing CO and the retrieved CO profiles are given in Fig. 4(c) and (f). One subset is for higher concentration cases [see Fig. 4(a)-(c)]. Despite the general agreement, overestimation around 400-500 hPa (sky blue lines) and underestimation below 700 hPa (dark blue lines) are more clearly seen here. The other interesting point is overestimation below 700 hPa for lower concentration cases (red lines). Such behavior is also seen in Fig. 3. The retrieval seems to have a CO structure for clean air, but the truth structure shows a light plume signal whose peak location is near 400 hPa. The other subset shows a comparison for low-concentration cases [see Fig. 4

(d)-(f)].
Unlike the previous comparison, most truth states represent CO structure in a clean environment. From this comparison, using a relaxed contribution matrix for better-representing enhanced CO emission also works well for CO retrieval in a clean environment.
A deeper understanding of the differences between two retrievals can be revealed from an averaging kernel analysis, which is defined as follows: where S x and S y are error covariance matrices for a priori and observation, respectively. K indicates Jacobian matrix. This averaging kernel allows us to see how the algorithm responds to output (i.e., retrieval) from a change of the truth. Fig. 5

(a) and (b)
is examples of CO averaging kernels from default and modified retrieval settings, respectively, whereas CO climatology used in this study as a priori is given in Fig. 5(c). In the modified retrieval setting, all parameters are the same except for S x . Here, the averaging kernels for different pressure levels are represented by curves of different colors. One can interpret that the CO state will be modified from this climatology in response to the actual status change, which is reflected in the radiance domain of NAST-I measurements.
When we relax the a priori covariance matrix for CO, the algorithm better captures a perturbation level. Also, as the vertical resolution is improved, more extreme variability can be captured. Even though the vertical resolution is improved, the overall retrieval is still affected by perturbation of the midtroposphere around 500 hPa. In other words, NAST-I channels for the mid-troposphere have lots of information for retrieving CO for the whole troposphere. The amplitudes of both averaging kernels for the upper troposphere drastically decrease with increasing height and are almost close to zero at the aircraft level. It indicates that NAST-I has less information on the upper levels, so a priori background primarily determines solutions at such vertical locations. Since the current a priori profile for CO does not represent wildfire-polluted CO in the upper troposphere, CO retrieval from NAST-I would not be accurate for upper tropospheric CO retrieved with such a priori. It is clear from this simulation-retrieval study that the upper tropospheric CO values from the default PCRTM-RA may be influenced by the values in the mid-lower atmospheric due to the strong vertical correlations of the global background CO covariance matrix.
B. Significance of CO Retrieval From PCRTM-RA 1) Demonstration of CO Plume Development: CO concentration is related to the evolution of the wildfire and CO plume that may have temporal variation due to the intensity of the wildfire source and the change of environment, such as wind. Therefore, retrieval success can be indicated by how well the temporal variation features can be captured. The case from August 16, 2019 is an excellent example showing how CO retrieval from NAST-I measurements captures plume development, including growth and decay. The location of the wildfire and the corresponding visible image of the plume are presented in Fig. 6(a). The background image in Fig. 6(a) is obtained from the VIIRS true-color image data, and the area of fire (red spot) is determined by measurements from both MODIS on terra and aqua satellites and VIIRS on Suomi-NPP (available at https://worldview.earthdata.nasa.gov/). On this day, the ER-2 flew around 33°N to 38°N and 110.5°W to 114.5°W on the northwest side of Arizona. As shown in the visible image, most measurements were made over a cloud-free area, where the concern of cloud contamination for infrared remote sensing is minimized.  Fig. 7(a) shows CO plume evolution as a high concentration area is visible over a relatively smaller space. The plume does not propagate further yet compared to other sequences. As time passes, a weak concentration area appears around the source region [see Fig. 7(b) and (c)]. In the fourth and fifth scenes, the plume travels in a northeastern direction around 0.5°longitude and 0.4°latitude from the point of the wildfire source [see Fig. 7(d) and (e)]. One can note the change of wind direction from CO distribution when time passes from the fifth scene to the sixth scene [see Fig. 7(e) and (f)]; it seems that the easterly wind in the sixth scene is stronger than in the fifth scene because the direction of the plume tends to be changed from northern direction to eastern direction. In the seventh scene, the plume spreads to a wider area, and the peak concentration is weaker than in previous stages [see Fig. 7(g)].
There seems to be no additional CO from the source region in the eighth scene, and the highest concentration is shown on the northeast side instead of the point of the source. This higher concentration is thought to be the result of the movement of the core of the plume. Fig. 9 shows the partial three-dimensional structure of CO in this scene. Here, the color scale is adjusted to highlight the plume structure better. Among the thirteen scanlines of NAST-I, we plot the horizontal and vertical CO profiles starting with the sixth scanline, which has the maximum concentration of CO. In this partial three-dimensional figure, CO over the source region observed around 112.9°W is almost close to the background, whereas NAST-I captures a core of plume around 112.3°W. From the ninth scene, concentration notably decreases, and it is almost closer to the background concentration in the twelfth scene though the scene still has the signature of a plume [see Fig. 7(i)-(l)]. Since CO has a lifetime ranging from 1.5 weeks to a year [49], such weakening is caused due to dilution by mixing with clean air. Similar time evolution of CO concentration over 700 hPa is captured in 400 hPa (see Fig. 8). As altitude increases from the source of fire on the surface, the overall concentration of CO decreases. The area affected by the plume is also getting slightly smaller. Also, compared to the background CO concentration of around 100 ppbv over 700 hPa, approximately 60 ppbv is shown over 400 hPa. on August 8, a wider plume spread over 47.8°N to 49°N, and 113.8°W to 115.5°W is targeted.
In the August 6 case, CO retrievals capture a plume that blows from the source of fire toward the east [see Fig. 10(a)-(c)]. Among the three levels, retrievals over 500 and 700 hPa successfully capture the plume propagation direction. The PCRTM-RA retrieved CO profiles indicate that the plumes are located in the mid to lower troposphere. Compared to the August 8 and August 16 cases, the August 6 plume was weaker in terms of concentration.
In Fig. 10(d)-(f), the CO retrieval from August 8 captures the plume shown in the visible image. This plume is relatively well developed vertically with high CO concentrations at 500 and 700 hPa. It is clear from Fig. 10(e) and (f) that the PCRTM-RA retrieved CO can be used to separate the clean air from the polluted air with heavy plume areas identified. Also, we see the slightly enhanced concentration at 300 hPa around 250 ppbv even though the information from infrared sounders is limited. It is interesting to note that the highest concentration area at 700 hPa is slightly shifted toward the South-East direction than that on 500 hPa. We can expect that plumes come from two different core locations.
2) Comparison of NAST-I CO Retrieval With in situ Observations: For assessment purposes, we compare the performance of retrieval by selecting the day used in previous work (i.e., the case of August 6, 2019) and follow the collocation strategy; a detailed strategy can be found in [38]. Fig. 11(a) compares collocated PCRTM-RA retrieved CO (CO pc ) and aircraft in situ measured CO (CO ref ) at different pressure levels. Red and black represent CO pc and CO ref , respectively. A corresponding scatter plot between CO ref and CO pc is provided in Fig. 11(b) to see the data differently. Here, Δlatitude and Δlongitude are less than 0.015°, and ΔUTC is less than 2 h. Based on Fig. 11, CO pc generally captures in situ CO vertical variations very well.
To better understand comparison results, it would be better to focus on natural differences between in situ and remote sensing data. Since the in situ DC-8 CO observation had a very limited sampling of the atmosphere while the NAST-I retrieved CO has continued three-dimensional atmospheric sampling between the ER-2 flying altitude and the ground, it is very difficult to ensure that the NAST-I sensor and the in situ CO instruments are sampling the same air parcel (or air volume) for different altitudes. Furthermore, the ER-2 and DC-8 have different flight patterns that overpass plumes at different times. If the DC-8 dwells in the relatively clean area above the plume event, biased sampling and lower resolution issues result in a large gap between retrieval and in situ observations.
A similar explanation would be available for general underestimation of the lower troposphere. As shown in the averaging kernel, a change of CO in the mid-troposphere is a considerable contribution to the lower level. That is, a radiative signal from a plume can be diluted due to coarse vertical resolution. Moreover, underestimation could be enhanced if the DC-8 flies inside the plume, which might be very shallowly formed over the lower troposphere. Beyond vertical resolution difference, horizontal resolution difference and time difference during the collocation procedure also contribute to the difference between CO pc and CO ref .

IV. CONCLUSION
To improve retrievals for better observation and characterization of CO plumes from NAST-I measurements over the wildfire rampant area during the FIREX-AQ campaign, adaptive modification on the standard PCRTM-RA algorithm has been made by localizing the vertical correlation imposed by the a priori and, therefore, providing a de facto relaxed constraint. The impact of the relaxing vertical correlations defined by the a priori covariance matrix is assessed from a synthetic dataset of NAST-I measurements representing various wildfire-induced CO plume cases, as well as a nominal clean atmospheric environment. We also demonstrate the CO retrieval results after applying the modified algorithm to real NAST-I campaign data and validate the corresponding results using in situ observations.
Notably, the a priori used in the standard PCRTM-RA for global CO retrievals limits the variation of possible retrieved CO profiles. A strong correlation between different altitude levels exists in the a priori covariance matrix typically used for global CO retrieval. Such correlation is due to vertical distribution characteristics of the global scale CO profiles, which can be very different from that of a local event when high concentration CO profiles dominate. Also, the lack of information provided by radiance observations for CO retrieval, as compared to T, q, or O 3 retrieval, requires more careful use of the a priori constraint. Simulation-based assessment has shown that the default algorithm for global retrieval cannot accurately capture the vertical distribution of high concentration CO profiles. Using the updated a priori constraint results in a notable improvement in the vertical structure of retrieved CO profiles, i.e., increased variability over mid-lower altitude and suppressed variability over high altitude CO that better matches the truth.
The CO plume formation and evolution are nicely captured by the updated PCRTM-RA CO retrievals using NAST-I measurements. Also, spatially and temporally coherent features, which are expected for wildfire plumes, are seen very well from the retrieval results. In situ CO measurements cannot catch such CO plume features because in situ observations are made point by point. It is also hard to see from polar-orbiting satellite infrared hyperspectral sounders due to their limited spatial and temporal resolutions. Despite the challenges of reconciling results from different types of measurements, we can still confirm a reasonable agreement between CO pc and CO ref .
The current approach is optimized in atmospheric conditions to include both nominal and highly concentrated CO. It has been demonstrated with FIREX-AQ NAST-I retrievals evaluated with coincidental in situ CO measurements with a large variation of CO profiles from upwind clean to downwind polluted regions, although this experiment was done within small regions near the wildfires at different geophysical locations. For global application, we may classify infrared hyperspectral observations with appropriate CO error covariance matrices for its final retrievals according to the magnitude of the CO from its initial retrieval. We can first perform the PCRTM-RA optimal estimation retrievals using a global error covariance constraint. If a retrieved scene has a CO concentration above a given threshold, the PCRTM-RA will perform one or two more iterations with a relaxed CO error covariance matrix to ensure achieving better CO retrievals.