CMOS System-on-Chip Spectrometer Processors for Spaceborne Microwave-to-THz Earth and Planetary Science and Radioastronomy

This article presents development of CMOS system-on-chip spectrometer processor for Spaceborne Earth Science, Planetary Science and Astronomy. The developed chip is intended for use in rotational emission spectroscopy observations from microwave to sub-millimeter wavelengths performed from Earth orbiting or deep space exploration spacecraft. The article first provides an overview of rotational spectroscopy for space sciences and highlights some of the important results achieved in Earth science, planetary science, and astronomy. The article then discusses the need for smaller and more compact spectrometer instruments across these three space science areas, which motivates this work. Then key considerations in development of spectrometer processors for space science are discussed including the significance including bandwidth, quantization efficiency and spectral channel shapes. The article then highlights detailed design of the developed spectrometer processor focusing on the many calibration techniques included to allow the chip to adapt to operations in space environments where large temperature ranges and high radiation levels are encountered. Presented measurements demonstrate the spectrometer processor operates up to 12.8 GS/s to provide up to 6.4 GHz of processing bandwidth with a frequency resolution of 8192 spectral channels and an analog resolution of 4 bits while consuming only 3960 mW of total DC power.

microwave, millimeter-wave, or Terahertz frequency range. The source is then swept across the band of interest to identify the molecular resonance frequencies of the sample gas, which reveal themselves in the experiment as frequencies at which the source energy is absorbed [1], [2], [3]. Absorption spectroscopy is by its nature already frequency resolved, mapping absorption to frequency since the source is controlled and only operates at one frequency at a time. Other common variations on absorption spectroscopy use pulsed sources which look at absorbed and re-emitted energy [4], [5] preventing the source noises (thermal and phase noise) from limiting the sensitivity of the measurement.
While simple to implement, absorption spectroscopy using artificial sources is not common for space exploration as sources capable of illuminating entire atmospheres or planets are not practical from a signal power standpoint, and certainly not compatible with the limited power and mass budget of an exploration spacecraft regardless of its destination. Beyond the practicality of sources, absorption spectroscopy requires the investigated sample to be placed between source and detector. Such configurations are possible at Earth for targeted situations like the GRACE and SWITCH concepts [6], [7], but not easily achieved at distant planets in our solar system as it would require multiple spacecraft in well understood constellations around the target object. Absorption spectroscopy can only be employed in astronomy when a natural signal source (star or other radio source) exists behind the object of study in the field of view. Using an artificial signal source for absorption is obviously not applicable in astronomy as the objects of study are astronomically distant (many light-years away), making a "sample in the middle" configuration impossible to configure (without many millions of years to perform the experiment).
For these key reasons, spaceborne exploration across Earth / Planetary Science and Astronomy primarily employ emission rotational spectroscopy which operates on a passive mechanism from the instrument standpoint. All molecules thermally excited above 0 K (so basically all matter in the universe) continuously emit some of their thermal energy at their rotational resonance frequencies with the power level emitted being proportional to their absolute temperature. While the spectral purity or "linewidth" of these emissions are governed by their absolute temperature (leading to Dopper blurring of the resonance) and their gas pressures (which dampen the resonance at higher pressures due to increased inter-molecular collisions through a process called pressure broadening), the absolute frequency of these resonances is governed by the structure of the molecule itself, making it an excellent means to identify and characterize molecular components of gasses [8]. These benefits stated, emission-based spectroscopy has several challenges that make it more difficult to perform than its absorption counterpart. First, as the emissions of gasses at their rotational resonances only occur at power levels proportional to their own temperature, they are not observable in a warm environment where the background behind the gas is at a comparable temperature. In this situation the contrast will be too low to perform an emission detection with a meaningful signal to noise ratio (SNR). Note if the background is a far higher temperature then the target gas, then an absorption measurement is probably more suitable as the gas will produce an absorption feature. Second, the receiver noise temperature or system temperature itself often limits the SNR of these detections to levels far below what absorption spectroscopy can achieve. Fig. 1 provides a comparison showing the key differences between absorption and emission spectroscopy.
As an example, let us quantitatively consider trying to detect water molecules in atmosphere of Titan, an exciting moon of Saturn that has may have potential to support life. Titan's atmospheric temperature peaks at about 94 K [9], so against the cold space temperature of 2-3 K set by the cosmic microwave background (CMB), this leads to a limited contrast about 92 K. To detect H2O molecules at their 556.96 GHz, you would need a receiver at these frequencies like the systems in [10], [11] which at best provide 1500 K noise temperatures, giving you a 1500 K to 1592 K contrast ratio in the most optimistic case. Radioastronomy investigates objects at much lower temperatures than this even, and thus demand extremely sensitive detectors to attain a workable contrast. As even in these ideal cases the signal-to-noise levels are quite low, integration is often employed in these spectroscopic measurements where the measurement is repeated many 1000s of times and averaged to statistically suppress the noise level of the receiver through its uncorrelated nature from measurement to measurement.

II. IMPORTANCE OF ROTATIONAL EMISSION SPECTROSCOPY FOR SPACE SCIENCES
In Earth science, rotational spectroscopy is most often employed for studying Earth's atmosphere primarily to gain a better understanding of atmospheric processes, their mechanisms and how they may be influenced by the effects of climate change. One excellent example is the Microwave Limb Sounder (MLS) series of missions, including the UARS MLS spaceborne spectrometer [12] and the more recent Aura MLS spaceborne spectrometer [13]. Both MLS instruments are operated in a configuration called limb-sounding where the spectrometer observes the atmosphere from orbit on a tangent line, looking at the emission of atmospheric gasses against the cold background of space as depicted in Fig. 2(a). MLS results have been important not only in making future predictions about climate change [14], [15], but also discovered several important atmospheric circulations including the stratospheric "tape-recorder" [16] and played an important role in understanding the mechanisms of Ozone loss caused by human emitted Chlorofluoro-carbons [17] as well as recovery since the Montreal Protocol to protect Earth's ozone layer was adopted in 1987 [18].
In planetary science (focused on missions exploring other planets in our solar system) rotational spectroscopy can be performed in a limb-sounding configuration like that in Earth science but also has other interesting configurations related to plumes at jets. Several primitive planetary bodies and moons are known to emit plumes or jets of material into space that can be sensed by a spectrometer to measure the molecular composition of the material emitted. Two exciting examples of this are Jupiter's moon Europa, and Saturn's moon Enceladus, both thought to have sub-surface oceans that may harbor the conditions necessary to support life. Fig. 2(b) depicts how a planetary mission can be configured for this type of plume sensing investigation. An example of a planetary emission spectrometer is the Microwave Instrument for the Rosetta Orbiter (MIRO) [19] which flew on the Rosetta mission to comet-P67. The MIRO instrument with its 180 and 500 GHz spectrometer bands investigated the comet's H2O and found that the Deuterium to Hydrogen (D/H) isotopic ratio of the water present did not match predictions, having major implications on the models of solar system evolution that were thought to be settled at the time [20] and have now been opened for further debate.
Finally, in astronomy, emission spectroscopy is used both in ground telescopes and spaceborne telescopes to probe the molecular content of stellar objects and understand the distribution of matter in key regions both inside and outside of our milky way galaxy like the configuration depicted in Fig. 2(c). Beyond analyzing the molecular composition, spectroscopy has another important purpose in radio-astronomy, which is to provide velocity resolved measurements. Looking out into the universe, many different objects will likely be along your view-path including dust and gas within our own solar system, material in the surrounding Ort cloud, as well as material in our own galaxy overlapping the view of distant objects being studied. As the molecular features in the spectrum are still subject to the Doppler shifts induced by both solar and galactic motions as the signals travel, this provides a means to separate foreground and background regions of space in astronomical observations. An important example of a spaceborne telescope spectrometer is the Herschel-Heterodyne Instrument for the Far-Infrared (HIFI) on ESA's Herschel space telescope launched in 2009. HIFI did several important surveys of star forming regions and improved understanding of the material and processes that promote the generation of new stars [21]. Beyond these configurations, there exist others that do need to be acknowledged for completeness including limb-sounding for Earth science from ballooncraft [22], spectroscopy for astronomy from ballooncraft [23], and airborne configurations also [24], [25]. The focus of this work however is for spaceborne implementations of these exciting science investigations.

III. MOTIVATION: THE NEED FOR COMPACT AND LOW-POWER SPECTROMETER SYSTEMS
While instrumentation for Earth science, planetary science and astronomy have vastly different requirements, they do have a common need for compact and low-power spectrometer systems, albeit for very different motivations. In the Earth science arena, a major scientific need exists for continuous global coverage of atmospheric measurements, especially so that transient atmospheric events (severe storms and monsoons, convection events, and even volcanic eruptions) can be captured. These needs are leading to new mission architectures where large constellations of smallsats or cubesats are used in coordination to provide near-continuous large coverage areas, as opposed to the limited spatial-temporal coverage of a single large spacecraft. A recent example of this is the Tempest-D demonstration instrument that flew on a cubesat in 2018 [26]. These small cubesat and smallsat spacecraft have extremely limited payload resources and are often restricted to just 5-10 W of total power for their science instrumentation, demanding new lower-power instruments than the 250 W scale spectrometers like Aura MLS. In planetary science power is similarly limited, but by the nature of the journey, not the spacecraft form factor itself. As you travel into the outer solar system sunlight and the resulting solar power becomes extremely limited, and so multi-hundred-Watt payloads cannot be supported. As an alternative Radioisotope-Thermoelectric power can be used like the Cassini and Voyager missions [27] but is again power is limited by the launch mass requirements of the spacecraft restricting the total amount of radioactive material. Meanwhile in Astronomy where spaceborne telescopes are not destined for the outer solar system, an abundance of solar power exists to provide a science payload several hundred watts at a minimum. For example, the James Webb Space Telescope launched in late 2021, provides over 2000 W of power to the science payload. However, in radioastronomy, unlike Earth and planetary science, telescopes need to map the entire sky to produce images, not perform targeted measurements on single objects, and they typically have much longer integration times (hours to days) as the stellar object temperatures are much lower. The long integration time and many measurements greatly limits the number of observations that can be performed in a mission lifetime.

IV. CONSIDERATIONS FOR THE DESIGN OF EMISSION SPECTROMETER PROCESSORS
To overcome this, astronomers have begun to consider spectrometer arrays of many pixels (10-100) which can parallelize these observations, allowing even measurements with extremely long integration times to become reasonable. While these arrays have not yet flown in space several airborne examples include the upGREAT array on the SOFIA observatory [28] and the GUSTO balloon-borne array [23]. Spaceborne implementations of this type of spectrometer telescope are thought to be an eventuality, although advancements in both processing and receiver technology need to be made before such a mission can be undertaken. Shown in Fig. 3 is a generic block diagram of an emission spectrometer instrument. While rotational spectral features exist across a wide range (1-3000 GHz) for common molecules typically spaceborne systems for exploration tend to be clustered in frequency space by discipline (although exceptions do exist). Earth science typically focuses on water vapor and pollution molecules in the 20-400 GHz range, while planetary science generally focuses on isotopes of water, and small organics found in the 100-750 GHz range. Astronomy although broad in scope, tends to target small atomic molecules (O + O 2 H 2 ) across the 0.3-3 THz range. All the spaceborne spectrometer instruments referenced in this article are variations on this general block diagram with small differences in number of down-conversion steps, and single vs. double sideband operation. Essentially a spectrometer is a heterodyne receiver system that captures a band containing the molecular rotational resonance frequencies to be studied and down-converts them to a lower intermediate frequency (IF). The local oscillator (LO) signal for the down-converting mixer is often generated with a frequency multiplier chain [29], [30] for higher frequency bands in the THz regime, while millimeterwave spectrometers have many examples of a synthesizer directly at the band of interest [31]. In all configurations, the IF signal at the receiver output is ultimately passed to a spectrometer processor, the focus of this article. The spectrometer processor performs four key functions within the spectrometer instrument: 1) Digitize the IF signal with an analog-to-digital converter (ADC) so proceeding digital signal processing (DSP) operations can be performed. 2) Apply a windowingfunction to the input stream to reduce spectral leakage between spectrometer bins (bins are often referred to as "channels" in spectroscopy. 3) Compute the PowerSpectralDensity (PSD) of the input signal with a discrete Fourier transform (often implemented as a fast Fourier transform (FFT) processor). 4) Average or accumulate the power spectral density to increase the signal-to-noise of the measurement by averaging out noise power in the IF bandwidth.

A. BANDWIDTH CONSIDERATIONS
Although the need for this step in the processing pipeline is obvious there are some important considerations, notably quantization efficiency and bandwidth. Many spectrometers in Earth and planetary science observe multiple lines that are spread throughout a large bandwidth, for example the D/H ratio discussed in the MIRO results [19] lie 48 GHz apart with the "D" at 509 GHz and the "H" at 557 GHz, with other interesting isotopes of water also present at 520 and 530 GHz. To perform this measurement the LO is essentially stepped one IF bandwidth at a time to cover the entire RF band of the receiver. The wider the digitization bandwidth, the fewer steps are needed to cover the spectrometer's RF band and the more efficient the instrument will be in terms of observation time for a given signal-to-noise. Additionally in some planetary science flyby scenarios, two spectral features may need to be compared simultaneously and time interleaved switching of the LO frequency is not suitable as the relative velocity between spacecraft and observed target is so high that the antenna is pointed to a different location by the time the switching operations and integration cycle have occurred. For these reasons a large bandwidth spectrometer processor is highly desired.

B. QUANTIZATION EFFICIENCY CONSIDERATIONS
Emission spectrometers are essentially only capturing noise, not deterministic signals, so deciding the correct number of bits for a given system is not as straightforward as that of a wireless system or radar where the dynamic ranges of the signals involved are well understood. Instead, we use a concept of quantization efficiency introduced in [32] which describes the loss of the quantized spectral shape vs. the ideal physical shape at the receiver input as a degradation of "SNR". Integration efficiency assumes the spectrometer input exhibits a Gaussian distribution (typically the case for noise signals) with a variance of σ 2 that is quantized to N levels by the digitizer (N being even, N = 2n), and with spacing ε (in units of σ ). With this the quantization efficiency η Q , defined as the loss in "SNR" in spectral shape due to added quantization noise, is expressed as: If we use this expression and design the spectrometer system so that the IF signal from the receiver excites the input full-scale of the digitizer (through automatic gain control), the expression in (1) can be evaluated versus the number of bits in the digitizer to produce the plot shown in Fig. 4. Of course, this plot is simplistic and assumes that white quantization noise is the only source of distortion, omitting possible spurious contributions of the A/D circuitry. As is clear from the produced plot, designing beyond the 4-bit resolution level has diminishing returns on the improved "SNR" of the spectral shape, while having major implications on the A/D's complexity of implementation, level of calibration required, as well as implications on power dissipation of both the A/D itself and the downstream DSP circuitry being required to carry larger word-widths. For these reasons our implementation selected an A/D with 4-bit resolution.

C. WINDOWING FUNCTION CONSIDERATIONS
Windowing functions are common in many DSP operations where a discrete Fourier transform (DFT) or fast Fourier transform (FFT) is taken and serves the important purpose of reducing spectral leakage, the energy that spreads between bins of a DFT or FFT because only a finite number of time samples are used. Spectral leakage effects are extremely important in spectroscopy as often we are comparing one spectral quantity to another (i.e., the ratio of molecule A to molecule B) so having contamination in each FFT bin from other frequencies is highly undesirable.
While in communication and radar applications Nyquist length windows are common (windowing functions that have the same number of points as the input sequence), for example the Hanning, or Hamming and Harris-Blackman window functions, in spectroscopy overlength windows are typically employed. An overlength window has more points in the window function representation than the final number of frequency domain points after the DFT or FFT is completed. In particular, poly-phase bank (PFB) windows are well suited to spectroscopy [33] and weight the time-domain sequence with an oversampled sinc function. These window functions offer an excellent balance of implementation complexity to the degree of spectral leakage. In this work we use an overlength by 4x PFB window which is compared with a common Nyquist-length Hanning window in Fig. 5 to illustrate the advantages in terms of spectral leakage. As seen, while not perfectly flat, the PFB window provides a flatter gain response than a Nyquist-length window function.

V. OF SYSTEM-ON-CHIP IMPLEMENTATION FOR SPECTROMETER PROCESSORS
Spectrometer processors for spaceborne instruments have gone through several iterations of implantation including early FPGA [34] based implementations that were considerably power hungry (>25 W), and custom ASIC implementations with off-chip A/D converters [35]. More recently systemon-chip (SoC) based spectrometer processors have become available [36], [37] which contain the A/D, and DSP and clocking PLLs within a single-chip, providing the lowest possible power solution, desirable for the power constraints discussed in Section II across Earth science, planetary science and astronomy spectrometer systems. In this article we present the design of a next generation spectrometer building on the prior work done in [38] which moves the implementation from a 65nm to a 28 nm technology, achieving a higher sample rate (12.8 GS/s over 6 GS/s previously), a 4-bit implementation (over the prior 3-bit work), double the number of FFT bins (from 4096 to 8192) and double the input bandwidth (6.4 GHz over 3 GHz). In addition, the presented 28 nm chip makes several architectural changes that better lend themselves to the high radiation and extreme temperatures encountered in the space exploration environments. Although the chip operates up to 12.8 GS/s, it can be operated at lower speed to reduce power if the full bandwidth is not required. Fig. 6 shows the block diagram of the spectrometer processor which begins with two interleaved 4-bit analog-to-digital (ADC) converters operating up to 6.4GS/s for digitization. The ADC inputs are taken from a matching circuit in which impedance matches the ADC input track-and-hold circuitry from the 50-ohm cabling and PCB traces outside the chip as well as establishes the required input common-mode levels. After the two interleaved ADCs, the digitized signal is de-muxed by 2 (for total 4 streams, each max 3.2 GS/s) and then retimed so that all four sub-rate streams transition at the same clock edge. The chip also contains a replica ADC used for calibration purposes (details on this are provided in the next section). After the demux and retiming, the digital streams are passed to the DSP section of the chip that further de-muxes the digital signals into 32 parallel max 400 MS/s streams. The polyphase windowing function is applied, and a 16K point FFT transform is taken. From that FFT output the power spectral density (PSD) is computed and each bin or "channel" is then averaged a programmable number of times to increase the spectral shape SNR. At the input to the PFB window block, a scale monitor checks the code statistics coming from ADCs to ensure all quantization levels being excited, this is used for feedback and automatic gain control at the IF stages before the processor chip.
Clocking for the chip originates at an on-chip phase-locked loop (PLL). The PLL locks to a low frequency 50 MHz reference originating from outside and synthesizes the high-speed clocks internally. A programmable divider allows the spectrometer to be clocked from the 6.4 GHz PLL directly, or a divided version of it at 3.2 GHz, 1.6 GHz or 800 MHz, respectively. Optionally, the PLL can be bypassed allowing the clock to be fed in directly from outside however this is only intended for low frequencies and operating from an external clock near full-rate can lead to signal integrity problems with standing waves on the PCB clock line. The PLL clock output is provided to a clock management system (CMS) responsible for aligning all phases of the clock and its divided outputs to maintaining setup and hold timing closure for each stage of the spectrometer processor. This is adjusted in closed-loop using several embedded phase detectors, allowing timing closure even when extreme temperatures are encountered, or timing changes induced by total ionized dose (TID) radiation degradation is experienced. The spectrometer processor offers a simple SPI interface for setting important registers like the number of averages, clock frequency settings and other analog calibration. Data is read out via simple 8-bit parallel bus that directly latches the power quantity of each averaged frequency bin. The following sections describe in detail each individual section of the spectrometer processor.

VI. IMPLEMENTATION FOR SPECTROMETER PROCESSOR ADC CONVERTER
The 4-bit ADC employed in the spectrometer processor is a two-way interleaved flash ADC with overall block diagram shown in Fig. 7. While flash ADCs commonly generates their reference voltages with a resistive ladder structure [39], [40], here we use individual 8-bit R2R DACs to generate each reference voltage for the comparator stages. Use of the R2R DAC allows the reference levels to be adjusted, not only to compensate for process variation and on-chip mismatches, but also effects from temperature or radiation the chip may encounter in spaceflight.
The thermometer output of the flash is not decoded to binary values directly following the comparator stage, but is instead decoded after a demux-by-2 operation being applied. Implementing the decoder logic at the demuxed stage consumes added chip area as four copies of the decoder are needed. However, the logic operates at half the net sample  rate, allowing higher overall bandwidth. Power remains similar in either implementation as power consumption is linear with the clock frequency (we've implemented twice the logic at half the clock speed). The input streams once decoded are further demuxed by 32 before passing to the digital signal processor that performs windowing and transform operations.
At the demux by 2 stage, we have also added 2 R2R DACs one of which is connected to each of the two interleaved sub-ADCs. Although these DACs support nowhere near the bandwidth of the ADCs themselves, they are useful for debugging with slow waveforms. Detailed schematics for each of the three key ADC circuit stages is shown in Fig. 8(the ADC contains many copies of each). The track-and-hold circuit ( Fig. 8(a)) is implemented as a simple NMOS pass-gate switch and source follower with a metal-over-metal (MOM) structure as the sampling capacitor. As the ADC is only 4bit resolution, even this very simple track and hold structure supports the required linearity over our large 500 mVpkpk (differential voltage range). The clock applied to the sampling gate of the THA stage is a simple near 50/50 duty cycle clock at half the sample rate. The differential input match is implemented as a string of series resistors, where the midpoint (a virtual ground) is biased with an opamp in closed-loop to establish the correct input common mode. The opamp tracks an 8-bit calibration R2R DAC to establish this common mode, set by the calibration algorithms discussed later in this section. After the sampling circuitry a simple source-following buffer is used as a distribution amplifier to drive the large load of the 15 downstream preamplifier stages (in direct flash architectures 15 preamps and 15 comparators are needed for 4-bits of resolution).
The preamplifier stage, shown in Fig. 8(b), is implemented as a resistor loaded and double-balanced differential pair. The negative input of the preamplifier is biased by two 8-bit R2R DACs which serve as single-ended side of the differential reference voltage. While the static integral and differential non-linearity (INL and DNL) of the 8-bit R2R DAC are constrained by on-chip mismatch and may have large offset errors relative to its own 8-bit LSB voltage value (<5 LSB INL and DNL in simulation), these offsets are quite small as referred to the much lower resolution 4-bit LSB voltage step of the ADC (scaled by a factor of 2 4 ) and do not impact overall static ADC linearity when the R2R values are correctly configured.
The comparator circuit (Fig. 8(c)) is a traditional strongarm configuration [39], [40] which leverages both pMOS and nMOS cross-coupled pairs to increase transient current into the decision nodes for reducing settling time. Fig. 9 shows post-layout simulations of both the preamplifier bandwidth and comparator decision time in response to a fast rate, showing each circuit supports beyond 6 GHz of bandwidth. Note these circuits are part of the two-way interleaved sub-ADCs behind the THA stages so their maximum input bandwidth needs only be 3.2 GHz to produce the full 6.4 GHz interleaved bandwidth (Nyquist bandwidth at 12.8 GS/s). The comparator itself needs to clock at a maximum 6.4 GS/s rate. DC calibration of the sub-ADCs to account for process variation, temperature and radiation effects are accomplished via the use of a replica circuit as shown in Fig. 10. The replica ADC is an exact copy of the two-interleaved sub-ADCs except it remains in a quiescent state without the input signal applied and the comparators in an un-clocked state. The THA switches are continuously on to pass DC, but only sampling a copy of the external common mode of the live THA, no input signals is applied. The digital control signals that set the R2R DACs for DC related settings in the ADC (reference voltages, bias currents, and common modes) are connected in parallel to all three sub-ADCs (the two "live" interleaved ones and the replica). Chip layout and floor-planning are done to minimize on-chip variation (OCV) between the three sub-ADCs. In this configuration the replica ADC will have very similar DC conditions as the live ADCs (as they have the same settings and are the same circuit excluding OCV), except as there are no signals superimposed in the replica ADC's nodes, the DC values can be interrogated directly by low-speed sensing circuitry.
Software running on an external microcontroller co-located on the PCB senses the replica DC conditions in closed-loop with a low-speed high resolution ADC, and then adjusts the R2R settings to track the desired design biases (predetermined from manual tuning) and observing the spectrometer output. Beyond the simplicity of sensing, the use of a replica eliminates the need for DC sensing paths within the "live" ADCs which add parasitic capacitance and loading, reducing the overall bandwidth. Fig. 11 shows the measured output of the monitor DACs that provide an analog representation of each of the two interleaved sub-ADC outputs in a response to a 3 MHz full-scale input. Fig. 11(a) shows when the R2Rs are in their neutral position, while Fig. 11(b) shows the waveforms when the closed loop calibration is engaged for track-and-hold stage common mode bias, preamplifier current bias, and reference settings. As seen in the figure not only is the distortion reduced, but the gain mismatch and offset mismatch between the two interleaved channels is also minimized.

VII. IMPLEMENTATION FOR SPECTROMETER PROCESSOR CLOCK MANAGEMENT
The clock system of the spectrometer processor has two elements, the clock generation block which contains an internal PLL and divider to generate the high-speed clocks for the  processor, and the clock management system (CMS) which is responsible for maintaining setup and hold timing closure in the data pipeline across process variation, changes in temperature and radiation effects.
A detailed block diagram of the clock generation system is shown in Fig. 12 where a clock mux is used to select the output clock frequency which drives the ADC and DSP portions of the spectrometer, setting its total processing bandwidth. The PLL is implemented as a standard type-2 architecture with frequency-phase detector, charge pump and off chip loop-filter. The VCO is implemented as a crosscoupled LC oscillator similar to the one in [30]. The loop filter DC control voltage (Vctrl) is fed back to a microcontroller co-located on the PCB to monitor locking conditions. Vctrl will be at VDD or VSS when unlocked and somewhere between them when the PLL is locked onto the reference input. The frequency range over which PLL remains locked on the reference frequency input is shown as measured in    13 as a function of the measured control voltage. The lock range provides over 1 GHz of coverage around the designed nominal frequency of 6.0 GHz (corresponding to 12 GS/s) although the spectrometer processor was found to function up to 6.4 GHz providing 12.8 GS/s operation. As each sub-ADC clocks on opposite clock edges of the same 6.4 GHz clock, the sub-ADC sample and clock frequency have a one-to-one relationship.
As the high-speed clocks within the spectrometer processor are not accessible from outside the chip (and routing them outside would create a significant power penalty and risks of clock contamination), we instead created a stand-alone chip to evaluate the clock generator's output spectrum and phase noise characteristics. The stand-alone die along with its output spectrum are shown in Fig. 14.
No spurs or spectral artifacts are visible in the wideband output spectrum. Measured phase noise of the clock generation PLL is shown in Fig. 15 and shows a performance of −110 dBc/Hz @ 1 MHz, with partial on-board coupling from 125 KHz SPI clock and its 2nd harmonic at 250 KHz. The overall block diagram of the clock management system (CMS) is shown in Fig. 16 along with the key waveforms involved. The CMS receives the input clock from the clock generator system and then internally generates clock phases at both the input and divided frequencies that are required by the data pipeline. At each clock output a phase shifter is  added so that the relative setup and hold time between major blocks can be adjusted to add extra margin, not only over process corners but also to compensate for changes in timing due to accumulated total ionized radiation doses, and changes in operating temperature. Also shown in the block diagram, several phase detectors are positioned between key clock branches to provide feedback that allows software to make the correct adjustments in a closed loop fashion. Clock points "A" and B" provide the sampling clock to the two-interleaved ADC track and hold (TH) stages, and feedback from phase detector PD1 ensures that they are adjusted to provide true differential sampling. Clock points "C" and "D" are the comparator arrays which are sampled 90 degrees behind the track and hold phases of each ADC. The TH, comparators and the demux stages do not require explicit phase detection, as errors in these timing values are extremely obvious from the output data stream of the spectrometer. After the demux operation, the data pipeline carries a retiming stage to bring all 4 parallel ADC data-streams onto the same clock edge for capture by the downstream DSP. Phase detectors PD2 and PD3 ensure that the output edges of the retiming blocks for both interleaved sides of the chip (points "G and H" are aligned with the clock driving the DSP section).
The phase shifters are implemented as switched-capacitive delay lines using the delay cell shown in Fig. 17, where an added capacitance is switched on or off and loads a differential inverter chain. In the full phase shifter implementation, many of these stages are cascaded together to achieve complete 0-360°degree coverage at all the possible internal clock settings of the clock generator. Fig. 18 also shows the simulated delays for a chain of 8 delay cells with 24,6 and 8 cells activated to provide an evaluation of the time resolution achieved. The phase detectors are implemented as a simple XOR gate and DC-extracting low-pass filter implemented off-chip on the carrier PCB.
One final issue that needs to be addressed within the CMS is the clock dividers themselves. Digital flop-based clock dividers can arrive at two different output phase configurations depending on their fabrication mismatch. To resolve this ambiguity, we employ the clock phase correction circuit depicted in Fig. 18. Here the positive edge of the 0/180°o utput stage is used to sample one of the 90/270°stages to see if it is inverted or not. If inverted, a pair of transmissiongate multiplexers flip the 90/270°output clock order, so an  unambiguous clock phase relationship is maintained at the divider output.

VIII. IMPLEMENTATION FOR SPECTROMETER DIGITAL SIGNAL PROCESSOR
The DSP portion of the spectrometer processor begins with the application of the poly-phase window function and is implemented as shown in Fig. 19 using a Weight, Over-Lap and Add operation (WOLA). In this structure, 4N time-domain ADC samples are weighted by a sinc function also of length 4N, and the resulting 4N time sequence is divided into 4 individual N-sample sequencies, that are then superimposed on each other prior to entering the FFT processor used for the PSD calculations. The sinc function itself carries 6 bits of precision and storage of the multiplied product with the ADC samples is accomplished using single port SRAMs embedded within the signal processor. Similar to [38], coefficients of the sinc(n) poly-phase window function are generated on-thefly. The division between sin(π n) and π n has both operands estimated by first-order linearization, and the division itself is implemented using the COordinate Rotation DIgital Computer (CORDIC) method. Employing on-the-fly computation is far more hardware and thus area efficient than a lookup ROM implementation and is also used for the twiddling multiplication between the parallel FFT and pipelined FFT as shown in Fig. 20. The hybrid architecture is developed here to optimize the delay-energy product. Each of the 32way 512-point sub-FFTs adopt a single-path delay-feedback (SDF) topology with a Radix-2 K (K ≤ 5) implementation. The Radix-2 K twiddle-factors are contained within each butterfly unit (BU) similar to [38] and, have the same set of reoccurring numerals as the 32-point pipeline stage, all of which are represented in the Canonical Signed Digit format [41], allowing the twiddling multiplication to require mere shift and add operations instead of full multiplication. The power spectral density is computed from the FFT output bins by computing the (I 2 + Q 2 ) term of each frequency bin.
The final stage of the processor that accumulates the power spectral density to increase the SNR of the "spectral shape" is implemented as two ping-pong single port SRAM banks that take turns writing and reading back the accumulated results. Flow control is achieved by tracking the LSB of the address counter which counts the number of accumulation cycles. The transformation from base-2 order back to natural order at output of the PSD processor is accomplished by performing simple bit manipulations as results are read out of the processor chip. Fig. 21 shows the SRAM structure of the PSD accumulator and the key flow control signals, note integration needs to be paused during the readout process.

IX. FULL SPECTROMETER ASSEMBLY AND DEMONSTRATION
The die photograph of the full spectrometer processor is shown in Fig. 22 with key blocks identified in the photo. The  4.0 mm × 3.8 mm chip is bump-bonded onto a carrier PCB that contains several support components including a microcontroller that runs the calibration software and provides the high-resolution low-speed ADCs used to read the main-ADC bias conditions and phase detector outputs from the CMS system. The PCB additionally provides bridge devices from the SPI and readout interfaces into USB 2.0 standard interfaces. A photo of the PCB assembly and X-rays bump bonding are shown in Fig. 23.
To demonstrate the full operation of the spectrometer processor, we use the emulated spectrometer setup shown in Fig. 24, a common approach to spectrometer processor characterization [42]. In this setup we first amplify a white noise source (noise diode) and power split it into two branches. In the first branch we simply pass the white noise unmodified which emulates the broadband thermal noise at the output of a millimeter or sub-millimeter wave emission spectrometer receiver.   In the second branch we filter the white noise with a very narrowband filter (5.25 to 5.26 GHz in this test) and amplify it by a small amount of gain, in this case 1 dB. This narrowband path with slight amplification emulates the added power of the emission signal from an observed molecule in the laboratory setting. Finally, we combine the output of two branches together to produce a composite signal with a broadband thermal noise that covers the spectrometer processor's input bandwidth and contains a small added noise power at one small frequency to emulate an emission signal. A photograph of the actual setup is shown in Fig. 25 with key components and their values identified. The setup uses an extremely high-Q 5.25 to 5.26 GHz bandpass filter to provide the emission branch and uses a combination of a mini-circuits ZX-60 amplifier and cascaded attenuators to achieve a calibrated 1.0 dB gain in the emission branch of the setup. The spectrometer processor is set to operate at the nominal rate of 12.0 GS/s and integrated the emulated spectra for 10 seconds. Then the emission branch is disconnected and terminated for subsequent measurements. The results of these two captures are overlayed in Fig. 26(a) where the emission feature in the spectrum is clearly visible only when the emission branch is enabled and occurs in the expected 7174th bin (5.255/6.0 GHz X 8192 bins = 7174 bins) based on the 5.255 GHz center frequency of the high-Q bandpass filter. The ratio of the two traces can also be taken as shown in Fig. 26(b), which removes the background standing wave features that arise from the cabling and limited impedance matching of the PCB board traces. This ratio represents the actual output that would be produced in a true emission measurement where "turning off the emission" is accomplished by shuttering the antenna using a method called Dicke switching [43].
By configurating the spectrometer processor to accumulate for an infinite amount of time, and slowly increasing the reference clock provided to the clock system, we can map out the DC power consumption of the spectrometer processor as a function of the clock frequency (one to one with the input bandwidth). As the ADC and DSP portions of the chip are powered from separate 1.0 V supplies, their power consumption can be evaluated independently, as shown in Fig. 27. Finally, to demonstrate the spectrometer processor operating with real-world emission signals we use the setup shown in Fig. 28 where the processor is coupled with the IF of a 500-600 GHz receiver from the "Water-Hunting Advanced Terahertz Spectrometer on an Ultra-Small Platform" (WHASTUP) space instrument [10]. Unlike most sub-millimeter spectrometers that use a flip mirror for receiver calibration, the WHATSUP instrument instead uses a microelectro mechanical system (MEMS) based waveguide switch [44] to periodically switch the receiver input to a waveguide load to calibrate absolute gain.
The receiver is pointed into a gas vessel or "gas cell" which has two transparent teflon windows at each end of the chamber allowing sub-millimeter radiation to enter and exit. At one end we place the receiver while at the other end we use optics to steer the window into a liquid nitrogen (LN 2 )-soaked absorber to provide contrast. The gas cell is evacuated, and a small amount of H 2 O is introduced with a partial pressure of 10 mTorr. As the H 2 O is at room temperature (293 K) and the LN 2 boils at 77 K, a 216 K contrast exists allowing FIGURE 28. Spectrometer processor coupled to a 500-600 GHz instrument [10] for emission sensing where H2O is introduced into a gas vessel or "gas cell" at a low pressure of 10mTorr and a liquid nitrogen (LN2) soaked absorber is placed in the background to provide contrast for the emission signal. the H 2 O emission signals to be detected. It should be noted, while this approach does allow spectroscopic emission signal measurements to be performed in the laboratory, the attainable signal-to-noise ratios are limited to much lower values than an actual spaceborne measurement. This is because the gas vessel (20 cm long in our setup) is much shorter than the pathlength of a view through the atmosphere on orbit of during a flyby (100-1000s of km). Also, the sub-millimeter wave beam must pass through the Teflon window material which has some signal loss (approximately 1 dB per end), which would not exist in a spaceborne configuration where the environment is already at near vacuum pressures. Fig. 29 plots the captured H 2 O emission feature at 556.96 GHz [8] from the spectrometer processor coupled to the WHASTUP 500-600 GHz receiver. Additionally, the theoretical emission level (in K) given the parameters of the test (pressure =10 mT, path length = 20 cm, Tcontrast = 216 K, Tsys = 5500 K) is overlaid for comparison. As seen in the figure, the captured H 2 O feature agrees well with the theoretical emission in terms of contrast level between emission and noise floor, and in terms of absolute rest frequency. The emission bandwidth also matches the expected values quite closely.

X. CONCLUSION
In this paper we first discuss the importance of and unique requirements for emission rotational spectroscopy systems across spaceborne Earth/planetary science and astronomy investigations. We then discuss the driving forces in these science arenas that demand more compact and low-power spectrometer systems for future missions that motivate the presented work. We explore key considerations in implementing spectrometer processor systems including the important roles of bandwidth, quantization efficiently and the spectral leakage that arises from selection of windowing functions. We then presented a CMOS system-on-chip spectrometer processor implemented in 28 nm CMOS technology. Circuit and system design of the ADC, clock and DSP systems are described in detail along with performance measurements for each sub-block and simulations of performance parameters not accessible from the implemented chip. An end-to-end demonstration of the spectrometer processor is performed using a laboratory emission experiment showing excellent match to theoretically expected results. The 4-bit resolution, 8192-point spectrometer processor occupies 4 × 3.8 mm of chip area and consumes 3385 mW (155 mW ADC + 3230 mW DSP) when operated at the nominal rate of 12 GS/s. The added support components co-located on the PCB (microcontrollers, regulators, ...) consume an additional 348 mW for a total module power of 3733 mW.