The Φ-Sat-1 Mission: The First On-Board Deep Neural Network Demonstrator for Satellite Earth Observation

Artificial intelligence (AI) is paving the way for a new era of algorithms focusing directly on the information contained in the data, autonomously extracting relevant features for a given application. While the initial paradigm was to have these applications run by a server hosted processor, recent advances in microelectronics provide hardware accelerators with an efficient ratio between computation and energy consumption, enabling the implementation of AI algorithms “at the edge.” In this way only the meaningful and useful data are transmitted to the end-user, minimizing the required data bandwidth, and reducing the latency with respect to the cloud computing model. In recent years, European Space Agency (ESA) is promoting the development of disruptive innovative technologies on-board earth observation (EO) missions. In this field, the most advanced experiment to date is the $\Phi $ -sat-1, which has demonstrated the potential of artificial intelligence (AI) as a reliable and accurate tool for cloud detection on-board a hyperspectral imaging mission. The activities involved included demonstrating the robustness of the Intel Movidius Myriad 2 hardware accelerator against ionizing radiation, developing a Cloudscout segmentation neural network (NN), run on Myriad 2, to identify, classify, and eventually discard on-board the cloudy images, and assessing the innovative Hyperscout-2 hyperspectral sensor. This mission represents the first official attempt to successfully run an AI deep convolutional NN (CNN) directly inferencing on a dedicated accelerator on-board a satellite, opening the way for a new era of discovery and commercial applications driven by the deployment of on-board AI.

complement the traditionally public sector agencies in the space arena shows how this sector is attractive and rich with new opportunities [1]- [3]. However, the adoption of new technology on-board satellites is still strongly limited by the requirements of reliability and availability, which traditionally have imposed the use of components with flight-heritage and extensive qualification. This is why on-board processing on space-borne data systems still relies on old components that do not provide enough computational power to run most of the innovative state-of-the-art algorithms [4], [5].
Artificial intelligence (AI) shows the ability to solve very complex problems exploiting only the intrinsic information contained within data, reducing the preprocessing and postprocessing that is required by standard on-board techniques. In this sense, AI can provide the needed boost in actual performance that will allow new applications to be realized [6]. AI algorithms, and especially those related to image processing [such as convolutional neural networks (CNNs) and deep neural networks (DNNs)], are not suited for the typical class of processors used on-satellite due to their limited computational power and memory resources [7]- [9]. Furthermore, flight hardware has to tolerate failures and faults caused by ionizing radiation in orbit [10]. The radiation exposure is dependent on the orbital altitude so that the total ionizing dose (TID) for low earth orbit (LEO) missions is much less than that for geostationary earth orbit (GEO) and interplanetary missions [11]. The rapid and continuous advancement in semiconductor technology is resulting in commercial processors that are increasingly compute-powerful. The performance of these commercial off-the-shelf (COTS) devices, and in particular their efficiency and ability to implement state-of-the-art AI algorithms with low-power consumption at the edge, leads to an increasingly large gap between their capabilities and those of traditional, reliable space flight hardware [12]. In the latest years, as part of its initiative to promote the development of radically innovative technologies such as AI capabilities on-board earth observation (EO) missions, the European Space Agency (ESA) developed the first -sat mission [13], [14], leveraging the development of the Technology and Quality Department of the Agency on a new processor board for LEO missions, named eyes of things (EoT) [15]. The EoT board features the Intel Movidius Myriad 2 vision processing unit (VPU) capable of performing fast inferences while maintaining the power consumption well below 2 W [16]. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Among the activities supported by ESA to assess and trial the Myriad 2, one of the most important is the successful radiation characterization of the device at several European test facilities, including European Organization for Nuclear Research (CERN) [17]. This provided the confidence to progress to an in-fight demonstration of AI applied on-satellite on -Sat 1. This article presents the design and the first in-flight results of the -Sat 1 mission.
The aim of the mission is to demonstrate in-flight the capability, robustness and accuracy of AI acceleration using the Myriad 2 device, and the suitability of AI algorithms for elaborating and handling raw L0 data or L0 to L1 data processing directly on-board. Specifically, the goal is to demonstrate accurate band coregistration and precise cloud detection to increase the efficiency of the downlink in terms of the ratio between useful images versus nonuseful images acquired by an innovative hyperspectral camera called HyperScout-2 [18].
At the time of the -Sat 1 mission, insufficient data volume from the HyperScout product line was available to train the CNN to classify each pixel as cloudy or not cloudy. This is a common problem with space imagers where bespoke cameras are developed for each new mission. The approach was to derive synthesized images starting from the Sentinel-2 dataset [19] after proper processing to emulate the new camera sensor characteristics, and use those for training and initial test. This mission represents the first use of deep CNN inference on a dedicated COTS VPU processor on-board a satellite, with the aim to autonomously identify hyperspectral images from HyperScout-2 that contain a percentage of cloudy pixels less than a given threshold.
-sat-1, the first experiment of the -sat mission series, is part of the ESA EO Directorate initiative to promote the space-oriented development and adaptation of radically innovative technologies such as AI [20]. The -sat mission series objective is to address brand new mission concepts, fostering novel architectures or sensing modalities that enable to meet user-driven science and/or applications by means of on-board processing. Specifically, the primary objective of -sat-1 was to demonstrate in flight the ability of a DNN inference running on a dedicated COTS AI accelerator to reliably detect clouds on acquired hyperspectral images, allowing the removal of the cloudy pixels and thereby reducing the amount of data to be downloaded while increasing the information content of this data. The performance of an on-board inference engine based on a machine learning (ML) algorithm for cloud detection was validated in flight. This is the first time such an experiment has been conducted in space.

II. ARTIFICIAL INTELLIGENCE ON-BOARD
Recent advances in space avionics have led to more decentralized on-board compute. COTS edge processors are ideally positioned to deliver low-latency and distributed edge compute at source for value-added services from orbit [21]. Furthermore, rapid mission design cycles are possible using COTS devices that incorporate suitable mitigation strategies. Mission lifetime extensions, as well as improvement by means of delta training, are also immediately feasible for AI solutions via dynamic reconfigurability of the neural networks (NNs). This paradigm opens new prospects and new opportunities enabled by robust and accurate on-board processing, and L1 to L2 product generation, with respect to the classical approach to download on ground mainly raw data for subsequent processing. DNNs have demonstrated remarkable results in several space applications, such as scene classification [22] object recognition [23], pose-estimation [24], change-detection [25], and others [26]. This capability of performing complex tasks with credible and robust precision has pushed researchers to investigate the possibility to move the application of DNNs on board satellites [2].
Moving AI to the edge can have a twofold benefit: 1) it enables new remote sensing techniques and 2) it enables new types of applications such as those requiring minimal latency direct downlink to the final user, or those optimizing the downlink bandwidth by transmitting to ground only useful data or only meta-information [27], [28]. In particular, the deployment of DNNs on-board can help to reduce mission/application bandwidth requirements by filtering out nonuseful data [2], [29]. This ability becomes particularly relevant when high revisit times and limited budgets are pushing the increased adoption of small and nano satellites, and CubeSats, which feature extremely limited downlink data rates [3]. It is worth noting that the robustness and reliability of the processing are of paramount importance since a false detection can lead to an ultimate and definitive loss of data. In order to port DNNs on board spacecrafts, Kothari et al. [17] and Furano et al. [1] propose to perform inference using COTS hardware accelerators, which feature improved energy efficiency and low costs and mass.
Furthermore, COTS accelerators are capable of exploiting the regular structure of NNs that, regardless of the specific layers, share the same structure and require the repeated execution of the same type of building block operations, such as multiply and accumulate (MAC). Thanks to this feature, the use of COTS devices has strong potential to enable the use of the same hardware for different applications, with advantages in terms of reduced mission set-up times, greater market access, and reduced costs [1], [17], [23].

A. Hyperspectral Instrument and CloudScout Processing Chain
HyperScout 2, shown in Fig. 1, is a miniaturized spectral camera developed by cosine remote sensing with a hyperspectral channel in the visible to near-infrared (VNIR) offering 45 bands from 400 to 1000 nm and a multispectral channel in the thermal infrared range (TIR) [12] with three bands from 8 to 14 μm, as detailed in Table I. HyperScout 2 is the second generation of the HyperScout product line, with the first generation launched and demonstrated in orbit in 2018 [5]. The instrument, detailed in Fig. 2, is based on a two set of 2-D sensors used in push-broom mode exploiting the orbital motion of the satellite for the acquisition and reconstruction of the entire sensor spectral coverage for each ground pixel.
The HyperScout 2 includes a number of subsystems among which the telescope, the VNIR and TIR focal plane arrays,  instrument control unit (ICU), and back-end electronics. For data processing the system is equipped with a central processing unit (CPU) based on-board data handling and the EoT as described in Section III-B. HyperScout has been equipped with a dedicated interface board for control and latch-up protection of the EoT.
The telescope is an athermal system based on a monolithical structure. The VNIR focal panel assembly (FPA) is based on a CMOS sensor and a hyperspectral filtering element used to separate the different wavelengths. The TIR FPA is based on a microbolometer and a multispectral filtering element. The ICU presents the contact point for the HyperScout allowing in-flight debugging of the basic electronic element (BEE)/onboard data handler (OBDH) subsystem. Housekeeping data is logged in an idle state. Being located on an independent MCU, it is possible to completely power down each component from the ICU. The BEE is the electrical interface to the spacecraft and is latch-up protected. It distributes power, clocks, telemetry, and commands between the units, controls the detector and serves as the data and control interface, providing clock timings, frame rate control, exposure, and gain control. The BEE then merges the data acquired with the platform ancillary information creating L0 payload image data. The latter are reconstructed unprocessed information at full space-time resolution coming from the imager payload with all available supplemental information to be used in subsequent processing appended. These data are then stored in the payload mass memory unit (MMU).
The OBDH hardware serves multiple purposes, the most distinct being the platform for both the acquisition and the processing modes. During the acquisition mode, data will be transferred from the BEE into the memory of the OBDH, which is then written to the MMU via SATA. During processing mode, the data are retrieved from the MMU and processed in memory on the OBDH. Both the acquired L0 image data and processed data are stored on board the payload's MMUs.
Standard methods of constructing spectral cubes from push-broom sensors rely on a multitude of satellite platform-dependent instrumentation. These on-board instruments include global positioning system (GPS), star trackers, and other attitude determination systems. This method inevitably makes the critical interface between platform and instrument very complex to manage in terms of synchronization, and additionally imposes requirements on the platform that need to be managed, verified, and validated. Furthermore, the available instrumentation makes the respective spectral cubing algorithm platform-dependent, which could be a disadvantage for push-broom satellites constituting a constellation on multiple platforms where it is desirable for all the satellites to behave in a similar manner.
Cosine's HyperScout is intended to be flown by various customers on different platforms and could indeed be used to create a constellation in the near future. Consequently, a spectral cubing algorithm completely based on machine vision techniques was developed. This algorithm, which is currently in the process of being patented, can construct spectral cubes without the use of attitude determination and control system (ADCS) data allowing the HyperScout instruments, in principle, to operate in a plethora of environments, e.g., on various spacebased platforms, on-board airplanes, or as part of an unmanned aerial vehicle (UAV), while all using the same code base.

B. EoT Board
As introduced in Section I, the AI processing engine on -Sat-1 is a custom build of a Myriad 2-based EoT development board from Ubotica Technologies. Initially developed as part of the "Eyes of Things" H2020 project [15], the EoT board is a low-power vision-enabled Internet of Things (IoT) edge processing platform. All EoT processing and control is performed by the Intel Movidius Myriad 2 VPU, positioning the board ideally as a readily available Myriad 2 hardware platform for the inference task.
The Myriad 2 VPU is a system on chip (SoC) with integrated dynamic random access memory (DRAM) that has been designed from the ground up considering high performance edge compute for vision applications. It is a heterogeneous 14-core SoC, with 2 reduced instruction set computer (RISC)-V LEON processors managing functionality and controlling the 12 integrated vector processors. These streaming hybrid architecture vector engines (SHAVEs) are 128-bit very long instruction word (VLIW) processors that have concurrent access to a 2 MB multiported random access memory (RAM), with 400 GB/s of sustained internal memory bandwidth supported between the SHAVEs and RAM. The SHAVE processors contain wide and deep register files controlling multiple functional units including extensive single instruction multiple data (SIMD) capability for high parallelism and throughput at both the functional unit and processor level. Firmware on the Myriad 2 utilizes the 12 SHAVEs to efficiently perform parallelized NN inference, including memory management and direct memory access (DMA) for fast network weight loading to the multiported memory, providing exceptional and highly sustainable NN inference performance.
Key to its selection for -Sat-1 was the Myriad 2's compute efficiency. With a core voltage of 0.9 V (that guarantees a good level of robustness against the most dangerous destructive radiation effects), it can operate at 600 MHz nominally consuming only 1 W. 20 independently controllable internal power islands help to minimize power dissipation. Further efficiencies for image processing operations are achieved via the computer vision (CV) and image signal processing (ISP) hardware acceleration blocks. Together, these features provide Myriad 2 with in excess of 1 TFLOPs of compute.
In order to support its deployment on -Sat-1, and in satellite applications in general, the Myriad 2 has undergone radiation characterization via a range of test campaigns in European test facilities. During these campaigns the device was assessed for susceptibility to latch-up, and to determine radiation cross sections across a range of energies [17]. Myriad 2 has demonstrated no single event latchup (SEL) effects at energies up to 8.8 MeVcm 2 /mg, with further results for recent tests at higher energies pending analysis. The in-package DRAM of the device was shown during single event upset (SEU) campaigns to have high immunity to bit upsets (per device cross section of 2e -14 at the above energy), indicating its suitability for code and NN model storage, and providing a level of inherent protection against functional upsets. Total ionizing dose (TID) testing was conducted up to 49 krad, with the device found to have no sensitivity to cumulative Co-60 radiation effects up to this dose.
The EoT board was designed to expose the wide range of Myriad 2 interfaces and peripherals in a compact 76 mm × 68 mm form factor, facilitating broad application development. Universal serial bus (USB) (2.0 and 3.0) is the high-speed control and data interface to the board, while multiple low-speed interfaces [inter-integrated circuit (I2C), serial peripheral interface (SPI), universal asynchronous receivertrasmitter (UART)] for peripheral attachment are exposed, alongside serial and parallel image sensor interfaces. The EoT board, being Myriad-centric, was selected as a suitable AI accelerator for the -Sat-1 mission as it provides an ideal base platform from which to build a complete inference engine, providing a payload-compatible, host-controlled, reconfigurable, low power, low heat generation, high-speed interfaced device, in a form-factor that integrates into the available space atop the sensor payload. However, although functionally capable for -Sat-1, the board was not designed with the harsh conditions of launch and space operation in mind, and its design consists entirely of COTS components. Consequently, a thorough analysis of the board design was conducted wherein nonessential functionalities were identified, thermal and vibration factors were considered, and a board-wide component risk analysis was conducted. Out of this a robust version of the board consisting of a custom-assembled COTS EoT mounted on a protection PCB (see Fig. 4) was produced for -Sat-1.
All active components associated with EoT board functionalities that were not required for -Sat-1 were excluded where possible from the board assembly, along with debug indicators and unused interface headers. In final preparation for integration, the board was conformally coated to protect against tin whiskers. The flight configuration of the -Sat-1 EoT board build is shown in Fig. 4.
Inference functionality is enabled on the EoT board via Ubotica and Intel Movidius firmware [30] executing on the Myriad 2, with host-side control of inference achieved via inference libraries that were custom built in order to target the payload on-board computer (OBC). The libraries expose a compact API for downloading NN models to the EoT, and subsequently for submitting input tensors to the inference engine and receiving corresponding inference results. Asynchronous inference is supported via input and output tensor queues in the firmware. A built in self -test (BIST), with board-level and device-level self-test coverage, is executable on-demand from the OBC, enabling health monitoring of internal memories and processors, device interfaces, and board peripherals (see Section V-D), and detection of SEU effects. The OBC is also responsible for booting the Myriad 2 via USB (either with BIST or inference boot images).

A. Dataset Preparation
Most of the ESA missions implement either a new sensor or new sensing technique, implying a general lack of data for the construction of the datasets. -Sat-1, as a highly innovative mission, is no less affected by this lack of preflight sensor data, which implied the need to construct and label synthetic datasets based on proxy off-the-shelf datasets (in this case the Sentinel-2 archive). Of course, the authors appreciate that this is a potential source of inaccuracy in the in-flight inference results.
In order to prepare a representative sample of Sentinel-2 data, the whole archive of Sentinel-2 tiles, available through Sentinel-Hub services [31], was randomly sampled. Fig. 5 shows the locations of the randomly sampled Sentinel-2 images. For each sample, the 13 bands of the Sentinel-2, at 10 m/px resolution were acquired. To simulate the behavior of the HyperScout-2 imager, the original samples were rebinned from 10 to 70 m/px, which corresponds to the ground resolution of the HyperScout 2 imager. Resampling was done using the default interpolation method of the Sentinel-Hub: nearest neighbor. Although nearest neighbor is not the best solution for image resampling, it represents a good tradeoff between dataset management and computational effort. The data was then retrieved and stored in the coordinate reference system (CRS) of the corresponding Sentinel-2 tile, so as to remove the need to re-project data. To prepare the dataset to train the DNN, the associated cloud mask was added to each image as produced by the s2cloudless package. The s2cloudless [32] is an automated cloud-detection algorithm for Sentinel-2 imagery based on a gradient boosting algorithm. The algorithm is mono-temporal, it does not take into account any spatial context and can be executed at any resolution. The input features are Sentinel-2 Level-1C top-of-atmosphere (TOA) reflectance values of the following ten bands: B01, B02, B04, B05, B08, B8A, B09, B10, B11, B12, and output of the algorithm is a cloud probability map. Users of the algorithm can convert the cloud probability map to a cloud mask by thresholding the cloud probability map. The masks were produced at a 70 m/px resolution, the same resolution used to download the Sentinel-2 data, using cloud probability threshold of 0.4. The distribution of the cloud ratio coverage of the data generated is shown in Fig. 6. Clearly, although images were randomly selected, the nature of the observed phenomenon within the instrument swath is such that the vast majority of the images were either almost fully covered by clouds or almost cloud-free. This drives the decision on the value of the cloud coverage threshold used to declare the image cloudy and not download it. The distribution of the cloud ratio coverage of the data generated is shown in Fig. 6. Clearly, although images were randomly selected, the nature of the observed phenomenon within the instrument swath is such that the vast majority of the images were either almost fully covered by clouds or almost cloud-free. This drives the decision on the value of the cloud coverage threshold used to declare the image cloudy and not download it. From Fig. 6 it is clear that choosing a threshold of 70% enables the detection of fully cloudy images.
HyperScout 2 is able to sense 10 of the 13 bands available on the Sentinel-2 satellites. However, using a high number of bands for inference would require a large preprocessing directly on-board that would place significant memory and energy demands on the satellite system. In fact, the data processing chain, illustrated by the first three blocks Fig. 3, is executed on the HyperScout 2 OBDH before the extracted spectral bands are fed into the EoT board for executing the inference step. During the first preprocessing, the raw frames are radiometrically corrected for gain and offset and are geometrically corrected to compensate for the distortions created by instrument's optical train. Next, for the spectral cube construction step, the corrected frames are stack on one another and aligned, using CV techniques, to form a hyperspectral data cube. The appropriate bands from this cube are then extracted and normalized during the second preprocessing step.
To overcome this problem, we performed principal component analysis (PCA) on the ten bands that HyperScout 2 has in common with Sentinel-2 to select the best three bands to use directly on-board. Using the three most important components from PCA, a sample image that would be submitted to Myriad for inference is shown in Fig. 7 (left), while reconstruction of the true RGB image from the three PCA components is shown in Fig. 8 (left), with the original RGB Sentinel-2 image on the right.
Nevertheless, in-flight reduction of the ten Sentinel-2 bands into three using PCA, although possible using the CPU on board HyperScout-2, was deemed prohibitive due to the necessary power required. The chosen alternative is to extract from the HyperScout-2 acquired data cube the three most important bands as determined by the S2cloudless model. The feature importance analysis of the S2cloudless model highlighted that the three most important Sentinel-2 bands that have a highly sensitive corresponding bands on HyperScout-2 are the B01,  After the selection of the three most useful Sentinel-2 bands, the HyperScout-2 data was simulated. The Sentinel-2 Level-1C data (digital numbers), Sun zenith angles, per-band solar irradiances, and Sun-Earth distances were used to calculate TOA radiances. For each pixel, the per-band radiance and HyperScout-2 imager per-band noise characteristics were used to calculate (per-band) root-mean-square (rms) values, and simulated Gaussian noise with zero mean and rms value as standard deviation was added to the radiances. In case the radiance value for a given band was above HyperScout-2 imager saturation thresholds, the value was capped. Finally, all the images produced were normalized to range [0, 1] and stored using 16-bit floating point precision. This normalization allowed us to fully exploit the input channel of EoT board, i.e., 16-bit FP per pixel.

B. CloudScout Segmentation Deep Neural Network
A custom segmentation NN architecture was developed for the -Sat-1 mission in order to achieve high detail and good granularity in the network result. A semantic segmentation network (see Fig. 9) is generally composed of two parts: encoder and decoder [33]. The encoder extracts only the most relevant features from the input image and propagates them through the entire network, increasing the level of details within each feature. At the end of the encoding phase, a nonhuman-readable vector is extracted that captures a compressed version of the input image. This vector is input to the decoder, and, helped by the concatenation with the same dimension encoder layers, reconstructs only the most valuable information, creating the segmented images.
As already mentioned in Section III, the hardware accelerator used in this mission is the Myriad 2 VPU. Due to limitations on its maximum intra-layer memory, particular attention had to be devoted to the implementation of the convolutional/deconvolutional layers, and to the quantization of the weights to the 16-bit floating point arithmetic available in the VPU. Furthermore, to avoid the saturation of the memory, an input size reduction was performed. In contrast to the binary classification model described in [34], the segmentation network input tensor size is 192 × 192 × 3. This input reduction allows increasing the number of deconvolutional layer within the network model, although the output size has been halved to better handling and postprocess the output data by the on-board processor. The CloudScout network, shown in Fig. 10, was inspired by U-Net [25] which is a network used to segment different scenes, with particular attention to false negative values [35]. Moreover, this network owes its success to the low number of training images required compared to the mean intersection over union (mIoU) obtained. Exploiting the same criteria, the CloudScout network uses only the lowest section of the U-Net.
In particular, the network is composed of convolutional, de-convolutional, and max-pooling layers. The convolutional layers have kernel size of 3 × 3 and stride 1, while the deconvolutional layers are of two types: 1) doubling the input image size by exploiting a stride of 2 and kernel of 2 × 2 and 2) increase the input image size of two pixels per axes, exploiting stride 1 and kernel of 3 × 3. All layers use ReLU activation functions. The training phase, detailed in Fig. 11, was conducted exploiting the binary cross entropy loss function, starting from 0.01 with AdaDelta optimizer. Furthermore, in order to reduce the memory footprint and to avoid excessive memory demand during the deconvolutional phase, the reconstructed images, and consequently the number of channels per layer, were reduced in size.
The receiver operating characteristic (ROC) analysis in Fig. 12 shows the variation of the performance with respect to each pixel confidence score threshold value of the last layer. This threshold represents the minimum confidence score needed by the network to define pixels cloudy or not cloudy, providing a fine control of the final output. Furthermore, it is not applied by the NN, but it is computed by the on-board  processor and can be changed to adjust the percentage of FP/FN without retrain the network. The red dot represents the best tradeoff for the CloudScout network in terms of pixel-wise accuracy and false positive rate, as shown by the confusion matrix in Table II. This is obtained using a threshold of 0.6 on the output mask of the last layer. The final pixel-wise accuracy is 88.4%. Although this is indicative of the overall quality of the inference, the main parameter that sets the performance and represents the actual index of quality is the false positive (FP) rate. The main reason for this is that using this inference in an operative mission to decide which of the images are worth downloading to the ground and which can be discarded on-satellite, the false positives, being images actually not cloudy but detected as cloudy, represent the net loss of good data, containing useful information, that is discarded. Therefore, the chosen quality index is the percentage of FP which for pixel classification is equal to 5.6% (see Table II). Moving from pixel classification back to image classification, with the definition of cloudy images as the images whose percentage of cloudy pixels is higher than 70%, the associated confusion matrix is that of Table III. It is worth noting that the 88.4% of accuracy of the segmentation algorithm corresponds to 95.1% of tiles accuracy with only 0.8% of the images were classified as False Positive  within the synthetic dataset. It is possible to further reduce the number of FPs by increasing via software the threshold of the last layer, at the expenses of some percentage reduction in the pixel wise accuracy. The inference time is approximately 102 ms exploiting 8 of the 12 available SHAVE vector processors of the Myriad 2, and consuming only 62.9 KB of memory footprint.

A. HyperScout-2 Preprocessing Chain
The HyperScout-2 preprocessing chain was run on 17 datasets acquired during the -Sat-1 mission, whose coordinates and acquisition times are summarized in Table IV. For each of these datasets, three 1152 × 1152 pixels bands with central wavelengths of 450, 494, and 862 nm (see Section III) were produced. Nine of these three-bands sets, combined to form color images, are shown in Fig. 13.
For every band produced, a saturation mask composed of Boolean true/false values was made so that the impact on the relative band radiances could be assessed. Each pixel which exceeds the saturation threshold was marked as true and all other pixels were assigned false values. Overall, 1.2% of the pixels were saturated due to bright clouds. The percentage of saturated pixels per acquisition is plotted in Fig. 14. It can be seen that acquisition 02CE is considerably more saturated than the other datasets, accounting for 50.6% of the total saturated pixels.
Since significant feature misalignment between the bands can affect the quality of the NN's inference, the inter-band alignment precision was also assessed. To quantify the precision, key points were identified with the ORB algorithm [20]. The distance between corresponding key points was then calculated and averaged for each band set. At least 100 suitable key point pairs were found for each acquisition except for 038B/Tumbarumba, where only 33 pairs were identified due to the lack of sharp features in the image. All key point pairs with distances of over ten pixels were discarded because these pairs were observed to be the result of incorrectly identified key points, and misalignments of this magnitude were not visually observed. Overall, the mean separation between the key points for all pairs was calculated as 1.14 ± 1.33 pixels.
The key point separation per acquisition is plotted in Fig. 15.

B. Synthesized Dataset Quality Performance
In order to train an NN capable of segmenting the images produced by the HyperScout 2 camera, a dedicated dataset reflecting the characteristics of the sensor itself is needed. As described in Section IV-A, the dataset has been synthesized from Sentinel-2 images, which, although similar, differ in some aspects, namely: 1) radiance versus reflectance and 2) relative SNR per band.
Simulating the behavior of radiance from reflectance images may introduce additional noise. This noise is important for the NN which uses the intrinsic characteristics to classify pixel values. To this aim, for each image from the generated synthetic radiance images, we added and removed 5% of the noise from the nominal noise expected for the selected band in HyperScout-2. The process involved adding an additive white Gaussian noise (AWGN) to increase the baseline value, and a denoiser algorithm to reduce it. Increasing the SNR values allows us to train a more robust NN capable of working correctly in different situations, and possibly with different bands. Furthermore, to challenge the network and avoid wasting potential good images, some images containing clouds over salt lakes, snow, etc. were included in the synthetic dataset. These images, even if they do not represent the main goal of the Sat-1 mission, lead to a reduction of the overall accuracy in the classification of the synthetic dataset. On the other hand, these challenging images improve the ability of the network to recognize boundary situations. In operational missions, improvements can be easily obtained adding special location information. Finally, in order to evaluate these capabilities, we mixed some bands obtained from a cube generated by the HyperScout-2 sensor as shown in Fig. 16.
The classification result is almost the same for all cases, achieving about 97% pixel accuracy for each image. This dataset is a real-world example of generating a synthetic dataset for cloud detection, demonstrating that it is possible to use/generate synthetic images for new sensors to be exploited directly on board.

C. Segmentation Network Performance
The CloudScout segmentation has gone through two completely different testing phases. The first was conducted during the design phase at the end of the training and validation stages, using the synthetized dataset, and the second one was conducted in-flight using the available images acquired by HyperScout-2 during the -Sat-1 mission. While the first testing phase aimed to assess the network's capabilities and performances using the synthetic dataset to validate the inference against the requirements, the second phase was a direct evaluation of the performance exploiting the hypercubes acquired in-flight during the -Sat-1 mission by the HyperScout-2 camera. The results of the first phase have been already presented and summarized in the confusion matrix in Table III. As already highlighted they respect the most  stringent requirement: false positive under 1% on images for the entire test set. After the launch, 17 cubes acquired by the HyperScout-2 hyperspectral camera have been used for the assessment of the performances of the NN. As already highlighted above, each cube is preprocessed to extract a portion of 1152 by 1152 pixels with only the three bands and tiled in 6 by 6 images of size 192 by 192 by three bands which makes a total of 612 images tested. An example of original input images and NN output mask is shown in Figs. 17 and 18.
The entire test executed on the HyperScout-2 images produced the results shown in Table V. Although the result with 0% false positives is very promising, it should be highlighted that the acquisitions of HyperScout-2 were not planned to challenge the NN but rather to perform actual applications. In particular, the HyperScout 2 acquisitions have been planned and executed according to requirements related to EO applications. All the acquisitions have been used for the PhiSat in orbit demonstration. This implies, for example, that no images of clouds on snow or clouds on Salt Lake have been acquired, while the training and synthetic test datasets randomly chosen from the Sentinel-2 archive contained many configurations quite challenging for the NN. This is also observable in the overall accuracy obtained by the NN model against the

D. EoT Inference Engine In-Flight Performance
In addition to the CloudScout results acquired during -Sat-1's mission, in-flight performance data was also acquired for the EoT inference engine. Four separate EoT hardware test phases were executed over a 70-day period of the mission. During each phase, the EoT BIST routine was initiated, which performed both EoT chip-level and board-level diagnostics and reported back results. Chip-level tests coverage encompassed memories, caches, interfaces, and functional tests to dynamically exercise the SHAVEs and multiported memory. Board-level diagnostics evaluated the PMU, flash and serial device (SD) card. The executed diagnostics and their results are summarized in Table VI. It is seen that every diagnostic test passed at each phase, with the exception of the SD card test. 2 and 3 bit errors were observed in two of the data readbacks from the SD card, where the temporal gap between write and readback was 41 s. Note that the SD card functionality was not used on the -Sat-1 mission. Test 027 A included an additional 240 s test run during which NN inference with an exemplar TinyYOLO [36] model was continuously executed, with all inference outputs exactly matching the reference values. The in-flight diagnostics tests for the EoT inference engine indicate that the device performed as expected on-board -Sat-1 without experiencing any functional upsets, or any functional degradation effects due to radiation. All future installations of the Myriad VPU in space will be equipped with this built in test (BIST) that will allow to monitor correct performance of the hardware in time.

VI. CONCLUSION
-Sat-1 is part of the ESA initiatives to promote the development of disruptive innovative technology capabilities on-board EO missions. The -Sat-1 satellite represents the first ever on-board AI deep CNN inference on a dedicated chip attempting to exploit artificial DNN capability for EO. In particular, the mission is composed of two innovative devices: the HyperScout-2 and the EoT inference engine (detailed in Section III).
The HyperScout-2 (Section III-A), developed and produced by Cosine Remote Sensing B.V., is a hyperspectral camera based on a 2-D sensor used in push-broom mode. This HyperScout model provides hyperspectral imaging in the visible and near-infrared to analyze the Earth composition, along with three thermal infrared bands to retrieve the temperature distribution, boosting, and improving the number of EO applications. As part of the -Sat-1 mission it has been demonstrated that it is possible to run the full preprocessing chain before inference onboard the HyperScout OBDH, including computing spectral data cubes at pixel accuracy without relying on platform ADCS data but instead solely on machine vision techniques, allowing robustness and independence from small satellites' performance.
As shown in Fig. 19, at the bottom of its structure, there is the EoT hardware accelerator, developed by Ubotica (Section III-B), latch-up protected and controlled by cosine's HyperScout subsystems. The EoT, powered by the Myriad 2 VPU, accelerates both CV and AI while operating in a low power envelope. Acceleration is achieved via a unique combination of parallel processing and high-bandwidth multiported memory on the multicore Myriad 2 SoC. The NN inference acceleration provided by EoT on -Sat-1 is enabled by a compact, host-integrated API, wherein the EoT acts as an inference server, supporting frame-by-frame inference requests over a high-speed USB interface. In-flight test and self-test data from -Sat-1 demonstrates the ability of the board to reliably, accurately, and robustly perform the inference task throughout the duration of the mission. The success of this board as an AI accelerator on -Sat-1 has led to the development of a successor board (the UB0100 board) for enabling future AI-based cubesat missions to advance on the success of -Sat-1.
In order to demonstrate the potential of using AI directly on board, the CloudScout segmentation NN was developed by the University of Pisa (Section IV-B). It assigns to each pixel a binary classification: cloudy or not cloudy. The CloudScout NN exploits only three of the ten bands available from the HyperScout 2 for two main reasons: memory constraints on the Myriad 2, and power limits derived from the satellite power budget. Hence, to perform the NN training, a synthetic dataset was developed by Sinergise (Section IV-A), starting from the Sentinel 2 images. The dataset was built following three phases: 1) PCA to select the best bands combination for our goal; 2) re-binning the Sentinel-2 images from 10 to 70 m GSD; and 3) using the Sinergise s2cloudless algorithm to construct a label/ground truth mask for the input images.
The training of the network aimed to obtain the highest accuracy while maintaining a low number of false positives. Maintaining a low false-positive rate is of paramount importance for the application, as images wrongly classified as cloudy would be not transmitted to the ground, resulting in a potential loss of interesting data. The CloudScout NN was tested on both synthetic images on the ground, and subsequently in-orbit on -Sat-1 for live images acquired from the HyperScout-2 sensor. In both cases the solution demonstrated accuracy in excess of 95.9% with respect to the tile level and a commensurate low FP rate, achieving a 96% accuracy when performing cloud detection on live images on-satellite as stated in Table V.
The -Sat-1 mission represents the first AI on-board demonstrator able to autonomously select noncloudy images for transmission to the ground. Thanks to its in-flight measured performances, -Sat-1 has demonstrated the capability of AI to perform reliable and accurate on-board image processing. This technological advancement in the field of space AI, and the use of low-power COTS hardware-accelerated inference, paves the way for the exploitation of on-board AI in future EO and remote sensing applications, enabling the development of smarter and more efficient satellites for EO.