Abstract:
Fault tolerant computers have been developed in recent years to operate in the harsh radiation environment of outer space. These computers employ multiple copies of soft ...Show MoreMetadata
Abstract:
Fault tolerant computers have been developed in recent years to operate in the harsh radiation environment of outer space. These computers employ multiple copies of soft processors in a reconfigurable hardware environment and can automatically repair faults caused by radiation strikes. However, during certain recovery procedures, data collection and processing can be halted, and valuable scientific data can be lost. In addition, current fault recovery procedures may inadvertently make the computer more susceptible to faults or errors, for example, by introducing voltage and temperature changes. Machine learning feature extraction algorithms have the potential to reduce data loss by identifying patterns related to computational fault mitigation and recovery techniques. In this work, we will gather telemetry data from RadPC: a reconfigurable, radiation tolerant computer that has been developed over the past 12 years by Montana State University to advance high performance space computing under varying environmental conditions. RadPC has recently been configured to provide regular telemetry data to measure and communicate the performance of the radiation-tolerant computing platform. Specifically, the telemetry data includes information about data memory integrity, faults experienced, and successful repairs; as well as various measurements including voltage, current, and temperature. While RadPC has been under development for some time, the developers have never searched the telemetry data for associations between fault recovery procedures and the physical state of the hardware itself (e.g., voltage and current levels of power supplies or internal temperature). In this work, the computer will be subject to synthetic faults—emulating radiation strikes that may occur in space—and perform standard recovery procedures. The tests will be performed with the RadPC on a high-altitude balloon flight as well as inside a temperature-controlled vacuum chamber, allowing for a range o...
Date of Conference: 13-14 May 2022
Date Added to IEEE Xplore: 20 June 2022
ISBN Information: