Loading [MathJax]/extensions/MathMenu.js
Identifying Patterns in Fault Recovery Techniques and Hardware Status of Radiation Tolerant Computers Using Principal Components Analysis | IEEE Conference Publication | IEEE Xplore
Scheduled Maintenance: On Monday, 30 June, IEEE Xplore will undergo scheduled maintenance from 1:00-2:00 PM ET (1800-1900 UTC).
On Tuesday, 1 July, IEEE Xplore will undergo scheduled maintenance from 1:00-5:00 PM ET (1800-2200 UTC).
During these times, there may be intermittent impact on performance. We apologize for any inconvenience.

Identifying Patterns in Fault Recovery Techniques and Hardware Status of Radiation Tolerant Computers Using Principal Components Analysis


Abstract:

Fault tolerant computers have been developed in recent years to operate in the harsh radiation environment of outer space. These computers employ multiple copies of soft ...Show More

Abstract:

Fault tolerant computers have been developed in recent years to operate in the harsh radiation environment of outer space. These computers employ multiple copies of soft processors in a reconfigurable hardware environment and can automatically repair faults caused by radiation strikes. However, during certain recovery procedures, data collection and processing can be halted, and valuable scientific data can be lost. In addition, current fault recovery procedures may inadvertently make the computer more susceptible to faults or errors, for example, by introducing voltage and temperature changes. Machine learning feature extraction algorithms have the potential to reduce data loss by identifying patterns related to computational fault mitigation and recovery techniques. In this work, we will gather telemetry data from RadPC: a reconfigurable, radiation tolerant computer that has been developed over the past 12 years by Montana State University to advance high performance space computing under varying environmental conditions. RadPC has recently been configured to provide regular telemetry data to measure and communicate the performance of the radiation-tolerant computing platform. Specifically, the telemetry data includes information about data memory integrity, faults experienced, and successful repairs; as well as various measurements including voltage, current, and temperature. While RadPC has been under development for some time, the developers have never searched the telemetry data for associations between fault recovery procedures and the physical state of the hardware itself (e.g., voltage and current levels of power supplies or internal temperature). In this work, the computer will be subject to synthetic faults—emulating radiation strikes that may occur in space—and perform standard recovery procedures. The tests will be performed with the RadPC on a high-altitude balloon flight as well as inside a temperature-controlled vacuum chamber, allowing for a range o...
Date of Conference: 13-14 May 2022
Date Added to IEEE Xplore: 20 June 2022
ISBN Information:
Conference Location: Orem, UT, USA

Contact IEEE to Subscribe

References

References is not available for this document.