Digital circuits operating in highly radiative environments are frequently subject to transient or permanent faults. We propose real-time techniques for automatically reconfiguring a damaged circuit implemented on Field-Programmable Gate Arrays (FPGAs) during mission time.
Digital circuits will play a major role in the design of future space exploration vehicles. It will be critical for long-duration manned flights to have a robust and efficient fault recovery mechanism that protects the electronic circuits from high-energy particle impacts.
Our current research follows two main directions. One approach consists of applying genetic algorithms in order to evolve a damaged circuit until a correct reconfiguration has been found. Most evolutionary approaches to fault recovery in FPGAs focus solely on evolving logic, as opposed to evolving both logic and routing. Evolutionary fault-recovery systems should benefit by accommodating routing as well, since the majority of transistors in a typical FPGA are dedicated to interconnects (approximately 80% in one estimate). The Evolvable Systems Group at NASA Ames has developed an evolutionary fault-recovery system employing a genetic representation that takes into account both logic and routing configurations.
Experiments were conducted using a software model of the Xilinx Virtex FPGA. Using four Virtex combinational logic blocks, the system was able to evolve a 100% accurate quadrature decoder finite state machine in the presence of a stuck-at-0 fault. This technology is efficient and has a small footprint, which enables an implementation on flight systems.Right: A damaged FPGA with a stuck-at-0 fault.
The other approach to circuit evolution is quite recent and uses modern resolution algorithms that operate on Boolean formulas (SAT solver technology) to automatically generate a correct FPGA configuration. The Genetic Algorithm approach does not provide a guarantee that the repaired circuit will behave exactly as it did before the fault occurred. In contrast, the SAT-based technology mathematically guarantees that the reconfigured circuit implements exactly the same functionality as the original one. Preliminary experiments show good scalability and high resilience in the presence of multiple damages.
Critical operations performed by a spacecraft, such as pyrotechnic command control, are usually implemented on specialized digital circuits that can achieve high throughputs and short response times. Digital circuits can be exposed in space to extreme radiative and thermal conditions that may cause transient or permanent damages to the device. The common approach to this problem uses a combination of radiation-hardened circuits together with triple module redundancy. NASA is interested in investigating alternative and/or complementary technologies that could increase the reliability of digital circuits in radiative environments.
FPGAs offer an alternative approach to this problem. As opposed to traditional ASICs for which the design is fixed from the production line and cannot be changed later, FPGAs are reprogrammable circuits. An FPGA is a blank matrix of gates and wires (that may add up to millions for the most recent ones) that can be dynamically reconfigured to implement a specific circuit. If an element of an FPGA is damaged, for example a bit permanently stuck at 1, the internal redundancy of the circuit (the unused gates and wires) can be used to set up a new configuration that works around the faulty part. Using FPGAs in combination with module redundancy for example, enables the off-line reconfiguration of the damaged circuit and its reintegration into the system after repair. For this new approach to be applicable to long-duration space missions, the tasks of identifying the damaged portion of the FPGA and finding an alternative reconfiguration must be performed on board.
We propose effective techniques that can automatically analyze and reconfigure a damaged circuit with a high-level of reliability. These techniques are lightweight and can be implemented in the resource-constrained environment of a spacecraft.