Philbo King
Addicted to Fun and Learning
- Joined
- May 30, 2022
- Messages
- 669
- Likes
- 876
Yeah, I didn't want delve into the fixes too much...I don't get it, do error correction fail-safes not exist for such designs, something akin to ECC RAM for example? (I'm talking about your bit flips post, not this RNG generator topic).
A brief summary:
- A checksum can be run periodically on the FPGA RAM and auto-reset on a mismatch to reload from ROM.
- In safety-critical applications (like a flight control computer) there are often 3 computers running in parallel, with voting to detect when one system disagrees with the other 2. A disagreement can cause rebooting the errant FPGA. If another disagreement occurs, it is logged in a systems diagnostic computer as a hard fault (not an NSEU, just broken) and that computer is locked out for the rest of the flight.
- Another approach simply resets and reloads the FPGA RAM at a fixed interval, for example 250 mS. So an NSEU-caused miscalculation can have an effect, but it is very temporary.
These work only because the odds of 2 or 3 devices being hit by an NSEU in the same RAM cell at the same time are infinitesimally low.
For what it's worth, I haven't seen any radiation hardened FPGAs in my work (avionics and aerospace), only commercial grade FPGAs, because buying and stocking thousands of one-time-build hardened (i.e., expensive) spares for an aircraft that will be in use >50 years gets outrageously expensive, and the solutions listed above far exceed the reliability requirements.
Last edited: