Advertisement

Blog

Failures in Aerospace Applications, Part 3

Part 2 of this Blog series describes the SEL (single event latchup), a failure that can occur in ICs for aerospace applications; this failure is due to radiation sources present in the aerospace environment (the protons and the heavy ions). SEL is the most common type of SEE (single event error), but there are other possible failures; below are listed all the different types of error that can be caused by radiation in the aerospace environment:

  • SEL (single event latchup)
  • SEU (single event upset)
  • SET (single event transient)
  • SEB (single event burnout)
  • SEFI (single event functional interrupt)
  • SEGR (single event gate rupture)

Now let’s see the second type of error, the SEU , which is quite common in aerospace applications.

SEU (single event upset)
This effect may be present in integrated memories. Let’s consider a generic cell of a silicon based memory, in particular the aij data, contained in the i-row and in the j-column of a memory. When a heavy ion (or a proton) crosses the memory, it will generate e / p + pairs that create a parasitic current, which modifies the value of the bit stored in the memory cell. (See Figure 1.)

Figure 1: The mechanism of SEU

The presence of SEU can be revealed by testing a reference memory cell: If the reference value has changed, the SEU is present. The corresponding error is known as data retention fault (DRF). The testing of a memory cell on an aerospace module can be performed by a test routine to check the data integrity of the SRAM memories, the march test, which comprises four main steps:

  • Step 1: Write a pattern of bits in a row of the memory cell (e.g., 101)
  • Step 2: Read the pattern of bits in the row of the memory cell and verify that it’s correct
  • Step 3: Write a pattern of complementary bits in a row of the memory cell (e.g., 010)
  • Step 4: Read the pattern of bits in the row of the memory cell and verify that it’s correct

The march test is widely used in industry because it’s linear in complexity, proportional to the number of operations implemented. There’s a procedure to generate automatically a march test for memories impacted by SEU. (See Figure 2.)

Figure 2: The algorithm for the automatic generation of a march test

(Source: Alberto Scionti, Graduate Thesis, University of Torino, Faculty of Informatics Engineering)

What’s your experience of SEU? How dangerous do you think it is? Do you think the march test is a good solution to avoid SEU and to guarantee the functionality of the ICs exposed to radiation and protection of the data stored in the memory part of the aerospace module? Do you think the memory block of the aerospace application board should be protected by a lead screen, or something similar, to preserve the data integrity?

10 comments on “Failures in Aerospace Applications, Part 3

  1. etnapowers
    March 31, 2014

    Altough SEU is not a destructive failure it can be very dangerous: loosing the data stored in memories might be a really big problem, for example when some recovery procedures are memorized in the mass memories of the application board.

  2. etnapowers
    April 1, 2014

    The algorithm to check the presence of SEU might be applied to a reference memory cell acting as a remote checker, the content should be checked periodically to monitorize remotely the presence of SEU

  3. etnapowers
    April 1, 2014

    If an impacting heavy ion damages the physical structure of the memory the data might not be lost but the data may become unaccessible, so the data will be unavailable, although there would not be SEU, because the data have not been modified..

  4. etnapowers
    April 2, 2014

    The SEU might be detected to monitor the radiation rate in order to have a feedback warning signal that makes the overall system able to increase the level of protection when it is under stress for high radiation level.

  5. eafpres
    April 2, 2014

    @Paolo & @etnapowers–it seems to me that redundancy is a possible design approach considering the statistical probability of a damaging event occurring for the same memory location in two or three memory units.  I have heard of systems where there are triple redundant answers and if one disagrees it is thrown away.

    Is this a viable approach for something you need really high reliability in space?

  6. etnapowers
    April 3, 2014

    @Blaine & @eafpres: absolutely yes! This approach is really a viable solution because it detects a failure in a specific position.

    A similar approach applies to the rad hard testing procedure of finished wafers, the outliers dice are scrapped and  the area of the wafer in which the outlier parts are located is recorded accordingly.

     

  7. etnapowers
    April 4, 2014

    @Blaine: the statistical approach is the base of the evaluations of the effect of radiations on an IC, because the fluence (i.e the radiation rate) and the type of radiation is an unknown parameter, so a statistical approach is the only approach possible.

  8. M_S
    April 7, 2014

    SEU and other related failures are not just seen in aerospace applications!

  9. etnapowers
    April 8, 2014

    True, it's a failure that can appear in integrated memories, if it happens in aerospace circuitry it can be a serious problem, because some important data may be lost.

  10. yalanand
    April 30, 2014

    It is good that the presence of the single event upset can be present in integrated memories. It can be detected when a proton crosses the memory as it generates the parasitic current. This will help improve the memory as it modifies the value of the bit. It can also be revealed by testing the reference memory cell which can be done on an aerospace module. It can be said to be a failure because very  important information can be lost without the presence of a back up.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.