Table Of ContentAbstract
Fundamentals of Digital
The concept ofdesigning forreliability willbe introduced along with abrief
Engineering: ovcr,,iew ofreliability, redundancy andtraditional methods offaulttolerance
ispresented, asapplied tocurrent logic devices The fundamentals of
advanced circuit design andanalysis techniques willbe theprimary focus
The introduction willcover thedefinitions ofkeydevice parameters andhow
Designing for Reliability analysis isused Ioprove circuit correctness Basic design techniques suchas
synchronous vs asynchronous design, melastable stateresolution time/arbiter
design, andfinite slatemachine st_cture/implementation willbe reviewed
A Micro-Course Advanced topics will beexplored suchasskew-tolerant circuit design, the
useoftriple-modular redundancy andcircuit hazards, device transients and
prevenlntive circuitdesign, lock-up states infinite statemachines generated
bylogic synthesizers, device transient characteristics, radiation mitigation
techniques, worst-case analysis, theuse oftiming analyzers andsimulators,
andothers Case studies and lessons learned from spaceflight designs will
begiven asexamples
May2I,2001
This Seminar
• This is a seminar, not aclass
- Two Way Conversation
- Basic Theory
Introduction - Lessons Learned
- Case Studies for Discussion
• Present Your Own Case Studies for Discussion and
Future Inclusion
• Under Development
- First Time This Seminar Is Given
- Not All Topics Are Fully Developed
- What Areas Are Useful? Guide Development.
Reliability Reliability
Motivation - A Case Study (1961) Motivation - A Case Study (1986)
Itappears that there are enormous differences of opinion as to the
First,Ibelievethat this nation should commit itself to probability of afailure with loss of vehicle and of human life. The
estimates range from roughly I in I00 to 1in 100,000. The higher
achieving the goal, before this decade is out, of
! figures come from the working engineers, and the very low figures
landing a man on the moon and returning himsafely from management, What are the causes and consequences of this
to the earth, lack of agreement? Since 1part in 100,000 would imply that one
could put a Shuttle up each day for 300 years expecting to lose
Speei_1MessagetotheCongn.'ssonU_gentNationalNeeds
PresidentlohnFKennedy only one, we could properly ask "What is the cause of
Ddivcn_dinpersonbefore ajointsessionofCongress management's fantastic faith in the machinery?'"
May25.1961
g RPF¢.v;ma,'mR,eporlof_¢ PRESIDENTIALCOMMISSIONontheSpaceShuRl¢
I ChallengerAccident,Volume2Appendi.xF-Persona/Observation sonRetiabilit._of
I Shunl¢,Iun¢6th,1086
Reliability
Increasing Reliability
Motivation - A Case Study (2001)
When discussing the impact of the high observed FIT • Fault Prevention
rate for the FPGAs, the IAT asked Lockheed Martin - Eliminate Faults
"What's the reliability allocation?" Lockheed Martin
- In Practice, Reduce Probability of Failure toan
responded, +'tlelI if Iknow."
Acceptable Level
The IAT followed up by stating that it appeared that • Fault Tolerance
there has been no calculation of the probability of
mission success. Lockheed Martin concurred and JPL - Faults Are Expected
added: "No programmatic requirement for reliability - Use Redundancy
numbers." •AdditionalHardware, So.rare, Time
F_n the Ma.,_Od':ssey F'I_A Independent A_sm'ncnt Team, Apnt 2,2001
Conventional Techniques for High- Conventional Techniques for High-
Reliable Spaceborne Digital Systems Reliable Spaceborne Digital Systems
(cont'd)
• UseofConservative DesignPractices
- Derating, Simplicity,WideTolerances
• ThermalCyclingandVibrationTestingofAllCompleted
• PartsStandardization Assemblies
• 100%ScreeningofPartsandAssemblies, IncludingThorough • Establishmentofanefficientfieldservice feedbacksystemto
Bum-in
reportonequipmentfailures intheField
• DetailedLaboratoryAnalyses andCon'ectiveActionforAll • DesignoftheEquipment toMinimizeStressDuring
FailedParts
AssemblyandtoFacilitateReplacementofFailed
• UseofExtreme CareinManufacture ofParts Components
• ThoroughQualification ofPartsandManufacturingProcesses
XAS3,SPA(I _\"kHIC_ EI'II:]SI('*NCI_.ITI]RIA (GI..fD kNQE AND (ON'_/,OI +
_ll'.'R(I:_Bt?,)RXE DIGFfA" ,'OMPL [ER SYSq-E' IS-_,P-_;_7;, t
What We Will Do What We Will Not Do
• Cover Basic Concepts • Provide Exhaustive Coverage
• Present Data and Design Techniques - We only have afew hours
• Case Studies - Too much material
- Solutions for Previous Missions • Solve All Problems
- Mistakes from Previous Missions - Goal isto make you think
• Not discuss "Morn and Apple Pie" [well, at
least minimize it]
2
The Lessons of Designing for
Reliability
"... we must not repeat the errors of the past. This
is blocking and tackling, not rocket science."
Dan Goldin, April 27, 2000.
Barto's Law: Every circuit is considered guilty
until proven innocent.
Termination of Special Pins
Special Pins • MODE pin (test program mode).
• Vpe pin (programming voltage).
• TRST* (Reset toJTAG TAP controller)
*TCLK (provides clock to TAP controller)
A Very Basic Topic But A Source of
Frequent Failures and Problems • SDI, DCLK (varies for each device type)
, Others
MODE Pin - Test, Debug and
MODE Pin
Programming Control
• Left Floating I Y'"_
- Devicecanbenon-functional I ..... I
- Highcurrents
- Uncontrolled l/O
• Tied High During Test
- Working devicestopped functioning
- Powersupply risetimekey
I ....... I
MODE Pin - Test, Debug and
IEEE JTAG 1149.1 TCLK
Programming Control
iflj !
il _"J [
_ itl I t l t,t
The CLK pin may turn into anoutputdriving low, clamping
the oscillator's outputatalogic "0'. The TAP controller can
not reset and restore YO operation. Most FPGAs donot have
the optional TRST* pin. Note TRST*, when present, has a
pull-up.
IEEE JTAG 1149.1 - Scan Path
IEEE JTAG 1149.1 TCLK
SERIAL _NPUT SERIAL INPUT m
_hifi Register i_
TCK -_ TAP Controller undefined inTI:!SI- --_ = S2Y-S'FsETAMTE
[,OGIC-RESE I gta_c OUTPUT
'Slate Machine) / SYSTEM
LOGIC
LNPUT , , S3Y-STSETMATE
OUTPUT
TDI -_ Shift Register TDO
1 SBOYIUOST]TRPEEUMCTTIC'NAL
_a_S:_ Parallel Latch
CConhtirpol
IEEE JTAG 1149.1 - Scan I/O Cell
ToNext Pin
T
.o0_ OutEnable ----_ T l ]
Data Out
T
l
JTAG DATA PATH
2
Input Stages - Introduction
• Most CMOS inputs have rise/fall time limits
- Most inputs also have some hysteresis
• Typical symbols in specifications
Input Stages
t_,trL. -rise time
tF, tTIIL - fall time
tT-transition time
• Waveform measurement
- b'pica[ly from 10% to 90% but not always
- sometime parameter measurement method is not
specified
Input Stages - Termination
Input Stages - Practice
• Floating CMOS inputs are, in general, 'bad.'
- Totem-pole currents, oscillations, etc.
• Data sheets may list aparameter for
information only and not 100% tested * Some devices offer pull-up/down resistors
- SX-S only active during power transitions
• Laboratory devices have shown that not all
- Xilinx resistors controlled by SRAM
qualified devices will meet the data sheet
- Care oninternal tri-state lines
- One case was when apart was shrunk
- Migration to afaster process • Dedicated Inputs
- Oscillations observed - Actel unused inputs _ere handled bys/w
- Not true forsome SX, SX-S clocks
• Conservative margins recommended
Check each case carefully
Input Stages - Termination
Input Transition Times
Case Study: SX-S Clock Pin
Part Number Reference tzm(a,x-)
Aio20 1 soo
AIO_0B J s_o
RIIl0_ i soo
z IXrr_CQF_51W
AI:mOA 4 SOO
I o¢9_ oFSg.k. Ea,_,_"r_l mO PJll:mo 4 S00
RT_4SXI_, J._ 7 SO
12o.t_" 3_tn._:_rural TaM _XTQ_R4_SXXLS I09 _ =SzOo
v_r_x I1 _SO
lo- L'T_..'_'P 10 I_ ?
AT6010 (_L} 1_ SO
AT_IO O_V) I4 SO
krIdll (Si)
Input Transition Times Clock Transition Time Specification
References A Difficult Case
[I] ACT r_ tField Programmable Gate Arrays, Match 19ql.
[2] ACT IandACT 2MilitaryFPGAs, April,1992.
[3] ACT vMISeriesFPGA_ April 1996.
[4] Radiation Hardened FPGA.% v3.0, )anua_ 2000. ACElectricaCl_racterlstlcsOvertheO_raUngTem_ratureRange
[5] ACT vu2Series FPGAx. April t996.
[6] Acceleralor Series FPGAJ -ACT v_3Family, September, 1997. (ReadandW_teCycleTlmin_/z'_'__0 •_i ts_,v,T,•O'Cto*_'c)
[7] 54SX Farm_y FPGAs RadTolcnmt andHiR¢l, prdiminary VT.5,
Match 2000
[g] HiRel SX-A Family FPGA& Advanced v.I. April2000. 1-I -- l,.l,.l-]-t'-]
[91 RT54SX-S RadTolerant FPGAs forSpace ApplicatiOns, Advanced
0.2, Noverabet, 2000.
[10] QPRO XQR4000XL Radiation H_dcncd FPGA_ DS071 (v[.I) June
25.2000.
[Il] QPRO TM Vira:xTM 2.5V Radiation Hardened FPGAs, D5028 (v1.0)
April 25,2000 Advance Product Specification.
II
[12] Not indatasheet
[13] Co'tflgurable Logic Data Book, ACntl, August I995.
[141 AT6OOOLV, AWncl,October 1999.
Transition Time Requirements Transition Time Requirements
Implications -Pullup Resistors Implications -Filters and Protection Circuits
• Often used on signals
• Often used fortri-state or bi-directional
- Elimination of noise
busses
- ESD protection
• Rise time (10% - 90%) = _= 2.2 RC
- Etc.
• Example
• RC filters orclamps (high C) can often
C = 50 pF substantially degrade transition times
R - I0 ki) (keep power levels reasonable)
• Consider discrete hysteresis buffers,
= 500 ns
particularly for clock signals
violates many devices' specifications (see table)
Bus Hold Circuit in an FPGA Transition Time Requirements
Implications -Interfacing with older logic
families
• Case Study (1)
- CD4000B CMOS NOR gate
-- VDD = 5V
- tr (typ)= 100ns
• Case Study (2)
- CD4050B (used as alevel shifter, for example)
- VDO= 5V
- tT(max, 25 °C)= 160 ns
2
Transition Time Requirements Transition Time Requirements
Implications - Interfacing with older logic Parameter Measurement
families (cont'd)
i Case Study (3) - 54HC00 CMOS NOR gate
VOH
- 5962-8403701VDA,NANDGATE, QUAD2-1NPUT
T_! Symbol Test conditions l: ____Lira/Is Unit 50,_
.55oC STc _+I2S°C Min Max
unless other_As¢ specified
Tc=+25°C Vce=ZO 75
c_ul ri_and %, =sopF Vec=4.5 is as tTH L i; TLH V0L
S¢cfigure 4 v,:c=6.0 13
Tc=-55"C, -55°( IVcc=2O lI0
Ct =50pF ivote.5 22 ns
_ figure 4 [Vcc=6,0 19
3/Tramit/on _ (try, trot), ifnott©stcd, shallb¢guaran:eed to_ sCccificd Tim/isin From: Figure 4, 5962-8403701VDA, NAND GATE, QUAD 2-1NTUT
table I.
Transition Time Requirements Transition Time Requirements
Case Study: RH1020 Case Study: RH1020 CLKBUF
• Production Parts ! : . -!L
- Inputstagewasmodifiedforclockupset
• Vcc =+5VDC
• T=25oc
• CLKBUF monitored onoutput
Because of design ofthe buffer, d/fficuh toseeeffects ontheinput
pin
• Used alow impedance signal generator, triangle waveform
)ooooov : _2ooooovi i
• Commercial specification is tg, tFof 500 ns
-60500 ns -105.00 n= 39500 n$
- RHI020 didnot mcct thisspecification t00 ns/div roBltlme
- SMD 596290965 does notspecify this parameter rilitiml { I)_gS.715n=
Transition Time Requirements
Transition Time Requirements
RH1020 CLKBUF @ VrN threshold
Case Study: RH1020 CLKBUF Notes
i - i
Conditions: Room temp; Vcc = 5.0 V.
Oscillations detected consistently at tR=
360 ns
Sporadic outputpulses at tg = 300 ns
i l i i
Transition time requirement not symmetric
- Oscillations detected consistently atIv= 1.5p.s
- Sporadic output pulses observed attv= 1.0
-205000 ns -105.0OO ns -SOO0 as
200 nl/_lv rlalttm!
trequln¢_ (3) 104.000M_
Interfacing - Voltage Margin Inputs: RT54SX 16 tT
• TTL _ CMOS
- Problem with discrete circuits (still seen)
.......i.i.i ....ili i ............i.i,ii....
- Normally not aproblem with 5V FPGAs
- Issue with new FPGAs
: :i =: -_ ........
•0.35tammayonlypullupto3.3VDC !
•0.25p,mmayonlypullupto2,5VDC
: i
•CanbeissuewithpartshavingaVm=70%Vt_D
-i oo_ .$ ooqo s i oooous
•Ringingcancausefalsetriggering
curr|nL .lnl_U. maximum a_ere_|
rtso_Iml <t_I07_2us m -- --
• Vm = 0.8V and fast devices are sensitive to
ringing on abackplane. RIrJ4,_FI6 output (bon'om trace) withaslow r_xing input (top trace)
which cloc_ adivide by two counter reciting ina*glitch." The
clock laput wo_prvvided byanHP8110_¢ pulxe generator.
Inputs: RT54SX16 tT JTAG and Loss of Control
! i i i i : • Run TCK with TMS='I'
r - Guaranteed toreturn toTEST_LOGICRESET
state within 5clocks.
• Share system clock with TCK
• JTAG Hit
• Inputs turn to outputs
- Clock pin turns to output, clamps system clock
-ioo.oo ns oooo s Ioo,oo ns
2o,o n$/Cllv real tlm* No TCK, system hangs.
RT54SXI 6ourgn,t (bottom trace) with aslow rising Input(top trace)
which clocl_ adivide byt_v counter re..ndtfng ina"glitch." The
clock input wartreovided byanHlaSIIOApu&e generator.
4