SOA, Watts Up, Transistor?: A Mystery of Self-Destructing MOSFETs

Editor’s note: I am pleased to have Ken Coffman and his partner-in-crime in this excellent blog, Jon Dutra, Principal Electrical Engineer, Microsoft.

Being an engineer is often like being a detective. Clues are studied. Data is collected. Deductions are, uh, deduced. Pipes are smoked. No, wait, that’s Sherlock Holmes.

In this case we have a circuit that worked well at one operating point, but reliably self-destructed at another operating point. The active load circuits were the same. The transistor power dissipation was the same. The engineer pulls out his hair and cries out to the heavens: ‘What is going on?’

In this circuit, we have a pair of parallel power transistors in a constant current load, with each FET dissipating 18W. One design delivers 18A with VDS at 1V with great reliability. The modified circuit delivers 1.5A with VDS at 12V. It lets the smoke out. The power dissipation in the FETs is the same, so the failure rate should be the same, right? Not so fast, Doctor Watson.

Generally speaking, Safe Operating Area (SOA) is a poorly understood parameter. If you don’t believe us, ask your FET supplier to explain how the SOA test is performed and what the results mean in a typical design. Good luck with that.

From the FET datasheet (because we’re serious professionals, we won’t mention the vendor—wait, yes we will, it’s ST and the part number is STP27N3LH5), we see the SOA chart below with the two operating points marked with stars. Basically, what we’re saying is the green star works and the red star fails. Both stars are in the “safe” area and you’d be forgiven in thinking the red star should represent a more robust operating point. Take a minute to study the SOA plot while studiously trying to ignore the typo. What’s a Sinlge pulse? But, let’s not digress.

STP27N3LH5 SOA Plot

For this part, ST does not show the DC test result. Why is that? Is it because the DC performance was outstanding and they did not want to brag? Maybe. Maybe not, but here we are, digressing again.

What’s with the ascending line labeled with “Operation in this area is Limited by max RDS(On) ”. Beyond questionable syntax, could this note have a relation to the reliable parallel-FET performance? [Insert an image here of Jon and Ken looking at each other with puzzled expressions.]

The note means the SOA performance might be good in that area, but we’ll never know because the FET RDS(On) will not allow the control loop to drive us into that region. Put another way: the control loop will overdrive the FET gate to try to operate in that region. Aha. In the 1V@18A case, the FETs are not operating in the high-stress linear mode—their channels are either fully enhanced or close to fully enhanced and the FETs are forced into sharing current.

‘Linear mode.’ There is a phrase to strike sheer terror into the hearts of engineers. For the 12V@1.5A case, the FETs are definitely forced by the control loop into the high-stress linear mode. That’s why they fail. Shall we discuss linear mode operation in more detail? Sure, if you want us to, we will go into infinite, tedious detail in a follow-on blog.

A secondary problem is that nothing forces the red-star FETs to share current across the die. Why not? It’s common knowledge that RDS(On) increases with increasing temperature and that’s good for forcing parallel FETs to share current. So why won’t the FETs share if they are operating in the linear mode? The ST datasheet does not show this, but with increasing temperature, the VGS threshold decreases and the ΔiDrain/ ΔvGS transconductance increases. This is not good—it means the hot area of the transistor hogs more current.

This same thing happens internal to the device because one FET is actually composed of many FETs in parallel to reduce the on resistance. With low VDS voltage this increase in current does not correspond to a large relative increase in local power, but with VDS of 12 Volts the local FET temp rises a lot with the increased current, further increasing transconductance and local heating and BOOM, it gets too hot and locally fails, and now the FET is shorted.

It is interesting that the transconductance parameter does not appear in the ST data sheet. Clicking around on the web, we found that for a similar FET: if the transistor heats up by 10o C, the threshold voltage drops by about 30 mV. Recall that ID =Gm x VGS . Let’s look at these two cases. Let’s say that Gm=3 A/V and we have an on-die 33o C rise, lowering the threshold 100 mV. In the low voltage, 18 amp case, the VGS needs to be 6 volts to satisfy 18=3 x 6.

In the 12 volt VDS 1.5 amp case, we have 1.5=3 x 0.5. So the 30o C rise locally increases current, and power, by 20% in the 12VDS , low-current case and 1.6% in the high-current, low-voltage case. At some point, thermal runaway will occur. As a localized spot on the die gets hot, the threshold decreases, so it gets hotter and it hogs the current. The op amp gate driver forces the same current to flow into the smaller area and BOOM.

In the circuit that fails, what is happening? The FETs attempt to share current, but there are hotspots on the die of the failed device and the pinpoint thermal stress is incredible. How would we fix this problem? What are characteristics of FETs that work well in the linear mode? We will not answer that question here and now, but, as a hint, we will clip from an ancient Hitachi 2SK1058 FET datasheet. This old FET was notorious for good linear mode operation. Look for yourself in the SOA plot below. Would 12V@1.5A drive this FET into the RDS(On) limit area and force the FETs to share current? You probably wouldn’t use this ancient, obsolete FET in your design, but you’d try to find a modern FET with a similar characteristic.

One thing that might help is to use FETs with lower transconductance for the higher VDS case. In this way it will run with a higher VGS and the effect of thermally induced threshold shift will be reduced.

For those who know us, we love to argue. So, if you disagree, make your case in the comments section below. Otherwise, case closed, Dr. Watson. Pour the sherry.

Old School FET, the 2SK105x SOA Plot

15 comments on “SOA, Watts Up, Transistor?: A Mystery of Self-Destructing MOSFETs”

1. Jerry Steele
October 18, 2016

Spot on guys.

When I was working with MOSFETS in a very high power hot swap application  (it should be noted the hot swap controller could monitor mosfet power dissipation,not just current) our rule was that when paralleling MOSFETs you treat them as if you have the SOA of a single MOSFET. The only reason it this was reliable with the hotswaps is that the loop is always designed to force the devices DOWN to the desired current level. In such a case you are assuming that the MOSFET hogging current is the only one being controlled (all others are off, or close to). So you selected current limits, power limits, and timeouts to respect that.

Also, I know from having talked to several MOSFET manufacturers that some, not all, characterize SOA by testing to failure (which always must result in a large pile of failed devices). Some don't. Not sure all of them even know how to spell SOA. But some are pretty good at it.

2. RobTheNob
October 19, 2016

Great article. I discovered more or less how this works at the expense of a big pile of ruined FETs, with case sizes increasing towards the top of the pile. In our case we ended up with one huge FET (gate capacitance not an issue for this job) and servo loop control.

3. Hooey0
October 19, 2016

Excellent post. However, I'm surprised that no mention of the Spirito Effect was mentioned. For those interested in understanding what it's about, there's ample literature available on the web.

4. David_Ashton_EC
October 19, 2016

It would have been nice if the schematic could have been enlarged – it's usually done with a link below it, talk to the editor about this.

When in the SOA characteristics they say “Single Pulse” (I'll ignore the typo as you say) it's unlikely anyone would design a circuit for just one pulse.  So is it one pulse in a second?  At a certain duty cycle?  Do they define this at all?

You talk about constant current and mention linear mode, so I get the impression you were doing just that, delivering a current at a 100% duty cycle?  Am I right?  A larger schematic would help here.

But a great article, thanks.

5. David.Ellison
October 19, 2016

It is not uncommon for large MOSFETs to contain multiple die. This should strike great fear in the heart of engineers with linear applications. The manufacturers invariably match these die with switching applications in mind, and do not pay attention to the gate parameters that you identified as crucial for matching in linear applications. In addition, their sales staff rarely understand these issues, and are often powerless to provide good information when they do understand. For engineers that are paralleling separate MOSFET packages, there is little excuse for not providing some form of external feedback to force nearly perfect sharing. Sense resistors and op-amps just aren't that expensive when compared to the cost of leaving them out.

6. ethan77
October 20, 2016

Is that old transistor systems approach the stereo channels today?

7. Ken.Coffman
October 22, 2016

You're right, David. This is an active load circuit, so it's intended to draw a calibrated DC current. We're working on getting a higher resolution schematic. There are standard methods of testing SOA, but even when I was in the power FET busines (Fairchild), there were few that really understood how the data was collected and what it means. And worse, there was no guidance to extrapolating the test results to an actual circuit. SOA is generally tested with a cold plate fixed at 25C. How many designs ship with an infinite heat sink? When we did specific SOA testing for a customer, I was shocked at how wide the variance was–and that's with a captive fab, I can only imagine the situation is worse with an outsourced fab. Let the FET buyer beware.

8. Ken.Coffman
October 22, 2016

You make good points, Jerry. One thing that is interesting is what “test to fail” means. For Fairchild (and probably most everyone else, but I don't know), test to fail meant the die reached the high operating temperature (usually 150C or 175C). The silicon is not actually damaged at that temperature, so there is some margin at that limit. I was always more interested in the operating point where the die is actually being damaged, but it's time-consuming and tedious to collect that data for all the operating points in the SOA curve.

9. David_Ashton_EC
October 22, 2016

Thanks Ken.  There's a lesson in interpreting Datasheet info in here!  I know most MOSFETS are used in pulse mode in SMPSs, but I've always been wary of the claimed dissipation ratings for the usual small TO-220 packages, even on good heatsinks.  Thanks again.

10. Ken.Coffman
October 24, 2016

Yes, Hooey, you're right, we should have mentioned the Spirito Effect as a name for what we are describing. This is the thermal instability region where the VGS threshold has a negative temperature coefficient. At higher VGS, the thermal goefficient goes positive, which is helpful in helping FET cells to share current. For trench devices, operating with a low VGS is very dangerous and the SOA capability can be very much lower than what you expect. For trench FETs that don't show the Spirito bend on the right side of the SOA plot, be very suspicious of the SOA capability.

11. Hooey0
October 24, 2016

Ken – Thanks for the response. This is a great thread, with a ton of very useful information for desginers of circuitry requiring both linear and switch mode MOSFETs.

I had the opportunity to deal with this exact issue a few years ago when a subcontracted electronics board used the wrong type of MOSFET for a linear application. We burnt up quite a few devices in the process until we became aware of what the root cause of the failures was. The designer had never heard of the Spirito Effect before, but he sure did afterwards…

I have a PDF file from IR (P. Spirito worked there), but I don't see any way to attach it. It's IR application note AN-1155 for those who are interested.

Note that there's more information availaable on this subject, as I noted in my earlier post.

12. Thomas7
October 25, 2016

Great and informative post!

13. Ken.Coffman
October 25, 2016

That's nice of you to say, Monica. Jon and I really appreciate it.

14. taylor123az
November 6, 2016

I agree with you at this point

15. JamesLankford
November 18, 2016

Way cool! Some substantial focuses extremely! I welcome you

This site uses Akismet to reduce spam. Learn how your comment data is processed.