Extrapolation and small numbers: peril and promise

I got angry again when I heard yet another news report that highlighted a few scattered data points about some crisis, and then added the phrase “if this trend continues. . . .” and “at this rate. . . “. Let's be honest, it's tricky and dangerous to extrapolate from a small set of events in a very large pool of numbers. It can give you insight but it can also lead you astray.

As engineers, we do often have to take rare events and use them to understand the bigger picture. For example, we look at causes of errors when the bit error rate (BER) is 1 in 1012 or less, to see if there is a systemic problem or just truly random fluctuations due to the laws of physics. We also look at scattered, seemingly random failure reports to understand why a product has failed.

But the flipside is that is too easy to start drawing conclusions that something is really happening which may not be there. Several months ago, I read an article about static electricity apparently sparking fires at gas stations where people are filling up their car gas tanks (“Static Fires Are a Peril at the Pump,”.The New York Times , July 27, 2008, click here). It seems, and I emphasize “seems,” that static may have caused 170 such fires from 1992 to 2006, or about 15 per year. When you compare this to the 11 billion fill-ups per year, you have to wonder about the significance of the data, and whether you can or should draw any conclusions.

To add to the small-number cloudiness, the researchers said that 78% of the drivers who had these fires were women, and began speculating about why this was. Among the reasons, and I am not making this up, one researcher speculated that mean are heavier than women, and therefore grasp the car door when getting out, thus giving the static a discharge path to ground. Could be, but hold on: while 78% sounds so precise, the small numbers of fire events in the data makes me very wary of making any conclusions.

That's the dilemma for us: how to work with small numbers of events in a large sea of data, and draw valid conclusions without going down a meaningless or deceptive path. You also have to watch out for the corollary danger of extrapolation to ridiculousness from this data. I've heard many experts say that product time-to-market and cycles are getting shorter and shorter, leavened with a number such as “shrinking by x% per year,” then followed with a casual extrapolation such as “at this rate, they will be half of what they are now, in just a few years.”

Yes, and if that continues, the time-to-market cycle will soon asymptotically approach zero. Hey, I say, go even better, and use a linear decline of so many months per cycle instead of a percentage. That way, the time-to-market may even go negative, and products will be introduced before they are conceived!

-x x x-

0 comments on “Extrapolation and small numbers: peril and promise”

This site uses Akismet to reduce spam. Learn how your comment data is processed.