Would it surprise you that the likelihood of two people out of any twenty-four having the same birthday is 50/50? At my last gig three of us out of a group of maybe 30 had the same birthdate! There's even a name for this coincidence: The Birthday Paradox. If you've ever cherry-picked analog components looking for ones that precisely matched, you've depended upon a similar phenomenon. As it turns out, the mathematics behind the Birthday Paradox shows cherry picking doesn't work when it comes to designing a profit-making product — or does it?

I've been working for a while now on a variant of classical cherry picking that works great when solving a challenge like ultrasound analog front-end design where the future is lower power at lower cost with literally hundreds and even thousands of precision matched channels.

I put together a chart, Figure 1, which quantifies why old-school cherry picking doesn't work and why this relatively new variant does. I designed the chart contemplating capacitor matching for high precision, low power, data converters, but to keep it simple let me use amplifier-offset voltage as the objective.

**Figure 1**

Let's say you bought a boatload of LM324 quad opamps real cheap, but you only want those with input offset voltages equal to the typical 3mV specification or better. So the question is, “How many amplifiers will you need to test before you find an amplifier with the typical 3mV offset?” The answer can be found looking at the chart.

Think of the X-axis on Figure 1 as the maximum number of times you'd need to pick until you found that 3mV amplifier. Since you aren't looking for any reduction relative to the typical, find the Y-axis intercept where the Cherry Picking curve intersects one since a 3mV objective / 3mV typical = 1. Then follow the dashed line down to the X-axis intercept. Here you find that you may need to pick up to 15 times before you find your first one. Now it won't be that bad all the time. The curves represent a 3-ppm failure rate on picking which is about 6-sigma. More likely you'd only need to look at one or two which is why you can usually just buy the next higher grade for about that same effective cost ratio. Essentially, the manufacturer cherry picks for you: for example, LM124s versus LM324s. But classical cherry picking falls apart for even the manufacturer when one needs to be lucky hundreds or thousands of times on a thousand-channel, ultrasound, analog front-end.

To get around cherry picking, analog IC designers use the Brute Force curve. Looking back at the chart one sees that the 1.0 Y-axis intercept for Brute Force is at the right most edge of the drawing crossing over the series of blue dots at about 26. What this tells a designer is they need to increase the total area of the input transistors by 26. So what they do is lay down 26 typical transistors in two composite groups of 13. In the context of The Birthday Paradox, this is choosing 13 from a group of 26, once. The key word here is *once* . And then they crank up the operating current to keep the bandwidth constant for the accompanying higher capacitances.

Now consider if the design enabled choosing 13 from 26 in all possible ways after the IC was manufactured. Well, tracing 26 down to the X-intercept shows that there are literally millions of ways to cherry pick from a pool of 26 transistors. And perhaps even more astounding is that at least one of the combinations would be a factor of 100,000 better as indicated on the Cherry Picking curve! This is a ridiculous example of course, but done to make a very important point.

Nothing in this variant of cherry picking has anything to do with size unlike with Brute Force where larger always has limits and always translates to more power loss. Like The Birthday Paradox, it is only combinations that drive the result. So not only can this variant greatly improve matching, but also lower power. As the chart indicates, the blue dot intercept is at about 8 typical transistors, so 35 possible combinations guarantee with 6-sigma confidence that 3mV becomes 1.5mV with 65 percent less power. These same benefits can work for matched capacitors in data converters.

Consider driving the huge capacitance at the frontend of an ADC. Manufacturers will tell you that it is large to minimize KTC noise and sell you an external amplifier to drive the large capacitance consuming even more power. In a low power SAR, this is a dominant point of energy loss. But, ultrasound designers use other tricks to minimize noise rather than Brute Force. They take advantage of the phenomenon whereby noise adds as the square root and signals add linearly and hence build parallel signal processing systems.

Why not do the same with an ADC and put down two ADCs in parallel? Maybe it's time to rethink the old methods of Brute Force, lasers, auto-zero switching clocks, and programming trim code ROMs. Cherry picking a static combination is a digital problem: cheap and low power without sacrificing performance. How far can this be carried? Well, how small can one make a capacitor on an IC? Start there.

One must be careful w/ the birthday paradox – got to state it as you did – “With a room of [pick a PeopleValue between 20 and 30 depending on how bold you feel] people, how likely is it that two will share a birthdate (not year spefic of course)?”

As opposed to – same PeopleValue as above – “How likely is it s/o in this room has the same birthdate as me?” That's sort of what everyone perceives and they therefore assume the odds are 1 in 365.

Reminds me of the Monty Hall-Let's Make a Deal Door #1, #2, or #3 puzzle. Counterintuitive.

Hi Scott. The birthday paradox seems simple compared to the component sorting you are discussing. I'm confused in part because I envision component values to vary continuously according to some distribution, and you are sampling from that distribution, as compared to the birthday case where there are a finite number of answers (365 or 366 different days, plus assuming away the non-uniform distribution of birth dates etc.).

The other thing making my head hurt is that for most purposes, it would seem the population from which you are selecting the desired components is essentially infinite, vs. the very finite 20 to 30 people in the room. So I want to address your component problem with probabilistic approaches vs. combinatoric approaches.

Where am I going wrong?

Hi Scott–thanks for your additional notes. I'm OK with the binomial coefficient description of the birthday problem. But if you look into that, the number of “states” that the date can have (i.e. Jan. 1, Jan. 2…Dec. 30, Dec. 31=365 states) factors into the specific result of how many would you have to choose to have at least a 50% chance of two the same. In other words, if we said how many would you have to choose to have at least a 50% chance that the birth date AND hour were the same, then, using a 24 hour clock, the number of states is 365*24 and the number we have to pick out in pairwise sampling is much higher.

So, when I look at the components, and you are selecting for some value, say the value of a resistor, there are an infinitely many possible values for each resistor.

My comment about the distribution is not about the shape but more about turning the problem back into the “choose 6 from a group of 12” problem–you have to define what range of values is “the same” for your purpose, then that defines a bin size, and the tolerance of the original parts would tell you what the distribution of the parts might be and from that you get how many bins you need to cover the distribution. The number of bins is the number of states the parts can have, so you are roughly back to the birthday problem.

I think the original point and the point even more clearly in your additional notes is that if you just need parts to match, not necessarily to have a particular value, then the odds of finding a group of them of some size all with a same (similar) value goes way up. I fully agree that this is true and it is a very important point.

@eafpres

You wrote, “..if you just need parts to match, not necessarily to have a particular value, then the odds of finding a group of them of some size all with a same (similar) value goes way up.”

Precisely. As you were suggesting, when one starts asking for an exact state (i.e. 1032.4 ohms), this is a different problem. But, interestingly, not too different. It really boils down to what is the mean and standard deviation of the starting lot. If one desires a value to the left of mean, then this is simply a reduction in the probility. That is a problem in absolute matching versus relative matching. There's a trick to solve this, but I'd get in trouble with others it I start talking about that.

My focus has been on matching because my feeling is just about EVERYTHNG in integrated analog is about matching. With amplifiers; input offset voltage, common mode rejection ratio, PSRR, input bias current, etc. all become irrelevant when all the transistors match. Precision voltage references have less drift and are more precise when transistors match. In data converters, perfect linearity occurs when all the passive component ratios are perfectly matched. Even ADC offset from zero is because of poor matching…when the internal sample switches are open, if the charge injection is exactly the same, there is no offset – well okay, then we worry about the comparator or amplifier, but I think you probably get the point.

You're right on track. It's about matching; xyz = xyz , not 1.024=1.024.

Scott – can you speak to the possibility (likelihood) of a skewed distribution of values in the case where the manufacturer picks all the “good” parts from the production run and labels them as the “premium” grade; then marks the rest as the average grade. This gives the average grade devices a peculiar, non-bell shaped distribution (high at the left and right sides and none in the middle).

Hey – nifty little calculator. I like it.

Graded parts are an interesting topic. Like you point out, cherry picking at the manufacturer means the lower grade shipped distributions won't be normal, but could be bimodal (since offset would have a plus limit and a minus limit if only the offset voltage is graded). The higher-grade parts would have a more normal distribution though if there were only two grades to buy. But even that claim is flawed because humans get involved when deciding how to meet an order if there is only flavor A in stock versus flavor B.

When grades are determined the issues become very complicated since there could be more than one specification that is graded (i.e. offset voltage and common mode rejection ratio). So it is too simplistic to make a prediction about what one will see in a distribution unless the purchased lot has no human intervention in the distribution. LM324s are a good example case to discuss because the volumes are so huge and price so low that games aren't typically played when deciding test limits or guard bands. Ship everything!

Perhaps the more interesting discussion is what constitutes a typical specification. Is typical the mean? Plus/minus three standard deviations?

Consider the LM324 case I wrote about. A 3mV typical offset is not the one-sigma offset because the manufacturers ship grades. If the worst case is about 9mV, then more likely the one sigma seen by the manufacturer is about 1.5 mV. So a 3mV typical spec means THE MANUFACTURER is seeing about 2 standard deviations or 70% of the yield better than typical. But if a huge order for LM124s came in the day before yours, don't expect your second LM324 distribution to look like the first order. It will be in spec, but probably the mean will be worse. And then we would have to talk about guard band limits at final test. Maybe those get moved around to hit a volume order?

At the risk of courting controversy… I'd be concerned about a design whose integrity and performance depends on the issues of grading and central limit theorem 'safety in numbers'. I speak as someone who's been involved in product designs that need to do exactly that – but that was in the 1980s, in a commercial galaxy now far, far away.

In those days you spent some money buying inherent stability and then some more in sorting components to make use of a value measurement phase that acts effectively as a transfer calibration, a gift of accuracy to the end product that you hope will last.

Thirty years on from those exciting, manual days, I'd propose that the only two sound modern approaches are either to architect the solution so that it is structurally insensitive to the variations that modern component suppliers foist on you, or so that it can be automatically calibrated on the production line to unanbiguously eliminate the disturbing effect of those variations.

I wouldn't bet my company on a design which relies on the distribution of parameters in disparate batches of components being 'random'. A good design will work, to however many sigma you plan for, with /any/ combination of components that completely meet the specifications the suppliers undertook to supply.

just my personal view, having had to work on both sides of the precision divide.

OK – good explanation – and it helps explain why parts vary from lot to lot. Thanks.

I can see that sorting issue both ways. I worked at a small company that bought lots of JFETs. We sorted some of them for one model that neede a certain gate threshold voltage; then used the rest elsewhere where it wasn't critical. We might ship 100k units per year. Our labor cost (person doing the testing) was pretty low, so it made sense economically and from an engineering perspective.

But at another company where the product was very high-end, expensive equipment and we sold between 200 and 1000 systems per year, we did careful sensitivity studies and avoided a design where minor variations in device/component specs would muck up the design. And there was lots of software driven calibration procedures in place too.