You are undoubtedly aware of the absolute mess that Volkswagen is in, as they deceitfully and deliberately switched between various diesel engine- and exhaust-control management algorithms when the car was on the road versus when it was undergoing formal evaluations on the test bed. The apparent intention was unwind the tradeoff between meeting lower emission mandates but at the cost of lower mileage in the tests, yet also do well in mileage results on the road. (Sad to say, many engineers were undoubtedly involved, and that's the kind of ethics black eye which our profession does not need.) I won’t try to further dissect why they did it or the long-term implications, as every pundit, blogger, and columnist out there is already doing that.
I was, however, intrigued by how this is scheme was discovered. As noted in an article in The Wall Street Journal , “VW Emissions Problem Was Exposed by West Virginia University Researchers,” some students and a professor were working on a small grant-funded project to review some diesel-emissions data and they did something basic and obvious: they bought a real car and borrowed some others, hooked up their sensors and instrumentation directly to the tailpipe, and took emissions data from actual on-the-road driving. When their results differed significantly from the official data, they checked again, and that's how the deceit began to unravel.
This brings me to my concern, and it is not just related to cars, VW, or similar situations. We have so much sophistication and complexity in our analysis of signals coming from so many sensors and translated into data that it is easy to forget to ask some basic questions: how do we know this answer is correct? Are there any independent cross checks we can do? Can we first measure something directly and do a basic analysis of the data, using rough calculations and estimates, to see if the numbers are likely correct?
I had a glimmer of this situation several years ago, when I was a judge at a local high-school science fair. Many of the projects were straightforward, such as growing plants under different lights, or building some mechanical devices; you could follow the chain of reasoning, analysis, and results; there was a certain provenance, you might say.
But there were a few projects where the student had gotten access to very sophisticated lab equipment, and was doing gene sequencing or something along those lines. I wondered: did the student have any grasp of what the likely or right answer should be? Did the student blindly accept the number which the very expensive “black box” spits out, and reach conclusions based on those? Perhaps the instrument is way out of calibration, perhaps its algorithms are faulty, or perhaps it is programmed maliciously to generate random numbers around the nominal answers—who would know?
That's why it's always important to figure out a way to check sensor-based data and results as close to the source as possible, before the fancy analysis, the color charts, and the razzle-dazzle has been added. You could be going off in a very wrong direction and not even suspect it.
This is not a new problem nor is it one that only engineers face. In Einstein's Ph.D. dissertation “A New Determination of Molecular Dimensions ” (one of five brilliant papers he published in 1905, including the best-known one on special relativity), he does a complex analysis of motion of particles in liquid diffusion, the kinetic theory of liquids, and more. I'll admit I can't follow his analysis, but I do know how the story ends: after all his equations and conclusions, he takes some well-established data from other researchers on diffusion coefficients of various solutions and puts them into his equations. The result is a value for Avogadro's number – a parameter not immediately related to the paper's subject – which is very close to the value that had been independently determined by many other techniques through the years. In other words, he was able to verify his intense and unique insight using basic data and an accepted chemistry number.
[Note: if you are interested in reading more on this and his other papers, check out “Einstein’s Miraculous Year: Five Papers That Changed the Face of Physics“– even if you can't follow Einstein's analysis in each paper (I certainly can't), each paper has detailed, clarifying introduction.]
How do you check your results? What's your sanity check? Have you ever seen results presented which simply couldn't be or didn’t make sense, and which perhaps didn’t pass the “smell test”, but which were accepted primarily because that's what the instrumentation and data analysis software said they were?