Statistical Analysis of Fuel Economy Claims
I’m subtitling this post “don’t believe all the data you read.”
|The mean was 3.0 MPG, right at the top of the curve.|
The same is true of something like MPG measured over the
same distance several times; if you run the car a bunch of times in the same
configuration, the natural variability in the system will give a normal
distribution centered around the simple average. Put another way, doing one
run may or may not tell you anything about the actual average result
because it may be sitting out at one “tail” of the normal distribution, far
away from the actual mean. To continue the example above, if I pick a tank at random, the difference between displayed and actual MPG might be 3.0 (exactly at the mean), but it might also be 5.2 or 0.7, and if that is my only datum I will grossly over- or underestimate the actual mean. Note that this is unlikely, but we have
no way of knowing with only one datum!
So, the first problem is this: you need far more than one
datum for each configuration. Statisticians use n = 15 as a minimum number;
fewer tests than that requires much more robust, consistent results to assign
the same confidence to a claim.
H1: μwithout < μwith