When statistics lie - and when they tell the truth

A major part of how I earn my living is the honest use of statistics to show what is going right or wrong in the industry in which I work.
 
Unfortunately, as in today's "Quote of the Day" which Mark Twain attributed to Disraeli and, which whoever really first said it, goes well over a century, some people are skilled at using statistics to deceive.
 
When one individual once said in my hearing that "You can prove anything with statistics" another person replied "Including the truth" and they both had a point.
 
Most people reading this will probably have heard exchanges at Prime Minister's Question Time or in other debates when speakers from two opposing viewpoints throw statistics at each other which appear to show opposite things. Usually in terms of literal facts both are telling the truth but of course, one or both statistics must either be misleading or fail to show the complete picture.
 
The graph below, which shows housing starts and completions in England over the past decade, is an excellent basis for an illustration of how people should choose their statistics to give an honest picture, or how a less scrupulous person could pick them selectively to slant the argument in a particular direction.
 

Any honest and intelligent person who looks at this graph should have no difficulty in agreeing that

1) House-building in England grew steadily for most of the first decade of this century before suffering a catastrophic fall during the last two years of the last Labour government.

2) The rate at which houses were started and built stabilised at about the time of the 2010 election, and

3) During the time the Coalition government has been in office, there has been a modest upward trend and the house-building rate is now higher than they inherited, though it has yet to rise to even the level of the early years of this century.

That's what an honest person would say. But supposing you are a Labour spinner who wants to produce a statistic that makes Labour's housing numbers look great and the Conservatives and Lib/Dems much worse? All too easy. Ditto if you are a Conservative or Lib/Dem spin doctor and want to make a modest recovery look like a fantastic one.

First of all, if you wanted to give an honest answer to the question of what is happening to house-building, you would compare equivalent points of the cycle, e.g. peak to peak or trough to trough.

If you look at the lowest points of the curve you have an increase from a little under 80,000 annual housing starts at the end of the 2008/9 year to about 100,000 in 2012/13, an increase of something over 20,000 housing starts or about 25% in three years. Equally, and perhaps more appropriately if you are trying to produce a fair assessment of housing progress under the coalition, you could measure from the peaks in housing starts,  e.g. about 112,000 in mid 2012 to 140,000 in 2014 an increase of about 28,000 and, again, about 25% this time in two years.

However, if you were a coalition spin doctor, and the genuine truth that the present government has got the construction of new homes back onto an upward trend is not enough for you, you could say something like this

"We have nearly doubled house-building: Housing starts have risen from 80,000 in the last full year of the last Labour government to 140,000 last year!"

Literally true but not entirely fair. The trick, of course, is that you would have taken your base from the lowest part of the cycle and the point you compare with from a peak.

Similarly a totally honest statistician looking at the Labour record would probably consider the whole period recorded by the graph and thereby take into account the early successes and the swing-back after the crash as well as the crash itself. Such a person might note that housing starts were 150,000 at the start of the century and 110,000 when Labour left office, and conclude that overall Labour had cut housing starts by 40% or nearly a third.,

But someone who wanted to make Labour's record look even worse - not that it really needs it - could take peak to trough of the 2008 crash and say "On Labour's watch, the economy fell over a cliff and housing starts were worse than halved, falling by more than 100,000 annually from 180,000 to less than 80,000 !" 

And they wouldn't be lying except in Disraeli's sense of lies, damned lies, and statistics.

The same trick could be played in reverse by Labour spinners to blame the disastrous slump in the  last two years of the Labour government on the coalition.

Suppose you were a Labour party spin doctor with no interest whatsoever in a showing what had really happened and wanted to make Labour's performance in building houses look better than the coalition's.

All you have to do is pick a Labour benchmark which is wholly or mostly based on data taken before the crash and compare it with any Coalition benchmark, which will be after the crash. Any such comparison will be perfect for the aptly-named Mr Balls to use.

Housing completions lag behind starts, so someone with this agenda will take completions rather than starts because it makes it easy to ensure that your Labour base comparison is less affected by the slump. That way the average completions rate over the 2005-2010 parliament is mostly driven by houses which were finished before the catastrophic fall in housing starts from 2008 really started to bite.

Conversely, most of the houses which were started as the recovery kicked in from about 2013 were not finished in time to show up in the period covered by the chart above.

So all a dishonest Labour spinner who wanted to blame the coalition for the crash during his or her his own government's term of office would have to do is take the average completions during the last Labour parliament, and it will come out around 140,000 p.a. Then compare with average completions during the coalition's terms of office up to the most recent figures available, and it will come out at around 115,000 annually. So they can accuse the coalition of building 25,000 houses a year fewer than Labour.

This would be just as true - and just as dishonest - as a Conservative spinner claiming that Labour had halved housing starts and the Coalition doubled them.

Does this mean that all statistics are lies? Absolutely not.

An intelligent person can look at the graph on this article and see what has really happened. As I hope I have explained above, a totally honest statistician can make a perfectly good case for the coalition and against Labour on these figures which, although less exciting than the "spun" figures is still powerful, without needing to rig them at all.

The key thing for the honest and intelligent voter is to try to get the full facts. Two excellent books I can particularly recommend to help you cut through misleading statistics are

How to Lie with Statistics by Darrell Huff (an oldie but a goodie)

and

Damned Lies and Statistics by Joel Best.

Comments

Jim said…
Statistics can tell you a lot of facts. its just those individual facts are not telling the same overall truth.

There are no cross dressers living on Seaview drive. There has never been a road accident on Seaview drive. These 2 facts are both true, and they both correlate (both are at zero). but coloration does not always mean causation.

so i could not take these two facts and state the moment a cross dresser buys a home on seaview drive the road accident rate will increse.
Chris Whiteside said…
Indeed: correlation does not prove causation.
Jim said…
No doubt you have seen in your career a number of people who will state a case for it though. Its one of the biggest "statistical lies"
Chris Whiteside said…
Oh, absolutely, though in my experience most people who jump to the conclusion of "post hoc ergo propter hoc" or similarly confuse correlation with proof of causality are mistaken rather than deliberately lying.
Jim said…
Thats what I mean by "statistical lies" its not the statistics that are lying its the way they are being interpreted that is mistaken.
Im not intentionally calling anyone who used them previously a deliberate liar, just saying they were often wrong in their interpretation of that which was available.
Chris Whiteside said…
Fair enough. Some statistical errors are entirely innocent - I think we both agree this is an example.

Popular posts from this blog

Nick Herbert on his visit to flood hit areas of Cumbria

Quotes of the day 19th August 2020

Quote of the day 24th July 2020