The practical implications of misplaced confidence when dealing with statistical evidence are obvious and worrying
A little two-part test for you. Imagine you’re a doctor, considering whether to recommend a particular kind of cancer screening, “A”. You discover that this form of screening improves five-year survival rates from 68 per cent to 99 per cent. (The five-year survival rate is the proportion of patients alive five years after the cancer was discovered.) The question is: does the screening test “A” save lives?
Part two: now you consider an alternative screening test, “B”. You discover that test “B” reduces cancer deaths from two per 1,000 people to 1.6 per 1,000 people. So: does screening test “B” save lives?
The second question is easier. Screening test “B” unambiguously saves lives: to be precise it saves 0.4 lives per 1,000 people. That might not seem a lot – and if the test is expensive or has unpleasant side-effects it might not be worth it – but that is the nature of cancer screening. Most people don’t have the cancer in question so most people cannot be helped by the test.
What about screening test “A”? This question is harder. The numbers look impressive, but survival rates are a treacherous way to evaluate a screening programme. Imagine a group of 60-year-olds who all develop an incurable cancer that will kill them at 70. They have no symptoms until age 67, so the five-year survival rate when they are diagnosed at 67 is, I’m afraid, zero. Introduce a screening programme and you can discover the cancer much earlier, at age 62. The five-year survival rate is now 100 per cent. But the screening hasn’t saved any lives: it’s merely given early warning of a disease that cannot be treated.
In general, screening programmes look impressive when evaluated by survival rates, because the purpose of screening is to detect the cancer earlier. Whether any lives are saved or not is a different issue entirely.
I’ll admit, this is a tricky pair of questions. You’d have to be a doctor, rigorously trained in how to handle the evidence base for medical treatments, to get this sort of thing right. But here’s the bad news: doctors do not get this sort of thing right.
An article published in the Annals of Internal Medicine in March put these questions to a panel of more than 400 doctors with relevant clinical experience. Eighty-two per cent thought they’d been shown evidence that test “A” saved lives – they hadn’t – and of those, 83 per cent thought the benefit was large or very large. Only 60 per cent thought that test “B” saved lives, and fewer than one-third thought the benefit was large or very large – which is intriguing, because of the few people on course to die from cancer, the test saves 20 per cent of them. In short, the doctors simply did not understand the statistics on cancer screening.
The practical implications of this are obvious and worrying. It seems that doctors may need a good deal of help interpreting the evidence they are likely to be exposed to on clinical effectiveness, while epidemiologists and statisticians need to think hard about how they present their discoveries.
The situation could be worse. A recent survey by the Royal Statistical Society’s “getstats” campaign asked MPs to give the probability of getting two heads when tossing a coin twice. More than half failed to get the answer correct – including a humiliating three-quarters of Labour MPs.
The answer, of course, is 25 per cent, and is appallingly basic stuff. If I try to translate from numeracy to literacy, I’d say that the doctors’ failure was the equivalent of being unable to write a decent essay about “The Waste Land”, while the MPs’ failure was more like the inability to read a newspaper.
The Royal Statistical Society reported that about three-quarters of MPs said they felt confident when dealing with numbers. This confidence is misplaced.
Also published at ft.com.