Our last post dealt with the way we cling to unproven, often unprovable, and sometimes downright already-disproven hypotheses. Perhaps you've gotten the "Flu and the ONION" email that promises that you'll not get the flu if you put onions in your socks. We mentioned a few reasons that people keep believing things like this, in spite of the less than convincing evidence. From aquatic apes to Big Foot, these are a particular kind of its-always-over-the-next-hill hypotheses. Given the huge territory in which he's said to have been sighted, it's dashed hard to prove that Big Foot doesn't exist!
And then there's phlebotomy. That's the theory that therapeutic bleeding will cure almost any disease. That's because disease represents an imbalance among vital factors (such as the four 'humours' of classic Galenic medicine). Such theories hang on for centuries (now, with a faster pace to life, decades may be the normal turnover time). Afterwards, such as now, we laugh heartily at the foolish simpletons who actually believed in such things--in the face, we'd say, of the overwhelming evidence that it simply doesn't work, you Dufus! If only they'd have looked at the evidence--that is, if they'd taken our idea of 'evidence based medicine', they'd readily have seen how foolish their ideas were.
But that's not quite so, and it's not quite fair. Those Dufi (that's the plural of Dufus) had the same IQ points that you and we have. They were absolutely aware of and discussed the evidence. Even Hippocrates, roughly 400 BC, wrote voluminously about evidence!
Then why did they believe in their system? Because, by their lights, it worked! It's all very nice for us to say it was bunkum, but physicians had their theories of why it worked (e.g., the four humours) and, after all, the vast majority of their patients recovered! What better evidence can you ask for? Is that not evidence-based medicine? And, even today a lot of grandmothers survive the flu because onions soak up the germs.
Why the four humours (or onions) seemed like such a plausible explanation, of course, is a combination of the power of a belief system and the kinds of explanations that they had for the failures (some patients did die, after all). For those, post hoc explanations would be offered (too far gone by the time of treatment, body too weakened with mistreatment -- grog or carousing -- to recover, God was calling, etc., etc.). Sometimes, just an honest "I don't know, but the treatment doesn't always work." These are not refutations of the theory, but exceptions. Wisdom and experience (and Galen's texts) were the criteria for interpreting the evidence.
What we have now in cases like this are formal statistical criteria, and a probabilistic view of efficacy. Probabilities leave room for failure that is 'explained' by the fact that the therapy only works with some probability. We often, if not usually, say we don't know why. But our formal statistical tests ask whether the treatment works relative to no treatment, or to some alternative treatment. We ask whether more people recovered than would happen by chance if the treatment had no effect, and set some chosen level of statistical significance, p, often set to 0.05 (5% chance of misinterpretation).
The p-value is subjective, but gives us at least an accepted, or conventional criterion for success. We know this approach is not perfect, but it sets a kind of standard that is at least somewhat more objective than 'wisdom'. So if the treatment passes this test, it works. In many ways, it's a formalization of the kinds of subjective way that phlebotomy and the four humours were evaluated, but being more standardized it does seem to improve the confidence we can have that our theory works.
In fact, there are some aspects of the four humours theory that are right. Things are out of 'balance' when we get sick: we're out of our physiologic equilbrium or homeostasis. We're a lot more specific: blood pressure or glucose level may be too high, rather than the 'sanguinous factor is excessive.' Blood-letting in the evening can, in physiological fact, lead to feeling improved in the morning after. And there is the psychological lift one gets from being treated and cared for.
But the post hoc justification for what one is doing is not just something in the benighted past. It's part of our own hopeful-thinking today. GWAS and Huge Comparison Studies (like biobanks) have been notoriously incomplete, to be kindly about it. But many papers pour forth raving about its success. This is about the same as the successes of phlebotomy: whatever works is credited as a 'hit', and the failures are ignored, downplayed, hand-waved, or in other ways dismissed as exceptions that don't undermine the wish.
This similarity applies even though GWAS and biobanks will use rigorous (that is, conventionally subjective) statistical criteria for interpreting results. If the achieved p-value is not quite the nominal one, we call the result 'suggestive' and soldier on in the same direction. If it is small and convincing, but accounts for but a small fraction of cases, that's treated as encouraging. We may wish to call it very different from and much better than the bad old days, but the story has closer resemblances than we like to think.
There are many stories of theories believed for reasons of stubbornness or worse. There are explanations of the unobservable past or future, whose plausibility is a matter of our own experience and culture, or a belief system to which we often tend to cling, such as that 'it must be due to natural selection for X', but for which proof is elusive.
And then there's phlebotomy--and GWAS!