The Mermaid's Tale: decline effect

"Our facts are losing their truth"
We all are taught the 'scientific method' by which we now understand the world. It is widely held that science is marching steadily toward an ever more refined and objective understanding of the one true truth that's out there. Some philosophers and historians of science question just how objective the process is, or even whether there's just one truth. But what about the method itself? Does it really work as we believe (and is 'believe' the right word for it)?

An article in the Dec 13 issue of the New Yorker raises important issues in a nontechnical way. "The Truth Wears Off", by Jonah Lehrer.

On September 18, 2007, a few dozen neuroscientists, psychiatrists, and drug-company executives gathered in a hotel conference room in Brussels to hear some startling news. It had to do with a class of drugs known as atypical or second-generation antipsychotics, which came on the market in the early nineties. The therapeutic power of the drugs appeared to be steadily falling. A recent study showed an effect that was less than half of that documented in the first trials, in the early nineties. Before the effectiveness of a drug can be confirmed, it must be tested again and again. The test of replicability, as it’s known, is the foundation of modern research. It’s a safeguard for the creep of subjectivity. But now all sorts of well-established, multiply confirmed findings have started to look increasingly uncertain. It’s as if our facts are losing their truth. This phenomenon doesn’t yet have an official name, but it’s occurring across a wide range of fields, from psychology to ecology.

Unofficially, it has been called the 'decline effect', and Lehrer cites many examples of strong effects going on to 'suffer from falling effect size'.

He mentions the assertion by John Ioannidis, a leading advocate of meta-analysis--pooling studies to gain adequate sample size and more stable estimates of effects--related to a paper he wrote about why most big-splash findings are wrong. The gist of the argument is that major scientific journals like Nature (if that's actually a scientific journal) publish big findings--that's what sells, after all. But what are big findings? They're the unexpected ones, strong statistical evidence behind them.....in some study.

But statistical effects arise by chance as well as by cause. That's why we have to support our case with some kind of statistical criteria, such as the level of a 'significance' test. But if hundreds of investigators are investigating countless things, even if they use a test such as the standard 5% (or 1%) significance criterion, some of them will, just by chance, get such a result. The more studies are tried by the more people, the more dramatic will be the fluke result. Yet that is what gets submitted to Nature and what history shows they love to publish.

GWAS studies magnify the problem greatly. By doing hundreds of thousands of marker tests in a given study, the chance of some 'significant' result arising just by chance is substantial. Investigators are well aware of this, and try to adjust for that by using more stringent significance criteria, but nonetheless with lots of studies and markers and traits, flukes are bound to arise.

Worse, what makes for a Big Splash is not the significance test value but the effect size. The usual claim is not just that someone found a GWAS 'hit' in relation to a given trait, but that the effect of the high-risk variant is major--explains a huge fraction of a disease, for example, making it a juicy target for Big Pharma to try to develop a drug or screen for.

From Dec 6 New Yorker

But a number of years ago Joe Terwilliger and John Blangero showed by simulation that even when there is no causal element in a genomic search, the estimates of the effect size for the sites that survive the significance tests are bloated....that's how they reached their significance when the criteria were cautiously stringent. The effect size, conditional on a high significance test, is biased upwards. So, as Joe et al. said way back then, you have to do new, unbiased sample of the particular purported cause to begin estimating the strength of effect that the cause actually has.

And this brings us back to the New Yorker story of the diminution of findings with follow-up, and why facts are losing their truth.

Lehrer concludes,

Even the law of gravity hasn't always been perfect at predicting real world phenomena. (In one test, physicists measuring gravity by means of deep boreholes in the Nevada desert found a two-and-a-half-per-cent discrepancy between the theoretical predictions and the actual data.) Despite these findings...the law of gravity remains the same.

...Such anomalies demonstrate the slipperiness of empiricism. Although many scientific ideas generate conflicting results and suffer from fallng effect sizes, they continue to get cited in teh textbooks and drive standard medical practice. Why? Because these ideas seem true. Beacuse they make sense. Because we can't bear to let them go. And this is why the decline effect is so troubling. ....it reminds us how difficult it is to prove anything. We like to pretend that our experiments define the truth for us. But that's often not the case. Just because an idea is true doesn't mean it can be proved. And just because an idea can be proved doesn't mean it's true. When the experiments are done, we still have to choose what to believe.

These points are directly relevant to evolutionary biology and genetics--and the over-selling of genetic determinacy that we post so often about. They are sobering for those who actually want to do science rather than build their careers on hopes and dreams, using science as an ideological vehicle to do it, in the same way that other ideologies, like religion are used to advance societal or individual self-interest.

But these are examples of 'inconvenient truths'. They are well-known but often honored mainly in the breech. Indeed, even as we write, on our campus a speaker is going to invoke all sorts of high-scale 'omics' (proteomics, genomics, etc.) to tech our way out of what we know to be true: the biological world is largely complex. Some Big Splash findings may not be entirely wrong, but most are at least exaggerated. There are too many ways that flukes or subtle wishful thinking lead science astray, and why the supposedly iron-clad 'scientific method' as an objective way to understand our world, isn't so objective after all.

Link List

Friday, December 10, 2010

Why most Big Splash findings in science are wrong