Friday, May 18, 2012

Non-replicability in science: your antelope for the day

A piece in the May 17 Nature supports one of Ken's favorite observations, something he says while wearing his Anthropologist's hat -- "Journal articles are just an academic's antelope for the day."  We're still just hunter/gatherers -- our published papers are, more often than not, nothing more than the way we feed ourselves.  Our basket of berries -- eaten today, droppings tomorrow.

Blackbuck male, females; Photo from Wikimedia, Mr Raja Purohi
Ed Yong, in "Replication studies: Bad copy," reports that most published studies can't be replicated.  This is something we often talk about with respect to genetic studies, and there are many reasons for this that are specific to genetic data, but apparently it's even more rampant in psychology, for reasons also specific to the field.  

And there is the notorious problem that 'negative' results are not published very often.  They're not glamorous and won't get you tenure--even if some of the most important findings in science are 'negative' if they steer work towards valid rather than dreamt-of theory or hypothesis.  Clinical trials are a major example, but less noticed are ephemeral natural selection stories about evolution.

A paper published last year claiming support for extrasensory perception, or psi, for example, produced a major kerfuffle (we blogged about it at the time).  The aftermath has been no less interesting, and informative about the world of publishing, as researchers who tried to replicate the findings but failed also failed to find publishers for their results.  This lead to a lot of discussion about the implications of negative results not being published, a discussion that has flared up frequently in academia, as well it should, although we're no closer to resolving it than ever.
There are some experiments that everyone knows don't replicate, but this knowledge doesn't get into the literature,” says [Eric-Jan] Wagenmakers [mathematical psychologist at the University of Amsterdam]. The publication barrier can be chilling, he adds. “I've seen students spending their entire PhD period trying to replicate a phenomenon, failing, and quitting academia because they had nothing to show for their time.”
But we'll leave that issue for another time.

The question of why studies so often aren't replicable is a different, if related one.  And one that The Reproducibility Project, a large scale collection of scientists from around the world, is addressing head on, as they attempt to replicate every study published in three major psychology journals in 2008, as described last month in the Chronicle of Higher Education.  
For decades, literally, there has been talk about whether what makes it into the pages of psychology journals—or the journals of other disciplines, for that matter—is actually, you know, true. Researchers anxious for novel, significant, career-making findings have an incentive to publish their successes while neglecting to mention their failures. It’s what the psychologist Robert Rosenthal named “the file drawer effect.” So if an experiment is run ten times but pans out only once you trumpet the exception rather than the rule. Or perhaps a researcher is unconsciously biasing a study somehow. Or maybe he or she is flat-out faking results, which is not unheard of. 
According to Yong, the culture in psychology is such that experimental designs that "practically guarantee positive results" are perfectly acceptable.  This is one of the downsides of peer review -- when all your peers are doing it, good scientific practice or not, you can get away with it, too.
And once positive results are published, few researchers replicate the experiment exactly, instead carrying out 'conceptual replications' that test similar hypotheses using different methods. This practice, say critics, builds a house of cards on potentially shaky foundations.
So, if a study isn't replicated exactly (or however exactly it can be), it's possibly because the methods were not described in enough detail for the study to be replicated.  Or, and this is a problem certainly not confined to psychology, the effect was small and significant by chance, as epidemiologist John Ionnides suggested in a paper published in 2005 that garnered a lot of attention for saying most Big-Splash studies are false.  He explained this in statistical terms, having to do with bias in significance levels of studies of new hypotheses and similar issues.

As the Chronicle story says about non-replicability:
The researchers point out, fairly, that it’s not just social psychology that has to deal with this issue. Recently, a scientist named C. Glenn Begley attempted to replicate 53 cancer studies he deemed landmark publications. He could only replicate six. Six! Last December I interviewed Christopher Chabris about his paper titled “Most Reported Genetic Associations with General Intelligence Are Probably False Positives.” Most!
So, psychology is under attack.  We blogged not long ago about an op/ed piece in the New York Times by two social scientists calling for an end to the insistence that the social sciences follow any scientific method.  Enough with the physics envy, they said, we don't do physics.  Thinking deeply is the answer.  But, would giving these guys free rein to completely make stuff up really be the solution?  Well, it might just be, if their peers agree. But, let's not just pick on psychology.  The problem is rampant throughout the sciences. 

Meanwhile, the motto seems to be:  Haste makes....nutrition for scientists!

8 comments:

James Goetz said...

Great post Anne.

Lately, I am refocusing on the definition of the scientific method. For example, if we interpret unrepeatable observations including unrepeatable experiments, then at best we are doing scientific speculation, history, or philosophy. We of course can make scientific hypotheses and theories about past. In these cases, the scientists are not replicating the past, but making inference from, for example, comparative genetics, comparative anatomy, geology, and the redshift in the galaxies. These theories of the past have their limits while there is no need to doubt, for example, the common genetic ancestry of all animals. However, if interpretations of unrepeatable medical or psychological experiments are called a scientific hypothesis instead of mere speculation, then the researcher has crossed the line into pseudoscience.

I will also clarify that I fully appreciate the need for speculation and philosophy, but the problem is when the speculations and philosophy are improperly labeled a scientific hypothesis.

Per "boring" negative results, perhaps open access journals is a convenient place to publish them.

I hope to have more time to criticize such problems in science, and you guys are great source of information for this : -)

Anne Buchanan said...

Thanks, Jim. Many people have written and are writing about the philosophy of science, so that's another place to turn.

It's interesting about non-replicability, because there are many possible reasons why a study can't be repeated. It's not so much of a problem when the effect of the cause is major, and swamps confounders or factors that might have minor effects. You can reproduce cholera pretty reliably, for example. But when the effects of the risk factors you're looking at are minor -- something that, as you know, we write about all the time -- or the effect is due to many factors, or there are multiple ways to produce an effect, then you can easily get into the realm of non-replicable results.

But wrinkle is that non-replicable doesn't necessarily mean wrong. One population has a disease because of one allele, and another population has the same disease because of another. Both real, both non-repeatable. Even though, by convention, results in science are expected to be repeatable, sometimes that's not an accurate reflection of truth.

James Goetz said...

I wholeheartedly agree that "non-replicable doesn't necessarily mean wrong" because historical method of the non-replicable realm can be powerful and accurate. But regardless of the amount of empirical observation used with historical method, all historical theories and hypotheses are not scientific theories or hypotheses. Perhaps your example of different alleles causing the same disease was discovered by historical method with a lot of empirical observation.

Anne Buchanan said...

Here's an interesting piece from today's NYT about the unreliability of the social sciences.

Ken Weiss said...

Our forthcoming pair of posts questions what kind of 'science' is it with such poor predictive powers, as in much or most of social and behavioral science, and the aspects of biomedical, genetic, and evolutionary science that we are typically concerned with.

The classical criteria for scientific knowledge are replicability and predictive ability. Those criteria derive from the idea of laws of Nature. If Nature is not law-like, then what is it?

Clearly these sciences relate to the real world. Clearly some aspects seem genuinely scientific. But what about the rest?

Artists relate to the real world, but arranging it in their own minds. To what extent are these fields of something-like-science similarly adventures in our own minds?

James Goetz said...

I look forward to these posts. I suppose part of the problem is that circumstances with complex variables are difficult or even impossible to replicate while we still need to make some judgment of complex circumstances that are beyond replication.

Ken Weiss said...

Well, "making some judgment" is an accurate way to put it, and maybe we haven't much choice. But if we know the judgments are usually wildly and upredictably off, then why go to the trouble and cost to collect the data--and how do you know whose judgment to trust when it matters?

In any case, it raises the question about what science is and how the world actually works. That's my interest, since I'm not a policy-maker or physician.

James Goetz said...

Exactly. We are not policymakers, but we are writers and taxpayers who want to inform the public and policymakers about what is scientific theory versus hypothesis versus speculation versus pseudoscience. And the pressures of society including ourselves wanting to know answers to various complex questions combined with the phenomena of political struggles increases the challenge of clear and honest communication about these issues.

Incidentally, I recently refocused on this and I am glad that you are starting this series of posts. Despite my opinionated approach, I want to learn more about this from different angles such as yours.