Friday, September 5, 2014

When do you believe research?

At the end of the mini-course we taught in Helsinki, after a week of discussion of many essentially philosophy-of-science issues including how to make decisions about cause and effect, or how to  determine whether a trait is 'genetic', or if it can be predicted from genes, a student asked how we decide which studies to believe.  That is, responding to our questioning nature, he wanted to know how we decide which research reports to be skeptical about and which to believe.  I've been thinking a lot about that.  I don't really have answers because it's a fundamental question, but here are a few thoughts.

The class, called Logical Reasoning in Human Genetics, is meant to get students thinking about how they know what they think they know.  Ken gave a lecture on the first day in which he talked about epistemological issues, including the scientific method. We're all taught from childhood that knowledge advances by the scientific method. There are multiple definitions, but let's just go with what's cited in the Wikipedia entry on the subject, in turn taken from the Oxford English Dictionary.  It's "a method or procedure that has characterized natural science since the 17th century, consisting in systematic observation, measurement, and experiment, and the formulation, testing, and modification of hypotheses." But most definitions go further, to say one adjusts the hypothesis until there is no discrepancy between it and the latest results.  This is how many web pages and books portray the process.

But this is awfully vague and not terribly helpful (if it's even true: for example, when is there no discrepancy between hypothesis and actual data?).  Who decides what is systematic observation, how and what we measure, how we conduct experiments, and formulate, test and modify hypotheses?  And even if we do agree on all this, it wouldn't give any hint as to which results should be believed.  Any that follow the method?  Any that lend evidence to our hypotheses?  There was plenty of evidence for the sun revolving around the Earth, and spontaneous generation, and the miasma theory of disease, all based on systematic observation and hypotheses, after all.  Clearly, empiricism isn't enough.

In his first lecture, Ken showed this slide:

The essential tenets of the scientific method.  Most of us would include at least some of these criteria in a list of essentials, right?  Ken discussed them all, and then showed why each of them in turn may be useful but cannot in fact be a solid basis for inferring causation.  One may hypothesize that all swans are white, and it may seem to stand up to observation -- but observing a single black swan does in that theory.  Figuratively speaking, when can we ever be sure that we'll never see a non-white swan?  So induction is not a perfectly reliable criterion for forming general inferences.  Prediction is an oft-cited criterion for scientific validity, but in areas of biology depends on knowing future environments, which is impossible in principle.  Scientists claim that theories may never be provable but can always be falsified, which leads to better theory. But scientists rarely, if ever, actually work to falsify their own theories.  And one can falsify an idea by a bad experiment even if the idea is correct.  P-values for statistical significance are subjective choices: P = 0.05 was not decreed by God. And so on.

So, then Ken added the following criterion:

This is probably a better description of how scientists actually do science.  And I'm writing this in Austria, so I'll mention that if you've read Austrian philosopher of science, Paul Feyerabend's "Against Method", this will sound familiar.  Feyerabend believed that strict adherence to the scientific method would inhibit progress, and that a bit of anarchy is essential to good science.  Further, the usual criteria, e.g. consistency and falsification, are antithetical to progress. Indeed, as a philosopher who took a hard long look at the history of scientific advances, Feyerabend concluded that the best description of good science is "anything goes," a phrase for which he is famous, and often condemned. But he didn't mean it as a principle, rather it was a description of how science is actually done.  It is a social and even political process.

However, even an anarchic bent doesn't help us decide which results to believe, even if it does mean that we shouldn't consider that sticklers for method have an advantage.

How do we decide?
A few weeks ago we wrote about a paper that claimed that tick bites are causing an epidemic of red meat allergies in the US and Europe.  Curious.  Curious enough to lead me to read 3 or 4 papers on the subject, all of which suggested a pattern of exposure and symptoms consistent with the habitat of the tick, as well as a mechanism that explained how the tick bite could cause this often severe allergy.  Seemed convincing to us.

But, someone on Twitter wasn't convinced:
The link is to a Lancet article, but it restricts its discussion to the anti-science claims of those who believe that Lyme disease is not what 'evidence-based' medicine says it is.
Similar to other antiscience groups, these advocates have created a pseudoscientific and alternative selection of practitioners, research, and publications and have coordinated public protests, accused opponents of both corruption and conspiracy, and spurred legislative efforts to subvert evidence-based medicine and peer-reviewed science. The relations and actions of some activists, medical practitioners, and commercial bodies involved in Lyme disease advocacy pose a threat to public health.
But should we skeptical about all tick-borne diseases?  The CDC still lists a number of them.  I don't know enough about this subject to comment further, but it's interesting indeed that antiscience claims can themselves be couched in a semblance of the scientific method.  Or at least a parallel track, with its own 'experts', publications, peer reviewers, and so on.  In fact this makes the question of how one decides what to believe almost mystical, or dare we say religious. Surprisingly, while it is often said that science, unlike other areas of human affairs, isn't decided by a vote, in reality group consensus about what is true is a kind of vote among competing scientists; the majority or those in most prominent positions, do tend to set established practice and criteria.

Or what about this piece, posted last week by the New York Times, on the effects of bisphenol on ovarian health?  Evidence seems to be mounting, but even people in the field are cautioning that it's hard to tell cause and effect.  Or, what about the causes of asthma?  Environmental epidemiology has found that breast feeding is a cause, but also bottle feeding, excessive hygiene, or pollution.  Same methods -- "systematic observation, measurement, and experiment, and the formulation, testing, and modification of hypotheses" -- contradictory results.

Or, what about climate change?  How do we decide what to believe?  Few of us are expert enough in meteorology, geology, or climate history to make a decision based on the data, so essentially we must decide based on whether we believe -- yes, believe -- that the science is being rigorously conducted.  But, how would we know?  Do we count the number of peer-reviewed papers reporting that the climate is changing?  If so, that's just a belief that peer-review adds weight to findings, rather than is simply evidence of a current fad in thinking about climate, or circling of wagons, or some other sociological quirk of science.  Do we count the number of papers or op/ed pieces written by US National Academy members, or Nobel prize winners?  In which case, we're even further from actual scientific evidence.

We can list one criterion that, today, must be true.  The results must be evolutionarily sound.  Evolution is probably as close as biology comes to 'theory'; descent with modification from a common ancestor.  If results don't fit within that theory, they are probably wrong.  But not definitively -- we should always be testing theory.

Here's another one, that must be true when considering causation -- the cause must precede the effect.  (This is one in a list of nine criteria sometimes relied upon in epidemiology, the rest of which aren't necessarily true, recognized even by Bradford Hill who devised the list.)  But this isn't terribly helpful. Many things can precede an effect, not just one, and many things that precede the event are unrelated to it.  Which such preceeder do we accept?

Several criteria that might help are replication and consistency, but for many reasons they can't be considered sufficient or necessary.  They might confirm what we think we know -- but consistent and replicated findings of disease due to bad air prior to the germ theory of disease confirmed miasma as a cause.  Life is about diversity, and that is how it evolves, so replication is not a necessary criterion for something about, say, genetic causation, to be true under some circumstances but not all.

Science is done by scientists in (and these days supported by) society.  We need jobs and we try to seek truth.  But one proverbial truth is that science should always be based on doubt and skepticism: rarely do we know everything perfectly.  Once we stop questioning -- and the hardest person to question is oneself -- then we become dogmatists, and our science is not that different from received truth in religion.

Scientists may rarely think seriously or critically about their criteria for truth.  We believe that there is truth, but it's elusive much of the time, especially in complex areas like evolutionary biology, genetics, and biomedical causation.  A major frustration is that we have no formal criteria for inference that always work.  Inference is a kind of collective, social decision process, based on faith, yes, faith in whatever a given scientist believes or is pressured by his/her peers to believe.  The history of science shows that this year's 'facts' are next year's discards.  So which study do we believe when there are important implications for that decision?  If it's not true that you can "use whatever criteria you want", for various pragmatic reasons, then what is true about scientific inference in these areas of knowledge?


Anonymous said...

If you are not aware, we recently published a paper debunking a previous 'discovery' that had been widely touted in media.

I cannot believe how sloppy the debunked paper is in every step of analysis, but sloppiness is a feature, not bug, for a large number of papers in related field. During the course of my investigation of related field, I noticed a trend that is quite disturbing. The government (NIH) is funding a large number of psychologists, because government has a separate budget for them. Those guys are told to work on their subject area (psychology) and are also paid to investigate gene and genome-based mechanisms, because that is the fashionable area of the day. As a result, any nonsense is funded, getting published in good journals and then touted by media as a major discovery.

As a result, the same nonsense gets 'reproved' by another psychologist in another sloppy paper published in major journal and then another. Soon you have a theory confirmed by many papers that will go into text books.

Based on what I have seen, I will NOT trust any research and especially those funded by US government agencies, unless I personally know the researcher to be careful, and even then I will critically look into the paper to trust it.


Ken Weiss said...

To Manoj
I think the core of your comment is a profound point, though I don't think US government funding is any particular culprit (think of drug trials, and is the EU or China any better?). But there is a serious problem these days, of the sort you identify--somewhat off the exact topic of specific criteria for inference, but highly relevant. We think we'll write more on this topic, in reference to your message and the paper you linked us to.