Thursday, December 30, 2010

The problem of correlation and causation is solved.....or not

The BBC Radio 4 program, More or Less, is a show about statistics, how they are used and abused in reporting the news. Among other regular messages, the presenters spend a lot of time explaining that correlation is not causation, which of course is something we like to hear, since we say it a lot on MT, too (e.g., here).

For the 12/17 show, they decided to test science journalists in Britain, to see whether they'd bite on a correlation/causation story they cooked up, or whether they were by now savvy enough not to.  The numbers were true, but the mathematician on the show tried to sell the idea that one caused the other, hoping it would warrant a spot on the news.

This guy's story was that there's an extremely strong correlation between the number of mobile phone towers in a given location and number of births.  In fact, each tower is correlated with 17.4 births, to be precise.  A small village with only 1 tower will have very few births, and a city with a lot of towers will have many more.  Well, no one bit.  Or rather, one media outlet bit on the story, Radio Wales, who wanted to talk with him about the problem of confusing correlation and causation.  Apparently it was pretty obvious.

At first look, though, the mathematician assumed it would appear that the number of towers causes an increase in births.  But in fact, of course, both the number of towers and the number of births are a consequence of population size.  They are confounded by population size, an unmeasured variable that affects both observed variables.  And, regular readers know that the issue of confounding is another frequent feature of MT.

The More or Less presenter is hoping that the fact that this story had basically no takers means that British science journalists are beginning to get the correlation-doesn't-equal-causation message, though as the mathematician pointed out, a recent story about mobile phone use causing bad behavior at school suggests otherwise.  And, a glance at the BBC science or health pages is an almost daily confirmation that the problem persists, something we also point out on an as-needed basis.

But that's not really what interested us about this story.  What interested us was what happened next, when the presenter asked the mathematician why making causal links was so appealing to humans, given that they are so often false.

The mathematician answered that it's just our instinct, our brains have developed to recognize patterns and respond to them.   He said we think of patterns as causal links because 'we survive better that way.' Our ancestors thought that the movement of the stars causes seasons to change, for example -- and....somehow that allowed them to live longer. Thus, he said, it's hard to overcome our instinct to assign causality.

Translated, what he meant was that we evolved to make sense of patterns by finding causal links between two things.  (If true, this certainly isn't unique to humans -- we used to have a dog who was terrified when the wind closed a door.  But if a human closed it, that was perfectly fine.  She actually did understand causation!)

But, isn't the mathematician making the very same error he cautions against?  Because we evolved, and because we can see patterns, one caused the other?  This is also something we write a lot about, the idea that because a trait exists, it has an adaptive purpose -- the Just-So approach to anthropology, or genetics.  Many things come before many other things, but that doesn't help identify causal principles that connect  particular sets of things.  And, correlation can be made between variables in many different ways.  Most are not known to us, or at least we're usually just guessing about what the truth is.


James Goetz said...

Perhaps that was the mathematician's second attempt to get the media to bite.:)

Ken Weiss said...

It's somewhat worse, actually. It's confusing axioms with inferences.

The axiom--assumed to be true but not proven or provable--is that everything here needs an adaptive Darwinian explanation.

The inference is that our confusion of correlation with causation is a by-product of our adaptation to noticing co-occurrence: those who didn't, died out.

The strength of reasoning--or its lack of it--is about the same in both instances. But we are so desirous of having answers, and we so like nice 'closed' (no-doubt) stories, that we have a hard time resisting them.

Hmmm, maybe that's because we evolved to need answers so we could find our prey and avoid our enemies.....

John said...

Well, in some cases, we can rely on math: not all cases of correlation are subject to the "correlation is not causation" excuse all. In particular, there are calculable circumstances under which the claim is just false. That is, as Cornfield showed in response to Fisher's chain-smoking claims that the (very high) correlation between smoking and lung-cancer was "just a correlation", you can compute how large the correlation of the alleged confounder (or its multivariate set) must be to produce the observed correlation, and, reject that alternative on those grounds alone:

Ken Weiss said...

Thanks very much for the reference and these pointers. I had not known of this paper, though I had known of Fisher and the smoking gun.

The statistical issue here is relevant to many things, but perhaps not to evolutionary hypotheses, because the underlying concepts including probability and replicability don't have the same meaning as (is typically assumed) in epidemiology--and certainly in the cell-phone example.

I wonder how often the Cornfield points actually apply even to epidemiology, given the heterogeneity and changeability of the complex networks of causation, behavior exposure, and so on. I'd have to play around with the material in the paper (your link) to get a sense in my own mind of how applicable or useful the ideas would be.

But they certainly seem to be highly relevant, and generally not sufficiently part of routine thinking (my own included!). So, again, thanks.