The program mentioned an entertaining web site, Spurious Correlations, that you can find here. It's run by Tyler Vigen, a Harvard law student, and it makes the correlation/causation problem glaringly obvious. There is even a feature for finding your own spurious correlation.
![]() |
Number of people who died by becoming tangled in their bedsheets correlates with Total revenue generated by skiing facilities (US); Source: Spurious Correlations |
How to tell if a correlation is spurious is no easy matter. If many things are woven together in nature, or society, they can change together because of some over-arching shared factor. But just because they are found together does not mean that in any practical sense they are causally related. Statistical significance of the correlation, meaning that it is unlikely to arise by chance, is a subjective judgment about what you count as 'unlikely.' After all, very unlikely events do occur!
Causation can be indirect and in a sense it is not always easy to understand just what it means for one thing to 'cause' another. If wealth leads to buying racy cars and racy cars are less safe, is it the driving or the car, or the wealth that 'causes' accidents? If AIDS can be a result of HIV infection, but you can get some symptoms of aids without HIV, or have HIV without the symptoms, does HIV cause AIDS? If the virus is indeed responsible, but only drug users or hookers get or transmit it, is the virus, the drug-use, or prostitution, or using prostitutes the 'cause'?
Another problem, besides just the way of thinking of causation, and the judgment about when a correlation is 'significant', relates to what you measure and how you search for shared patterns. If you look through enough randomly generated patterns, eventually you'll find ones that are similar, with absolutely no causal connection between them.
Looking at the examples on the above web site should be a sobering lesson in how to recognize bad, or overstated, or over-reported science. It won't by itself answer the question about how to determine when a correlation means causation. Nobody has really solved that one, if indeed it has any sort of single answer. And there are some curious things to think about.
Just what is causation?
In a purely Newtonian, deterministic universe, in a sense everything is causally connected to everything else, and was determined by the Big Bang. For example, with universal gravity everything literally affects everything else and through a totally deterministic causal process.
In that sense nothing at all is truly probabilistic. But quantum mechanics and various related principles of physical science hold that some things may really be truly probabilisitic rather than deterministic. If that is right, then the idea of a 'cause' becomes rather unclear. How can some outcome truly occur only with some probability? It verges on an effect without an actual cause. For example, if the probability of something happening is, say, 15%, what establishes that value--what causes it? A systematic or random process with a truly random cause, that is not just our inability to measure it precisely, in a sense redefines the very notion of cause. Such things, some of them seemingly true of the quantum world, really do violate common sense. So the whole idea of correlation vs causation takes on many different, subtle colors.