Wednesday, October 6, 2010

Air pollution causes diabetes?

Here's a paper in Diabetes Care that we found because it was written up in the New York Times on Monday.  Why?  Proper scientific forewarning?  Scare mongering?

The authors of the paper find a correlation between air pollution (fine particulate matter) and type 2 diabetes.  As the NYT puts it:
A strong link exists between adult diabetes and air pollution, according to a new epidemiological study by researchers at Children’s Hospital Boston. The long-term study builds on previous laboratory studies that have tied air pollution to an increase in insulin resistance, a precursor to diabetes.
The researchers used health, economic, geographical and other data to adjust for known diabetes risk factors, such as obesity, exercise, ethnicity and population density. After controlling for these factors, a strong correlation still emerged between diabetes prevalence and particulate air pollution.
So, those of us who live in polluted cities (such as readers of the NYT) now have to worry about getting diabetes through no fault of our own, just because we live where we do, on top of everything else we have to worry about.  And this one we can't outrun.  Go get a face mask, and hurry!

However, the paper's description of the study isn't quite right.  It doesn't tell the whole story.  From the paper itself:
The relationship between PM2.5 [particulate matter 2.5] levels and diagnosed diabetes prevalence in the U.S. was assessed by multivariate regression models at the county level using data obtained from both the Centers for Disease Control and Prevention (CDC) and U.S. Environmental Protection Agency (EPA) for years 2004 and 2005. Covariates including obesity rates, population density, ethnicity, income, education, and health insurance were collected from the U.S. Census Bureau and the CDC. 
The important fact that this was a county-level study was never mentioned in the Times story.  That is, the fact that the study looked at average diabetes prevalence rates, average obesity, pollution and so on, for whole counties, not individual exposures and covariates.

This is important because of a well-known epidemiological bias called the "ecological fallacy", the problem of attributing group-level characteristics to individuals -- equating group correlations to causation at the level of the individual.  We'd all agree that it was silly to, say, assume that everyone in a voting district was Republican because the county always votes Republican, but in the same way, though harder to intuit, a correlation between high pollution levels and high diabetes rates doesn't tell us anything about any single individual's exposure or duration of exposure to pollution, not to mention whether it caused his or her diabetes.

There may well be alternative explanations for the correlation.  Perhaps diabetes care is good in that county and a lot of patients moved there, after being diagnosed, to take advantage of the care.  Or any number of other possible scenarios.  And the epidemiologist on the study seems to know this:
“We didn’t have data on individual exposure, so we can’t prove causality, and we can’t know exactly the mechanism of these peoples’ diabetes,” said John Brownstein, an assistant professor of epidemiology at Children’s Hospital Boston and co-author of the study. “But pollution came across as a significant predictor in all our models.  
Now, pollution may in fact cause diabetes.  Our point here is not about causation per se (though the biological link doesn't seem obvious from everything that's known about type 2, adult-onset, insulin resistant diabetes, but we certainly can't say it's not possible).  Our point is that the authors haven't convincingly demonstrated a causative link, and it was premature to rush this to print -- and for the NYT to pick up the story -- without better evidence.

The ecological fallacy is in every first-year epidemiology textbook -- and the authors of this paper even refer to it.  The related fundamental logical error is that this equates correlation with causation -- even at the group level.  When authors know they face these issues, the proper thing is not to publish and call their eager friends at the Times,  but to take the result as an indicator that it may be worthwhile following the possible connection up in a proper study.  But that's not the era we live in.

6 comments:

  1. Nice post and another case of "correlation does not equal causality". But newspapers sell on this kind of headlines.

    ReplyDelete
  2. If we can't get beyond the problem, we're condemned to misunderstand much, and make poor societal decisions. But history doesn't suggest we're even realizing this.

    ReplyDelete
  3. Ken,
    As you note the principal issue here is not that "correlation does not imply causation", but that of the ecological fallacy (and the possibility of Simpson's paradox). It is quite possible that at the individual level, the correlation is nonexistent, or even reversed (the sign of the current ecological correlation changes those possibilities NOT AT ALL). So, even your claim that "...take the result as an indicator that it may be worthwhile following the possible connection up in a proper study" is too strong: the observed correlation here by itself tells us absolutely nothing about what the correlation at the individual level might be like: that is, it is no more an indicator than any other result would have been.

    ReplyDelete
  4. Certainly what you say is technically right. We did not want to be always too negative! However, if the association seems to be strong or potentially quite serious, then it would be reasonable to follow it up in some at least preliminary way to see if there's anything to it.

    But, again, you're right in what you say. There is a similar issue related to the basis of GWAS mapping approaches that use SNP markers to try to identify chromosomal locations of disease-causing alleles; a discussion of this, with an appropriately provocative title, is by Terwilliger and co-authors, called "The fundamental theorem of the HapMap", in the Eur J Hum Genet, in case you're interested.

    ReplyDelete
  5. Agreed, John. We erred on the side of politeness this time.

    And, in fact the title of the Terwilliger paper is "An utter refutation of the fundamental theorm of the HapMap."

    ReplyDelete
  6. It is quite possible that at the individual level, the correlation is nonexistent, or even reversed (the sign of the current ecological correlation changes those possibilities NOT AT ALL).

    ReplyDelete