An inconvenient truth is that the two, retrospective and prospective analysis, are not the same and their connection hinges on these assumptions but the assumptions are by no means obviously true. We have written about this basic set of problems many times here.
Now a new study, which we saw first as reported here in the NY Times, is that while overall death rates generally have been dropping in the US, the authors note "the declining health and fortunes of poorly educated American whites. In middle age, they are dying at such a high rate that they are increasing the death rate for the entire group of middle-aged white Americas, [authors] Dr. Deaton and Dr. Case found.....The mortality rate for whites 45-54 years old with no more than a high school education increased by 135 deaths per 100,000 people from 1999 to 2014."
This is very different from other developed countries, for this particular age group, a shown by this figure from the authors' PNAS paper, and deviates from the generally improving age-specific mortality rates in these countries.
|From Deaton and Case, PNAS Nov 2015|
There are lots of putative reasons for this observation. The main causes of death were suicides, drugs, and alcohol related diseases, as shown below by the second figure from their paper. There were mental illnesses associated with financial stress, opiate misuse and so on.
|From Deaton and Case, PNAS Nov 2015|
There are sociological explanations for these results, results that other demographic investigators had apparently not noticed. They do not seem to be mysterious, nor is there any suggestion of scientific errors involved. Our point is a different one, based on these results being entirely true, as the seem to be.
When the future is unpredictable, to an unpredictable or unknowable extent
Why were these findings a surprise? First, perhaps, because nobody bothered to look carefully at this segment of our society or at these particular subsets of the data. To this extent, predictions of disease based on GWAS and other association studies of risk will have used past exposure-outcome associations to predict today's disease occurrences. But they'd have been notably inaccurate, because the factors Deaton and Case considered either were not considered and/or because behavioral patterns changed in ways that couldn't have been taken into account in past studies. There may of course be other causes that these authors didn't observe or consider that account for some of the pattern they found, and there may be other subsets of populations that have lower or higher risks than expected, if investigators but happened to look for them. There is, of course, no way to know what data, causes, or subsets one may have not known about, not been measured, or just not considered.
That is a profound problem with risk projections based on past observations. The risk-factor assessments of the past were adjusted for various covariates in the usual way, but one can't know all of what one should include. There is just no way to know that and, more profoundly, as a result no way to know how inaccurate one's risk projections are. But that is not even the most serious issue.
Much deeper is the problem that even if all exposures and behaviors of study subjects from whom risk estimates were made by correlation studies, these have unknown and unknowable relevance to future risks. The reason is that the exposures of people in the future to these same risk factors will change, even if their genomes don't (and, of course, no two current people have the same genome, nor the same as anyone's in studies on which risks were estimated). Even if the per-dose effects were perfectly measured (no errors of any kind), the mixture of exposures to these factors will not be the same and hence the achieved risk will differ. There is no way to know what that mix will be.
Worse, perhaps by far, is that future risk exposures are unknowable in principle. If a new drug for treating people under financial stress, or a new recreational drug, or a new type of cell phone or video screen, or a new diet or behavioral fad comes along, it may substantially affect risk. It will modify the mix of existing exposures, but its quantitative effect on risk simply cannot be factored into the predicted risks because we can't consider what we have no way to know about.
The current study is a miners' canary in regard to predictions of health risks, whether from genetic or environmental perspectives. This particular study is retrospective, and just shows the impact of failure to consider variables, relative to what had been concluded (in this case, that there has been a general improvement of mortality rates). The risk factors and mortality causes reported are within the general set of things we know about and the study in this case merely shows that mistakes in using the data and so on--not any form of cheating, bad measurement, etc.--is responsible for the surprise discovery. These things can be easily corrected.
But the warning is that there are likely many factors related to health experience that are still not measured, but should be, and that there are also an unknown number that have not been measured, for the simple reason that they do not yet exist. The warning canaries have been cheeping as loudly as they can for quite a while, both in regard to environmental and genomic epidemiology. The fault lies not in canaries, but in miners' leaders, the scientific establishment, who don't care to hear their calls.