We've already posted some critiques of the current push for ever-larger genomewide association-style studies of disease (GWAS) which have been promoted by glowing promises that huge-scale studies and technology will revolutionize medicine and cure all the known ills of humankind (a slight exaggeration on our part, but not that far off the spin!). We want to explain our reasoning a bit more.
For many understandable reasons, geneticists would love to lock up huge amounts of research grant resources, for huge amounts of time, to generate huge amounts of data that will be deliciously interesting to play with. But such vast up-front cost commitments may not be the best way to eliminate the ills of humankind. It may not even be the best way to understand the genetic involvement in those ills.
In a recent post we cited a number of our own papers in which we've been pointing out problems in this area for many years, and while we didn't give references we did note that a few others have recently been saying something like this, too. The problem is that searching for genetic differences that may cause disease is based on designs such as comparing cases and controls, which don't work very well for common, complex diseases like diabetes or cancers. Among other reasons this is because, if the genetic variant is common, people without the disease, the controls, may still carry a variant that contributes to risk, but they might remain disease-free because, say, they haven't been exposed to whatever provocative environment is also associated with risk (diet, lack of exercise, etc.). And these designs don't work very well for explaining normal variation.
As we have said, the knowledge of why we find as little as we are finding has been around for nearly a century, and it connects us to what we know about evolutionary genetics. Since the facts apply as well to almost any species--even plants, inbred laboratory mice, and single-celled species like yeast--they must be telling us something about life that we need to listen to!
Part of the problem is that environments interact with many different genes to produce the phenotypes (traits, including disease) in ways that would be good to understand. However, our methods of understanding causation necessarily look backwards in time (they are 'retrospective'): we study people who have some trait, like diabetes, and compare them to age-sex-etc. matched controls, to see how they differ. Geneticists and environmental epidemiologists stress their particular kinds of risks, but the trend recently has strongly been to focus on genes, partly because environmental risk factors have proven to be devilishly hard to figure out, and genetics has more glamour (and plush funding) these days: it may have the sexy appearance of real science, since it's molecular!
Like looking in the rear-view mirror, we see the road of risk-factor exposures that we have already traveled. But what we really want to understand is the causal process itself, and for 'personalized medicine' and even public health we need to look forward in time, to current people's futures. That is what we are promising to predict, so we can avoid all ills (and produce perfect children).
We need to look at the road ahead, and what we see in the rear-view mirror may not be all that helpful. We know that the environmental component of most common diseases contributes far more to risk than any specific genetic factors, probably far more than all genetic factors combined do on their own. We know that clearly from the fact that many if not most common diseases have changed, often dramatically, in prevalence just in the last couple of generations, while we've had very good data and an army of investigators tracking exposures, lifestyles, and outcomes.
Those changes in prevalence are a warning shot across the genetics bow that geneticists have had a very convenient tin ear to. They rationalize these clear facts by asserting that changes in common diseases are due to interactions between susceptible genotypes and these environmental changes. Even if such unsupported assertions were true, what we see in the rear-view mirror does not tell us what the road ahead will be like, for the very simple, but important reason that there is absolutely no way to know what the environmental--the non-genetic--risk factor exposures will be.
No amount of Biobanking will change this, or make genotype-based risk prediction accurate (except for the small subset of diseases that really are genetic), because each future is a new road and risks are inevitably assessed retrospectively. Even if causation were relatively simple and clear, which is manifestly not the case. No matter how accurately we can identify the genotypes of everyone involved (and there are some problems there, too that we will have to discuss another time).
This is a deep problem in the nature of knowledge in regard to problems such as this. It is one sober, not far-out, not anti-scientific, reason why scientists and public funders should be very circumspect before committing major amounts of funding, for decades into the future, to try to track everyone, and everyone's DNA sequences. And here we don't consider the great potential for intrusiveness that such data will enable.
As geneticists, we would be highly interested in poking around in the data mega-studies would yield. But we think it would not be societally responsible data to generate, given the other needs and priorities (some of which actually are genetic), that we know we can address with available resources and on other problems or approaches.
We can learn things by checking the rear-view mirror, but life depends on keeping our eye on the road ahead.