Seriously, why? Many people have embraced direct-to-consumer (DTC) genotyping, or whole genome sequencing, for reasons that we admit we don't understand. But we are clearly missing something. Do people believe that as a general statement future disease is truly predictable from their genome? We think that most geneticists at least, would say not.
Just to be clear, here we're referring to diseases the person doesn't yet have. If one
already has some disease, the usefulness, if any, of genome testing
would be those instances where specific causal variants are known to
respond to specific kinds of treatment. This, however, is only a small
minority of cases, in which the causal variants have relatively clear,
strong, and consistent effect. If this isn't the case, why do people do it? In today's post we lay out our view, hoping it might elicit insights from people who see it differently.
Disease risk prediction
We'll start with GWAS (genomewide association studies), the most common method these days for looking for causal genes. A few recent exchanges on Twitter make it obvious that what people think about the success of GWAS is a glass half full, glass half empty kind of thing. Everyone agrees that genomewide association studies have not explained much variation in most traits, but that's where the agreement ends. Supporters say it doesn't matter because GWAS are teaching us a lot about causal pathways, and have found thousands of replicated signals for complex disease, even if with small effect. Detractors say GWAS are an expensive way to gain very little, and if the point is to be able to predict disease, they can't get us there. Or a more sanguine view, the glass half full and half empty view, which we hold, is that GWAS have very successfully revealed the general shape of genomic effects on traits -- but did so years ago and we need not continue to expand and increase the same approach just to identify ever-more-miniscule effects.
Dr Muin Khoury at the CDC in a recent blog post about the potential public health impact of GWAS, notes the glass half full/half empty quality of the endeavor as well. These studies, he writes, have produced
massive amounts of data, but with little application to public health as
of yet. Though he cites a recent paper by Teri Manolio reporting that specific applications are beginning to be seen, he cautions that it will be decades before the full benefit will be felt.
But is it possible?
Of course, effect sizes will generally need to be larger. Further, such time and size estimates depend on extrapolation from what is known today, and may be wildly inaccurate if some deeper insight about genomic causation comes along. From our point of view, what we know today does not suggest a high payoff from continuing business as usual. And we have to be especially circumspect about messages from on high, that is, from heavily funded investigators or, even moreso, from NIH staffers (like Teri) who are fine people but who fund the work and hence have a very clear if not unavoidable interest in touting its results.
That said, we think it's still fair to say that many people, including human geneticists, agree that we're far from being able to accurately predict complex disease -- from GWAS or anything else, including whole genome sequencing (WGS). Even so, many people look forward to the day when newborns leave the hospital with their genome on a chip. The drumbeat for personalized genomic medicine, backed by administrative decisions to push much of the research funding toward that promise, is not trivial. This is curious, given that a major lesson of the genome era is that environment is a huge factor in the risk of common diseases that fell us, and yet future environments are inherently unpredictable. We've also learned that there are multiple genetic pathways to many traits.
Risk is elusive
And, there are methodological issues with risk prediction from direct-to-consumer companies. A new paper in Genetics in Medicine ("Variations in predicted risks in personal genome testing for common complex diseases", Kalf et al.) reports a comparison of disease prediction from three DTC companies. The authors found substantial differences in predicted risk estimates of specific diseases because the companies use different SNPs, different average population risk estimates, and different formulas for calculating risk. Indeed, average population risk estimates can change with every new study because calculated risk is never the same in different population samples. Is anyone 'right' here? Is everyone 'wrong' and if so, to a measurable or knowable extent?
To date, the best predictor of future disease is family history, and that's because if a disease follows Mendelian patterns of inheritance or risk levels are correlated among close family members, and thus you know that risk genotypes are common in your family, you can infer that you are at higher risk without knowing specifically which one or ten or hundreds of genomic variants are responsible. That is, we don't need GWAS for these diseases.
For the clearer, single-gene caused diseases, we already have an informed medical system, with professional genetic counselors and physicians, that has sorted these out long ago and has long been set up to use various kinds of data to provide very important and useful advice. Which is not to say that the system is infallible for people with such diseases at all. Yes, if
the cause of the disease or disorder hasn't yet been identified, WGS or
WES may be helpful for doing so, though certainly not always, and as it's the very rare Mendelian
disorders that remain unexplained, the search can be complex -- the causal variant may be due to somatic mutation, or the cause may be multiple interacting genes, or in a regulatory region, not protein coding.
In fact, it may not be widely understood that GWAS works because of inheritance and is essentially itself a kind of family data -- but one with unknown, very deep family connections. It has some advantages in that respect, because if cases are only distantly related, they may only share narrow chromosome regions (at the causal genes), whereas close relatives share huge fractions of their genomes. By contrast, if the disease is genetically caused, unaffected controls will be less closely related to each other than are the cases.
In this sense, GWAS removes the close-relationship 'noise' of shared variation. The problem is that when traits are caused by many different genes, family members may have simpler sets of causal genes than random sets of cases and controls. So there are statistical issues at play here. Nonetheless, for the important, common complex diseases, family data are generally more informative than GWAS kinds of data. [There are other issues too much to go into here, such as the guesspothesis that common diseases are caused by very rare genetic variants that might be found in genome sequence data in families.]
So, as we see it, GWAS -- or any other way to identify genetic causation -- won't be very useful for predicting common complex diseases in individuals, or at least only rarely. Again, environment is the often intractable but most important wild-card. So, why are currently healthy people interested in having their genomes typed or sequenced? If you've done it, we'd love to know.