The Mermaid's Tale: If I'm healthy, why should I have my genome sequenced?

Tuesday, August 6, 2013

If I'm healthy, why should I have my genome sequenced?

Seriously, why? Many people have embraced direct-to-consumer (DTC) genotyping, or whole genome sequencing, for reasons that we admit we don't understand. But we are clearly missing something. Do people believe that as a general statement future disease is truly predictable from their genome? We think that most geneticists at least, would say not.

Just to be clear, here we're referring to diseases the person doesn't yet have. If one already has some disease, the usefulness, if any, of genome testing would be those instances where specific causal variants are known to respond to specific kinds of treatment. This, however, is only a small minority of cases, in which the causal variants have relatively clear, strong, and consistent effect. If this isn't the case, why do people do it? In today's post we lay out our view, hoping it might elicit insights from people who see it differently.

Disease risk prediction
We'll start with GWAS (genomewide association studies), the most common method these days for looking for causal genes. A few recent exchanges on Twitter make it obvious that what people think about the success of GWAS is a glass half full, glass half empty kind of thing. Everyone agrees that genomewide association studies have not explained much variation in most traits, but that's where the agreement ends. Supporters say it doesn't matter because GWAS are teaching us a lot about causal pathways, and have found thousands of replicated signals for complex disease, even if with small effect. Detractors say GWAS are an expensive way to gain very little, and if the point is to be able to predict disease, they can't get us there. Or a more sanguine view, the glass half full and half empty view, which we hold, is that GWAS have very successfully revealed the general shape of genomic effects on traits -- but did so years ago and we need not continue to expand and increase the same approach just to identify ever-more-miniscule effects.

Dr Muin Khoury at the CDC in a recent blog post about the potential public health impact of GWAS, notes the glass half full/half empty quality of the endeavor as well. These studies, he writes, have produced massive amounts of data, but with little application to public health as of yet. Though he cites a recent paper by Teri Manolio reporting that specific applications are beginning to be seen, he cautions that it will be decades before the full benefit will be felt.

But is it possible?
Of course, effect sizes will generally need to be larger. Further, such time and size estimates depend on extrapolation from what is known today, and may be wildly inaccurate if some deeper insight about genomic causation comes along. From our point of view, what we know today does not suggest a high payoff from continuing business as usual. And we have to be especially circumspect about messages from on high, that is, from heavily funded investigators or, even moreso, from NIH staffers (like Teri) who are fine people but who fund the work and hence have a very clear if not unavoidable interest in touting its results.

That said, we think it's still fair to say that many people, including human geneticists, agree that we're far from being able to accurately predict complex disease -- from GWAS or anything else, including whole genome sequencing (WGS). Even so, many people look forward to the day when newborns leave the hospital with their genome on a chip. The drumbeat for personalized genomic medicine, backed by administrative decisions to push much of the research funding toward that promise, is not trivial. This is curious, given that a major lesson of the genome era is that environment is a huge factor in the risk of common diseases that fell us, and yet future environments are inherently unpredictable. We've also learned that there are multiple genetic pathways to many traits.

Risk is elusive
And, there are methodological issues with risk prediction from direct-to-consumer companies. A new paper in Genetics in Medicine ("Variations in predicted risks in personal genome testing for common complex diseases", Kalf et al.) reports a comparison of disease prediction from three DTC companies. The authors found substantial differences in predicted risk estimates of specific diseases because the companies use different SNPs, different average population risk estimates, and different formulas for calculating risk. Indeed, average population risk estimates can change with every new study because calculated risk is never the same in different population samples. Is anyone 'right' here? Is everyone 'wrong' and if so, to a measurable or knowable extent?

To date, the best predictor of future disease is family history, and that's because if a disease follows Mendelian patterns of inheritance or risk levels are correlated among close family members, and thus you know that risk genotypes are common in your family, you can infer that you are at higher risk without knowing specifically which one or ten or hundreds of genomic variants are responsible. That is, we don't need GWAS for these diseases.

For the clearer, single-gene caused diseases, we already have an informed medical system, with professional genetic counselors and physicians, that has sorted these out long ago and has long been set up to use various kinds of data to provide very important and useful advice. Which is not to say that the system is infallible for people with such diseases at all. Yes, if the cause of the disease or disorder hasn't yet been identified, WGS or WES may be helpful for doing so, though certainly not always, and as it's the very rare Mendelian disorders that remain unexplained, the search can be complex -- the causal variant may be due to somatic mutation, or the cause may be multiple interacting genes, or in a regulatory region, not protein coding.

In fact, it may not be widely understood that GWAS works because of inheritance and is essentially itself a kind of family data -- but one with unknown, very deep family connections. It has some advantages in that respect, because if cases are only distantly related, they may only share narrow chromosome regions (at the causal genes), whereas close relatives share huge fractions of their genomes. By contrast, if the disease is genetically caused, unaffected controls will be less closely related to each other than are the cases.

In this sense, GWAS removes the close-relationship 'noise' of shared variation. The problem is that when traits are caused by many different genes, family members may have simpler sets of causal genes than random sets of cases and controls. So there are statistical issues at play here. Nonetheless, for the important, common complex diseases, family data are generally more informative than GWAS kinds of data. [There are other issues too much to go into here, such as the guesspothesis that common diseases are caused by very rare genetic variants that might be found in genome sequence data in families.]

So, as we see it, GWAS -- or any other way to identify genetic causation -- won't be very useful for predicting common complex diseases in individuals, or at least only rarely. Again, environment is the often intractable but most important wild-card. So, why are currently healthy people interested in having their genomes typed or sequenced? If you've done it, we'd love to know.

18 comments:

AnonymousAugust 6, 2013 at 5:59 AM
I only have 23andMe genotype data, not sequencing, but would also pay a small fee for the latter. I agree genomic prediction using common alleles is not well powered at the moment (and in many cases, in principle) to give answers that would motivate me to take any action. It's mostly for curiosity, some current utility, and some investment to potential future utility that I got myself typed.

1) Curiosity about ancestry. With reference panels from all around the world, individual haplotypes can be traced to founder populations for an illuminating picture of "where I come from", while birth records only go back a century or two.
2) Actionable large effect alleles (drug dosing, BRCA). Recently, a family member had trouble during surgery due to increased warfarin sensitivity that could have been prevented (or at least the surgeons notified) with this information at hand.
3) Carrier status for severe Mendelian diseases. When planning kids, we could double check whether conditions and alleles not covered by common tests (or tests not available in our country) have potential to yield compound heterozygosity that we should test for.
4) Cumulative gain of knowledge over time. There has only been about a decade of sequencing and array powered genomic discovery, I am sure the utility of genotyping data will increase with time.
5) Contributing my genome data to understanding. Some heritable traits are quaintly interesting (e.g. detached earlobes), but should not be spent public money on for mapping. For some large reference panels (like 23andMe is amassing), gathering information on such traits is cheap, and my data can help in the mapping.
ReplyDelete
Replies
Mark WannerAugust 6, 2013 at 9:54 AM
Holly's comment made me laugh. I give tours/talks about genomics and genomic medicine, which personally I find very compelling. Nonetheless I say that I would support anyone's decision to get genetic information from a DTC company only if they take the view that it's likely going to yield entertainment for the most part, not useful understanding. The predictive power is about as good as you'll get from a phone-in psychic at this point.

I am about to get WGS for myself for a couple of reasons though. First is simple curiosity, as mentioned by the first commenter. Second is to add my data to the pool (I'm in PGP) in hopes that if we can sequence millions and one day figure out how to share and manage the data, not to mention all the ELSI stuff, we'll find out some really useful things along the way. Still probabilistic and not predictive most likely, but useful nonetheless.
ReplyDelete
Replies
Eric TurkheimerAugust 16, 2013 at 11:13 AM
Hi, I wanted to let you know how happy I am to have discovered this blog, and this seems to be as good an entry point as any. Although I have spent a lifetime thinking about the role of genetics in the genesis of complex human behavior and am currently the past-President of the Behavior Genetics Association, I often find myself in disagreement with my colleagues about reductionistic causal models in gene-behavior relations.

If I can be forgiven for plugging one of my own papers here, you might be interested in:

http://people.virginia.edu/~ent3c/papers2/Turkheimer%20GWAS%20EWAS%20Final.pdf

Some other things I have written will be cited there.

Here is my standard argument about the limitations of GWAS. Suppose you are given a stack of DVDs with movies on them, and a microscope. You are told to examine the pattern of dots or whatever through the microscope. Your task is to figure out on this basis whether the movie is a drama or a comedy.

My conclusions:

1) No one is denying that one way or another all the information about the movie is encoded on the DVD.

2) Nevertheless, it won't work.

3) Because the microscope does not encompass the developmental model via which the data on the disc gets turned into a movie.

4) Sample size is not the issue.

5) Despite everything, you will still get some hits. That is, if you have enough DVDs, sooner or later you would find some location on the disc whose state was correlated with drama v. comedy at some level of statistical significance.

Anyway, thanks again for the blog.

Eric Turkheimer
ReplyDelete
Replies
Eric TurkheimerAugust 16, 2013 at 11:47 AM
Thanks. Feel free to post comments and criticisms here when you have them.
ReplyDelete
Replies

Add comment