Wednesday, November 13, 2013

When causal disease alleles don't cause disease (or is that the norm?)

The general inadequacy of genomewide association studies to explain much of the apparent genomic variation affecting risk of most complex diseases (the 'missing heritability' problem) has led some to suggest that the reason for this is that most complex diseases (diabetes, schizophrenia, autism, etc.) are caused by single rare variants with strong effects, rather than multiple common variants, each with weak effect.

But GWAS can't find these rare variants without hugely ramped up sample sizes which, to some is the obvious next frontier.  Of course, the rarer the variant or the less its individual strength of effect, the harder this strategy will be to implement successfully.  And, of course, it's based on the assumption that causal variants always cause disease, or at least have a high probability of doing so (or to put another way, the assumption -- almost always unstated -- that as sample sizes increase, the signal-to-noise ratio is well-behaved). 

A paper in the November Nature Genetics ("Assessing the phenotypic effects in the general population of rare variants in genes for a dominant Mendelian form of diabetes", Flannick et al.) addresses the question of population allele frequencies of rare variants associated with a hereditary form of diabetes called MODY (maturity-onset diabetes of the young).

Flannick et al. chose seven genes that have been well-established (e.g., by confirmative findings)  to be associated with the disease and sequenced these in 4,003 individuals in two large heart disease studies.  They found that 1.5% of one group and 0.5% of the other carry 'disease-causing mutations', but with no evidence of disease.  These were individuals well past the MODY age of onset, so it's unlikely that their apparent genetic risk will lead to disease.  That is, the presence of a disease-causing variant doesn't always cause disease.

Flannick et al. conclude: 
The view that rare variants have deterministic effects, whereas common variants have modest effects, reflects in part the ascertainment bias of study designs used in Mendelian genetic research, as well as the true penetrance of rare mutations.
(As to the use of the term Mendelian itself, in this case, see our views in yesterday's post.) 

There are several studies on the books now that search candidate genes in cases and controls and find roughly similar frequencies of mutational variation in both groups. One study of ion channel and related genes associated with epilepsy comes to mind ("Exome Sequencing of Ion Channel Genes Reveals Complex Profiles Confounding Personal Risk Assessment in Epilepsy," Klassen et al., 2011). The study compared exome sequence of 237 channel genes in individuals with and without sporadic idiopathic epilepsy. 
Rare missense variation in known Mendelian disease genes is prevalent in both groups at similar complexity, revealing that even deleterious ion channel mutations confer uncertain risk to an individual depending on the other variants with which they are combined.
Are these genes non-penetrant, to use the standard parlance? Or non-causal?

There have by now been several extensive-sequencing papers that have found that we all are walking around with many (estimates are, as we recall, about 150) seriously mutated or defunct genes that otherwise should be associated with disease -- that we don't have.  Clearly, simple unitary causal models don't account for protective genomic backgrounds or environments -- indeed, even using words like  'protective' or 'compensating' effects gives place of pride the putatively causal gene; they're saying, in effect, that it's causal except when it's not.  Maybe better would be to say it's harmless except when it isn't -- that may be a more accurate picture even if it's just as weasly.

This points to a problem with disease prediction from direct-to-consumer screening companies, which report risk probabilities to consumers based on what is potentially largely biased data, and the same may apply to newborn screening based on whole genome sequencing such as is just getting underway in Boston, and more generally to the idea that we should all have our genomes sequenced.

But it also suggests that larger and larger studies aren't necessarily going to find causal genes. If rare apparently causal alleles are found sometime at equivalent frequency in affected and unaffected individuals, GWAS of any size aren't going to identify them.

Indeed, in the face of the kind of complexity we're encountering these days, as we've tried to note in many past posts, rather than rethinking the situation we face, we see journals publishing lots of 'this is the answer' papers, each author claiming to have the right method, or explaining what 'the reason' is for the perplexing findings.  The very qualifier 'the' belies the underlying simplism of thinking, or the hunger for simple answers when we know that real life is a complex mix of causal processes, and there is no such thing as 'the' reason.  But changing the perspective to something more creative is threatening, despite all the data, such as in the Flannick paper, that make very clear what we're facing.

No comments: