Thursday, January 21, 2010

The complexities of complexity

I've just returned from co-teaching a course in logical reasoning in genetics in Helsinki (whether or not everyone agrees that my reasoning is logical is not for me to say). I worked with Joe Terwilliger from New York, Markus Perola from Finland, and Patrik Magnusson from Sweden. Students were mainly Finns, but others from the US, Peru, Sweden and perhaps elsewhere that I've forgotten to mention. Hopefully they got something from the course (along with some fun times in snowy, dark, but very socially hospitable Finland, a very nice place, with very good food).

Naturally a lot of attention was paid to ideas, activities, results, and interpretation of GWAS and other whole-genome studies, and we discussed various study designs for inferring the genetic contributions to complex traits.

A picture is emerging that is wholly consistent with theoretical expectations based on basic genetic and evolutionary conceptions that have been around for a long time, and that we present in detail (though in a different, nonbiomedical context) in Mermaid. It's that for many traits, perhaps even most traits, a large number of genes contribute, along with 'environments' (still an elusive term in many biomedical contexts). Sometimes, one or a few genes are far more important in the biological process generating the normal trait or whose mis-firing can lead to disease. Or many genes may be important but, for various reasons, in any population only one or a few may contain variants in the population that have strong effect on their own.

In these cases, unless lifestyle factors are exceedingly important, the genetic variants can be inferred from family or case-control studies, of which GWAS studies are one scaled-up instance. The strong effects are typically repeatable, and focus attention on one or a few genes. Cystic fibrosis is an example of a usually single-gene trait, and breast cancer is a trait in which variants in a few genes have differentially important effect. In the latter, however, only a small fraction of all cases are accounted for.

Most of the time, although genetic variation is clearly contributing to variation in the trait, be it normal variation in stature or pathologic levels of, say, blood pressure, there is clear family risk: if a close family member is affected, your risk is substantially increased. This implies genes, yet.....mapping can't find most of them. This is called polygenic variation.

As far as predicting your value for polygenic traits, the original method of relating your trait to the value in your relatives, that was due to Francis Galton in the late 1800s, still works best. That's because, when genes are contributing to the trait substantially, similarities among relatives follow known relationships, that reflect the action of all genetic variability in the individuals, and you don't gain much by trying to identify or use all the specific genes. But GWAS efforts are attempting to go further, and at least to identify collections or combinations of known variants that may give you a risk 'score' that has at least some predictive power. That means, the identification of many of the polygenes (genes with individually small effect).

Extremely large sets of data, such as whole population biobanks, with full genome sequence, will become available in the predictable future. Such risk scores seem therefore in the offing. But even those who are writing papers proposing such personalized genetic risk scores recognize that the predictive power for disease may remain low in most instances, and it may be a long time before it shows clinical value.

One of the complicating ironies is that, in these conditions, the vast majority of cases of a disease of this sort will be the only cases in their families! That is, the trait is substantially genetic, but with so many possible contributing genotypes that rarely will close relatives inherit enough 'risk variants' also to be affected. That happens only in the subset in which one or a few strong-effect variants are being transmitted. This is similar to the statement above that Galton's classical prediction of trait values among relatives is better by far than trying to enumerate the contributing variants.

There is much food for thought here, with serious implications for what it means to say a case of a disease is 'genetic', or how and when using genetic information will be particularly useful. There are many other issues that are worth discussing in the future, too, but this at least summarizes the essence of what the pro-GWAS advocates are discussing these days, even while recognizing that the kind of promise offered for this approach is not going to be realized.

*References to some of these points are technical so I didn't include them here, but they could be sent on request.*


Tomas Lindblad said...

Question: Were you in Helsinki in the role of the skeptic?
I get the impression that the other people you mention are in the business of genetic prediction, just the thing you say will be difficult (impossible?) to achieve...

Ken Weiss said...

No, I wouldn't say that. Joe Terwilliger has been more strongly saying that GWAS has failed to do what it promised (this is phrased vaguely as I don't want to speak for him), Markus Perola is a researcher physician involved in such studies but not strident at all, though asking clearly whether they will have clinical implications (he's an internist as well as researcher). Patrik Magnusson is involved in various studies, but is not an advocate or ideologue at all, but instead is interested in what they can or cannot tell (a main object of his work is Swedish and EU twin data).

My skepticism is clear but I do my very best not to be an ideologue, but instead to resist the temptation to seek simple unexceptionable answers to complex questions, and in fact to try to figure out how to deal with complexity as best as can be done.

But this was also not a propaganda course, but a course that tried to teach students a bit 'out of the box' on the reasoning, epistemology, and scientific logic of approaches, designs, and inferences related to complex biomedical genetics. And I tried to explain from an evolutionary point of view why I think we see what we do, and should understand and accept its reality.

Francesc said...

Did you discuss this paper?
Yin et al. (2009) Genome-wide association study of the four-constitution medicine. J Altern Complement Med 15:1327-1333. PMID: 19954339

Quite an interesting way to generate a null distribution of p values in a GWAS...

Holly Dunsworth said...

Nice post.

Alexander Grankvist said...

Joe was definitely the most outspoken sceptic on the course (some may even consider him a bit brash at times, allthough he's only honest about his own beliefs).

I did not get the feeling on the course that the faculty tried to convince me that GWAS was wrong, rather I felt they tried to discuss issues which are normally not raised and to give an in-depth understanding about the processes which have shaped the human genome.