Joe as Abe on the balcony (but not of Ford's Theater) |
The point is that sometimes there is a lot of convenient hard-of-hearing even in science, which fancies itself to be an objective search for truth. Some of the details in Joe's post are out-of-date but we, and he, think that the basic thrust is not. In a sense that makes the conclusion all the more cogent, because the same modes of thinking about genomic causation are still predominant, despite the vastly costly but essentially consistent results in the five years since 2008. And, as Joe points out, he and Ken had much the same message in 2000.
One not-so-subtle change, we will note, is that promises by NIH Director Francis Dr Collins, and many others in presumably responsible positions, have steadily altered their due date, which recedes into the distance like, say, an oasis as you grope for water, the fences if you want your pitchers to have a better earned run average, or a preacher's promises of ultimate salvation, if you weekly plunk coins into the basket.
So perhaps the lesson is that under these circumstances, rather than just dismiss critics, science--actual science as it's supposed to be--should feel a need to take stock of what it's doing. But we leave it to you to judge.
And if you hear about Joe in other contexts in weeks to come, remember that he was a sometime geneticist here first:
The Rise and Fall of Human Genetics and the Common Variant - Common Disease Hypothesis
By Joe Terwilliger
Nov 2008
There is an enormity of positive press coverage for the Human Genome Project and its successor, the HapMap Project, even though within the field the initial euphoric party when the first results came out has already done a full 180 to be replaced by the hangover that inevitably follows such excesses.
For
those of you not familiar with the history of this field and the controversies
about its prognosis which were present from the outset, I refer you to a review
paper I and a colleague wrote back in 2000 at the height of the controversy - Nature Genetics 26, 151 - 157 . The basic gist
of the argument put forward for the HapMap project was the so-called common
variant/common disease hypothesis (CV/CD) which proposed that "most of the
genetic risk for common, complex diseases is due to disease loci where there is
one common variant (or a small number of them)" [Hum Molec Genet
11:2417-23]. Under those circumstances it was widely argued that using the
technologies being developed for the HapMap project, that one would be able to
identify these genes using "genome-wide association studies" (GWAS),
basically by scoring the genotype for each individual in a cross sectional
study for each of 500,000 to 1,000,000 individual marker loci - the argument
being that if common variants explained a large fraction of the attributable
risk for a given disease, that one could identify them by comparing allele
frequencies at nearby common variants in affected vs unaffected individuals.
This point was contested by researchers only with regard to how many markers
you might have to study for this to work if that model of the true state of
nature applied. Many overly optimistic scientists initially proposed 30,000
such loci would be sufficient, and when Kruglyak suggested it might take
500,000 such markers people attacked his models, yet today the current
technological platforms use 1,000,000 and more markers, with products in the
pipelines to increase this even more, because it quickly became clear that the
earlier models of regular and predictable levels of linkage disequiblrium were
not realistic, something that should have been clear from even the most basic
understanding of population genetics, or even empirical data from lower
organisms.
Today
such studies are widespread, having been conducted for virtually every disease
under the sun, and yet the number of common variants with appreciable
attributable fractions that have been identified is miniscule. Scientists have
trumpetted such results as have been found for Crohn's disease, in which 32
genes were detected using panels of thousands of individuals genotyped at
hundreds of thousands of markers - this sounds great until you start looking at
the fine print, in which it is pointed out that all of these loci put together
explain less than 10% of the attributable risk of disease, and for various
well-known statistical reasons, this is a gross overestimate of the actual
percentage of the variance explained. Most of these loci individually explain
far less than half a percent of the risk, meaning that while this may be
biologically interesting, it has no impact at all on public health as most of
the risk remains unexplained. This is completely opposite to the CV/CD theory
proposed as defined above. In fact, this is about the best case for any complex
trait studied, with virtually every example dataset I have personally looked at
there is absolutely nothing discovered at all.
At
the beginning of the euphoria for such association studies, the example
"poster child" used to justify the proposal was the relationship
between variation at the ApoE gene and risk of Alzheimer disease. In an
impressively gutsy paper recently, a GWAS study was performed in Alzheimer
disease and published as an important result, with a title that sent me rolling
on the floor in tears laughing: "A high-density whole-genome association
study reveals that APOE is the major susceptibility gene for sporadic
late-onset Alzheimer's disease" [ J Clin Psychiatry. 2007 Apr;68(4):613-8
] - in an amazingly negative study they did not even have the expected number
of false positive findings - just ApoE and absolutely nothing else... And the
authors went on to describe how important this result was and claimed this
means they need more money to do bigger studies to find the rest of the genes.
Has anyone ever heard of stopping rules, that maybe there aren't any common
variants of high attributable fraction??? This was a claim that Ken Weiss and I
put forward many times over the past 15 years, and Ken has been making this
point for a decade before that even, in his book, "Genetic variation and human
disease", which anyone working in this field should read if they are not
familiar with the basic evolutionary theory and empirical data which show why
noone should ever have expected the CV/CD hypothesis to hold...
In
many other fields, the studies that have been done at enormous expense have
found absolutely nothing, and in what Ken Weiss calls a form of Western Zen (in
which no means yes), the failure of one's research to find anything means they
should get more money to do bigger studies, since obviously there are things to
find but they did not have big enough studies with enough patients or enough
markers - it could not possibly be that their hypotheses are wrong, and should
be rejected... It is a truly bizarre world where failure is rewarded with more
money - but when it comes to promising upper-middle-aged men (i.e. Congress)
that they might not die if they fund our projects, they are happy to invest in
things that have pretty much now been proven not to work...
While in a truly bizarre propaganda piece, Francis Collins, in a parting sycophantic commentary (J Clin Invest. 2008 May;118(5):1590-605) claimed that the controversy about the CV/CD hypothesis was "... ultimately resolved by the remarkable success of the genetic association studies enabled by the HapMap project." He went on to list a massive table of "successful" studies, including loci for such traits as bipolar, Parkinson disease and schizophrenia, and of course the laughable success of ApoE and Alzheimer disease. To be objective about these claims, let me quote from what researchers studying those diseases had to say.
Parkinson
disease: "Taken together, studies appear to provide substantial evidence
that none of the SNPs originally featured as PD loci (sic from GWAS studies)
are convincingly replicated and that all may be false positives...it is worth
examining the implications for GWAS in general." Am J Hum Genet 78:1081-82
Schizophrenia:
"...data do not provide evidence for involvement of any genomic region
with schizophrenia detectable with moderate [sic 1500 people!] sample
size" Mol Psych 13:570-84
Bipolar
AND Schizophrenia: "There has been great anticipation in the world of
psychaitric research over the past year, with the community awaiting the
results of a number of GWAS's... Similar pictures emerged for both disorders -
no strong replications across studies, no candidates with strong effect on
disease risk, and no clear replications of genes implicated by candidate gene
studies." - Report of the World Congress of Psychiatric Genetics.
Ischaemic
stroke: "We produced more than 200 million genotypes...Preliminary
analysis of these data did not reveal any single locus conferring a large effect
on risk for ischaemic stroke." Lancet Neurol. 2007 May;6(5):383-4.
And
the list goes on and on of traits for which nothing was found, with the authors
concluding they need more money for bigger studies with more markers. It is
really scary that people are never willing to let go of hypotheses that did not
pan out. Clearly CV/CD is not a reasonable model for complex traits. Even the
diseases where they claim enormous success are not fitting with the model -
they get very small p-values for associations that confer relative risks of
1.03 or so - not "the majority of the risk" as the CV/CD hypothesis
proposed.
One
must recall that in the intial paper proposing GWAS by Risch and Merikangas
(Science 1996 Sep 13;273(5281):1516-7) - a paper which, incidentally, pointed
out that one always has more power for such studies when collecting families
rather than unrelated individuals - the authors stated that "despite the
small magnitude of such (sic: common variants in)genes, the magnitude of their
attributable risk (the proportion of people affected due to them) may be large
because they are quite frequent in the population (sic: meaning >>10% in
their models), making them of public health significance." The obvious
corollary of this is that if they are not quite frequency, they are NOT having
high attributable fraction and are therefore NOT of public health significance.
And
yet, you still have scientists claiming that the results of these studies will
lead to a scenario in which "we will say to you, 'suppose you have a 65%
chance of getting prostate cancer when you're 65. If you start taking these
pills when you're 45, that percent will change to 2". Amazing claims when
the empirical evidence is clear that the majority of the risk of the majority
of complex diseases is not explained by anything common across ethnicities, or
common in populations... (Leroy Hood, quoted in the Seattle
Post-Intelligencer). Francis Collins recently claimed that by 2020, "new
gene-based designer drugs will be developed for ... ALzheimer disease,
schizophrenia and many other conditions", and by 2010, "predictive
genetic tests will be available for as many as a dozen common conditions".
This does not jibe with the empirical evidence... In Breast Cancer for example,
researchers claimed that knowledge of the BRCA1 and BRCA2 genes (which confer
enormously high risk of breast cancer to carriers) was uninteresting as it had
such a small attributable fraction in the population. Of course now they have
performed GWAS studies and examined tens of thousands of individuals and have
identified several additional loci which put together have a much smaller
attributable fraction than BRCA1 and BRCA2, yet they claim this proves how
important GWAS is. Interesting how the arguments change to fit the data, and
everything is made to sound as if it were consistent with the theory.
I
suggest that people go back and read "How many diseases does it take to
map a gene with SNPs?" (2000) 26, 151 - 157. There are virtually no
arguments we made in that controversial commentary 8 years ago which we could
not make even stronger today, as the empirical data which has come up since
then basically supports our theory almost perfectly, and refutes conclusively
the CV/CD hypothesis, despite Francis Collins' rather odd claims to the
contrary...
In
the end, these projects will likely continue to be funded for another 5 or 10
years before people start realizing the boy has been crying wolf for a damned
long time... This is a real problem for science in America, however, as NIH is
spending big money on these rather non-scientific technologically-driven
hypothesis-free projects at the expense of investigator-initiated
hypothesis-driven science. Even more tragically training grants are enormously
plentiful meaning that we are training an enormous number of students and
postdocs in a field for which there will never be job opportunities for them,
even if things are successful. Hypothesis-free science should never be allowed
to result in Ph.D. degrees if one believes that science is about questioning
what truth is and asking questions about nature, while engineering is about how
to accomplish a definable task (like sequencing the genome quickly and
cheaply). The mythological "financial crisis" at NIH is really more a
function of the enormous amounts of money going into projects that are
predetermined to be funded by political appointees and government bureaucrats
rather than the marketplace of ideas through investigator-initiated proposals.
Enormous amounts of government funding into small numbers of projects is a bad
idea - one which began with Eric Lander's group at MIT proposing to build large
factories for the sequencing of the genome rather than spreading it across
sites, with the goal of getting it done faster (an engineering goal) instead of
getting more sites involved so that perhaps better scientific research could
have come along the way. This has led to a scenario years later in which the
factories now want to do science and not just engineering, which is totally
contrary to their raison d'etre, and leads to further concentrations of funding
in small numbers of hands when science is better served, perhaps by a larger
number of groups receiving a smaller amount of money so that more brains are
working in different directions thinking of novel and innovative ideas not
reliant on pure throughput. Human genetics has transformed from a field with
low funding, driven by creative thinking into a field driven by big money and
sheep following whatever shepherd du jour is telling them they should do (i.e.
innovative means doing what they current trend is rather than something truly
original and creative). This is bad for science, and also is bad science. GWAS
has been successful technologically, and it has resoundingly rejected the CV/CD
hypothesis through empirical data. If we accept this and move on, we can put
the HapMap and HGP where it belongs, in the same scientific fate as the
Supercollider, and let us get back to thinking instead of throwing money at
problems that are fundamentally biological and not technological!
(most
notably in terms of the big money NIH is sending into these non-scientific
technologically-driven hypothesis-free studies, rather than investigator
initiated hypothesis-driven science - one of the main causes of the
"funding crisis" at NIH where a tiny portion of new grants are funded
- get rid of the big science that is not working - like the supercollider! -
and there is no funding crisis)
10 comments:
His google profile says -
"Joe Terwilliger, Attended Peabody Institute, Lives in Pyongyang"
Has he become permanent resident of N. Korea?
Joe Terwilliger, sometime musician, sometime geneticist, sometime resident of Pyongyang...
Is it possible to imagine a more frivolous pursuit than GWAS? Well, other than sequencing, I mean ;). At least The tuba playing, Abe Lincoln impersonating, Korean translating for Rodman, competitive eating, Manhattan walking and all the rest are not being done with false promises at the expense of taxpayers who contribute to the economy by tuba playing, Abe Lincoln impersonating, Korean translating for Rodman, competitive eating, etc... To my mind frivolous is taking money from society under false promises, returning nothing of value compared to what was advertised, and then asking for more money because of said failure... Which pursuit is more frivolous, I ask you......
Which google profile is that? I probably have several and never use them :). I did live in Pyongyang in the summer, teaching human genetics at the university there for a month, and I have a gmail account I only use for that gig...
To be fair, I am a permanent resident of Finland, which is close to DPR Korea :)
I've worked with Joe for many (too many?) years. Only, I hardly ever actually 'work' with him, because usually I can't find him, or even know where the hell he is. He's a resident in [you name it here] about like someone is a 'resident' at a bus stop.
Joe's mind works almost as fast as the planes he flies in (when he's not in Korea). That means that what is for me one hour of collaboration with him is probably about 36 hours for him (and he'd probably call it boredom rather than collaboration, given how little I have to contribute.
I will say this, however: I know from many direct personal observations that Joe does actually sometimes live in Finland. But I can't say if he ever impersonates Abe Lincoln (or Kim) when he's there.
Joe, I got it from google+. Clicked that particular one after seeing your picture with two beautiful Korean women :)
https://plus.google.com/117378265486229750634/posts
Hey! Revealing his deepest secrets??
And one of them has red eyes....a vampire?
Joe, I posted your comment here (http://www.homolog.us/blogs/blog/2013/12/12/rise-fall-human-genetics-common-variant-common-disease-hypothesis/) without asking permission before. Please let me know if you do not like it, and I will remove.
Post a Comment