Thursday, August 16, 2018

The Litella Factor: Changing the claimspace of science

We may be starting to see rationalizations and wiggle-words as investigators gradually inch away from many of the genomics-based claims, such as last year's slogan du jour that we're going to deliver 'precision' genomic medicine, or this year's that we'll find genomic causes of disease for 'All of Us'.  Science, of all human subjects, should be objective about the world and not sloganeering even if to wangle ever more funding from the public.  Many are by now quietly realizing not only that environments are important, which is nothing new though minimized by geneticists for a generation, but also that genomics itself is more complex, more variable, and less predictively powerful than has been so widely and often touted in recent years.

We've known the likely nature of genomic causal contribution complexity for literally a century (RA  Fisher's 1918 paper is the landmark).  The idea was a reasoned way to resolve what appeared to be fundamental differences between classically discrete Mendelian traits that took on only one or two states (yellow or green peas), and classically quantitative 'heritability' based traits that seemed to vary continuously (like height) and that as a result were presumed to be the main basis of Darwinian evolution.  The former states seemed never to change, and hence to evolve, while selection could move the average values of continuous traits.

The resolution of these two seemingly incompatible views came from the idea that complex traits were produced by many individual 'Mendelian' genes, but each with a very small effect, was a major advance in our understanding of both heritable causation and the evolution of life: agricultural and experimental breeding confirmed this 'modern synthesis' of evolutionary genetics to an extensive and consistent if implicit degree for a century.  

However, the specific genes that were responsible were largely implicit, assumed, or unknown.  There was no way to identify them until large-scale DNA sequencing technology became available.  What genomewide mapping (GWAS and other statistical ways to identify associations between genetic variants and trait variation) has shown is (1) that century-old model was basically right, and (2) we can identify many of the myriad genome regions whose variation is responsible for trait variation.  This was given a real boost in public support by the fact that many diseases were familial and, even more, that if our diseases and other traits are genetic, we can identify the responsible genes (and, hopefully do something to correct harmful variants).

Phenotypes and their evolution (effects on health and reproductive success) are in this context usually based on the individual as a whole, not individual genes--say, your blood pressure's effect on you as a whole person.  That is, the combinations of polygenic effects that GWAS has identified typically differ for each person even if they have the same trait measure.  We have also found something that is entirely consistent with the nature of evolution as a population phenomenon.  That is that much of the contributing genomescape for a given trait (like blood pressure) involves genome sites whose relevant variants have very low frequency or effects too small to measure with statistical 'significance', so that only a fraction of the estimated overall genetic contribution in the population (measured as the trait's 'heritability') is accounted for by mapping.  All of this has been a discovery success that is consistent with what was the basic formal genetic theory of evolution, developed over the 20th century.  

Great success--but.....
The very same work, however, has led to a problem.  This is the convenient practice equating induction with deduction.  That is, we estimate the risk effects of genomic sites from samples of individuals whose current trait-state reflects their genotype and their past lifestyle exposures.  For example, we estimate the average blood pressure with sampled individuals with some particular genotype.  That is induction.  But then we make a prediction that we promise that from a new person's genotype we can, with 'precision', predict his/her future state.  That is, we use this deductively to assume that the average from past samples is a future parameter--say, a probability p of getting some disease.  That is essentially what a genotype-specific risk is.

But this is based on the achieved effects of the individuals' genotypes at the test and other genome sites as well as lifestyle exposures (mainly unmeasurable and unknown).  We assume that similar factors will apply in the future, so that we can predict traits based on genome sequence.  That is what (by assumption) converts induction to deduction.  It rests on many untested or even untestable assumptions.  It is a dubious port of convenience, because future mutations and lifestyle exposures, which we know are crucial to trait causation, are unpredictable--even in principle.  We know this from clearly documented epidemiological history: disease prevalences change in unpredictable ways so that the same genotype a century ago would not have the same phenotype consequences today.

So, while genetic variation is manifestly important, its results are complexly interactive, not nearly the simple, replicable, additive, specific causal phenomena that NIH has been promising so fervently to identify to produce wonders of improved health.  It's been a very good strategy for securing large budgets, and hence very good for lots of scientists, and perhaps as such--its real purpose?--it is a booming success.  It did, one must acknowledge, document the largely theoretical ideas about complex genotypic causation of the early 20th century.  But the casual equating of induction with deduction has also fed a convenient ideology that has not been very good for science, because science should shun ideology: in this case the idea of enumerable, essentially parametric causation is wrongly and far too narrowly focused.  

Perhaps some realization is afoot
But now we're seeing, here and there, various qualifiers and caveats and soft, not fully acknowledged, retreats from genomics promises.  Some light is being shown on the problems and the practices that are common today.  Few if any are admitting they've been too strident, or wrong, or whatever, but instead are asserting their view as either what we all already know, or as a kind of new insight they are making etc.  That is, claiming that things aren't so genomically caused is a claim of original insight and hence new or continued funding.  No apologies, and no acknowledgments of those critics of the current NIH-promoted Belief System, who have been pointing these things out for many years--no offer of Emily Litella's quiet and humble recognition of a mistake:  "Oh.....Never mind!"

"Oh.....Never mind!"  YouTube from NBC's SaturdayNightLive
How seriously should the quiet backtracking be challenged about this?  Is it even fair to call the revisionists 'hypocrites'?  We live and learn via science, so perhaps the claimscape change, though quiet and implicit, is a reflection of good science, not just expediency.  Perhaps that is how science should be, reacting, even if slowly, to new knowledge and giving up on cherished paradigms.

One underlying aspect of modern science is not that we can accept wrong notions, but our hasty, excessive claims rushed to the public, the journals, and the funders. In a sense, this isn't entirely a fault of vanity but of the system we've built for supporting science. A toning down of claims, shunning those who claim too much too quickly, and much higher threshold for 'going public' would improve science and indeed be more honest to the public.  A stone-age suggestion I've made (almost seriously) is that journals should stop publishing any figures or graphs (in the pages or on the cover) in color--that is, to make science papers really, really boring!  Then, only serious and knowledgeable scientists would read, much less shout about, research reports (maybe some black-and-white TV science reporting should be allowed, too).  At least, we are due some serious reforms in science funding itself, so that scientists are not pressured, for their very career survival, into the excessive claimscape of recent years.

In specific terms, I personally think that by far the most important reforms would be to limit the funding available to any single laboratory or project, to stop paying faculty salaries on grants, to provide base funding for faculty hired with research as part of their responsibilities, and decoupling relentless hustling for money from research, so that the science rather than the money would be in the driver's seat.  
Universities, lusting after credit score-count and grant overheads, would have to quiet down and reform as well.  

The infrastructure is broad and altering it would not be easy. But things were once more sane and responsible (even if always with some venal or show-boat exceptions, humans being humans). But if such reforms were to be implemented, young investigators could apply their fresh minds to science rather than science hustling.  And that would be good for science.


Ellen said...

Do you think something else is going to rush in to fill the place of genomics (both in terms of promises/attention and funding) and if so what?

Ken Weiss said...

Genomics will still be important, because genes are. Environment will still be important, because how people life still is. Interactions among these factors will be better understood. Some traits that really are 'genetic' will receive attention and, hopefully, effective treatment. Whether people will adopt better lifestyles is unpredictable--but if they did, unfortunately, a lot of early onset diseases will become less common, but nasty old-age ones will become more common. There is never only one direction.

But as long as the grant system rewards rushing to make claims and announce things to the press, and the need to hawk grants so faculty can get their salaries on them, and so on, then we will have hype and more hype, as we do now.