Tuesday, October 16, 2018

Where has all the thinking gone....long time passing?

Where did we get the idea that our entire nature, not just our embryological development, but everything else, was pre-programmed by our genome?  After all, the very essence of Homo sapiens compared to all other species, is that we use culture--language, tools, etc.--to do our business rather than just our physical biology.  In a serious sense, we evolved to be free of our bodies, our genes made us freer from our genes than most if not all other species! And we evolved to live long enough to learn--language, technology, etc.--in order to live our thus-long lives.

Yet isn't an assumption of pre-programming the only assumption by which anyone could legitimately promise 'precision' genomic medicine?  Of course, Mendel's work, adopted by human geneticists over a century ago, allowed great progress in understanding how genes lead at least to the simpler of our traits, with discrete (yes/no) manifestations, traits that do include many diseases that really, perhaps surprisingly, do behave in Mendelian fashion, and for which concepts like dominance and recessiveness been applied and that, sometimes, at least approximately hold up to closer scrutiny.

Even 100 years ago, agricultural and other geneticists who could do experiments, largely confirmed the extension of Mendel to continuously varying traits, like blood pressure or height.  They reasoned that many genes (whatever they were, which was unknown at the time) contributed individually small effects.  If each gene had two states in the usual Aa/AA/aa classroom example sense, but there were countless such genes, their joint action could approximate continuously varying traits whose measure was, say, the number of A alleles in an individual.  This view was also consistent with the observed correlation of trait measure with kinship-degree among relatives.  This history has been thoroughly documented.  But there are some bits, important bits, missing, especially when it comes to the fervor for Big Data 'omics analysis of human diseases and other traits.  In essence, we are still, a century later, conceptual prisoners of Mendel.

'Omics over the top: key questions generally ignored
Let us take GWAS (genomewide association studies) on their face value.  GWAS find countless 'hits', sites of whatever sort across the genome whose variation affects variation in WhateverTrait you choose to map (everything simply must be 'genomic' or some other 'omic, no?).  WhateverTrait varies because every subject in your study has a different combination of contributing alleles.  Somewhat resembling classical Mendelian recessiveness, contributing alleles are found in cases as well as controls (or across the measured range of quantitative traits like stature or blood pressure), where the measured trait reflects how many A's one has: WhateverTrait is essentially the sum of A's in 'cases', which may be interpreted as a risk--some sort of 'probability' rather than certainty--of having been affected or of having the measured trait value.

We usually treat risk as a 'probability,' a single value, p, that applies to everyone with the same genotype.  Here, of course, no two subjects have exactly the same genotype so some sort of aggregate risk score, adding up each person's 'hits', is assigned a p.  This, however, tacitly assumes something like that each site contributes some fixed risk or 'probability' of affection.  But this treats these values as if they were essential to the site, each thus acting as a parameter of risk.  That is, sites are treated as a kind of fixed value or, one might say 'force', relative to the trait measure in question.

One obvious and serious issue is that these are necessarily estimated from past data, that is, by induction from samples.  Not only is there sampling variation that usually is only crudely estimated by some standard statistical variation-related measure, but we know that the picture will be at least somewhat different in any other sample we might have chosen, not to mention other populations; and those who are actually candid about what they are doing know very well that the same people living in a different place or time would have different risks for the same trait.

No study is perfect, so we use some conveniently assumed well-behaved regression/correction adjustments to account for the statistical 'noise' due to factors like age, sex, and unmeasured environmental effects.  Much worse than these issues, there are clearly factors of imprecision, and the obvious major one, taboo even to think about much less to mention, that relevant future factors (mutations, environments, lifestyles) are unknowable, even in principle.  So what we really do, are forced to do, is extend what the past was like to the assumed future.  But besides this, we don't count somatic changes (mutation arising in body tissues during life, that were not inherited), because they'd mess up our assertions of 'precision', and we can't measure them well in any case (so just shut one's eyes and pretend the ghost isn't in the house!).

All of these together mean that we are estimating risks from imperfect existing samples and past life-experience, but treating them as underlying parameters so that we can extend them to future samples.  What that does is equate induction with deduction, assuming the past is rigorously parametric and will be the same in the future;  but this is simply scientifically and epistemologically wrong, no matter how inconvenient it is to acknowledge this.  Mutations, genotypes, and environments of the future are simply unpredictable, even in principle.

None of this is a secret, or new discovery, in any way.  What it is, is inconvenient truth. These things should have been enough, by themselves and without badgering investigators about environmental factors that (we know very well, typically predominate) prevent all the NIH's precision promises from being accurate ('precise'), or even to a knowable degree.   Yet this 'precision' sloganeering is being, sheepishly, aped all over the country by all sorts of groups who don't think for themselves and/or who go along lest they get left off the funding gravy train.  This is the 'omics fad.  If you think I am being too cynical, just look at what's being said, done, published, and claimed.

These are, to me, deep flaws in the way the GWAS and other 'omics industries, very well-heeled, are operating these days, to pick the public's pocket (pharma may, slowly, be awakening-- Lancet editorial, "UK life science research: time to burst the biomedical bubble," Lancet 392:187, 2018).  But scientists need jobs and salaries, and if we put people in a position where they have to sing in this way for their supper, what else can you expect of them?

Unfortunately, there are much more serious problems with the science, and they have to do with the point-cause thinking on which all of this is based.

Even a point-cause must act through some process
By far most of the traits, disease or otherwise, that are being GWAS'ed and 'omicked these days, at substantial public expense, are treated as if the mapped 'causes' are point causes.  If there are n causes, and a person has an unlucky set m out of many possible sets, one adds 'em up and predicts that person will have the target trait.  And there is much that is ignored, assumed, or wishfully hidden in this 'will'.  It is not clear how many authors treat it, tacitly, as a probability vs a certainty, because no two people in a sample have the same genotype and all we know is that they are 'affected' or 'unaffected'.

The genomics industry promises, essentially, that from conception onward, your DNA sequence will predict your diseases, even if only in the form of some 'risk'; the latter is usually a probability and despite the guise of 'precision' it can, of course, be adjusted as we learn more.  For example, it must be adjusted for age, and usually other variables.  Thus, we need ever larger and more and longer-lasting samples.  This alone should steer people away from being profiteered by DNA testing companies.  But that snipe aside, what does this risk or 'probability' actually mean?

Among other things, those candid enough to admit it know that environmental and lifestyle factors have a role, interacting with the genotype if not, usually, overwhelming it, meaning, for example, that the genotype only confers some, often modest, risk probability, the actual risk much more affected by lifestyle factors, most of which are not measured or not measured with accuracy, or not even yet identified.  And usually there is some aspect that relates to age, or some assumption about what 'lifetime' risk means.  Whose lifetime?

Aspects of such a 'probability'
There are interesting issues, longstanding issues, about these probabilities, even if we assume they have some kind of meaning.  Why do so many important diseases, like cancers, only arise at some advanced age?  How can a genomic 'risk' be so delayed and so different among people?  Why are mice, with very similar genotypes to humans (which is why we do experiments on them to learn about human disease) only live to 3 while we live to our 70s and beyond?

Richard Peto, raised some of these questions many decades ago.  But they were never really addressed, even in an era when NIH et al were spending much money on 'aging' research including studies of lifespan.  There were generic theories that suggested from an evolutionary theory why some diseases were deferred to later ages (it is called 'negative pleiotropy'), but nobody tried seriously to explain why that was from a molecular/genetic point of view.  Why do mice only live only 3 years, anyway?  And so on.

These are old questions and very deep ones but they have not been answered and, generally, are conveniently forgotten--because, one might argue, they are inconvenient.

If a GWAS score increases the risk of a disease, that has a long delayed onset pattern, often striking late in life, and highly variable among individuals or over time, what sort of 'cause' is that genotype?  What is it that takes decades for the genes to affect the person?  There are a number of plausible answers, but they get very little attention at least in part because that stands in the way of the vested interests of entrenched too-big-to-kill Big Data faddish 'research' that demands instant promises to the public it is trephining for support.  If the major reason is lifestyle factors, then the very delayed onset should be taken as persuasive evidence that the genotype is, in fact, by itself not a very powerful predictor.

Why would the additive effects of some combination of GWAS hits lead to disease risk?  That is, in our complex nature why would each gene's effects be independent of each other contributor?  In fact, mapping studies usually show evidence that other things, such as interactions are important--but they are at present almost impossibly complex to be understood.

Does each combination of genome-wide variants have a separate age-onset pattern, and if not, why not?  And if so, how does the age effect work (especially if not due to person-years of exposure to the truly determining factors of lifestyle)?  If such factors are at play, how can we really know, since we never see the same genotype twice? How can we assume that the time-relationship with each suspect genetic variant will be similar among samples or in the future?  Is the disease due to post-natal somatic mutation, in which case why make predictions based on the purported constitutive genotypes of GWAS samples?

Obviously, if long delayed onset patterns are due not to genetic but to lifestyle exposures interacting with genotypes, then perhaps lifestyle exposures should be the health-related target, not exotic genomic interventions.  Of course, the value of genome-based prediction clearly depends on environmental/lifestyle exposures, and the future of these exposure is obviously unknowable (as we clearly do know from seeing how unpredictable past exposures have affected today's disease patterns).

The point here is that our reliance on genotypes is a very convenient way of keeping busy, bringing in the salaries, but not facing up to the much more challenging issues that the easy one (run lots of data through DNA sequencers) can't address.  I did not invent these points, and it is hard to believe that at least the more capable and less me-too scientists don't clearly know them, if quietly.  Indeed, I know this from direct experience.  Yes, scientists are fallible, vain, and we're only human.  But of all human endeavors, science should be based on honesty because we have to rely on trust of each other's work.

The scientific problems are profound and not easily solved, and not soluble in a hurry.  But much of the problem comes from the funding and careerist system that shackles us.  This is the deeper explanation in many ways.  The  paint on the House of Science is the science itself, but it is the House that supports that paint that is the real problem.

A civically responsible science community, and its governmental supporters, should be freed from the iron chains of relentless Big Data for their survival, and start thinking, seriously, about the questions that their very efforts over the past 20 years, on trait after trait, in population after population, and yes, with Big Data, have clearly revealed.

4 comments:

Steven Kurtz said...

Are you saying that humans are different from all other social mammals? We are the most complex ones in the opinion of most biologists I'm aware of. Heredity and natural selection are operative in us as in other life forms. It seems to me that boundaries, predispositions, and parameters are present in humans whether or not it is PC to say so. We are not blank slates.

Ken Weiss said...

Yes. But we have the mythology about ourselves, especially in science to be 'objective' (so we in science seem to say). Whether we are the most complex, or just differently so is probably a matter of judgment (and vanity?). Selection works differently in us because of culture, and much more slowly because we are so globally dispersed and have been very isolated from each other until recently (but also because there are so many of us). As to 'all other social mammals', I guess that would be a matter of definition, discussion, and opinion, but I wouldn't want to argue about it without knowing more about what you have in mind.

Steven Kurtz said...

Thanks for your interaction, Ken. Perhaps culture is part of "group selection." I'm not an academic. Just a retired dilettante! As to complexity, I was referring to our nervous systems, sense organs, and magnitude of language/concepts. We are likely to go extinct before many other species, so I'd not say we are the fittest, evolutionarily speaking!

Ken Weiss said...

I think that 'complex' is in the eye of the beholder in many ways. I am not qualified to judge whether we are more brain-wise 'complex' than an elephant or cat, though language makes us different at least. Anyway, each species has the complexity it needs, one might say.