Monday, May 9, 2016

Darwin the Newtonian. Part III. In what sense does genetic drift 'exist'?

It has been about 50 years since Motoo Kimora and King and Jukes proposed that a substantial fraction of genetic variation can be selectively neutral, meaning that the frequency of such an allele (sequence variant) in a population or among species changes by chance--genetic drift--and, furthermore, that selectively 'neutral' variation and its dynamics are a widespread characteristic of evolution (see Wikipedia: Neutral theory of molecular evolution). Because Darwin had been so influential with his Newtonian-like deterministic theory of natural selection, natural evolution was and still is referred to as 'non-Darwinian' evolution. That's somewhat misleading, if convenient as a catch-phrase, and often used to denigrate the idea of neutral evolution, because even Darwin knew there were changes in life that were not due to selection (e.g., gradual loss of traits no longer useful, chance events affecting fitness).

First, of course, is the 'blind watchmaker' argument.  How else can one explain the highly organized functionally intricate traits of organisms, from the smallest microbe to the largest animals and plants?  No one can argue that such traits could plausibly just arise 'by chance'!

But beyond that, the reasoning basically coincides with what Darwin asserted.  It takes a basically thermodynamic belief and applies it to life.  Mother Nature can detect even the smallest difference between bearers of alternative genotypes, and in her Newtonian force-like way, will proffer better success on the better genotype.  If we're material scientists, not religious or other mystics, then it is almost axiomatic that since a mutation changes the nature of the molecule, if for no other reason that it requires the use of a different nucleotide and hence the use and or production of at least slightly different molecules and at least slightly different amounts of energy.

The difference might be very tiny in a given cell, but an organism has countless cells--many many billions in a human, and what about a whale or tree! Every nonessential nucleotide has to be provided for each of the billions of cells, renewed each time any cell divides.  A mutation that deleted something with no important function would make the bearer more economical in terms of its need for food and energy. The difference might be small, but those who then don't waste energy on something nonessential must on average do better: they'll have to find less food, for example, meaning spend less time out scouting and hence exposed to predators, etc.  In short, even such a trivial change will confer at least a tiny advantage, and as Darwin said many times to describe natural selection, nature detects the smallest grain in the balance (scale) of the struggle for life.  So even if there is no direct 'function,' every nucleotide functions in the sense of needing to be maintained in every cell, creating a thermodynamic or energy demand.  In this Newtonian view, which some evolutionary biologists hold or invoke quite strongly, there simply cannot be true selective neutrality--no genetic drift!

The relative success of any two genotypes in a population sample will almost never be exactly the same, and how could one ever claim that there is no functional reason for this difference?  Just because a statistical test doesn't find 'significant' differences in the probabilistic sense that it's not particularly unusual if nothing is going on, tiny differences nonetheless obviously can be real.  For example, a die that's biased in favor of 6 can, by chance, come up 3 or some other number more often in an experiment of just a few rolls. Significance cutoff values are, after all, nothing more than subjective criteria that we have chosen as conventions for making pragmatic decisions (the reason for dice being this way is interesting, but beyond our point here).

But what about the lightning strikes?  They are fortuitous events that, obviously, work randomly against individuals in a population in a way unrelated to their genotypes, thus adding some 'noise' to their relative reproductive success and hence of allele (genetic variant) frequencies in the population over time.  That noise would also be a form of true genetic drift, because it would be due to a cause unrelated to any function of the affected variants, whose frequencies would change, at least to some extent, by chance alone. A common, and not unreasonable selectionist response to that is to acknowledge that, OK! there's a minor role for chance, but nonetheless, on average, over time, the more efficient version must still win out in the end: 'must', for purely physical/chemical energetics if no other reasons.  That is, there can be no such thing as genetic drift on average, over the long haul.  Of course, 'overall' and 'in the end' have many unstated assumptions.  Among the most problematic is that sample sizes will eventually be sufficiently great for the underlying physical, deterministic truth to win out over the functionally unrelated lightning-strike types of factors.

On the other hand, the neutralists argue in essence that such minuscule energetic and many other differences are simply too weak to be detected by natural selection--that is, to affect the fitness of their bearers.  Our survival and reproduction are so heavily affected by those genotypes that really do affect them, that the remaining variants simply are not detectable by selection in life's real, finite daily hurly-burly competition. Their frequencies will evolve just by chance, even if the physical and energetic facts are real in molecular terms.

But to say that variants that are chemically or physically different do not affect fitness is actually a rather strong assertion! It is at best a very vague 'theory', and a very strong assumption of Newtonian (classical physics) deterministic principles. It is by no means obvious how one could ever prove that two variants have no effect.

So we have two contending viewpoints.  Everyone accepts that there is a chance component in survival and reproduction, but the selectionist view sees that component as trivial in the face of basic physical facts that two things that are different really are different and hence must be detectable by selection, and the other view that true equivalence is not only possible but widespread in life.

When you think about it, both views are so vague and dogmatic that they become largely philosophical rather than actual scientific views.  That's not good, if we fancy that we are actually trying to understand the real world.  What is the problem with these assertions?

Can drift be proved?
Maybe the simplest thing in an empirical setting would just be to rule out genetic drift, and show that even if the differences between two genotypes are small in terms of fitness there is always at least some difference.  But it might be easier to take the opposite approach, and prove that genetic drift exists.  To that, one must compare carriers of the different genotypes and show that in a real population context (because that's where evolution occurs) there is no, that is zero difference in their fitness. But to prove that something has a value of exactly zero is essentially impossible!

Is each outcome equally likely?  How to tell?

Again to a dice-rolling analogy, a truly unbiased die can still come up 6 a different number of times than 1/6th of the number of rolls: try any number of rolls not divisible by 6!  In the absence of any true theory of causation, or perhaps to contravene the pure thermodynamic consideration that different things really are different, we have to rely on statistical comparisons among samples of individuals with the different competing genotypes.  Since there is the lightning-strike source of at least some irrelevant chance effects and no way to know all the possible ways the genotypes' effects might differ truly but only slightly, we are stuck making comparisons of the realized fitness (e.g., number of surviving offspring) of the two groups.  That is what evolution does, after all.  But for us to make inferences we must apply some sort of statistical criteria, like a significance cut-off value ('p-value') to decide. We may judge the result to be 'not different from chance', but that is an arbitrary and subjective criterion.  Indeed, in the context of these contending views, it is also an emotional criterion.  Really proving that a fitness difference is exactly zero without any real external theory to guide us, is essentially impossible.

All we can really hope to do without better biological theory (if such were to exist) is to show that the fitness difference is very small.  But if there is even a small difference, if it is systematic it is the very definition of natural selection!  Showing that the difference is 'systematic' is easier to say than do, because there is no limit to the causal ideas we might hypothesize.  We cannot repeat the study exactly, and statistical tests relate to repeatable events.

There's another element making a test of real neutrality almost impossible.  We cannot sample groups of individuals who have this or that variant and who do not differ in anything else.  Every organism is different, and so are the details of their environment and lifestyle experiences.  So we really cannot ever prove that specific variants have no selective effect, except by this sort of weak statistical test averaging over non-replicable other effects that we assume are randomly distributed in our sample.  There are so many ways that selection might operate, that one cannot itemize them in a study and rule out all such things.  Again, selectionists can simply smile and be happy that their view is in a sense irrefutable.

A neutralist riposte to this smugness would be to say that, while it's literally true that we can't prove a variant to confer exactly zero effect, we can say that it has a trivially small effect--that it is effectively neutral.  But there is trouble with that argument, besides its subjectivity, which is the idea that the variant in question may in other times and genomic or environmental contexts have some stronger effect, and not be effectively neutral.

A related problem comes from the neutralists' own idea that by far most sequence variants seem to have no statistically discernible function or effect.  That is not the same as no effect.  Genomes are loaded with nearly or essentially neutral variants by the usual sampling strategies used in bioinformatic computing, such as that neutral sites have greater variation in populations or between species than is found in clearly functional elements.  But this in no way rules out the possibility that combinations of these do-almost-nothings might together have a substantial or even predominant effect on a trait and the carriers' fitness.

After all, is not that just what have countless very large-scale GWAS studies shown? Such studies repeatedly, and with great fanfare, report that there are tens, hundreds, or even thousands of genome sites that have very small but statistically identifiable individual effects but that even these together still account for only a minority of the heritability, the estimate of the overall amount of contribution that genetic variation makes to the trait's variation.  That is, it is likely that many variants that individually are not detectably different from being neutral may contribute to the trait, and thus potentially to its fitness value, in a functional sense.

This is one of the serious and I think deeply misperceived implications of the very high levels of complexity that are clearly and consistently observed, which raises questions about whether the concept of neutrality makes any empirical sense, and remains rather a metaphysical or philosophical idea.  This is related to the concepts of phenogenetic drift that we discussed in Part II of this series, in which the same phenotype with its particular fitness can be produced by a multitude of different genotypes--the underlying alleles being exchangeable.  So are they neutral or not?

In the end, we must acknowledge that selective neutrality cannot be proved, and that there can always be some, even if slight, selective difference at work.  Drift is apparently a mythical or even mystical, or at least metaphoric concept.  We live in a selection-driven world, just as Darwin said more than a century ago.  Or do we?  Tune in tomorrow.


Anonymous said...

Enjoyed the article but will make a point that theories of genetic drift wherein stochastic and non-selected evolutionary changes can, if irreversible, lead to gradual divergence without adaptive differences pre-date Kimura and co. by about three and a half decades, with the Hagedoorns being credited as the original 'drifters', and drift as originally referred to as the 'Hagedoorn effect', whom Kimura rightly credits in his famed 1955 paper. The Hagedoorn's theory differed from Kimura's in that instead of being based principally on the idea that some mutations are invisible to selection, they instead argued (as I understand it) that it was possible for alleles to be lost from a population purely through quirks of differential reproduction and non-selective accident. This particularly applied to small populations, e.g. those geographically isolated such as island populations (e.g. founder effect), but was also incorporated into Sewall-Wright's shifting balance theory of evolutionary population genetics, based on the idea that natural populations, including mainland ones, naturally tend to be broken up into smaller composite populations in which inbreeding can promote the loss of alleles, as well as the expression of possibly selectively favourable mutations that would otherwise be repressed under panmixia, therefore allowing for higher adaptive 'peaks' to be reached by passing through valleys of decreased adaptation. So there's more to questioning drift than simply discussing neutrality, although the task of proving the Hagedoorn effect and/or shifting balance theory is equally fraught with highly theoretical speculations.

Ken Weiss said...

Thanks for this informative comment. I wondered how I could not have known about Hagedoorn since I was in training at the time that drift was becoming a major component of population genetics. I scrambled to look at my often-read copy of Provine, to find only one scant reference to an idea of H's that was immediately (says Provine) blasted by Fisher. So I don't know if Hagadoorn was coming in from left field at the time, rather than having what we'd call the penetrating insight.

Part of the difference is that in the bad old days we had virtually no understanding of genetic variation,since there were no sequence data and only spotty data on protein variants etc. It was easier to think that one of the (usually) two alleles must have some disadvantage, at least conceptually, since it was an actual translated molecule being identified by protein studies. Also there was Fisher and the evolution of dominance and the heavy Mendelian type thinking that took place before RFLPs and so on came along.

I don't have enough recollection of Wright's views. He certainly accommodated drift but I think that was mainly in terms of frequency change relative to a selective landscape, so maybe not as purely 'drift' as we usually think of today, where the variants have no function relative to a fitness space. I would say that things like inbreeding effects were still being considered as important relative to fitness not truly non-functional variants. But I am just surmising and trying to remember what Wright said--too lazy right now to go check!

In any case, thanks again for raising these points. The issues are worth thinking about with more sophistication than usually given and strongly selectionistic views still predominate, and to me a major issue is the 'forciness' of selection in the face of the many complications that would induce herky-jerky evolution even among variants that, on their own, had functional effects.

Tarquin Holmes said...

(That's me above, by the way, I just wasn't able to publish the comment with my blog identity for some reason). I'd say* that a major difference between Fisher and Wright was that Fisher's highly abstract mathematical models assumed infinite population sizes and a state of panmixia, which did not allow for the Hagedoorn effect. Wright, on the other hand, started from the assumption that natural selection would be most effective when operating in a manner similar to the most effective optimising strategies used in artificial breeding, which typically require small founder populations and heavy inbreeding. Fisherian selection he thought would generally be inefficient in nature, not least as it does not allow for populations to move through maladaptive valleys to more adaptive peaks. That presumptive insight is one of the major bases for his theory of shifting balance. The role of drift for Wright was therefore to contribute, in small populations isolated from panmictic interbreeding, to the possibility of moving through globally maladaptive (but locally adaptively adequate) valleys to locally and globally adaptive peaks.
*This isn't my argument really, its one that's been made by established historians of science such as MJS Hodge and Jean Gayon.

Ken Weiss said...

This makes sense and thanks for writing it, and very clearly. I don't know what Fisher said about this sort of thing in his feuds with Wright, perhaps that stochasticity in large continuous species habitats was trivially unimportant (which is why most modern population geneticists would say that weak selection takes forever in species with large population size, a point that, as you say, Wright made, so that effective evolution occurs in local areas and spreads. I think this is roughly what Darwin said in his 6th edition of the Origin of Species, in his ch IV on natural selection if my memory is right. I think no modern evolutionary geneticist can dispute the kinds of drift effects you describe. I do think the concept of a smooth terrain of peaks and valleys is too metaphoric to be entirely adequate, and I'm going to have to go back to Provine's critique of Wright and the latter's reply (in, I think, Wright's final paper, in American Naturalist). Even a local adaptation in an otherwise numerous, widely dispersed species, will have a hard time displacing its genotypic rivals, not least because not all parts of the species' environments will be favorable to the once-local variants. Perhaps what mainly happens is that the local variants become a new species.

Ken Weiss said...

A further thought. As we try to explain (or, at least, assert!) in this series of posts, environments screen any species on multiple traits all the time. Most models, and I think it's fair to say those of Fisher and Wright in the founding decades of modern evolutionary theory, were thinking of single genes or of single traits. This is probably misleading, though still I think the usual mode, even in terms of polygenic or GWAS-like studies, where additivity is assumed or little effort made to really test epistasis. Those days were when little was known about actual DNA or developmental/metabolic/etc. systems at the 'gene' level.

If environments screen multiway in any local area/time, then one can expect it to be correspondingly less likely that a subpopulation from local 'peak' could expand into the greater population and its 'adaptive' genes still retain their advantage. I think, also, that Fisher or others argued that because of diploidy and recombination and multi-locus nature of most interesting traits, that the genotypic combinations that were fostered by the small-population at the local Wright-type peak, would quickly be diluted so that this was not much of an explanation of adaptation for numerate or widely dispersed species. I am pretty sure Fisher or others made that argument, and it may have been part of Kimura and others views on drift (that the sequence elements were non-functional, so the recombination/dilution process would pose no problem for those variants' neutral evolution).

Ken Weiss said...

You might find it interesting to read Wright's 1988 (American Naturalist) paper--which I retrieved again. I don't have Provine's critique at hand, but I think Wright makes clear the kinds of circumstances he was referring to (at least, in retrospect!). I believe that these arguments are somewhat out of date given what we know about genomes, complex gene function and the like and I agree with Wright's statement in 1988 that this is not an area where we should be using formal mathematical theory in the face of what we know to be the complexities and irregularities of reality.

Tarquin Holmes said...

Ken, thank you for your replies. I'm responding more as a historian than as someone arguing for what is the 'right' model today for evolution through natural selection (not least as I'm less up-to-date and informed on a lot of recent developments, though I try to keep up every now and then). I certainly agree about the problems of classical population genetics with its tendency to assume simple Mendelian alleles that do not significantly interact with the rest of the genome. I will definitely look up the 1988 Wright paper, as well as Provine's original critique, thank you for the tip.