Thursday, October 16, 2014

What if Rev Jenyns had agreed? Part III. 'Group' selection in individuals, too.

We have been using Darwin's and Wallace's somewhat different views of evolution to address some questions of evolutionary genetics and their consequences for todays attempt to understand the biological, especially genomic, basis of traits of interest. Darwin had a more particularistic individual focus and Wallace a more group-focused, ecological one, on the dynamics of evolutionary change.

HMS Beagle in the Straits of Magellan

As a foil, we noted that a friend of Darwin's, Leonard Jenyns was offered the naturalist's job on the Beagle first, but turned it down, opening the way for Darwin. We mused about how we might think today had Wallace's view of evolution, announced in the same year that Darwin's was, been the first view of the new theory. Where we'd be now if we'd had a more group than individual focus is of course not knowable, but we feel Wallace's viewpoint, at least in some senses, has been wrongly neglected.

Population genetic theory traces what happens to genetic variants in a population over time. Almost without exception the theory treats each individual as representing a single genotype. We take individual blood samples or cheek swabs, and let our "Next-Gen" sequencer grind out the nucleotide sequences as though on a proverbial assembly line. In this sense, each individual--or, rather, the individual's genotype--is taken to be the unit of evolution.

Populations were, and generally still are, seen as a mix of these individual internally non- varying homogeneous units each having a genotype. But that's an obviously inaccurate way to view life, another reflection of the difference in viewpoint about variation in life that we've been characterizing by relating them symbolically to Darwin's and Wallace's stress in their views of evolution.

There is a strong tendency to equate genotypes with the traits they cause. This derives from the tendency to reduce natural selection to screening of single genes, because if single genes cannot be detected effectively by selection, they generally won't have high predictive value for biomedicine either. It is easy to see the issue.

But individuals are populations too
Let's ask something very simple: What is your 'genotype'? You began life as a single fertilized egg with two instances of human genomes, one inherited from each parent (here, we’ll ignore the slight complication of mitochondrial DNA). Two sets of chromosomes. But that was you then, not as you are now. Now, you’re a mix of countless billions of cells. They’re countless in several ways. First, cells in most of your tissues divide and produce two daughter cells, in processes that continue from fertilization to death. Second, cells die. Third, mutations occur so that each cell division introduces numerous new DNA changes in the daughter cells. These somatic (body cell) mutations don’t pass to the next generation (unless they occur in the germline) but they do affect the cells in which they are found.

But how do we determine your genotype? This is usually done from thousands or millions of cells—say, by sequencing DNA extracted from a blood sample or cheek swab. So what is usually sequenced is an aggregate of millions of instances of each genome segment, among which there is variation. The resulting analysis picks up, essentially, the most common nucleotides at each position. This is what is then called your genotype and the assumption is that it represents your nature, that is, all your cells that in aggregate make you what you are.

In fact, however, you are not just a member of a population of different competing individuals each with their inherited genotypes. In every meaningful sense of the word each person, too, is a i of genomes. A person's cells live and/or compete with each other in a Darwinian sense, and his/her body and organs and physiology are the net result of this internal variation, in the same sense that there is an average stature or blood pressure among individuals in a population.

If we were to clone a population of individuals, each from a single identical starting cell, and house them in entirely identical environments, there would still be variation among them (we see this, imperfectly, in colonies of inbred laboratory strains such as of mice). They are mostly the same, but not entirely. That’s because they are aggregates of cells, with genomes varying around their starting genome.

Yesterday we tried to describe why the traits in individuals in populations have a central tendency: most people have pretty similar stature or glucose levels or blood pressure. The reason is a group-evolutionary phenomenon. In a population, many different genomic elements contribute to the trait, and because the population is here and hence has evolved successfully in its competitive environment, the mix of elements and their individual frequencies is such that random draws of these elements mainly generate rather similar results.

It is this distribution of random draws of all the genetic variants in the population that determines the context and hence the success of a given variant. But the process is a relativistic one, rather than absolute effects of individual variants. Gene A's success depends on B's presence and vice versa, across the genome. There is always a small number of outliers, having drawn unusual combinations, and evolution screens these in a way that results in a central tendency that may shift over time, etc.

The same explanation accounts for the traits in individuals. There would be a central tendency in our hypothetical cloned mice. That’s because the somatic mutations generate many different cells, but most are not too different from each other. As in evolution in populations, if they are dysfunctional the cell dies (or, in some instances, they doom the whole cell-population to death, as when somatic mutations cause cancer in the individual). Otherwise, they usually comprise a population near the norm.

Is somatic variation important?
An individual is a group, or population of differing cells. In terms of the contribution of genetic variation among those cells, our knowledge is incomplete to say the least. From a given variant's point of view (and here we ignore the very challenging aspect of environmental effects), there may be some average risk--that is, phenotype among all sampled individuals with that variant in their sequenced genome. But somatically acquired variation will affect that variant's effects, and generally we don't yet know how to take that into account, so it represents a source of statistical noise, or variance, around our predictions. If the variant's risk is 5% does that mean that 5% of carriers are at 100% risk and the rest zero? Or all are at 5% risk? How can we tell? Currently we have little way to tell and I think manifestly even less interest in this problem.

Cancer is a good, long-studied example of the potentially devastating nature of somatic variation, because there is what I've called 'phenotype amplification': a cell that has inherited (from the person's parents or the cell's somatic ancestors) a carcinogenic genotype will not in itself be harmful, but it will divide unconstrained so that it becomes noticeable at the level of the organism. Most somatic mutations don't lead to uncontrolled cell proliferation, but they can be important in more subtle ways that are very hard to assess at present. But we do know something about them.

Evolution is a process of accumulation of variation over time. Sequences acquire new variants by mutations in a way that generates a hierarchical relationship, a tree of sequence variation that reflects the time order of when each variant first arrived. Older variants that are still around are typically more common than newer ones. This is how the individual genomes inherited by members of a population and is part of the reason that a group perspective can be an important but neglected aspect of our desire to relate genotypes to traits, as discussed yesterday. Older variants are more common and easier to find, but are unlikely to be too harmful, or they would not still be here. Rarer variants are very numerous in our huge, recently expanded human population. They can have strong effects but their rarity makes them hard to analyze by our current statistical methods.

However, the same sort of hierarchy occurs during life as somatic mutations arise in different cells at different times in individual people. Mutations arising early in embryonic development are going to be represented in more descendant cells, perhaps even all the cells in some descendant organ system, than recent variants. But because recent variants arise when there are many cells in each organ, the organ may contain a large number of very rare, but collectively important, variants.

The mix of variants, their relative frequencies, and their distribution of resulting effects are thus a population rather than individual phenomenon, both in populations and individuals. Reductionist approaches done well are not ‘wrong’, and tell us what can be told by treating individuals as single genotypes, and enumerating them to find associations. But the reductionist approach is only one way to consider the causal nature of life.

Our society likes to enumerate things and characterize their individual effects. Group selection is controversial in the sense of explaining altruism, and some versions of group selection as an evolutionary theory have well-demonstrated failings. But properly considered, groups are real entities that are important in evolution, and that helps account for the complexity we encounter when we force hyper-reductionistic, individual thinking to the exclusion of group perspectives. The same is true of the group nature of individuals' genotypes.

We have taken Darwin and Wallace as representatives of these differing perspectives. Had Jenyns taken the boat ride he was offered, we'd have been more strongly influenced by Wallace's population perspective because we wouldn't have had Darwin's. Instead, Darwin's view won, largely because of his social position and being in the London hub of science, as has been well-documented. A consequence is that the ridicule to which group-based evolutionary arguments have been subjected is a reflection of the resulting constricted theoretical ideology of many scientists—but not of the facts that science is trying to explain.

What needs to be worked on is not, or certainly not just, increased sample size to somehow make enumerative individual prediction accurate. For reasons we've tried to suggest, retrospective fitting to the particular agglomerate of genotypes does not yield accurate individual prediction--and here we've not even considering non-genomic aspects of each genome-site's environment. Instead, we should try to develop a better population-based understanding of the mix of variants and their frequencies, and a better sense of what a given allele's 'effect' is when we know each allele's effect is not singular nor absolute, but is strictly relative to its context both in terms of its individual and population occurrences. It's not obvious (to us, at least) how to do that, or how such an understanding might relate to whether accurate individualized prediction is likely to be possible in general.


Anne Buchanan said...

A friend sent this post to a list of his friends, many in genetics but also in agriculture and other fields as well, with this comment. "Attached please find a tutorial from Ken that I believe is particularly relevant for those who are proponents of personalized medicine and the genomic analysis of tumors. The reality that there is a population of genomes within each member of a population of individuals (each with a different genome at the zygote stage) is one of the most overlooked inconvenient truths that makes our search for the understanding the etiology of phenotypes far more difficult than 99.37% seem to admit. Adding the complexity of similar variation within and among individuals of the epigenome makes one wonder what unmeasured force is the molasses that keeps life as orderly as it appears at the level of the whole. This reality challenges us to think differently about the integration of genomic information into medicine and agriculture. It also raises the question of how we have made so much progress in applying genetics in ag and medicine without this most recently emerging knowledge."

Ken Weiss said...

Reply to that response:

The descent tree of cell lineages during embryology, from the early cell divisions of the zygote to the adult, and the location of cells that continue to divide, are known. There will be a hierarchical tree of mutational accumulation as well in these cells, a distribution of genotypes and their relative frequencies.

The genotypic tree could perhaps be worked out by some careful sampling from a mouse or even a human cadaver, and with some reasonable assumptions, one might work out the distributional properties of somatic mutation if, say, the alleles found by GWASing are assumed to be the same as would arise during development etc.

One could at least try to get a sense of the variance of genotypic effects (forgetting environments--a huge issue, of course) that might apply to each person's nominal genotype. It might at least help to sharpen our idea of the unresolvable noise surrounding nominal genotype effects alone.