Monday, October 5, 2015

Life in 'trans'-it: Why genomic causation is often so elusive

We are in a time when genes are in the daily news, with reports of how this gene or that gene is related to disease, evolution, race, ancestry, and even social behavior.  But what are 'genes', and what do they do?  This is so often presented--in classes, even at higher levels of education--as a simple story presenting genes as bits of DNA that code for a protein, and proteins the molecules that do the functions of life.  We are still heavily influenced by the pioneering work of Gregor Mendel, who did his famous experiments with peas more than 150 years ago.  So, we still think of genes as elements with one or more variant states in a population, transmitted from parents to offspring, which cause some trait (he studied traits like size, shape, or color in his pea plants to try use this fact to breed better agricultural crops).

Mendel's intentionally focused, single-cause approach opened the way for an understanding of the mechanisms of inheritance and enabled one of the most powerful research strategies in all of science. But the idea of one gene and one function is a 19th century legacy that has put a conceptual cage around our thinking ever since.  Mendelian inheritance and its terms (like dominance and recessiveness, and even some of his notation) are still around, and indeed it all is rather ubiquitous even at the university level.  But we now know better, and can do better, and the many discoveries of the last century in biology and genetics present us with many 'mysterious' facts, basically unanticipated by the long, persistent shadow of Mendel's well-chosen simplifications.  It requires some thinking outside the Mendelian box to understand what they might mean.  

The cis image of the world
DNA is located in the nucleus of our cells, but where does genetic function take place?  The usual Mendelian way of thinking is that the action occurs in a particular place in our DNA where a 'gene' is. The gene codes for protein and (usually) has nearby DNA sequences that regulate the gene's usage---turning on its expression by transcribing the gene into messengerRNA.  That is, the gene itself determines how it's used.  It's in a given place in our DNA, and the presence of a complex of regulatory proteins that attach to nearby sequence cause the gene to be transcribed into messenger RNA, which exits the nucleus and is in turn translated into an amino acid chain specified by the sequence.  The amino acid chain is then folded up into a functional protein.

This local, focal view of gene action is what is called a cis perspective.  The Latin origin has a meaning like 'right here', or 'on this side'.  The specifics of this process differ depending on the gene, as no two genes work exactly alike, but the variation in the details is not central to the main point here,  the widespread perception of genes  as modular, chromosomally local self-standing functional units.

But this common idea of how genes work is inaccurate--it's a fundamentally inaccurate way to understand genes and genomic function.

The fundamental nature of life in trans-it
DNA is itself essentially an inert molecule.  It doesn't do anything by itself.  In turn that means that each nucleotide, and that means each new mutational change, cannot be said to have a function or effect, or effect size, on its own.  It only has an effect in terms of its interactions with other aspects of the genome in the same cell, other materials in that cell, that cell in its respective organ and that organ in the organism as a whole, and indeed all of this in relation to environmental factors. While some gene-regulatory regions are near a coding gene, and act in cis, most function involves things elsewhere, on the same chromosome or on others.  This is the trans causal world of life, and it means we cannot really understand what's 'here' without knowing what's elsewhere.

Indeed even Darwinian evolution is fundamentally an ecological phenomenon--it's about organisms' resources, threats, mates, and so on, at any given time.  As well as luck, there may be many levels and aspects of life that are about competition for resources and so on, that are important to survival and reproduction.  But cooperating, in the sense of appropriate interaction, is by far the most prevalent, immediate, and vital aspect of life (Richard Dawkins' ideological 'selfish gene' excessive assertions notwithstanding).

Trans means cooperation in life and evolution
Trans interactions are just that: interactions.  That means multiple components working together, which involves the 'right' combinations in the 'right' time and the 'right' cellular place.  By 'right' I mean functionally viable.  During development and subsequent live, organisms require suitable expression patterns of genes and the dispersion and processing pattern of gene products.  If this combinatorial action--this cooperation--doesn't occur to a suitable degree, the organism fails and its reproduction is reduced.  The extent of this failure depends on the nature of the combinatorial action.

In this sense, trans interactions may be reproductively better or worse and that can be a form of natural selection, whose result is the 'better' (more viably successful) patterns proliferate.  But this does not require Darwinian selection among organisms competing for limited resource.  Genomic variants whose cooperative interactions do not function can lead to embryonic lethality, for example, which need have nothing whatever to do with competition, and certainly not with other organisms seeking mates, food, or safety.  Ineffective cooperation is an evolutionary factor not identical to natural selection in its mechanism, but with similarly 'adaptive' effects.

In our view, cooperation based on trans interactions is more important, more prevalent, and more fundamental than Darwinian natural selection (as we write in our book The Mermaid's Tale).  Interactions that are successful become increasingly installed in the life history of organisms ('canalized' to use CH Waddington's venerable term for it), and this constrains the way and perhaps the rate at which evolution can occur.  This is neither heresy nor surprise.  For example, genes present today are the descendants of 4 billion years of evolutionary history, and most are used in multiple ways in the organism (at least in complex multicellular organisms; we don't know how true this is of simple or single-celled species).  They are less likely to suffer mutational change without serious effect, mainly negative. This is a very long-established idea, and is clearly supported by the high degree of sequence conservation of genes in genomes.

Genomewide mapping of most traits identifies many different genome regions that can statistically affect a trait's presence or measure.  But mapping rarely identifies coding regions.  Most 'hits' are in regulatory regions or regions with other (usually unknown) function.

This should surprise no one.  First, as noted above, 'genes' (protein coding regions) are largely of evolutionary long standing and embedded in interaction patterns usually in multiple contexts (they are 'pleiotropic'), so the coding parts are harder than regulatory parts to modify viably by mutation. It is empirically much more likely that their expression patterns can be varied.  Second, every gene is a complex of many different components (protein code, splice and polyadenylation signals--where the required AAAAA... tail of a mRNA molecule is attached--promoter sites, enhancer sites, and so on). Each of these is mutable in principle, and ample evidence shows that regulatory regions are especially so.  And each transcription factor or other gene product that is needed to activate a given gene (that is, the tens of proteins and their DNA binding sites that must assemble to cause a nearby gene  to be expressed) is itself a gene with all the same sort of complex modular structures.  RNA has to be processed, transported and translated by factors that, again, are potentially mutable.  And so on.  And then most final functions, physiological, developmental, metabolic, or physical are the result of complex processes over time, involving many genes and systems.

In fact, in recognition of biological complexity, many investigators suggest that the proper level of analysis should be of systems, that is, organized pathways of interaction that bring about some end result.  Gene regulation, physiology and metabolism, and so on, represent such entities.  The 'emergence' of the result cannot be predicted by listing the individual contributing elements, in the same sense that the effect of a new mutational change cannot be understood without considering its context.  However, systems themselves have overlap, redundancy, and elements that contributed in different systems at different times, and many systems may themselves interact in what one might call hyper-systems for a result--like you--to come about.  Analyzing emergent systems is at present an active but in many ways immature endeavor, because we still probably don't have adequate understanding, or perhaps not even adequate technology for the job.  But it's important that people are considering the trans world in this and other ways.

Causal complexity is predictable, and what we expect is what we see
Causation in life is fundamentally about cooperation which is about trans interactions.  Since cells are isolated from each other, so they can sense their own environments and respond to them, they actively signal to each other and a major way gene expression is regulated is through complex signal sending and receiving mechanisms.  'Signals' can mean gene-coded proteins secreted from cells, or the detection by cells of ions or other chemicals in their environment, and so on.  Signaling and responding to environmental conditions involves large numbers of genes and their regulation in time and space.  Most genes, in fact, have such cooperative, communicative function.

In turn, this implies that traits have many contributing genes, and their modular coding and regulatory sequences (and other forms of genome function, such as packaging and many different types of RNA), and each of these is potentially mutable and potentially variable within and between samples, populations, and species.  The result is the high level of causal complexity that is being so clearly documented.  A very large amount of viable contributing variation can be expected, if the individual variants have small effect.  The trait itself must be viable, but viability can coexist with large amounts of variation in the hundreds of contributing components.  This is what GWAS consistently finds, and is wholly consistent with how evolution works.

Life is complex in these ways in very understandable (and predictable) ways.  Enumeration of causes or even defining 'causes' are often  fool's errands because different variants in different genome regions in different samples and populations are to be expected.

It's a highly cooperative trans world out there!

2 comments:

Anonymous said...

Given that the biologically co-operative trans-world is so extensive with "innumerable" processes involved, could one at least dream that every individual has his or her own unique combination of co-operative processes, akin to the uniqueness of a fingerprint ... a lot of commonality but still unique.

Ken Weiss said...

That is not a dream, but it's the clear reality! At least, each of us has a unique combination of specific variants in the contributors to these processes (we inherit the same array of process-related genome functional units, though in each of us some of them carry inactivation variants of some sort). It is as unique as a fingerprint.