Wednesday, April 22, 2015

Seattle's finest (thinkers, not coffee)

We're currently in Seattle, not blogging, but where I gave a presentation about genomic causation at the Institute for Systems Biology. I was hosted by Sui Huang and his group, and had many good conversations. This is a research institute where a lot of clever people are working hard, in various ways, to understand the causal complexity of genomic variation.

Many are dedicated at present to a Big Data informatics or computational approach to causation, using GWAS or similar kinds of data.  My talk did critique that approach in the sense of things we've posted about many times here, and it probably didn't go down well with those at ISB for whom this is a GWAS world.  Still, my point was the extensive genome mapping has shown that traits are complex in ways we had reason to expect before the GWAS era began 20 or so years ago.  We've learned a lot, but one point is the disappointment that mapping would  not find 'the', or the few, genes for common disease traits.

The 'precision' individual predictive medicine is here, whether one likes it or not, because that's where the money is, from NIH, thanks to the PR spin director Francis Collins' has managed to sell. The romance with heavily inductive Big Data, computationally extensive and without strong hypotheses is what's afoot.  Nonetheless, reservations not being dismissed, there are people here at ISB and elsewhere who are trying to exploit what is in this sort of data.

One major 'new' approach is to use extensive genome sequencing in, yes, families.  Without any sense of embarrassment at how GWASers have for a long time sneered at pedigree data, the greater power of families to reveal strong genetic causal factors is being realized.  To be fair, most of the younger people involved were not the sneerers (that was their elders).  Whether major high-risk genes or variants will be found in numbers justifying this way of analyzing Big Data remains to be seen, and one needs to beware of claims for this or that success as showing that this approach gives value. Such claims will be made, because that's how we operate these days.  But the question isn't asked whether we could get more bang for the buck with other approaches.

Anyway, we've said these sorts of things many times, and here there are people trying to think hard about what can be learned from Big Data and mapping approaches.  My talk was on something different, namely, to point out that genes function only by interacting with other genes and 'environmental' factors.  DNA is inert by itself, and only works by interacting molecules in its environment, many of which are coded by DNA elsewhere in the genome.  If a mutation arises, does it have an 'allelic effect?'  If so, what is it?  Essentially, and fundamentally, it depends on its context in the individual--the rest of the genome and its net effects, plus environmental conditions.

I asked this in the context of an evolutionary simulation program, called ForSim, that I and colleagues have developed and used over the years.  When a new mutation arises in a simulation, one has to specify how it affects traits (phenotypes) being simulated. How to assign such an effect is by no means trivial and leads to many important issues about genetic causation.

As importantly, differentiated organisms exist because their cells are partially isolated from each other, so they can specialize, but connected through communications like signaling, to the other cells (and to the external environment as perceived by the chemical and other sensory systems, etc.).  These interactions are in a sense not working in cis as the term is used, meaning along a given DNA molecule, but are trans effects, meaning involving DNA elsewhere in the genome.  In addition, it is local combinations of factors, their location and timing and presence, absence, or concentrations that, together, bring about biological effects.

The cis-trans distinction is not at all new, but there are many phenomena that are very 'trans' in nature.  For example, a given cell responds to environmental factors like genetically coded signal molecules, that are produced elsewhere.  The local cell is the location for the interactions among many factors arriving from various places.  That is how tissue behavior and differentiation respond to changed conditions, and/or are produced initially in embryos as they differentiate their various organ systems.

However, there are many aspects of what goes on genomically within a given cell that involve important trans effects that are basically not yet understood--and often hardly investigated or, as one might say, that do not give pause to the Big Data train racing right past, paying little attention. Monoallelic expression of various sorts are examples.  I talked with various people who found these issues interesting to think about, and in general we discussed strange facts that might be given higher priority in trying to understand genetic causation.  For example, a single gene may be associated with one type of disease. (The Huntingtin protein and Huntington's Disease, or BRCA1 and breast/ovarian cancer, are examples).  These are interesting because the gene is expressed in all cells, and the dangerous variants seem to have particular roles to play that would lead one to expect that they would have negative effects on all tissues, not just the 'major' ones the genes are known for, yet they don't.  Similarly, these sorts of genes often do not cause the same problem in mice, even though the genes are in mice and actively used.

Why is this?  The generic answer is that the genomic background or other life-course factors differ among species, or among cells.  But generic answers at this stage are rather hand-waving by nature. To me, these are examples of 'strange' facts that could, potentially, provide far more important answers if they were understood, than what we'll get by increasing the scale of the same sorts of studies we've been doing.

Another topic we had very good talks was about the way that genomic causal complexity has evolved to enable organisms to achieve biological success (survival and reproduction) under varying conditions.  The fact that there are many causal paths to traits like, say, glucose levels or behavior, means that different genotypes can succeed in a given environment, or the species can succeed in diverse environments.  Thus, the genomic complexity being clearly shown by mapping makes evolutionary sense.

In that context, another important point is that some genes, especially perhaps those involved in early development, are highly conserved evolutionarily.  Why is that, and how might that channel aspects of biology, keeping them from differing too much even over eons of geologic history?  These are important points.

We've discussed other things even in the 2 days we've been here, and hope to comment on them in the near future (but we're on holiday as of tomorrow, for a few days, so we may not get to that til next week).  We also want to mention that we met and had dinner with another of our regular, thoughtful correspondents, Manoj Samanta, whose blog often has topics MT readers might find interesting.

My final time at ISB ended with several hours of extremely stimulating discussions with a number of their people, students and staff alike.  What a refreshing and invigorating experience!  I intend to keep in touch with many of the people there.

1 comment: said...

It was a great pleasure to meet both of you !

I mentioned P. W. Anderson and broken symmetry. In that context, you may find Anderson's following paper interesting. It came as a surprise in 1972, when it was written, because physicists of that era used to believe that finding the fundamental laws would be enough to explain everything else in nature.

I was wrong about him being a founder of Santa Fe Institute. He was heavily involved with the organization, but not a founder, but Murray Gellman (another well-respected physicist) was.