Showing posts with label paradigm shift. Show all posts
Showing posts with label paradigm shift. Show all posts

Wednesday, December 28, 2016

Post-truth science?

This year was one that shook normal politics to its core.  Our belief in free and fair elections, in the idea that politicians strive to tell the truth and are ashamed to be caught lying, in real news vs fake, in the importance of tradition and precedent, indeed in the importance of science in shaping our world, have all been challenged.  This has served to remind us that we can't take progress, world view, or even truth and the importance of truth themselves for granted.  The world is changing, like it or not.  And, as scientists who assume that truth actually exists and whose lives are devoted to searching for it, the changes are not in familiar directions.  We can disagree with our neighbors about many things, but when we can't even agree on what's true, this is not the 'normal' world we know.

To great fanfare, Oxford Dictionaries chose "post-truth" as its international word of the year.
The use of “post-truth” — defined as “relating to or denoting circumstances in which objective facts are less influential in shaping public opinion than appeals to emotion and personal belief” — increased by 2,000 percent over last year, according to analysis of the Oxford English Corpus, which collects roughly 150 million words of spoken and written English from various sources each month.  New York Times
I introduce this into a science blog because, well, I see some parallels with science.  As most of us know, Thomas Kuhn, in his iconic book, The Structure of Scientific Revolutions, wrote about "normal science", how scientists go about their work on a daily basis, theorizing, experimenting, and synthesizing based on a paradigm, a world view that is agreed upon by the majority of scientists.  (Although not well recognized, Kuhn was preceded in this by Ludwik Fleck, Polish and Israeli physician and biologist who, way back in the 1930s, used the term 'thought collective' for the same basic idea.)

When thoughtful observers recognize that an unwieldy number of facts no longer fit the prevailing paradigm, and develop a new synthesis of current knowledge, a 'scientific revolution' occurs and matures into a new normal science.  In the 5th post in Ken's recent thought-provoking series on genetics as metaphysics, he reminded us of some major 'paradigm shifts' in the history of science -- plate tectonics, relativity and the theory of evolution itself.

We have learned a lot in the last century, but there are 'facts' that don't fit into the prevailing gene-centered, enumerative, reductive approach to understanding prediction and causation, our current paradigm.  If you've read the MT for a while, you know that this is an idea we've often kicked around.  In 2013 Ken made a list of 'strange facts' in a post he called "Are we there yet or do strange things about life require new thinking?" I repost that list below because I think it's worth considering again the kinds of facts that should challenge our current paradigm.

As scientists, our world view is supposed to be based on truth.  We know that climate change is happening, that it's automation not immigration that's threatening jobs in the US, that fossil fuels are in many places now more costly than wind or solar.  But by and large, we know these things not because we personally do research into them all -- we can't -- but because we believe the scientists who do carry out the research and who tell us what they find.  In that sense, our world views are faith-based.  Scientists are human, and have vested interests and personal world views, and seek credit, and so on, but generally they are trustworthy about reporting facts and the nature of actual evidence, even if they advocate their preferred interpretation of the facts, and even if scientists, like anyone else, do their best to support their views and even their biases.

Closer to home, as geneticists, our world view is also faith-based in that we interpret our observations based on a theory or paradigm that we can't possibly test every time we invoke it, but that we simply accept.  The current 'normal' biology is couched in the evolutionary paradigm often based on ideas of strongly specific natural selection, and genetics in the primacy of the gene.

The US Congress just passed a massive bill in support of normal science, the "21st Century Cures Act", with funding for the blatant marketing ploys of the brain connectome project, the push for "Precision Medicine" (first "Personalized Medicine, this endeavor has been, rebranded -- cynically? --yet again to "All of Us") and the new war on cancer.  These projects are nothing if not born of our current paradigm in the life sciences; reductive enumeration of causation and the ability to predict disease.  But the many well-known challenges to this paradigm lead us to predict that, like the Human Genome Project which among other things was supposed to lead to the cure of all disease by 2020, these endeavors can't fulfill their promise.

To a great if not even fundamental extent, this branding is about securing societal resources, for projects too big and costly to kill, in a way similar to any advertising or even to the way churches promise heaven when they pass the plate. But it relies on wide-spread acceptance of contemporary 'normal science', despite the unwieldy number of well-known, misfitting facts.  Even science is now perilously close to 'post-truth' science. This sort of dissembling is deeply built into our culture at present.

We've got brilliant scientists doing excellent work, turning out interesting results every day, and brilliant science journalists who describe and publicize their new findings. But it's almost all done within, and accepting, the working paradigm. Too few scientists, and even fewer writers who communicate their science, are challenging that paradigm and pushing our understanding forward. Scientists, insecure and scrambling not just for insight but for their very jobs, are pressed explicitly or implicitly to toe the current party line. In a very real sense, we're becoming more dedicated to faith-based science than we are to truth.

Neither Ken nor I are certain that a new paradigm is necessary, or that it's right around the corner. How could we know? But, there are enough 'strange facts', that don't fit the current paradigm centered around genes as discrete, independent causal units, that we think it's worth thinking about whether a new synthesis, that can incorporate these facts, might be necessary. It's possible, as we've often said, that we already know everything we need to know: that biology is complex, genetics is interactive not iterative, every genome is unique and interacts with unique individual histories of exposures to environmental risk factors, evolution generates difference rather than replicability, and we will never be able to predict complex disease 'precisely'.

But it's also possible that there are new ways to think about what we know, beyond statistics and population-based observations, to better understand causation.  There are many facts that don't fit the current paradigm, and more smart scientists should be thinking about this as they carry on with their normal science.



---------------------------------
Do strange things about life require new concepts?
1.  The linear view of genetic causation (cis effects of gene function, for the cognoscenti) is clearly inaccurate.  Gene regulation and usage are largely, if not mainly, not just local to a given chromosome region (they are trans);
2.  Chromosomal usage is 4-dimensional within the nucleus, not even 3-dimensional, because arrangements are changing with circumstances, that is, with time;
3.  There is a large amount of inter-genic and inter-chromosomal communication leading to selective expression and non-expression at individual locations and across the genome (e.g., monoallelic expression).  Thousands of local areas of chromosomes wrap and unwrap dynamically depending on species, cell type,  environmental conditions, and the state of other parts of the genome at a given time; 
4.  There is all sorts of post-transcription modification (e.g., RNA editing, chaperoning) that is a further part of 4-D causation;
5.  There is environmental feedback in terms of gene usage, some of which is inherited (epigenetic marking) that can be inherited and borders on being 'lamarckian';
6.  There are dynamic symbioses as a fundamental and pervasive rather than just incidental and occasional part of life (e.g., microbes in humans);
7.  There is no such thing as 'the' human genome from which deviations are measured.  Likewise, there is no evolution of 'the' human and chimpanzee genome from 'the' genome of a common ancestor.  Instead, perhaps conceptually like event cones in physics, where the speed of light constrains what has happened or can happen, there are descent cones of genomic variation descending from individual sequences--time-dependent spreading of variation, with time-dependent limitations.  They intertwine among individuals though each individual's is unique.  There is a past cone leading of ancestry to each current instance of a genome sequence, from an ever-widening set of ancestors (as one goes back in time) and a future cone of descendants and their variation that's affected by mutations.  There are descent cones in the genomes among organisms, and among organisms in a species, and between species. This is of course just a heuristic, not an attempt at a literal simile or to steal ideas from physics! 
Light cone: Wikipedia

8.  Descent cones exist among the cells and tissues within each organism, because of somatic mutation, but the metaphor breaks down because they have strange singular rather than complex ancestry because in individuals the go back to a point, a single fertilized egg, and of individuals to life's Big Bang;
9.  For the previous reasons, all genomes represent 'point' variations (instances) around a non-existent core  that we conceptually refer to as 'species' or 'organs', etc.('the' human genome, 'the' giraffe, etc.);
10.  Enumerating causation by statistical sampling methods is often impossible (literally) because rare variants don't have enough copies to generate 'significance', significance criteria are subjective, and/or because many variants have effects too small to generate significance;
11.  Natural selection, that generates current variation along with chance (drift) is usually so weak that it cannot be demonstrated, often in principle, for similar statistical reasons:  if cause of a trait is too weak to show, cause of fitness is too weak to show; there is not just one way to be 'adapted'.
12.  Alleles and genotypes have effects that are inherently relativistic.  They depend upon context, and each organism's context is different;
13.  Perhaps analogously with the ideal gas law and its like, phenotypes seem to have coherence.  We each have a height or blood pressure, despite all the variation noted above.  In populations of people, or organs, we find ordinary (e.g., 'bell-shaped') distributions, that may be the result of a 'law' of large numbers: just as human genomes are variation around a 'platonic' core, so blood pressure is the net result of individual action of many cells.  And biological traits are typically always changing;
14. 'Environment' (itself a vague catch-all term) has very unclear effects on traits.  Genomic-based risks are retrospectively assessed but future environments cannot, in principle, be known, so that genomic-based prediction is an illusion of unclear precision; 
15.  The typical picture is of many-to-many genomic (and other) causation for which many causes can lead to the same result (polygenic equivalence), and many results can be due to the same cause (pleiotropy);
16. Our reductionist models, even those that deal with networks, badly under-include interactions and complementarity.  We are prisoners of single-cause thinking, which is only reinforced by strongly adaptationist Darwinism that, to this day, makes us think deterministically and in terms of competition, even though life is manifestly a phenomenon of molecular cooperation (interaction).  We have no theory for the form of these interactions (simple multiplicative? geometric?).
17.  In a sense all molecular reactions are about entropy, energy, and interaction among different molecules or whatever.  But while ordinary nonliving molecular reactions converge on some result, life is generally about increasing difference, because life is an evolutionary phenomenon.
18. DNA is itself a quasi-random, inert sequence. Its properties come entirely from spatial, temporal, combinatorial ('Boolean'-like) relationships. This context works only because of what else is in (and on the immediate outside) of the cell at the given time, a regress back to the origin of life.

Wednesday, October 30, 2013

Are we there yet or do strange things about life require new thinking?

Yesterday we wrote about the state of things in genomics. The idea of genetics as essentially a reductionistic one gene, one trait approach to understanding causation and prediction is still a live one, despite decades of evidence to the contrary.  Indeed, despite the fact that we've known for 100 years that life is far more complex than that.  Yet still today the prevailing paradigm is to collect more data, enumerate more genes and gene variants associated with disease, and other sorts of 'omics' Big Data, and we'll finally understand causation and be able to predict disease.  It is largely raw induction--the data will speak for themselves by the patterns computers can find in them.  But in many ways, the closer we look, the stranger things seem, not clearer.

In his book The Structure of Scientific Revolutions, published in 1962, Thomas Kuhn described what scientists do as 'normal science' interrupted by rare, transformative changes of fundamental viewpoint.  He called these moments 'paradigm shifts', now a terribly over-worked phrase.  People are very reluctant to give up a worldview they know and have worked with, and are either oblivious to contrary facts or problems that seem insoluble.  Until someone comes along with a fundamentally better idea that accounts for those contrary facts, and then people wonder, as Huxley did about Darwin's theory, why they hadn't seen it all along.

We have seen over the last few years that there are important areas in which the proverbial emperor of genomics has been shown to have less than adequate clothes, or more accurately, that there is not very much emperor in the huge cloak of modern 'omics'. We're awash in data, with new sorts appearing regularly (e.g., ever-growing lists of SNPs, copy number variation, microbiomes, epigenetic modifications, genes in pathways, etc.).  This has added potentially causal elements to efforts to relate genomic data to the traits of organisms, like disease or adaptations, and to a withering amount of complexity about which there is much angst about how it can be parsed.  Some findings have been quite important, but most have been minor at best, and often totally ephemeral.

But what we get and what most are seeking are just lists, in some ways that only a computer can love (or have the patience to look through), and lists don't account for the many, many spatial and temporal entanglements, of diverse form, between the multitude of factors we know are involved in making organisms what they are, in 4-dimensional space and time.

It is tempting to think that some  revolutionary theory is just around the corner if only someone makes the profound discovery--the next Newton or Einstein (or Darwin).  Darwin's insight was as profound as these others, but what he saw was that life, unlike atoms, seems imprecise by nature--based as it is not on replication but on divergence by random variation weakly screened by experience.  And despite widespread but uncritical views to the contrary, Darwin's very Newtonian simple causal determinism was patently imprecise or incomplete.  Is there something fundamental about causation in life and genomes that is yet to be discovered?

In a sense, the evolutionary and functional genomics professions are clinging to conventional notions much the way early 20th century physicists clung to 'ether' in the face of relativity theory:  if we just have better technology, bigger samples and enumerate more and more things, and build statistical models to infer patterns we attribute to causation, we'll understand everything and answer riddles like 'hidden heritability' or enable 'personalized genomic medicine' ... finally!  So, defenders of the faith say to skeptics: patience, please--let us carry on!

But is this right?  What if we ask whether there might be something more involved in life than relentless 'omic'-scale beetle-collecting?

Do strange things about life require new concepts?
Here is another list, this time of a few discoveries or realizations that don't easily fit into the prevailing view, suggesting that simple ramping up of enumeration may not be our salvation:
1.  The linear view of genetic causation (cis effects of gene function, for the cognoscenti) is clearly inaccurate.  Gene regulation and usage are largely, if not mainly, not just local to a given chromosome region (they are trans);
2.  Chromosomal usage is 4-dimensional within the nucleus, not even 3-dimensional, because arrangements are changing with circumstances, that is, with time;
3.  There is a large amount of inter-genic and inter-chromosomal communication leading to selective expression and non-expression at individual locations and across the genome (e.g., monoallelic expression).  Thousands of local areas of chromosomes wrap and unwrap dynamically depending on species, cell type,  environmental conditions, and the state of other parts of the genome at a given time;
4.  There is all sorts of post-transcription modification (e.g., RNA editing, chaperoning) that is a further part of 4-D causation;
5.  There is environmental feedback in terms of gene usage, some of which is inherited (epigenetic marking) that can be inherited and borders on being 'lamarckian';
6.  There are dynamic symbioses as a fundamental and pervasive rather than just incidental and occasional part of life (e.g., microbes in humans);
7.  There is no such thing as 'the' human genome from which deviations are measured.  Likewise, there is no evolution of 'the' human and chimpanzee genome from 'the' genome of a common ancestor.  Instead, perhaps conceptually like event cones in physics, where the speed of light constrains what has happened or can happen, there are descent cones of genomic variation descending from individual sequences--time-dependent spreading of variation, with time-dependent limitations.  They intertwine among individuals though each individual's is unique.  There is a past cone leading of ancestry to each current instance of a genome sequence, from an ever-widening set of ancestors (as one goes back in time) and a future cone of descendants and their variation that's affected by mutations.  There are descent cones in the genomes among organisms, and among organisms in a species, and between species. This is of course just a heuristic, not an attempt at a literal simile or to steal ideas from physics!
Light cone: Wikipedia

8.  Descent cones exist among the cells and tissues within each organism, because of somatic mutation, but the metaphor breaks down because they have strange singular rather than complex ancestry because in individuals the go back to a point, a single fertilized egg, and of individuals to life's Big Bang;
9.  For the previous reasons, all genomes represent 'point' variations (instances) around a non-existent core  that we conceptually refer to as 'species' or 'organs', etc.('the' human genome, 'the' giraffe, etc.);
10.  Enumerating causation by statistical sampling methods is often impossible (literally) because rare variants don't have enough copies to generate 'significance', significance criteria are subjective, and/or because many variants have effects too small to generate significance;
11.  Natural selection, that generates current variation along with chance (drift) is usually so weak that it cannot be demonstrated, often in principle, for similar statistical reasons:  if cause of a trait is too weak to show, cause of fitness is too weak to show; there is not just one way to be 'adapted'.
12.  Alleles and genotypes have effects that are inherently relativistic.  They depend upon context, and each organism's context is different;
13.  Perhaps analogously with the ideal gas law and its like, phenotypes seem to have coherence.  We each have a height or blood pressure, despite all the variation noted above.  In populations of people, or organs, we find ordinary (e.g., 'bell-shaped') distributions, that may be the result of a 'law' of large numbers: just as human genomes are variation around a 'platonic' core, so blood pressure is the net result of individual action of many cells.  And biological traits are typically always changing;
14. 'Environment' (itself a vague catch-all term) has very unclear effects on traits.  Genomic-based risks are retrospectively assessed but future environments cannot, in principle, be known, so that genomic-based prediction is an illusion of unclear precision;
15.  The typical picture is of many-to-many genomic (and other) causation for which many causes can lead to the same result (polygenic equivalence), and many results can be due to the same cause (pleiotropy);
16. Our reductionist models, even those that deal with networks, badly under-include interactions and complementarity.  We are prisoners of single-cause thinking, which is only reinforced by strongly adaptationist Darwinism that, to this day, makes us think deterministically and in terms of competition, even though life is manifestly a phenomenon of molecular cooperation (interaction).  We have no theory for the form of these interactions (simple multiplicative? geometric?).
17.  In a sense all molecular reactions are about entropy, energy, and interaction among different molecules or whatever.  But while ordinary nonliving molecular reactions converge on some result, life is generally about increasing difference, because life is an evolutionary phenomenon.
18. DNA is itself a quasi-random, inert sequence. Its properties come entirely from spatial, temporal, combinatorial ('Boolean'-like) relationships. This context works only because of what else is in (and on the immediate outside) of the cell at the given time, a regress back to the origin of life.
. . .  . you can probably add other facts, curious things about life that are not simply list-like and at the very least challenge the idea that we can understand genomic causation with current approaches.

Is there an analog of 'complementarity' or something equivalently important missing?
These facts are, to paraphrase Einstein about strange phenomena in quantum physics, 'spooky' if you think about them in terms of normal ideas about life or even just about genes.  They are far from the idea of DNA as a linear code or 'the' blueprint for life, or even as a source of 'information' read off like one reads a sentence in an email message.  Yet, generally, we explain biological causation with statistical descriptions of the above sorts of phenomena, based on sampling and enumeration studies, but even huge studies of hundreds of thousands of people, and millions of genomic loci aren't getting us very far.

We do, of course, have a huge array of experimental ways of using reductionist approaches to understand  many sorts of  processes--transcription, physiological reactions, translation, countless others.  We use animal and cell culture models with many fine results where reductionist approaches are in order or are suited to our objectives.  Each gives us something of a view of biological causation.  But often if not usually without asymptotic precision--more than just measurement error. 

Even in most of these instances, and especially at higher levels of observation, we currently have no theory  that is remotely comparable to fundamental theories in chemistry and physics.  There are general evolutionary and population genomic patterns that may even be widely observed, but the patterns are basically empirical rather than being predicted by some sort of 'laws' that compare to those of physics.  It's not even clear how deeply we understand how things work.  We can observe statistical patterns, but they are not of the rigorous kind of probabilistic processes found in physics or chemistry.  As with, say, relativity, you can ignore it unless you approach a critical point (the speed of light, say).  Then you must have a better theory of what's happening and a better way to assess it.  Perhaps we have reached such a point in our desire to make precise predictions about genomes, a kind of limit of utility of enumeration-based thinking.

If countless, ephemeral variants have individually minor, but overall substantial contribution to traits, enumeration and statistical significance criteria simply won't work even if the effects are real.  Similarly, the number of known interactions even in biologically simple reactions are typically (and obviously) vastly more than can be characterized or identified by sampling and standard statistical analysis.  Everyone knows this.  But we don't yet have anything comparable to 'entropy' or 'statistical mechanics' to deal with this adequately--for example, to make precise predictions. Yet the standard view, basically not based on any profound creativity, is that what we need is bigger enumerative studies--'Big Data'.

Is there something missing at a basic level?  The list above suggests that this may be so.  In many ways we may not even be asking well-posed questions.  Would a true conceptual change of some kind lead us to the kinds of predictive uses of genomic data that is being promised?  Will it lead to a serious new theory of genetics?  Is such a change even in the offing or will we just plow ahead with very expensive, sausage-grinding normal science indefinitely?

It is easy to think of the strange facts and argue that a transformative insight will put them all in place. And of course it is always possible that the majority view is simply correct, that we understand life well enough already, basically just needing more data, more computational power, and revised statistical tweaks.

Our personal feeling is that we are ripe for something radically new, that will make many facts that don't now fit the current paradigm fall into place. In reality, right now, most bets and almost all scientific momentum and the way science works, and careers are built, are in the business-as-usual, normal-science mode, but can deeper thinking change that? People aren't yet asking well-posed questions, and we think not enough even recognize that there's a problem.

What we've tried to do here is suggest reasons why we think a change in how we view the role of genes in biology may be overdue, and to trigger readers to think seriously about what that might be.

Thursday, March 28, 2013

Just like pornography!

In a famous obscenity case, an infamous Supreme Court justice Potter Stewart said he couldn't define 'pornography' but he knew it when he saw it (that was before the internet, so he actually had to do some work to get his, um,  exposure, so to speak).

There is something similar in relation to scientific explanations that really have transformative power:  when it happens, you may or may not be able to explain it in its details, but you recognize it.

Goya, The Nude Maja
Every day in evolutionary and biomedical genetics, we see a stream, indeed, a flood of reviews, overviews, commentaries, papers, blog posts, and op-eds promising progress on understanding biological complexity.  Papers with sentences like
"Here we develop a model that shows how considering [sequence, systems biology, epigenetics, copy number variation, evolution, new functional equations, neuroimaging, high throughput analysis, new 'omics' data, methylation, acetylation, ..... (pick your favorite)], major advances in understanding the biology of complex traits and diseases.  Our method ....."
But, then, where is all this promised progress?  It might not exactly be obscene, but are such bevies of claims just posturing and careerism, what we are forced to do to succeed in academic careers these days?  It may be apt way to say so, because what we really see these days is incremental change, some of it progress but most of it trivial or useless, yet fed by the constant pressure for more Large Scale high-powered computational this or that. And that leads to all the self-congratulation.  But it's as paradigm shifting as Goya's Maja is pornography. 

If you think about the major advances in science that by most counts really were progress, the so-called revolutions or real (rather than self-flattering) paradigm shifts that have happened in science, from Galileo, to Darwin/Wallace, Einstein, the discovery of the nature of DNA sequence, or continental drift, these changes were very similar:
1.  Many diverse things that had been given separate, forced, or hand-waving explanations fell dramatically and quickly into place 
2. This was almost instantly recognized 
3.  The new ideas were conceptually very simple
The new theory may have involved some technical details, like fancy math or biochemistry and the like, but the ideas themselves were over-arching, synthesizing, and simplifying.

As Thomas Huxley famously proclaimed after learning Darwin's explanation of the mechanism for evolution:  "How extremely stupid not to have thought of that!"  

Once you see it, you realize it's import.  In genetics and biomedicine today, people are always saying it....but we're not yet seeing it.

Monday, January 28, 2013

Genomic analysis results: understanding, or Fairy Dust?

We are daily seeing claims of major discoveries from genomics and other 'omics' kinds of studies.  These are being proclaimed by their investigators as if they have waved a magic wand and solved critical human problems 'urgently' in need of solution.

Yet, many realize that GWAS and other omics, or idea-free methods, have provided a much lower  yield than was promised or expected.  This is expressed in terms like 'hidden heritability', referring to the familial clustering that should be genetic but for which specific genes cannot be found, or at least many individually trivial contributing genome regions are identified.  In fact, this is what we should have expected, based on long-standing evolutionary theory and ideas about genetics.  We've posted many, many times about this.

The evidence is consistent.  Many genes interact to produce biological traits, in humans as well as other species including yeast and bacteria, and plants.  These genes have to be regulated to control the timing and amount of their expression in cells, and gene regulation involves many interactions among genes and other DNA regions where regulatory proteins bind.  Each of the functional DNA regions that are involved is subject to mutation that, if not lethal, can circulate in the population over generations.

This is known as 'polygenic' variation.  The word simply means a great many contributing genetic elements that mainly have individually tiny effects.  Findings from GWAS and other types of studies consistently point to evidence for just this kind of polygenic control.  But the frustrating thing (for proponents of genetics-are-everything and of personalized genomic medicine, etc.) is that with many individually trivial contributions, each person's genotype is different and each case of the 'same' disease is due to different genotypes and/or environmental exposures.

At the same time, major mutational changes in contributing genes can yield a serious effect that proper analysis can assign to that specific gene.  Our methods identify these, and we generally refer to their effects as 'Mendelian'.  These are often due to changes that inactivate ('knock out') the gene.  This success in easily identifying causes large and only problematically finding the small ones suggests that the reason there appears to be so much genetic control (reflected in measures like family correlation or heritability) is simply what we think it is: traits really are polygenic.

But can life be that complicated??
In the face of this apparent complexity--many argue that life can't really be that complex.  One may feel that it's just not plausible that hundreds or thousands of genes can be the explanation for traits that show orderly value distributions in populations.  That orderliness, and the relatively orderly nature of evolution, and the fact that a trait can be knocked out by single genes, all might be seen as indicating that life must have been able to evolve our complex traits in a way that is not so complex after all.  We're just not understanding--yet!

The usual approach to this view is to argue that we just need longer, larger, costlier studies, or more kinds of 'omics' approaches--like epigenetics, copy number variation, nutrigenomic, microbiomic studies and the like.  Then, this view goes, we'll (whew!) finally identify essentially all the causal elements.  But if things really are polygenic, this may be hopeless.

But, do we really know what's going on, whether or not causation is totally enumerable as the current belief system holds?  We know this belief system is based in part--perhaps major part--on the kinds of professional vested interests and paucity of better ideas that we write often about here.  One way to view this is imply to assert, yet again, that for nearly a century we have had the right kinds of knowledge and the right interpretation, even if lacking in sufficient technology to document those ideas, and that recent technologies are showing just what we expected to find--despite the resistance to the contrary, the idea that we can reduce complexity to simple genetics, or even omics, is largely based on wishful thinking.

But what about an alternative?

Fairy Dust?
Source: Wikimedia
Suppose this is one of those situations in which we are documenting the hell out of trivia, because our theoretical understanding leads us there--we try to force the current 'paradigm' to fit facts that really don't fit!  But perhaps not only are the wishful-thinkers wrong, but so are those of us who have been arguing that we see what we expected to see and that, alas, life really is as complex as our polygene theory suggests it is.

If both sides of these issues are wrong, perhaps there is some other explanation for what seems like the tractable theory of life's coherence, something other than many tiny contributing factors.  Could there be some force or factor--call it 'Fairy Dust'--that we simply have not discovered but that underlies what we are struggling to understand?

Such factors would be analogous to those that were discovered in other 'paradigm shifts', or revolutionary changes in scientific gestalt, that have happened over time.  We might refer to quantum effects, dark energy or dark matter, gravity particles, and so on as exemplars of such factors in other sciences. It could be some kind of 'field' or 'force' whose nature we don't know of or even suspect.  Or just another way of thinking about what we know already.

One can never deny the possibility that Fairy Dust exists.  But neither can we propose studies to find it, as if we knew it existed, and it's understandable given human nature and the history of science that we'll press ahead, ever more intently, trying herd-like to force things to fit our theory, or trying to outwit our theory, the way that's par for the course now, until someone somehow stumbles on the insight required to identify the fairy dust and improve our biological explanations.  This is just how Thomas Kuhn described the way science works (as we posted about yesterday).

But don't hold your breath, because to us, right now, it does not seem that our explanations are missing any such thing.

Thursday, January 24, 2013

A 'paradigm shift' in science....or a manoever?

Thomas Kuhn's 1962 book The Structure of  Scientific Revolutions suggested that most of the time we practice 'normal' science, in which we take our current working theory--he called it a 'paradigm'--and try to learn as much as we can.  We spend our time at the frontiers of knowledge, and at some point we have to work harder and harder to make facts fit the theory.  Something is missing, we don't know what, but we insist on forcing the facts to fit.

Then, for reasons hard to account for but in a way that happens regularly enough that it's a pattern Kuhn could outline (even if rare), someone has a major insight, and shows how a totally unexpected new way to view things can account for the facts that had heretofore been so problematic.  Everyone excitedly jumps onto the new bandwagon, and a 'paradigm shift' has occurred. Even then, some old facts may not be as well accounted for, or the new paradigm may just explain issues of contemporary concern, leaving older questions behind.   But the herd follows rapidly, and an era of new 'normal science' begins.

The most famous paradigm shifts involve people like Newton and Galileo in classical physics, Darwin in biology, Einstein and relativity, and the discovery of continental drift.  Because historians and philosophers of science have in a sense glamorized the rare genius who leads such changes, the term 'paradigm shift' has become almost pedestrian:  we all naturally want to be living--and participating--in an important time in history, and far, far too many people declare paradigm shifts far too frequently (often humbly referring to their own work).  It's become a kind of label to justify whatever one is doing, a lobbying tactic, or a bit of wishful thinking.

Is 'omics' a paradigm shift?
The idea grew out of the Enlightenment period in Europe starting about 400 years ago, that empiricism (observation) rather than just thinking, was the secret to understanding the world.  But pure empiricism--just gathering data-- was rejected in the sense that the idea was for the facts to lead to theoretical generalizations, the discovery of the 'laws' of Nature, which is what science is all about.  This led to the formation of the 'scientific method,' of forming hypotheses based on current theory, setting up studies specifically to test the hypothesis, and adjusting the theory according to the results.

If the 17th-19th centuries were largely spent in gathering data from around the world, a first rather extensive kind of exploration.  But by the 20th century such 'Victorian beetle collection' was sneered at, and the view was that to do real science you must be constrained by orderly hypothesis-driven research.  Data alone would not reveal the theory.

With advances in molecular and computing technology, and the complexity of life being documented, things changed.  In the 'omic' era, which began with genomics, the ethos has changed.  Now we are again enamored of massive data collection unburdened by the necessity to specify what we think is going on in any but the most generic terms.  The first omics effort, sequencing the human genome, led to copy-cat omics of all sorts (microbiomics, nutrigenomics, proteomics, .....) in which expensive and extensive technology is thrown at a problem in the hope that fundamental patterns will be revealed.

We now openly aver, if not brag, that we are not doing 'hypothesis-driven' research, as if there is now something wrong with having focused ideas!  Indeed, we now often treat 'targeted' research as a kind of after-omics specialty activity.  Whether this is good or not, I recently heard a speaker refer to the  omics approach as a 'paradigm shift'.  Is that justified?

Before we could even dream about genomic-scale DNA sequencing and the like, we must acknowledge that our understanding of genetic functions and the complex genome had perplexed us in many ways.  If we had no 'candidate' genes in mind-no specific genetic hypothesis--for some purpose, such as to understand a complex disease, but were convinced for some reason that genetic variation must be involved, what was the best way to find the gene(s)?  The answer was to go back to 'Victorian beetle collection'.  Just grab everything you can and hope the pieces fall into place.  It was, given the new technology, a feeling of hope that this might help (even though we had many reasons to believe that we would find what we indeed did find, as some of us were writing even then).

The era of Big Science
Omics approaches are not just naked confessions of ignorance.  If that were the case, one might say that we should not fund such largely purposeless research.  No, more is involved.  Since the Manhattan Project and a few others, it did not escape scientists' attention that big, long, too-large-to-be-canceled projects could sequester huge amounts of funding.  We shouldn't have to belabor this point here: the way universities and investigators, their salaries and careers, became dependent on, if not addicted to, external grants, the politics of getting started down a costly path enabling one to argue that to stop now would throw away the money so-far invested (e.g., current Higgs Boson/Large Hadron Collider arguments?).  Professors are not dummies, and they know how to strategize to secure funds!

It is fair to ask two questions here:
First, could something more beneficial have been done, perhaps for less cost, in some other way?  Omics-scale research of course does lead to discoveries, at least some of which might not happen or might take a long time to occur.  After the money's been spent and the hundred-author papers published in prestige journals, one can always look back, identify what's been found, and argue that that justifies the cost. 

Second, is this approach likely to generate importantly transformative understanding of Nature?  This is a debatable point, but many have said, and we generally agree, that the System created by Big Science is almost guaranteed to generate incremental rather than conceptually innovative results.  (E.g., economist Tim Harford talked about this last week on BBC Radio 4, comparing the risk-taker science of the Howard Hughes Institutes with the safe and incremental science of the NIH.)  Propose what's big in scale (to impress reviewers or reporters), but safe--you know you'll get some results!  If you compare 250,000 diabetics to 500,000 non-diabetic controls and search for genetic differences, across a genome of 3.1 billion nucleotides, you are bound to get some result (even if it is that no gene stands out as a major causal factor, that is a 'result').  It is safe.

This is not providing a daring return on society's largesse, but it is the way things largely work these days.  We post about this regularly, of course.  The idea of permanent, factory-like, incremental, over-claimed, budget-inflated activity as the way to do science has become the way too many feel is necessary in order to protect careers. Rarely do they admit this openly, of course, as it would be self-defeating.  But it is very well-known, and almost universally acknowledged off the record, that this strategy of convenience seriously under-performs, but is the way to do business.

Hypothesis-free?
This sort of Big Science is often said to be 'hypothesis free'.  That is a big turn away from classical Enlightenment science in which you had to state your theory and then test it.  Indeed, this change itself has been called a 'paradigm shift'.

In fact, even the omics approach is not really theory- or hypothesis-free.  It assumes, though often not stated in this way, that genes do cause the trait, and the omics data will find them.  It is hypothesis-free only in the sense that we don't have to say in advance which gene(s) we think are involved.  Pleading ignorant has become accepted as a kind of insight.

For better or worse, this is certainly a change in how we do business, and it is also a change in our 'gestalt' or worldview about science.  But it does not constitute a new paradigm about the nature of Nature!  Nothing theoretical changes just because we now have factories that can systematically churn out reams of data.  Indeed, the theories of life that we had decades ago, even a century ago, have not fundamentally changed, even though they remain incomplete and imperfect and we have enormous amounts of new understanding of genes and what they do.

The shift to 'omics' has generated masses of data we didn't have before.  What good that will do remains to be seen, as does whether it is the right way to build a science Establishment that generates good for society.  However that turns out, Big Science is certainly a strategy shift, but it has so far generated no sort of paradigm shift.

Monday, June 14, 2010

The Fired Coach Syndrome

The Fired Coach Gets Hired Again
Well, sports fans, you probably are all familiar with the Fired Coach Syndrome. That's the regular pattern whereby, when a coach is fired because his team doesn't win--even if it's because they haven't got good enough players--he is immediately hired by some other team and treated as their savior. We need not name names, because as one coach said, all coaches have either been fired, or will be fired (with the single exception of our own Joe Paterno, shown at left!).

What has this got to do with genetics, evolution, or any other MT theme? The NY Times has a story--a front page story--on the failure of genome-wide association studies (GWAS) to lead to many disease 'cures'. Big news!

Or is it?

This story is by an op-ed writer who has built his career over many years by serving as a notorious mouthpiece for the genomics 'industry'. Now, without a whiff of contrition, he's telling a new tale, and again making a Big Story of it. But, as before, he does it uncritically, too.

Genomewide Association Studies
It's true that countless millions have been spent on GWAS with little of the long-promised 'cures' to show for it. A few of us have been long-term critics of this approach, for the decade in question, and for legitimate reasons. Complex traits are here today because whatever their genetic basis, it passed the sieve of evolution. For human geneticists, the main interest has of course been the genetic basis of disease, for obvious societal and funding reasons.

But for many decades it has been clear that while genes contribute substantially to their variation, these traits (normal or otherwise) are basically not due mainly to single genetic effects. The evolutionary assumptions that are implicit in complex disease mapping studies never, except for wishful thinking, predicted other than what we have seen. And another fact that the NYT story and a recent journal article or two report as curious--that we can predict risk better with a patient's family history than with his/her genotype data--is entirely expected, based on reasons that everyone properly trained in genetics should have known (for approximately the last 100 years based on the genetics of 'polygenic' traits).

The Evolution of Complexity
Evolution generates redundancies, removes nasty variants, and generates a distribution of mutation effects and allele (genetic variant) frequencies that will not as a rule lead to common, chronic, late-onset, or complex traits being caused by just a gene or two (and here we're not eve considering the wild card, the environmental effects that are usually substantially more important than the genetic ones). But that doesn't mean there would be no identifiable contributing genes: for most traits some alleles, in some genes, in some populations, will have marked effects. Unlike a complete 'bust', that, too, is what we find and was predictable.

From countless GWAS and other approaches to enumerate the genetic cause of complex traits, we know of a large number of contributing genes, say around 1000. But only a surprising few of these do so replicably or with high probability and nontrivial frequency. What we see today is what everyone should have known to expect and a few of us were saying this in papers at the beginning of the decade in question.

Selling the Brooklyn Bridge--Again
So if there's no news in this news, why does the former hawker for the vested genomic interests now have the creds to write as if insightfully about the current state of affairs? It's by consulting the very people who, mainly knowingly, lobbied and hyped everyone into the GWAS era. Why should he be getting assessments of the situation from those whose mouthpiece he was for an approach that didn't work, and for predicted reasons? The same vested interests are now advocating their new grand theories, such as that finally we'll solve the problem in terms of (take your pick) copy number variation, epigenetics, whole genome sequences, large biobanks, many very rare alleles with strong effect, combinations of common alleles with modest, effects, etc.

Why should those who sold us the old Brooklyn Bridge be listened to when they advocate these new ideas which, not incidentally, require larger, longer, more costly studies (for their own labs?). They are not giving us reasons, just hand-waving invocation of current fads or Plan B's. Naturally, they like the idea of locking in funding for the rest of their careers. But would people have any interest at all in buying this new Brooklyn Bridge?

The FCS
The answer is the Fired Coach Syndrome. Somehow, we tend to believe that these people with known rap sheets are still the experts. Their past statements and advocacy are not taken to be the lobbying that they actually were. Now, they're being listened to as the coaches of the new Team GenomeSequence.

But there are deeper and more serious issues. Unfortunately, they are reflected in the latest Big Story now being marketed in this Times article and elsewhere, which continues to suffer from uncritical hyperbole and oversimplification.

We Already Knew All This
The fact is that after a few GWAS (and other kinds of studies of variation underlying single-gene traits), many years ago, we confirmed the theory we had about evolution and complex genetics (which goes back nearly 100 years). We should have declared our knowledge firm, and thought of better ways to understand complex traits. But too many vested interests, without better ideas, demanded the big GWAS funding. In that sense GWAS and its fellow-travelers have generally been a very expensive bust.

But here we have to again criticize the attempt to make this a new Big Story. Because we have learned a lot, even if at great cost, about genetic control. Huge and useful data bases have been created. Technology has been developed. DNA sequencing is now about as cheap as your HDTV set. Many corporations have flourished--money for their stockholders, and jobs for employees. Much has been learned about genomes of many species, and their variation. Despite private greed, much or most of these data are freely available to anyone. Genetics has flourished perhaps like no science ever before.

Maybe the funds should have been spent in other ways, but they have led to much new knowledge. This doesn't mean that 1000 new promising targets for Pharma have been revealed, as Francis Collins and Eric Lander enthusiastically claim: instead, most of those genes are trivial contributors. Still, these investigators have sponsored or done technically excellent work, contributed to the huge public data bases, new understanding of genomic functions of many kinds. They've done this while, largely unrecognized even by themselves, showing that classical theory was right--and that is a substantial baby in the bathwater.

There are important questions about complex causation, which science can address, especially if we could decide to confront complexity on its own terms rather than promising to reduce it to a manageable number of enumerable causes as has been done to date and is essentially still being done.

However, the reportage errors continue: Despite the waste and the catering to vested political interests, the Times article's main claim to Big Story, the fact that we haven't developed many 'cures' of complex disease, is irrelevant to whether GWAS was a scientific success.

Disease is complex, organisms evolved to resist tinkering from the outside--especially tinkering with our genomes--and attacking diseases genetically is a difficult engineering feat. Whether, when, or how that will be done successfully is not yet clear, but nobody has any right to expect it to be rapid. It's a bum rap at GWAS to say they failed because the gene engineers haven't yet figured out how to use the results to make cures. The real rap, and what should discredit much of the hype machine, is that they have promised 'cures' (in fact, and to be fair, some of the main advocates--though not Francis Collins--have at least been slightly circumspect, warning this is all for our grandchildren or beyond--even if such caveats were declared in passing and did not temper their lobbying for the GWAS and related resources).

Bigger, Longer, Pricier More-of-the-same
However, the Failed Coach Syndrome would suggest that we should be very skeptical about the same writers and scientists who are now deftly expostulating their latest grand theories, paradigm shifts, and new strategies. Because they're largely rationales for more-of-the-same, except at larger scale, longer time frames, and greater cost. What we should be doing is understanding where the baby is, and not just fostering existing self-interest by saying there was no baby in the bath so let's run a lot more bathwater and maybe we'll find triplets in there someplace.

One thing that some are advocating is to search genomewide sequence data from patients, to find clearly harmful mutations in functional genome regions that are known to be related to the physiology in question--such as rare 'knockout' mutations in those genes which might be inferred to be causal and then tested experimentally. Whether this justifies the continued use of mega-scale genomic approaches is debatable. Though one can predict there will not be a huge bonanza of findings with much therapeutic value, there certainly will be some, and at least to some extent this genome-screening approach is very different from the statistical-evolutionary basis underlying GWAS. The reason is too complex for this post, but the gist is that such studies will finally rest on biology rather than baloney, even if it's already being over-sold in the usual lobbying way.

Today's PT Barnums
As to Fired Coaches, it must be said that high level expertise and capability is not to be found everywhere, and our genetic PT Barnums got there largely because of both management and scientific skill. Once the research community has its hands on funds, and labs have been built, it's politically difficult if not impossible to shift from this entrenchment to new investigators or unrelated approaches. Science is part of society and works by evolution, which includes careerism. Scientific revolutions can't be ordered up by the media or by funding institutions with their turf-protecting bureaucrats: they just happen when and where they happen. So in that sense we must in practice rely on the same cast of characters, though whether we should try to move gradually away to fresh approaches is a legitimate question.

In science, we should at least be aware of what's afoot because otherwise resources continue to be diverted for less rather than more effective use. But can we learn to temper our claims--or at least penalize those who don't? Or can the system demand accountability for results if we promise them? Or can we establish full-stop criteria for large projects once it's clear that they aren't really bearing fruit? Probably not. Because the real name of the game is getting funding. To a substantial extent, scientific facts themselves are secondary. That's the anthropological truth. Or perhaps, as Marshal McLuhan said decades ago, in modern science the medium really is the message. And now we are clearly going to hire the same coaches to guide us into the future!

Maybe we have no choice. But we should be aware that's what we're doing. And if you believe their new pronouncements or their media megaphones, then we have a bridge that we'd like to sell you. And, by the way, Joe Paterno has stayed in the national rankings and just signed a new contract....at age 83.