Lots of people commented or tweeted about our post yesterday on the latest stature genomics paper. It is true that we think that tons of money are being thrown into chasing rainbows such as the genomic basis of stature. We said what we felt about that yesterday.
But people often defend current practice reflexively, including on Twitter and avoid having to face up to the serious need for new and actually creative thinking.
I'm pretty clumsy at Twitter, and too verbose, but I think the issues demand more than just 140 character exchanges anyway. The latter can make the issues seem dismissible or trivial, and instead from a biological point of view they are anything but.
We did one good deed for the genetics community: we provided some great sport, as defenders of the faith tweeted, re-tweeted and re-re-tweeded dismissive and often derisive comments. Circling the wagons is reinforcing, and heresy is never well-received. Still, the issue remains.
The sneering aside, there are a number of real issues
Is it strange that not a single objector provided even a single reason why the comments we made were in any sort of substantial error, nor why continuing the vain attempt at enumerating the thousands of trivial causes of stature is a good way to keep investing resources. A characteristic of the objecting tweets is that the tweeters reacted defensively or didn't actually read the post carefully. We were, for example, accused of thinking that the huge number of hits only involved genes in the traditional sense (protein coding regions). We were actually clear that the word 'gene' is losing its traditional, restricted meaning because we now know there are so many other types of DNA function. So many now use the word quite vaguely, sometimes to refer to a single nucleotide, or more generally to refer to specific spots in the genome--some use the word only when claiming that the variant site has relevant function. To criticize us for such a trivial point is to miss the point: If there are figuratively or even literally countless causal contributors, not all of them digital or enumerable, then it doesn't matter about such semantic details. But it does matter that we continue to churn out such kinds of results as if we're making progress. But these sorts of findings were predicted long ago.
The fact that we mentioned unborn or dead people was criticized but our point was that the sample used in this study is literally not replicable, and replicability is a fundamental aspect of the kind of science being applied to the data. The sample consists of people not all of whom are likely still to be alive (or, soon won't be), or whose stature will actually change (the age of individuals in the study ranged from 14 to 103, the young ones still growing, the old ones surely losing height). The human population is churning over in numbers so that genotypes in one study don't represent what some next study would sample (and here we can forget the trivial detail that this stature study basically only included Europeans).
Quibbling about the details won't change the overall bottom line, and is only a careless or intentional distraction from the real message that even the sneerers know very well is the truth: When even easily measurable traits like stature show this level of causal (or, mainly, statistical-associational) complexity, and we have similar traits from many different kinds of species, including plants, animals and even yeast, then we have a problem! This is not irrelevant complexity.
Calling something complex and arguing that therefore we just need to enumerate larger samples is a way of stalling, circling wagons rather than realizing that now we know the lay of the land and it's time to think differently. One might even dare to ask what kind of sense it makes to say that 9,500 sites in the genome contribute to stature (in this current, restricted sort of sample). But that is perhaps too profound a question to raise in polite company.
Even saying, as one tweeter implied, that the study was essentially closed in terms of its causal elements (that is, everything was there, with no new in- or outputs, and hence enumerable) really means that what can be done is to assess risk in closed sets of data. But the implication of the paper, and the authors' suggesting reassuringly that the causes were 'finite', is that in general, and in relation to diagnosis or prediction, that the causal elements are closed in number, and enumerable, and discrete (categorical). These are simply fictions.
So here's an analogy I used in a brief tweet. If we sample some trees and enumerate its number of branches, we simply cannot say we know how many branches trees in general have. But perhaps we can find ways to understand how branches form, relate to each other, change over the lifetime of a tree, vary among tree species, and so on. Enumerability is not the objective of the development of an adequate theory of branching.
Similarly, we should be asking how it is that so many functional spots in the genome, not the same from sample to sample, could be involved in a seemingly simple trait like stature--with qualitatively similar findings even in much more well-controlled, even experimental studies. What does it actually mean to say that thousands of parts of the genome affect the trait? It probably means something we do not yet understand about the organization of organisms that goes beyond enumerability (since clearly open-ended numbers of combinations yield similar results--normal height, blood pressure, glucose levels, brain functions, etc.).
One might argue that what we can get from such studies as mega-sequencing extravaganzas is some sense of the 'shape' of causation in various way, and that's good. Yes, true enough, but it is not even clear what one would mean by 'shape', and there comes a time when we have enough post hoc enumerative data and need to go beyond that, because that's generic rather than specific when it comes either to understanding causation or, more to the point, to prediction. If this work has little predictive value, then it is being misrepresented.
In the case of stature, since we can't predict future environments, we cannot, in principle, predict stature conditional on genotype, except perhaps in the real, often pathologic extremes. Every tweeter who knows his/her genetics knows this.
Since in truth the same thousands-of-sites genotype doesn't ever recur (and here, we only refer to the constitutive genotype, not the entire somatic genotype as we discussed in our post), then there is a major limit to our predictive power. We know this even from the many nearly single-locus diseases that have been studied in detail.
The fact that the authors claimed substantial (if far from complete) accounting of the high heritability of stature is potentially very misleading. They retrofitted to a set of data, which describes correlations but doesn't really address causation and hence prediction, and conditional on some mix of environmental exposures of the sampled individuals. Heritability is a population-specific ratio, as everyone knows, so what is actually being accounted for by such a percentage is unclear and that needs to be said up front. A proper understanding of 'heritability' of course also must recognize the fundamental relative nature of that measure, even if all elements at work were measured and samples large enough to resolve everything sampled, and even if 'the' environment is in some sense a unitary population measure--which it isn't.
Retrofitting is one thing, but prediction is another. It is further limited because only a fraction of the presumed contributing genomic elements will be seen again or even have assessable effects (even in the study, because they're too rare), not to mention lifestyle issues.
Shape is a different thing. If we can get a proper definition of causal 'shape', with some serious-level theoretical background, we might get somewhere. That is elusive at present in both evolutionary and present-day genomics, or more properly, in causal biology. Maybe it (and other strange things we know about genomics but sweep under the rug) is in order to take these things seriously, and stop laughing and start thinking beyond enumeration.
Various such aggregate approaches to GWAS results have been suggested, but they're mainly additive summaries of retro-fitted causal associations, and post hoc hopeful mention of purported possibly relevant pathways.
Better science is needed.
Thursday, October 9, 2014
Wednesday, October 8, 2014
The height of folly: are the causes of stature numerous but at least 'finite'?
By
Ken Weiss
In 1926, geneticist Thomas Hunt Morgan wrote this about stature:
The most recent is an extensive study in (where else?) Nature Genetics, by a page-load of authors. In summary, the authors found, pooling all the data from many different independent studies in different populations, that at their most stringent statistical acceptance ('significance') level, 697 independent (uncorrelated) spots scattered across the genome each individually contributed in a significance-detectable way to stature. The individual contributions were generally very small, but together about 20% of the overall estimated genetic component of stature was accounted for. However, using other criteria and analysis, including lowering the acceptable significance level, and depending on the method, up to 60% of the heritability could be accounted for by around 9,500 different 'genes'. Don't gasp! This kind of complexity was anticipated by many previous studies, and the current work confirms that.
Many issues are swept under the rug here, however--that is, relegated to a sometimes obscure warren of tunneling of Supplemental information. The individuals were all of European descent so that genetic contributions in other populations are not included. The analysis was adjusted for sex and age. Subjects were all, presumably, 'normal' in stature (i.e., no victims of Marfan or dwarfism), and all healthy enough to participate as volunteers. The majority were older than 18, but the range was 14 to 103, and 45% were male, 55% female. The data were also adjusted for family relationships.
These are very capable authors--with such a major size crew, consisting of more 'authors' than locations in the genome, there must be at least some who know their business (and have actually read and understood the paper they purportedly authored). In all seriousness, however, there is no reason to make serious criticism of their methods or their findings. They pooled many different data sources and had to adjust for various sampling and data-quality aspects. But it is comforting that the study confirms the earlier projections in general terms (and, we think, the data from earlier studies are part of the current meta-analysis).
But if there's no reason to question the study itself, there is a deeper issue.
The deeper issue
The authors rather glibly come to what they seem to feel (did they take a poll?) is the comforting conclusion that while number of causally contributing genes is huge, it isn't 'infinite'. But that is at most technically true and in fact is farther from the real truth than the authors intended, or perhaps even realized.
First, if by 'gene' the authors meant coding genes, then their 9500 is nearly half of all such genes in our genome. Of course the definition of 'gene' is vague and debatable these days so we'll let that problem pass. Still, 9500 is a lot, and it's just the proverbial tip (see below).
Secondly, stature is affected to a substantial extent by non-sequence-based variables, such as diet, early childhood diseases, maternal nutrition and so on. There is no clear limit to what or how many such influences there are, and they are often quantitative, which means exposure can be measured in principle to an infinitesimal degree and hence there are infinitely many exposures and infinitely many statures. But this isn't much of a problem in Europeans, but if we're talking about human stature, we have to recognize the causes not included here.
But, third, there are about 3 billion nucleotides in the human genome, each of which can take on 4 possible nucleotides. Each person is diploid, so there are 10 possible site-specific genotypes (AA, AC, AG, AT, CC, CG, CT, GG, GT, TT) at each site, and thus 10 to the 3-billionth power possible genomewide genotypes. That may be more than all the stars, or even all the atoms in the universe, and of course the vast majority of these possible combinations of contributing variants will never occur in real people, and probably don't affect stature (under any conditions?), but we basically have no way of knowing which those are, or of showing that they don't matter.
Well, fine you say, that's a really big number of contributors--but it's not infinite! That isn't true.
Actually, it is infinite!
By saying the number of causes of stature variation is finite they essentially mean that they are enumerable and that there is an end to the counting, by which time all causes of stature will have been accounted for. That is simply false.
There are deletions and insertions of open-ended locations, numbers and size in our genomes. There is no known upper limit to genome size (as some huge-genomed species show), and hence no limit to the number of insertions that may occur sometime or somewhere. And, as Morgan recognized so long ago, there are many ways to be short or tall. Not all genome sites need have an effect on stature, but we don't know that and the more we look the more we find, so in practice the number of genomic influences on stature, not counting environmental factors, is in fact not limited or countable: it can keep on growing indefinitely. It is literally not finite, but is infinite!
One must also remember that by far the greatest effects on stature are not even being counted! These studies only include adults in a 'normal' range at normal adult ages (old people, for example, lose height substantially as their posture sags and their intervertebral disks shrink). The authors report that across their age range they didn't see dramatic mapping differences, but in pooled studies with different numbers at different ages, this has to be a weak test. More importantly, however, is that different cohorts, decades apart, have very different statures. If the same study were done in Japan, mean height has changed by several inches in the last 75 years, and of course in most of the world stature is impaired by chronic infectious disease, nutritional deficits, and the like. Thus, sex and cohort (when you are living, such as now vs the Middle Ages in Europeans, the subjects of this study) are regressed out and don't count and the effects being reported are just those relatively small modifications of these major causes. So when one speaks of cause, this study is retro-fitting genomic data to existing individuals, which is by no means the same as using genomes to predict stature of future individuals--and that raises questions about the finiteness of the whole process.
Further, even if a mere 10,000 positions in the genome affect stature, there are still 10 to the 10000th power possible contributing genotypes, vastly more than the number of people alive today (large but finite), but not measurable even in principle in people who have died or are yet to be born--many millions of which events have happened since the studies in this paper were done.
Most of the causal variant sites are likely to be exceedingly rare in the population, because most variables in general are due to very recent mutations, and that means first that they aren't in the authors' samples, and likely won't be in anybody's samples, even if some bewildered agency keeps funding such studies. And the variants' effects won't be detectable because a very rare variant in large samples like these cannot generate statistically significant result.
Worse, you have roughly, say, a trillion cells in your body. All their genomes are descendant from the dual set of chromosomes in the fertilized egg by which you started life, but each cell division creates a few new mutations that if not lethal to that cell are transmitted to the cell lineage into which that cell divides. So you have 6 billion nucleotides, each potentially variable in one or any unspecified number of a trillion cells. Your 'genotype' is the aggregate of these possible variants--and that is far more potential variation than in the whole 7 billion living humans, each of whom has a net genotype of his/her trillion cells. Since cells in a person are constantly dying and being lost, while new ones are produced, not even a single person's genotype is in any sense countable or enumerable. And again, we have no way to know which variants in which cells in which tissues (in which lifestyle and other environments) affect stature.
Well, fine, you say, but the number of causes is at least manageable. But isn't true, either!
And not even countably infinite!
We can push this even further. In the 1800s the German mathematician Georg Cantor pioneered the study of infinity. He identified various levels or types of infinity. The smallest, if such a term can be used, was countable infinity. Like the integers, 0,1,2,...., one can count or enumerate them, even if one never reaches the end. Further, one can pair them in a way, for example, the even and odd numbers are an equal-sized infinite series, and can be paired such as 1-2, 3-4,5-6,.... one by one, without limit.
But then there is a next level of uncountable infinity. The range of 'real' (decimal) numbers from zero to 1 is like that. One cannot match them up with countably infinite numbers, such as integers. That's because one can always cut an a real-number interval like that between 0 and 1 into more and more smaller pieces. Their number accumulates faster than the number of integers. We might start by matching this way: 1 - 0.01, 2 - 0.02, 3 - 0.03 and so on. But we can also chop the 0-1 interval into smaller values 0.001, 0.002, 0.003 etc. And if we tried to pair them up with the integers we could quickly chop even finer: 0.0001, 0.00011, 0.00012... These uncountably infinite 'real' numbers will always be ahead of the countably infinite numbers we'd try to pair them up with.
This is a second level of infinity. In fact, if you start thinking of DNA as a sequence of nucleotides, even with some of the complications mentioned above, you might think that the contributors to stature were countably infinite, one per nucleotide. But it's worse. That's because stature itself is infinitesimally measurable, significance cutoff values are infinitesimally divisible, and DNA is not just a digital sequence: it has 4-dimensional properties that involve other molecules (epigenetic modification, for example), and levels of gene expression depend quantitatively on the efficiency of binding of regulatory proteins and what happens in one part of DNA depends on what's happening on other parts at a given time. For such reasons, stature does not come in countably infinite number of values and so neither can its causes. From a causal point of view, stature is uncountably infinite.
Another way to see this is that even with the enumerably infinite (or even with the authors' claim that stature is causally 'finite') we cannot discriminate in the most proper causal way. That is because open-ended, unknown (probably unknowable) numbers or sets or combinations of causes can yield the same stature value (which, of course, changes during life, historic period, and sex). We cannot infer genomic cause from stature (an inference I've called Detectance in the past). Nor can we predict stature from genotype (the Predictance of a genotype), because there is more to stature than genes and a continuous range of possible stature values for a given genotype (just as we won't be able to predict an individual's facial appearance from genes, but that's another story). And here we're not considering computing power or measurement error, etc. And the prediction can't be done in the usual statistical way, because that requires replicability but each person's genotype (as estimated from a blood sample, not to mention all his/her cells) is unique. And each person's stature is changing during life.
What to do
In essence, we're not just playing word games here. The number of causes, even just the genetic causes, of stature variation is truly infinite. It is misleading of the authors to try to reassure readers that at least the number is finite. That is in essence a tactic, perhaps inadvertent, that justifies the enumeration-approach form of business as usual.
In fact, the causes of stature have no limit (at least not until the death of our species). That means that we basically cannot understand such traits comprehensively by enumeration or cause-counting approaches. It's no more enumerable than, say, the values in a magnetic or gravitational field which are different in every infinitesimally small location.
The wise TH Morgan realized these issues in early 20th century form without needing to have all the expensive and extensive data that we are amassing. But his statement was generic and one might say it called for confirmation. We have had many other sorts of confirmation for similar traits, but perhaps what's been reported for stature closes the book on the basic question.
So now, if the science is to advance beyond a pretense of causal enumerability, what we need to do is develop some new, quantitative rather than enumerative causal concepts. How we should do that is unknown, unclear, debatable,.... and in our business-as-usual environment, probably unfundable.
A man may be tall because he has long legs, or because he has a long body, or both. Some of the genes may affect all parts, but other genes may affect one region more than another. The result is that the genetic situation is complex and, as yet, not unraveled. Added to this is the probability that the environment may also to some extent affect the end-product.His point, of course, was not about stature per se but about the difficulty of identifying genes 'for' traits because there are many pathways to a trait, and they aren't all genetic. This was understood eighty-eight years ago, and yet we have had study after study, ever larger, merging smaller studies, and all sorts of fancy statistics to account for various internal complications in genome sampling, and still the results pour forth as if we haven't learned what we need to know about this and many traits like it.
(TH Morgan, The Theory of the Gene, p 294, 1926):
![]() |
| Source |
The most recent is an extensive study in (where else?) Nature Genetics, by a page-load of authors. In summary, the authors found, pooling all the data from many different independent studies in different populations, that at their most stringent statistical acceptance ('significance') level, 697 independent (uncorrelated) spots scattered across the genome each individually contributed in a significance-detectable way to stature. The individual contributions were generally very small, but together about 20% of the overall estimated genetic component of stature was accounted for. However, using other criteria and analysis, including lowering the acceptable significance level, and depending on the method, up to 60% of the heritability could be accounted for by around 9,500 different 'genes'. Don't gasp! This kind of complexity was anticipated by many previous studies, and the current work confirms that.
Many issues are swept under the rug here, however--that is, relegated to a sometimes obscure warren of tunneling of Supplemental information. The individuals were all of European descent so that genetic contributions in other populations are not included. The analysis was adjusted for sex and age. Subjects were all, presumably, 'normal' in stature (i.e., no victims of Marfan or dwarfism), and all healthy enough to participate as volunteers. The majority were older than 18, but the range was 14 to 103, and 45% were male, 55% female. The data were also adjusted for family relationships.
These are very capable authors--with such a major size crew, consisting of more 'authors' than locations in the genome, there must be at least some who know their business (and have actually read and understood the paper they purportedly authored). In all seriousness, however, there is no reason to make serious criticism of their methods or their findings. They pooled many different data sources and had to adjust for various sampling and data-quality aspects. But it is comforting that the study confirms the earlier projections in general terms (and, we think, the data from earlier studies are part of the current meta-analysis).
But if there's no reason to question the study itself, there is a deeper issue.
The deeper issue
The authors rather glibly come to what they seem to feel (did they take a poll?) is the comforting conclusion that while number of causally contributing genes is huge, it isn't 'infinite'. But that is at most technically true and in fact is farther from the real truth than the authors intended, or perhaps even realized.
First, if by 'gene' the authors meant coding genes, then their 9500 is nearly half of all such genes in our genome. Of course the definition of 'gene' is vague and debatable these days so we'll let that problem pass. Still, 9500 is a lot, and it's just the proverbial tip (see below).
Secondly, stature is affected to a substantial extent by non-sequence-based variables, such as diet, early childhood diseases, maternal nutrition and so on. There is no clear limit to what or how many such influences there are, and they are often quantitative, which means exposure can be measured in principle to an infinitesimal degree and hence there are infinitely many exposures and infinitely many statures. But this isn't much of a problem in Europeans, but if we're talking about human stature, we have to recognize the causes not included here.
But, third, there are about 3 billion nucleotides in the human genome, each of which can take on 4 possible nucleotides. Each person is diploid, so there are 10 possible site-specific genotypes (AA, AC, AG, AT, CC, CG, CT, GG, GT, TT) at each site, and thus 10 to the 3-billionth power possible genomewide genotypes. That may be more than all the stars, or even all the atoms in the universe, and of course the vast majority of these possible combinations of contributing variants will never occur in real people, and probably don't affect stature (under any conditions?), but we basically have no way of knowing which those are, or of showing that they don't matter.
Well, fine you say, that's a really big number of contributors--but it's not infinite! That isn't true.
![]() |
| Source |
Actually, it is infinite!
By saying the number of causes of stature variation is finite they essentially mean that they are enumerable and that there is an end to the counting, by which time all causes of stature will have been accounted for. That is simply false.
There are deletions and insertions of open-ended locations, numbers and size in our genomes. There is no known upper limit to genome size (as some huge-genomed species show), and hence no limit to the number of insertions that may occur sometime or somewhere. And, as Morgan recognized so long ago, there are many ways to be short or tall. Not all genome sites need have an effect on stature, but we don't know that and the more we look the more we find, so in practice the number of genomic influences on stature, not counting environmental factors, is in fact not limited or countable: it can keep on growing indefinitely. It is literally not finite, but is infinite!
One must also remember that by far the greatest effects on stature are not even being counted! These studies only include adults in a 'normal' range at normal adult ages (old people, for example, lose height substantially as their posture sags and their intervertebral disks shrink). The authors report that across their age range they didn't see dramatic mapping differences, but in pooled studies with different numbers at different ages, this has to be a weak test. More importantly, however, is that different cohorts, decades apart, have very different statures. If the same study were done in Japan, mean height has changed by several inches in the last 75 years, and of course in most of the world stature is impaired by chronic infectious disease, nutritional deficits, and the like. Thus, sex and cohort (when you are living, such as now vs the Middle Ages in Europeans, the subjects of this study) are regressed out and don't count and the effects being reported are just those relatively small modifications of these major causes. So when one speaks of cause, this study is retro-fitting genomic data to existing individuals, which is by no means the same as using genomes to predict stature of future individuals--and that raises questions about the finiteness of the whole process.
Further, even if a mere 10,000 positions in the genome affect stature, there are still 10 to the 10000th power possible contributing genotypes, vastly more than the number of people alive today (large but finite), but not measurable even in principle in people who have died or are yet to be born--many millions of which events have happened since the studies in this paper were done.
Most of the causal variant sites are likely to be exceedingly rare in the population, because most variables in general are due to very recent mutations, and that means first that they aren't in the authors' samples, and likely won't be in anybody's samples, even if some bewildered agency keeps funding such studies. And the variants' effects won't be detectable because a very rare variant in large samples like these cannot generate statistically significant result.
Worse, you have roughly, say, a trillion cells in your body. All their genomes are descendant from the dual set of chromosomes in the fertilized egg by which you started life, but each cell division creates a few new mutations that if not lethal to that cell are transmitted to the cell lineage into which that cell divides. So you have 6 billion nucleotides, each potentially variable in one or any unspecified number of a trillion cells. Your 'genotype' is the aggregate of these possible variants--and that is far more potential variation than in the whole 7 billion living humans, each of whom has a net genotype of his/her trillion cells. Since cells in a person are constantly dying and being lost, while new ones are produced, not even a single person's genotype is in any sense countable or enumerable. And again, we have no way to know which variants in which cells in which tissues (in which lifestyle and other environments) affect stature.
Well, fine, you say, but the number of causes is at least manageable. But isn't true, either!
And not even countably infinite!
We can push this even further. In the 1800s the German mathematician Georg Cantor pioneered the study of infinity. He identified various levels or types of infinity. The smallest, if such a term can be used, was countable infinity. Like the integers, 0,1,2,...., one can count or enumerate them, even if one never reaches the end. Further, one can pair them in a way, for example, the even and odd numbers are an equal-sized infinite series, and can be paired such as 1-2, 3-4,5-6,.... one by one, without limit.
But then there is a next level of uncountable infinity. The range of 'real' (decimal) numbers from zero to 1 is like that. One cannot match them up with countably infinite numbers, such as integers. That's because one can always cut an a real-number interval like that between 0 and 1 into more and more smaller pieces. Their number accumulates faster than the number of integers. We might start by matching this way: 1 - 0.01, 2 - 0.02, 3 - 0.03 and so on. But we can also chop the 0-1 interval into smaller values 0.001, 0.002, 0.003 etc. And if we tried to pair them up with the integers we could quickly chop even finer: 0.0001, 0.00011, 0.00012... These uncountably infinite 'real' numbers will always be ahead of the countably infinite numbers we'd try to pair them up with.
This is a second level of infinity. In fact, if you start thinking of DNA as a sequence of nucleotides, even with some of the complications mentioned above, you might think that the contributors to stature were countably infinite, one per nucleotide. But it's worse. That's because stature itself is infinitesimally measurable, significance cutoff values are infinitesimally divisible, and DNA is not just a digital sequence: it has 4-dimensional properties that involve other molecules (epigenetic modification, for example), and levels of gene expression depend quantitatively on the efficiency of binding of regulatory proteins and what happens in one part of DNA depends on what's happening on other parts at a given time. For such reasons, stature does not come in countably infinite number of values and so neither can its causes. From a causal point of view, stature is uncountably infinite.
Another way to see this is that even with the enumerably infinite (or even with the authors' claim that stature is causally 'finite') we cannot discriminate in the most proper causal way. That is because open-ended, unknown (probably unknowable) numbers or sets or combinations of causes can yield the same stature value (which, of course, changes during life, historic period, and sex). We cannot infer genomic cause from stature (an inference I've called Detectance in the past). Nor can we predict stature from genotype (the Predictance of a genotype), because there is more to stature than genes and a continuous range of possible stature values for a given genotype (just as we won't be able to predict an individual's facial appearance from genes, but that's another story). And here we're not considering computing power or measurement error, etc. And the prediction can't be done in the usual statistical way, because that requires replicability but each person's genotype (as estimated from a blood sample, not to mention all his/her cells) is unique. And each person's stature is changing during life.
What to do
In essence, we're not just playing word games here. The number of causes, even just the genetic causes, of stature variation is truly infinite. It is misleading of the authors to try to reassure readers that at least the number is finite. That is in essence a tactic, perhaps inadvertent, that justifies the enumeration-approach form of business as usual.
In fact, the causes of stature have no limit (at least not until the death of our species). That means that we basically cannot understand such traits comprehensively by enumeration or cause-counting approaches. It's no more enumerable than, say, the values in a magnetic or gravitational field which are different in every infinitesimally small location.
The wise TH Morgan realized these issues in early 20th century form without needing to have all the expensive and extensive data that we are amassing. But his statement was generic and one might say it called for confirmation. We have had many other sorts of confirmation for similar traits, but perhaps what's been reported for stature closes the book on the basic question.
So now, if the science is to advance beyond a pretense of causal enumerability, what we need to do is develop some new, quantitative rather than enumerative causal concepts. How we should do that is unknown, unclear, debatable,.... and in our business-as-usual environment, probably unfundable.
Tuesday, October 7, 2014
"The Book that Saves the World": When is a book legitimate science?
By
Ken Weiss
Yesterday in the mail, I received a copy of a book called Is It To
Be: Terminal Alienation or Transformation for the Human Race, written
by one Jeremy Griffith, described in the book as an Australian biologist of
unspecified (if any) professional affiliation. This book arrived at my office address without
my having ordered it (or paid for it), and the material included an advertising
poster for the book and a long letter to me (clearly addressed from some
mailing list because it went to the wrong department). The letter was
from the Preface author, one Harry Prosen, MD. It seemed on the face of
it to be tailored to me since it mentioned me and my wife and co-blogger and
co-author Anne (mis-spelled), with some quotes of ours embedded in the 3-page
letter. Apparently Griffith has established what he calls the World Transformation Movement to
help fill what he asserts is an urgent need.
This book is quite long at 639 jam-packed pages. Its pages are littered with breathless font changes--lots of italics, bold-face, and
underlines. It purports to argue that we have (that is, the author has)
finally understood the human condition based on biology rather than religion, and further, we finally know what is needed to rescue our species
before it is too late. The countless quotes are from dare I say reputable scientists, as well as popular culture. Except for the personalized nature of the tome, it seems to want to give the appearance that it is a work of science, based on research. On this basis, the author proffers a Transformed Lifeforce Way of Living that the Movement advocates. But what this actually is is buried so thickly in word salad as to be inscrutable to me (if you doubt this, the book is available online and there are easily found web pages).
Someone apparently funded this book's printing and mass mailing, of
which I am sure I am but one of countless unsuspecting recipients. While sent to scientists and purporting to be based on science, it's easy to see this as fitting a pattern--the breathless font-changing style,
the miscellany of quotes and source, the claim that the author has found the answer and so on, but written as if it is the author's deep resulting insight that is transformative. But a tipoff to the
kind of book this is, is that there is no index and it was published by his Movement.
It sounds wacky and is totally nonconforming in its treatment of science (or even philosophy). Its rambling message is largely impenetrable, and easy to dismiss as a reflex....indeed, it fits into a category.
It sounds wacky and is totally nonconforming in its treatment of science (or even philosophy). Its rambling message is largely impenetrable, and easy to dismiss as a reflex....indeed, it fits into a category.
Over the years, I've received many such unsolicited books. They have included the hugely massive and beautiful Atlas of Creation, by the
Turkish author Harun Yahya, two pro-eugenics books including one on Jewish
inherent superiority by John Glad (Jewish Eugenics; Future Human Evolution:
Eugenics in the Twenty-First Century), and Philippe Rushton's classically racist tracts
(Race: Evolution and Behavior). But not all of these kinds of books are distributed freely. There's Michael Behe who hasn’t
sent out freebies to my knowledge, but is well-known for his views on the
non-evolution of biological complexity. And there are many of the standard Christian Creationist or Fundamentalist
books in the mix.
These authors usually have no, or no relevant, academic appointment, but
there are exceptions (e.g., Rushton, Behe).
Their funding if any may be unclear, but some are funded by the religion-based
Templeton Foundation. Others set up
their own ‘institutes’ of which they are the only employee. Their formats and styles make them sound like
cranks, and I have never considered any of these books to be serious treatments of their subject.
Still, should having an academic appointment lend credence to these kinds of books? We all would agree, I think, that being in academia is far from a guarantee that
someone knows the truth as it is best known in his/her time. Should having a religious agenda be disqualifying? Only to those who deny religion, presumably, and likewise being
funded by a religious organization doesn’t automatically vitiate the work
itself.
And what about the books that are widely read, reviewed, and cited in the
public as well as academic world, but that are written by journalists? Some of them are in any sense of the word
‘tracts’. And journalists often get things wrong or culpably distort their importance, yet their stories appear in the leading media. The books are reviewed, often by actual scientists, as if the authors were themselves qualified scientists. Some clearly seem (to me) to be credible books, but how do we tell? And what about articles written for major magazines or journals by program officers in funding agencies, or editors of major journals? Oddly enough, they always tell a glowing tale about what they are funding!
And what about popular science books written by academics? These have long been looked on with some disdain by other academics. Indeed, writing a popular book is generally not a way to advance an academic career, though a few have done so and a number of others have made considerable income that way. Some have even kept their academic reputation (even if what they've written is far afield of what they actually control). And Edison and Galen and Boyle were showmen in the media of their times, as one might say was the style then.
And what about popular science books written by academics? These have long been looked on with some disdain by other academics. Indeed, writing a popular book is generally not a way to advance an academic career, though a few have done so and a number of others have made considerable income that way. Some have even kept their academic reputation (even if what they've written is far afield of what they actually control). And Edison and Galen and Boyle were showmen in the media of their times, as one might say was the style then.
So, how and why do we judge these books, and collectively reject the subset like the one I just received? I have wondered whether it is because we don’t like their point of view because it challenges our own, or
because they’re not playing by the usual institutional rules like peer review,
or because of their styles. Or is it
just that we feel threatened and so we ostracize them?
Darwin and many others in his era wrote books for sale through the
public and commercial presses. That was
normal, and there were very controversial books circulating at the time. They didn’t just flood everyone with freebies
so far as I am aware, but I don’t happen to know enough of the history of science
publication to know how things should be considered.
Galileo, for example, wrote in Italian (not Latin) and supposedly did so
in order to be read and understood by the general public, doing an end run
around the constricting Church-controlled media of his time. Einstein published in legitimate journals,
but couldn’t get an academic job and worked instead for the Swiss patent office
in his famous 1905-6 flurry of brilliance.
These and many other authors were one-man institutes--funding their own work, doing it in their basement. They did not work in big institutions.
These and many other authors were one-man institutes--funding their own work, doing it in their basement. They did not work in big institutions.
So when a book like Is It To Be
crosses the threshold, should it go straight into the circular file or
not? How do we tell? Sometimes even just the tone of its prefatory material alone is enough.
Should we toss a book that includes the following?
The book continues in that vein. Unfortunately for our willingness to pass easy judgments about quality, this is the preface to Galileo's Starry Messenger (Sidereus Nuncius, 1610), perhaps the single most important book in opening the age of modern science. That's where Galileo devastated accepted theory by pointing what amounted to a new toy, a telescope, at the moon. Not surprisingly the book and the author's attitude, rankled the Establishment.
Should we toss a book that includes the following?
"Revealing great, unusual, and remarkable spectacles, opening these to the consideration of every man, and especially of philosophers....."
The book continues in that vein. Unfortunately for our willingness to pass easy judgments about quality, this is the preface to Galileo's Starry Messenger (Sidereus Nuncius, 1610), perhaps the single most important book in opening the age of modern science. That's where Galileo devastated accepted theory by pointing what amounted to a new toy, a telescope, at the moon. Not surprisingly the book and the author's attitude, rankled the Establishment.
![]() |
| Galileo's self-promotion, 1610 |
Scientists are a lot better off than, say, people who argue over which religion tells the Real Truth or which is The Best Wine, or debates about 'Art', because in science we at least have various sorts of evidence to collect and means to put a scientific claim to the test. Of course, we have to agree on the kind of evidence and methods, which criteria history shows can change. Where does alchemy become chemistry, or phrenology become fMRI? And why, for example, do we believe the latter and not the former?
Indeed, I think in fact that conformity to the club rules of the clubby environment of academe, including degrees, professional jobs and the like are part of the criteria that we use. I think we do it partly in a tribal way, and at least partly by intuition: a book just doesn’t feel right, or comes to conclusions long dismissed as untenable based on current evidence. Or the author isn't playing by our rules of decorum. As a collective enterprise, scientists ‘know’ or sense or maybe just informally agree on what is credible—at least what is worth looking at seriously—and what is crackpot.
But it’s a disturbing thing to ponder, because we like to believe that science is at its roots open to any doubts or ideas that may help us to explain
nature, and in principle ones from out of the blue may be, just occasionally,
among the deepest insights. We would
hate to miss some penetrating insight and then exclaim, as Thomas Huxley did
about evolution “How extremely stupid not to have thought of that!”. Yet, when the unsolicited book, and strong self-promotional material, from an irregular source, comes into the mailbox it tends
to go out the same afternoon in a less savory container.
Monday, October 6, 2014
Big Data and its rationale - why are we still at it?
By
Ken Weiss
We're in the era of Big Data projects. This is the result of a fervor of belief that more data, essentially comprehensive and completely enumerative, will lead to deeper or even complete understanding. It is an extension of the idea of reductionism and induction that was a major part of the Age of Science, ushered in about 400 years ago with the likes of Bacon, Newton, Galileo and other iconic figures. Examples in physics include huge collider studies and very costly space activities, and of course the Big Data drive is hugely prevalent in genomics and other biological and biomedical sciences. But in our age, several centuries later, why are we still at it in this way?
The story isn't completely simple. The Age of Science also led to the so-called 'scientific method', a systematic way to increase knowledge through highly focused hypothesis-testing. Many philosophers of science have argued that, or tried to show how, the approach enabled our understanding of truth refined in this self-disciplined way, even if ultimate truth may always elude our meagre brainpower. But why then a return to raw induction?
One reason, and we think a predominant reason, is the pragmatic competition for research resources. As technological abilities rise (pushed by corporate interests for their own reasons), we have become able to collect ever more detailed data. The Age of Science was itself ushered in by technology in many ways, the iconic examples being optics (telescopes and microscopes). But sociopolitical reasons also exist. Long, large projects lock up large amounts of funds for years or even decades, guaranteeing jobs and status for people who thus can avoid the draining, relentless pursuit of multiple, small 3- or even 5-year grants. As the science establishment has grown, driven by universities for good as well as greedy reasons, funding inevitably became more competitive.
Careerism and enumerative ways to judge careers by administrators (paper and grant counting) are driving this system, but the funding agencies, too, have become populated with officials whose careers involve holding and building portfolios, using public relations to tout their achievements, and so on. And once you've got to the top of the Big Data pile, it's a high that's hard to come down from!
The history of a worldview
From the Manhattan Project in WWII, and several open-ended research efforts that followed, the idea became obvious that if you can state some generic problem and get funding to study it, and justify why it requires large scale, expensive technology, and long term, well, you snared yourself a career's worth of secure funding and all the status and perks--and, of course, actual research--that go with it. It's only human to understand those reasons.
However, there are also some good, scientifically legitimate reasons for the growth of Big Data. When I have queried investigators in the past--and here, this means over about 20 years--about why they were advocating genome mapping approaches to diseases of concern to them, they often said, rather desperately, that they were taking a 'hypothesis-free' approach because nothing else worked! If biology is genetics, genes must be involved, and mapping may show us what genes or systems are involved. For example, psychiatrists said that they simply had no idea what was going on in the brain to cause traits like schizophrenia, no candidate genes or physiological processes, so they were taking a mapping approach to get at least some clues they could then follow with focused research.
But to a great extent, what started out as a plausible justification for arch induction approaches, has become an excuse and a habit, a convenience or strategy rather than a legitimate rationale. The reason is not because their reasoning in the past was wrong at the time. The reason is because the Big Data approach has in a sense worked successfully: it has by and large proven not to provide the kind of results that initially justified it. Instead of identifying causes that couldn't have been expected, exhaustive Big-Data studies, in genomics and other areas of epidemiology and biomedical science, have identified countless minor or even trivial 'causes' of traits (and the psychiatric traits are good examples), showing they are not well explained by enumerative approaches, genetic or otherwise. For example, if hundreds of genes, each varying in and among populations, contribute to a trait, every occurrence of a disease, or everyone's blood pressure etc. is due to a different combination of causes. Big Data epidemiology has found the same for environmental and life-style factors.
What we should now do is to realize this successful set of findings from mapping studies. Rather than flood the media with hyperbole about the supposed successes of current approaches, we should adapt our approach to what we've learned, and take a time-out somehow, to reflect on what other conceptual approaches might actually work, given what we now know rather clearly. We may even have to substantially reform the types of questions we ask.
The reason for that sort of new approach is that once we, or as we, plunge into ever more too-big-to-terminate studies, with their likely minimal cost-benefit payoff, we lock up resources that clever thinkers might find better ways to use. And unless we do something of that sort, the message to scientists is to be more strongly driven to think in Big Data terms---because they'll know that's where the money is and how to keep their hold on it. This is exactly what's happening today.
Unfortunately, even many fundamental things in physics, the archetype of rigorous science, are being questioned by recent findings. Life sciences are in many relevant ways a century behind even that level. But this seems not to give us, or our funders, pause. Changing gears seems to go against the grain of how our industrialized society works.
In times of somewhat crimped resources from traditional funders, it's no wonder that universities and investigators are frantically turning to any possible source they can find. As we know from our own experience and that of colleagues, so much time is spent hustling and so relatively little in doing actual research, that the latter is becoming a side-light of the job. But it doesn't really seem to be changing how people think, and the push for Big Data is an understandable part of the strategy. The way to think about science itself is not changing under this pressure. At least not yet.
We keep harping on this message because it involves both the nature of knowledge and the societal aspects of how we acquire it. Even if there were no material interests in terms of allocation of resources, to cling to Big Data, we face a scientific or even epistemological problem that few seem interested in facing. There is simply too little pressure to force people to think differently.
Perhaps, if the message is said enough times, and read by enough people, sooner or later, somewhere, someone might get it, and show the rest of us a better way to fathom the causal complexities of the world.
The story isn't completely simple. The Age of Science also led to the so-called 'scientific method', a systematic way to increase knowledge through highly focused hypothesis-testing. Many philosophers of science have argued that, or tried to show how, the approach enabled our understanding of truth refined in this self-disciplined way, even if ultimate truth may always elude our meagre brainpower. But why then a return to raw induction?
One reason, and we think a predominant reason, is the pragmatic competition for research resources. As technological abilities rise (pushed by corporate interests for their own reasons), we have become able to collect ever more detailed data. The Age of Science was itself ushered in by technology in many ways, the iconic examples being optics (telescopes and microscopes). But sociopolitical reasons also exist. Long, large projects lock up large amounts of funds for years or even decades, guaranteeing jobs and status for people who thus can avoid the draining, relentless pursuit of multiple, small 3- or even 5-year grants. As the science establishment has grown, driven by universities for good as well as greedy reasons, funding inevitably became more competitive.
Careerism and enumerative ways to judge careers by administrators (paper and grant counting) are driving this system, but the funding agencies, too, have become populated with officials whose careers involve holding and building portfolios, using public relations to tout their achievements, and so on. And once you've got to the top of the Big Data pile, it's a high that's hard to come down from!
![]() |
| "It takes Big Data to make it Big--But I did it! (Drawn by the author) |
From the Manhattan Project in WWII, and several open-ended research efforts that followed, the idea became obvious that if you can state some generic problem and get funding to study it, and justify why it requires large scale, expensive technology, and long term, well, you snared yourself a career's worth of secure funding and all the status and perks--and, of course, actual research--that go with it. It's only human to understand those reasons.
However, there are also some good, scientifically legitimate reasons for the growth of Big Data. When I have queried investigators in the past--and here, this means over about 20 years--about why they were advocating genome mapping approaches to diseases of concern to them, they often said, rather desperately, that they were taking a 'hypothesis-free' approach because nothing else worked! If biology is genetics, genes must be involved, and mapping may show us what genes or systems are involved. For example, psychiatrists said that they simply had no idea what was going on in the brain to cause traits like schizophrenia, no candidate genes or physiological processes, so they were taking a mapping approach to get at least some clues they could then follow with focused research.
But to a great extent, what started out as a plausible justification for arch induction approaches, has become an excuse and a habit, a convenience or strategy rather than a legitimate rationale. The reason is not because their reasoning in the past was wrong at the time. The reason is because the Big Data approach has in a sense worked successfully: it has by and large proven not to provide the kind of results that initially justified it. Instead of identifying causes that couldn't have been expected, exhaustive Big-Data studies, in genomics and other areas of epidemiology and biomedical science, have identified countless minor or even trivial 'causes' of traits (and the psychiatric traits are good examples), showing they are not well explained by enumerative approaches, genetic or otherwise. For example, if hundreds of genes, each varying in and among populations, contribute to a trait, every occurrence of a disease, or everyone's blood pressure etc. is due to a different combination of causes. Big Data epidemiology has found the same for environmental and life-style factors.
What we should now do is to realize this successful set of findings from mapping studies. Rather than flood the media with hyperbole about the supposed successes of current approaches, we should adapt our approach to what we've learned, and take a time-out somehow, to reflect on what other conceptual approaches might actually work, given what we now know rather clearly. We may even have to substantially reform the types of questions we ask.
The reason for that sort of new approach is that once we, or as we, plunge into ever more too-big-to-terminate studies, with their likely minimal cost-benefit payoff, we lock up resources that clever thinkers might find better ways to use. And unless we do something of that sort, the message to scientists is to be more strongly driven to think in Big Data terms---because they'll know that's where the money is and how to keep their hold on it. This is exactly what's happening today.
Unfortunately, even many fundamental things in physics, the archetype of rigorous science, are being questioned by recent findings. Life sciences are in many relevant ways a century behind even that level. But this seems not to give us, or our funders, pause. Changing gears seems to go against the grain of how our industrialized society works.
In times of somewhat crimped resources from traditional funders, it's no wonder that universities and investigators are frantically turning to any possible source they can find. As we know from our own experience and that of colleagues, so much time is spent hustling and so relatively little in doing actual research, that the latter is becoming a side-light of the job. But it doesn't really seem to be changing how people think, and the push for Big Data is an understandable part of the strategy. The way to think about science itself is not changing under this pressure. At least not yet.
We keep harping on this message because it involves both the nature of knowledge and the societal aspects of how we acquire it. Even if there were no material interests in terms of allocation of resources, to cling to Big Data, we face a scientific or even epistemological problem that few seem interested in facing. There is simply too little pressure to force people to think differently.
Perhaps, if the message is said enough times, and read by enough people, sooner or later, somewhere, someone might get it, and show the rest of us a better way to fathom the causal complexities of the world.
Friday, October 3, 2014
An example of the problem of risk projection
By
Ken Weiss
One of the biggest problems in biomedical, including genomic, disease risk prediction is that it is almost always based on projections of past risks into the future. We wrote about that the other day (here), but here's yet another example--and they abound.
The Oct 1 NYTimes had a story about a boom in pre-school fitness programs. If parents, and it will largely be middle-class privileged parents, adopt this fad, it may have long-term, even lifelong, implications for the future health of the kids who partake. If the Times is right that this is a boom industry, one can imagine a whole generation of super healthy upper and upper middle class future adults in the making. That would be quite good (unless, of course, it turns out that various muscle, skeletal, or other traits are harmed by overdoing this early exercise), and so a beneficial practice for individual and public health.
But, even if they are healthier than today's adults, eventually these babies will develop diseases as they grow older. From our perspective as scientists who think about pitfalls to doing science, this raises some potential problems for future researchers doing disease genetics or environmental epidemiology, looking for risk factors associated with disease.
So risk estimates are about the future, but future exposures can't even in principle be known. This is obvious, so why is awareness of the problem so low? And what, if anything, can we do about it besides discounting risk estimates and acknowledging that they usually have unknown precision?
![]() |
| Baby swimming; Wikipedia |
The Oct 1 NYTimes had a story about a boom in pre-school fitness programs. If parents, and it will largely be middle-class privileged parents, adopt this fad, it may have long-term, even lifelong, implications for the future health of the kids who partake. If the Times is right that this is a boom industry, one can imagine a whole generation of super healthy upper and upper middle class future adults in the making. That would be quite good (unless, of course, it turns out that various muscle, skeletal, or other traits are harmed by overdoing this early exercise), and so a beneficial practice for individual and public health.
But, even if they are healthier than today's adults, eventually these babies will develop diseases as they grow older. From our perspective as scientists who think about pitfalls to doing science, this raises some potential problems for future researchers doing disease genetics or environmental epidemiology, looking for risk factors associated with disease.
1. If it's predominantly parents of a given ancestry, European urbanites say, who enroll their kids, this can induce false positive genomic signals. Any other kind of clustering related to who enrolls can be equally problematic;
2. The kids themselves may not remember, or investigators decades from now may not be aware of these early fitness programs even to ask about them. The exposure to such programs' effects may as a result go under-reported in epidemiological or genetic association studies, leading to distorted estimates of other risk factors;
3. If parents who enroll their kids are, as seems likely, themselves into fitness plans, there can be a family association of altered risk with genotype that will be challenging to identify and correct for as they could seem to be genetic;
4. If the kids are inculcated with other health-habits, based on today's do-this/don't-do-that fashions (e.g., here's a story in the Times about 6 year olds choosing to be vegan), there will be correlations with later disease that will not necessarily be identifiable, and indeed, it may be the parents' attitudes that are responsible, not the kids' genotypes or behavioral choices freely made.Our society already spends much media ink and research resources in hyping risk estimates for genes and lifestyle factors alike, that are made retrospectively based on the behavioral and exposure antecedents of today's disease cases, as ascertained by means such as interview questionnaires (Did you smoke? How much, for how long? Did you get exercise when you were a child? How much, for how long? How many eggs did you eat per week when you were in your twenties?). Those are not only quite inaccurate, involving things occurring decades ago, but the chronic, complex disease risks we're exposed to today generally won't materialize for decades into the future. Indeed, if we read about a risky behavior or food, this makes a lot of us change our behavior, yet another complication--and one which operates regularly as we read advice from the latest research, not always aware of its potential weaknesses.
So risk estimates are about the future, but future exposures can't even in principle be known. This is obvious, so why is awareness of the problem so low? And what, if anything, can we do about it besides discounting risk estimates and acknowledging that they usually have unknown precision?
Thursday, October 2, 2014
Ignore this study!
A piece in the New York Times reports that a new study shows that working long hours causes type 2 diabetes, but only in people of lower socioeconomic status. The study is a meta-analysis of 4 previous studies and analysis of 19 unpublished studies, and is published in The Lancet: Diabetes and Endocrinology ("Long working hours, socioeconomic status, and the risk of incident type 2 diabetes: a meta-analysis of published and unpublished data from 222 120 individuals," Kivimäki et al.).
Which means, of course, that working long hours doesn't in fact have a direct effect, but is correlated with something else that's associated with living life in the lower socioeconomic strata. Or at best working long hours exacerbates the effect of that unidentified, confounding variable, or variables. The authors note that they adjusted for age, sex, obesity and physical activity, and excluded shift workers, and that the effect of long working hours was independent of these variables. But, while they wonder what it is about working long hours that might be causal, they do also note that long working hours may be mediating the effect of some unknown variable.
Type 2 diabetes, characterised by hyperglycaemia and insulin resistance or insulin insufficiency, causes substantial disease burden. Globally, more than 285 million people have type 2 diabetes, and its prevalence is predicted to increase to 439 million by 2030. The findings from prospective cohort studies show that working long hours is associated with factors that contribute to diabetes, such as unhealthy lifestyle, work stress, sleep disturbances, and depressive symptoms. Working long hours is also associated with an increased risk of cardiovascular disease, which is one of the complications of type 2 diabetes. However, the direct association between long working hours and incident type 2 diabetes has been assessed in only a few studies.And they found that "In this meta-analysis, the link between longer working hours and type 2 diabetes was apparent only in individuals in the low socioeconomic status groups." People of low SES who worked more than 55 hours a week were at a 30% higher risk of developing t2d than those who worked 35-40 hours a week.
Which means, of course, that working long hours doesn't in fact have a direct effect, but is correlated with something else that's associated with living life in the lower socioeconomic strata. Or at best working long hours exacerbates the effect of that unidentified, confounding variable, or variables. The authors note that they adjusted for age, sex, obesity and physical activity, and excluded shift workers, and that the effect of long working hours was independent of these variables. But, while they wonder what it is about working long hours that might be causal, they do also note that long working hours may be mediating the effect of some unknown variable.
Indeed, there are many differences between socioeconomic strata that might be associated with risk of T2D including ethnicity and genetic predisposition, diet, maternal health during pregnancy, type of job and thus pay scale, and other possible risk variables associated with poverty. So, it's interesting that what looks to us like an inconclusive study that suggests but doesn't identify confounding variables associated with risk of type 2 diabetes, the NYT piece emphases the effect of long working hours only, though acknowledging that this is associated with depression, sleep deprivation, unhealthy lifestyle, which may be causal. Though, they do not question why that would be only in poor people.
Curious that the lead author is quoted in the NYT on T2d prevention this way:
And here's another study we might want to ignore
Steven Salzburg at Forbes alerts us to another work-related danger. Standing up at our desks is said to be good for our health. Indeed, it's said even to reduce risk of type 2 diabetes (and presumably the benefit increases the longer we stand and work, unless of course we're lower class, in which case the longer we stand, the more likely we'll get type 2 diabetes). The mechanism has now been found -- you know how telomere length (the length of the ends of our chromosomes) is associated with longevity? The longer they are, they longer we live, right? Well, it seems that standing up lengthens telomeres!
An RCT (random control trial) study in the British Journal of Sports Medicine ("Stand up for health—avoiding sedentary behaviour might lengthen your telomeres: secondary outcomes from a physical activity RCT in older people," Sjögren et al.) reports that the less time their subjects spent sitting, the greater the lengthening of their telomeres six months from the inception of the study. "Reduced sitting time was associated with telomere lengthening in blood cells in sedentary, overweight 68-year-old individuals participating in a 6-month physical activity intervention trial."
But, as Salzburg points out, this result is based on 12 individuals, and in fact only 2 of these people showed a marked effect, and probably drove the results. And what do blood cell telomeres have to do with diabetes or longevity, unless this has to do with the immune system? We ask, because red blood cells have no chromosomes, and white blood cells are basically in the immune system. So are some discarded cells sloughed off from other tissues (and hence no longer being used by the body). Where is a plausible mechanism, unless it has to do with lifestyle correlates of those who choose stand-up careers? The authors owe us at least some explanation. Or is it the journal that owes its readers and explanation for why it published such a paper?
Anything can get published
There are good studies and there are bad studies. Even good (legitimate) findings can be falsely attributed to some measured putative cause without sufficient justification. The publication and promotion of loose, over-interpreted, over-sold studies is one reason that we don't know which foods are good for us and which are bad. But the problem is deeper than that -- reductionist science, which aims to identify single causal factors for complex diseases, no matter how well done the study and expert the analysis, is simply the wrong approach to understanding complexity. It is systematically misleading.
Why these reports keep flowing is understandable in our news-cycle media culture. But when the bottom line is basically that a reasonable, moderated, balanced lifestyle is the best and almost the only reliably known way to defer many chronic diseases, it's strange that scientists themselves can't see the relative nonsense they are purveying.
Curious that the lead author is quoted in the NYT on T2d prevention this way:
“My recommendation for people who wish to decrease the risk of Type 2 diabetes,” he said, “applies both to individuals who work long hours and those who work standard hours: Eat and drink healthfully, exercise, avoid overweight, keep blood glucose and lipid levels within the normal range, and do not smoke.”So, basically, he's saying ignore this study.
And here's another study we might want to ignore
Steven Salzburg at Forbes alerts us to another work-related danger. Standing up at our desks is said to be good for our health. Indeed, it's said even to reduce risk of type 2 diabetes (and presumably the benefit increases the longer we stand and work, unless of course we're lower class, in which case the longer we stand, the more likely we'll get type 2 diabetes). The mechanism has now been found -- you know how telomere length (the length of the ends of our chromosomes) is associated with longevity? The longer they are, they longer we live, right? Well, it seems that standing up lengthens telomeres!
![]() |
| Telomeres; Wikpedia |
An RCT (random control trial) study in the British Journal of Sports Medicine ("Stand up for health—avoiding sedentary behaviour might lengthen your telomeres: secondary outcomes from a physical activity RCT in older people," Sjögren et al.) reports that the less time their subjects spent sitting, the greater the lengthening of their telomeres six months from the inception of the study. "Reduced sitting time was associated with telomere lengthening in blood cells in sedentary, overweight 68-year-old individuals participating in a 6-month physical activity intervention trial."
But, as Salzburg points out, this result is based on 12 individuals, and in fact only 2 of these people showed a marked effect, and probably drove the results. And what do blood cell telomeres have to do with diabetes or longevity, unless this has to do with the immune system? We ask, because red blood cells have no chromosomes, and white blood cells are basically in the immune system. So are some discarded cells sloughed off from other tissues (and hence no longer being used by the body). Where is a plausible mechanism, unless it has to do with lifestyle correlates of those who choose stand-up careers? The authors owe us at least some explanation. Or is it the journal that owes its readers and explanation for why it published such a paper?
Anything can get published
There are good studies and there are bad studies. Even good (legitimate) findings can be falsely attributed to some measured putative cause without sufficient justification. The publication and promotion of loose, over-interpreted, over-sold studies is one reason that we don't know which foods are good for us and which are bad. But the problem is deeper than that -- reductionist science, which aims to identify single causal factors for complex diseases, no matter how well done the study and expert the analysis, is simply the wrong approach to understanding complexity. It is systematically misleading.
Why these reports keep flowing is understandable in our news-cycle media culture. But when the bottom line is basically that a reasonable, moderated, balanced lifestyle is the best and almost the only reliably known way to defer many chronic diseases, it's strange that scientists themselves can't see the relative nonsense they are purveying.
Wednesday, October 1, 2014
Nature or nurture? Tristram Shandy weighs in!
By
Ken Weiss
It is only because of our very casual and cursory attention to history that we credit Charles Darwin, in 1858, with showing us that every aspect of our natures is due, entirely and with infinite determinism, to natural selection fine-tuning our genomes. We like heroes and because we're scientists, we want the heroes to be other scientists (so we can liken ourselves, and our own inherent brilliance, to those heroes). We dismiss philosophers and historians of science as meddlers in our business if they do not cling to the mythologies we prefer about ourselves (and our inherent brilliance). But if we're really scientists, and truly truth-seekers, we must bow to discoveries that undermine our self-flattering tales. It hurts, it really hurts, if the discovery shows that the pioneers of our field were, in fact, religious or, worse, totally un-versed in the lore of science and the pursuit of objective fact.
But such is the sad reality about the fact of complete genetic determinism.
The opinions and discoveries of Laurence Sterne (as expressed by Tristram Shandy)
In 1761-3, the Rev. Laurence Sterne, published his study of environmental determinism, called The Life and Opinions of Tristram Shandy, Gentleman.
The book is written as a narrative of his life, by the title character. Tristram notes that at the very moment of the act that led to his conception Mrs Shandy blurted out "Pray my Dear, have you not forgot to wind up the clock?" About that question her husband, though a man of exceedingly regular habits, replied "Good G..! Did ever woman, since the creation of the world, interrupt a man with such a silly question?" The clock in question is shown in the background of this figure; in the foreground is a depiction of the use of forceps in deliveries (which crushed Tristram's nose during his birth).
Here is how Tristram describes the lifelong impact of his mother's ill-timed distraction:
Things were debated even back then!
Last week we noted that in 1862, just after Darwin's Origin of Species, the novelist Wilkie Collins expressed the debate between Nurture advocates and their Nature foes as to which was responsible for our behavior. This was exactly a century after Sterne's Nature view just described. Sterne had no notion of 'genes' and Tristram didn't attribute his nature to inheritance, which would put him in the Nurture category, even if at the extreme (being parentally distracted in flagrante delicto defined the imminent conceptus's entire future).
It wasn't long thereafter that the power of inheritance was debated, in little-remembered works, that applied to the same behavioral characteristics discussed in fiction. Almost half-way between Sterne and Collins, in 1808, one M. Portal, a French professor of medicine, published the "Considerations on the Nature and Treatment of some hereditary or Family Diseases" (London Med. Phy. Journal, 21, 229–239, 281–296). Members of the upper classes (at least) were concerned that they might sully their noble posterity by transmitting disease, especially mental disease, from parent to offspring. Many traits were known to be transmitted (as Montaigne is quoted by Portal), "We find that not only are the marks of the body transmitted from father to son, but also a resemblance of temper, complexion, and inclinations of the mind."
A few years later, in 1814, a British physician named Joseph Adams took Portal's work to task, arguing that there were other ways that traits could cluster in families. He, too, was speculating in that he hadn't the kinds of precision or systematic analytic methods we have today. But he carefully pointed out that life-experience, infection, and other causes could account for such clustering. Diseases present at birth, for example, were more likely to be hereditary than diseases that, even if similar among relatives, only appeared later. He noted that inherited predisposition could lead to a disorder only after experiencing some environmental or life-style factor. He was particularly interested to calm down those in the upper classes who were worried that behavioral traits ('madness') were inherited. Additionally, Adams explicitly anticipated much of Darwin's ideas about evolution by natural selection, but that's unrelated to our topic today. We've discussed Adams' book on MT before.
The bottom line is that, from these instances, ones that I just happened to know about, both literature and science reflect the fact that in the post-Enlightenment era we often think of as the Age of Science, western culture has long known of what appeared to be inherited traits, even if 'genes' per se weren't yet known of, and yet it was also clear that experience and living conditions could not only generate traits but generate family resemblance of traits.
But nobody knew the how, when, or why of the two kinds of cause, and without knowing units of causation in either life-experience or genetics, we had simply to guess or speculate about these things. The 20th century gave us Mendelian patterns to look for in families as signatures of genetic causation, but when we realized early on that many genes could generate similar traits it was known (to those who cared to recognize that fact) that Mendelian patterns were not needed even in 'genetic' traits.
We have almost the same level of confusion, mix, debate, and lack of clarity today, centuries later in our amped-up Age of Technical Science. We now throw around terms like 'interaction' without, usually, having much direct sense of what we are actually referring to. Sometimes, the contorted way we describe our ideas of causation seem not much different from the way Tristram Shandy did--only that was satire and we seem to be deadly serious!
Something's missing.
After-note: If you haven't read Tristram Shandy, and are interested in more than just reading the flood of science articles in the journals that jam up your mail box every day, Sterne's book is a good, if wacky, romp through sense and nonsense. Reading it takes patience, since it intentionally doesn't always flow in one direction, but it's no more contorted and obscure than those same science articles.
But such is the sad reality about the fact of complete genetic determinism.
The opinions and discoveries of Laurence Sterne (as expressed by Tristram Shandy)
In 1761-3, the Rev. Laurence Sterne, published his study of environmental determinism, called The Life and Opinions of Tristram Shandy, Gentleman.
![]() |
| Laurence Sterne (1713-68); image from Google images |
![]() |
| The Shandy home (and the clock), from 1761 edition. By Wm Hogarth, from Wikipedia images |
Here is how Tristram describes the lifelong impact of his mother's ill-timed distraction:
I wish either my father or my mother, or indeed both of them, as they were in duty both equally bound to it, had minded what they were about when they begot me; had they duly consider'd how much depended upon what they were then doing;--that not only the production of a rational Being was concerned in it, but that possibly the happy formation and temperature of his body, perhaps his genius and the very cast of his mind;--and, for aught they knew to the contrary, even the fortunes of his whole house might take their turn from the humours and dispositions which were then uppermost;--Had they duly weighed and considered all this, and proceeded accordingly,--I am verily persuaded I should have made a quite different figure in the world, from that in which the reader is likely to see me.--Believe me, good folks, this is not so inconsiderable a thing as many of you may think it;--you have all, I dare say, heard of the animal spirits, as how they are transfused from father to son, & &--and a great deal to that purpose:--Well, you may take my word, that nine parts in ten of a man's sense or his nonsense, his successes and miscarriages in this world depend upon their motions and activity, and the different tracks and trains you put them into, so that when they are once set a-going, whether right or wrong, 'tis not a half-penny matter,--away they go cluttering like hey-go mad; and by treading the same steps over and over again, they presently make a road of it, as plain and as smooth as a garden-walk, which, when they are once used to, the Devil himself sometimes shall not be able to drive them off it.The surprise is that more than two centuries ago at least some views were directly contrary to the predominant view these days, reflecting the way so much of our research resources are currently committed--that is, to the idea that genes rather than experience are what (also from the moment of conception) make us what we are. But not everyone shared this view!
. . . . .
--Then, let me tell you, Sir, it was a very unseasonable question at least,--because it scattered and dispersed the animal spirits, whose business it was to have escorted and gone hand in hand with the Homunculus, and conducted him safe to the place destined for his reception. . . . . Now, dear Sir, what if any accident had befallen him in his way alone!--or that through terror of it, natural to so young a traveller, my little Gentleman had got to his journey's end miserably spent;--his muscular strength and virility worn down to a thread;--his own animal spirits ruffled beyond description,--and that in this sad disorder'd state of nerves, he had lain down a prey to sudden starts, or a series of melancholy dreams and fancies, for nine long, long months together.--I tremble to think what a foundation had been laid for a thousand weaknesses both of body and mind, which no skill of the physician or the philosopher could ever afterwards have set thoroughly to rights.
. . . . .
That I should neither think nor act like any other man's child:--But alas! continued he, shaking his head a second time, and wiping away a tear which was trickling down his cheeks, My Tristram's misfortunes began nine months before ever he came into the world.
Things were debated even back then!
Last week we noted that in 1862, just after Darwin's Origin of Species, the novelist Wilkie Collins expressed the debate between Nurture advocates and their Nature foes as to which was responsible for our behavior. This was exactly a century after Sterne's Nature view just described. Sterne had no notion of 'genes' and Tristram didn't attribute his nature to inheritance, which would put him in the Nurture category, even if at the extreme (being parentally distracted in flagrante delicto defined the imminent conceptus's entire future).
It wasn't long thereafter that the power of inheritance was debated, in little-remembered works, that applied to the same behavioral characteristics discussed in fiction. Almost half-way between Sterne and Collins, in 1808, one M. Portal, a French professor of medicine, published the "Considerations on the Nature and Treatment of some hereditary or Family Diseases" (London Med. Phy. Journal, 21, 229–239, 281–296). Members of the upper classes (at least) were concerned that they might sully their noble posterity by transmitting disease, especially mental disease, from parent to offspring. Many traits were known to be transmitted (as Montaigne is quoted by Portal), "We find that not only are the marks of the body transmitted from father to son, but also a resemblance of temper, complexion, and inclinations of the mind."
A few years later, in 1814, a British physician named Joseph Adams took Portal's work to task, arguing that there were other ways that traits could cluster in families. He, too, was speculating in that he hadn't the kinds of precision or systematic analytic methods we have today. But he carefully pointed out that life-experience, infection, and other causes could account for such clustering. Diseases present at birth, for example, were more likely to be hereditary than diseases that, even if similar among relatives, only appeared later. He noted that inherited predisposition could lead to a disorder only after experiencing some environmental or life-style factor. He was particularly interested to calm down those in the upper classes who were worried that behavioral traits ('madness') were inherited. Additionally, Adams explicitly anticipated much of Darwin's ideas about evolution by natural selection, but that's unrelated to our topic today. We've discussed Adams' book on MT before.
![]() |
| Joseph Adams' book, 1814 |
The bottom line is that, from these instances, ones that I just happened to know about, both literature and science reflect the fact that in the post-Enlightenment era we often think of as the Age of Science, western culture has long known of what appeared to be inherited traits, even if 'genes' per se weren't yet known of, and yet it was also clear that experience and living conditions could not only generate traits but generate family resemblance of traits.
But nobody knew the how, when, or why of the two kinds of cause, and without knowing units of causation in either life-experience or genetics, we had simply to guess or speculate about these things. The 20th century gave us Mendelian patterns to look for in families as signatures of genetic causation, but when we realized early on that many genes could generate similar traits it was known (to those who cared to recognize that fact) that Mendelian patterns were not needed even in 'genetic' traits.
We have almost the same level of confusion, mix, debate, and lack of clarity today, centuries later in our amped-up Age of Technical Science. We now throw around terms like 'interaction' without, usually, having much direct sense of what we are actually referring to. Sometimes, the contorted way we describe our ideas of causation seem not much different from the way Tristram Shandy did--only that was satire and we seem to be deadly serious!
Something's missing.
After-note: If you haven't read Tristram Shandy, and are interested in more than just reading the flood of science articles in the journals that jam up your mail box every day, Sterne's book is a good, if wacky, romp through sense and nonsense. Reading it takes patience, since it intentionally doesn't always flow in one direction, but it's no more contorted and obscure than those same science articles.
Subscribe to:
Posts (Atom)










