Thursday, June 22, 2017

Everything is genetic, isn't it?

There is hardly a trait, physical or behavioral, for which there is not at least some familial resemblance, especially among close relatives.  And I'm talking about what is meant when someone scolds you saying, "You're just like your mother!"  The more distant the relatives in terms of generations of separation, the less the similarity.  So you really can resist when told, "You're just like your great-grandmother!" The genetic effects decline in a systematic way with more distant kinship.

The 'heritability' of a trait refers to the relative degree to which its variation is the result of variation in genes, the rest being due to variation in non-genetic factors we call 'environment'.  Heritability is a ratio that ranges from zero when genes have nothing to do with the trait, to 1.0 when all the variation is genetic.  The measure applies to a sample or population and cannot automatically be extended to other samples or populations, where both genetic and environmental variation will be different, often to an unknown extent.

Most quantitative traits, like stature or blood pressure or IQ scores show some amount, often quite substantial, of genetic influence.  It often happens that we are interested in some trait that we think must be produced or affected by genes, but that no relevant factor, like a protein, is known.  The idea arose decades ago that if we could scan the genome, and compare those with different manifestations of the trait, using mapping techniques like GWAS (genomewide association studies), we could identify those sites, genomewide, whose variation in our chosen sample may affect the trait's variation.  Qualitative traits like the presence or absence of a disease (say, diabetes or hypertension), may often be due to the presence of some set of genetic variants whose joint impact exceeds some diagnostic threshold, and mapping studies can compare genotypes in affected cases to unaffected controls to identify those sites.

Genes are involved in everything. . . . .
Many things can affect the amount of similarity among relatives, so one has to try to think carefully about attributing ideas of similarity and cause.  Some traits, like stature (height) have very high heritability, sometimes estimated to be about 0.9, that is, 90% of the variation being due to the effects of genetic variation.  Other traits have much lower heritability, but there's generally familial similarity.  And, that's because we each develop from a single fertilized egg cell, which includes transmission of each of our parent's genomes, plus ingredients provided by the egg (and perhaps to a tiny degree sperm), much of which were the result of gene action in our parents when they produced that sperm or egg (e.g., RNA, proteins).  This is why traits can usually be found to have some heritability--some contribution due to genetic variation among the sampled individuals.  In that sense, we can say that genes are involved in everything.

Understanding the genetic factors involved in disease can be important and laudatory, even if tracking them down is a frustrating challenge.  But because genes are involved in everything, our society also seems to have an unending lust for investigators to overstate the value of their findings or, in particular, to estimate or declaim on the heritability, and hence genetic determination, of the most societally sensitive traits, like sexuality, criminality, race, intelligence, physical abuse and the like.

. . . . . but not everything is 'genetic'!

If the estimated heritability for a trait we care about is substantial, then this does suggest the obvious: genes are contributing to the mechanisms of the trait and so it is reasonable to acknowledge that genetic variation contributes to variation in the trait.  However, the mapping industry implies a somewhat different claim: it is that genes are a major factor in the sense that individual variants can be identified that are useful predictors of the trait of interest (NIH's lobbying machine has been saying we'll be able to predict future disease with 'precision').  There has been little constraint on the types of trait for which this approach, sometimes little more than belief or wishful-thinking, is appropriate.

It is important to understand that our standard measures of genes' relative effect are affected both by genetic variation and environmental lifestyle factors.  That means that if environments were to change, the relative genetic effects, even in the very same individuals, would also change.  But it isn't just environments that change; genotypes change, too, when mutations occur, and as with environmental factors, these change in ways that we cannot  predict even in principle.  That means that we cannot legitimately extrapolate, to a knowable extent, the genetic or environmental factors we observe in a given sample or population, to other, much less to future samples or populations.  This is not a secret problem, but it doesn't seem to temper claims of dramatic discoveries, in regard to disease or perhaps even more for societally sensitive traits.

But let's assume, correctly, that genetic variation affects a trait.  How does it work?  The usual finding is that tens or even hundreds of genome locations affect variation in the test trait.  Yet most of the effects of individual genes are very small or rare in the sample.  At least as important is that the bulk of the estimated heritability remains unaccounted for, and unless we're far off base somehow, the unaccounted fraction is due to the leaf-litter of variants individually too weak or too rare to reach significance.

Often it's also asserted that all the effects are additive, which makes things tractable: for every new person, not part of the study, just identify their variants and add up their estimated individual effects to get the total effect on the new person for whatever publishable trait you're interested in.  That's the predictive objective of the mapping studies.  However, I think that for many reasons one cannot accept that these variable sites' actions are truly additive. The reasons have to with actual biology, not the statistical convenience of using the results to diagnose or predict traits.  Cells and their compounds vary in concentrations per volume (3D), binding properties (multiple dimensions), surface areas (2D) and some in various ways that affect how how proteins are assembled and work, and so on.  In aggregate, additivity may come out in the wash, but the usual goal of applied measures is to extrapolate these average results to prediction in individuals.  There are many reasons to wish that were true, but few to believe it very strongly.

Even if they were really additive, the clearly very different leaf-litter background that together accounts for the bulk of the heritability can obscure the numerical amount of that additivity from sample to sample and person to person.  That is, what you estimated from this sample, may not apply, to an unknowable extent, to the next sample.  If and when it does works, we're lucky that our assumptions weren't too far off.

Of course, the focus and promises from the genetics interests assume that environment has nothing serious to do with the genetic effects.  But it's a major, often by far the major, factor, and it may even in principle be far more changeable than genetic variation.  One would have to say that environmental rather than genetic measures are likely to be, by far, the most important things to change in society's interest.

We regularly write these things here not just to be nay-sayers, but to try to stress what the issues are, hoping that someone, by luck or insight, finds better solutions or different ways to approach the problem that a century of genetics, despite its incredibly huge progress, has not yet done.  What it has done is in exquisite detail to show us what the problems are.

A friend and himself a good scientist in relevant areas, Michael Joyner, has passed on a rather apt suggestion to me, that he says he saw in work by Denis Noble.  We might be better off if we thought of the genome as a keyboard rather than as a code or program.  That is a good way to think about the subtle point that, in the end, yes, Virginia, there really are genomic effects: genes affect every trait....but not every trait is 'genetic'!

Tuesday, June 20, 2017

Spooky action at a (short) distance

Entanglement in physics is about action that seems to transfer some sort of 'information' across distances at speeds faster than that of light.  Roughly speaking (I'm not a physicist!), it is about objects with states that are not fixed in advance, and could take various forms but must differ between them, and that are separated from each other.  When measurement is made on one of them, whatever the result, the corresponding object takes on its opposite state.  That means the states are not entirely due to local factors, and somehow the second object 'knows' what state the first was observed in and takes on a different state.

You can read about this in many places and understand it better than I do or than I've explained it here.  Albert Einstein was skeptical that this could occur, if the speed of light were the fastest possible speed.  So he famously called the findings as they stood at that time "Spooky action at a distance." But the findings have stood many specific tests, and seem to be real, however it happens.

Does life, too, have spooky action? 
I think the answer is: maybe so.  But it is at a very short distance, that within the nuclei of individual cells.  Organisms have multiple chromosomes and many species, like humans, have 2 instances of each (are 'diploid'), one inherited from each parent.  I say 'instances' rather than 'copies', because they are not identical to each other nor to those of the parent that transmitted each of them.  They are perhaps near copies, but mutation always occurs, even among the cells within each of us, so each cell differs from their contemporary somatic fellows and from what we inherited in our single-cell beginnings as a fertilized egg.

Many clever studies over many years have been documenting the 3-dimensional, context-specific conformation, or detailed physical arrangement of chromosomes within cells.  The work is variously known, but one catch-term is chromosome conformation capture, or 3C, and I'll use that here.  Unless or until this approach is shown to be too laden with laboratory artifact (it's quite sophisticated), we'll assume it's more or less right.

The gist of the phenomenon is that (1) a given cell type, under a given set of conditions, is using only a subset of its genes (for my  purposes here this generally means protein-coding genes proper); (2) these active genes are scattered along and between the chromosomes, with intervening inactive regions (genes not being used at the moment); (3) the cell's gene expression pattern can change quickly when its circumstances change, as it responds to environmental conditions, during cell division, etc.; (4) at least to some extent the active regions seem to be clustered physically together in expression-centers in the nucleus; (5) this all implies that there is extensive trans communication, coordinating, and physically juxtaposing, parts within and among each chromosome--there is action at a very short distance.

Even more remarkably, I think, this phenomenon seems somehow robust to speciation because related species have similar functions and similar sets of genes, but often their chromosomes have been extensively rearranged during their evolutionary separation. More than this: each person has different DNA sequences due to mutation, and different numbers of genes due to copy number changes (duplications, deletions); yet the complex local juxtapositions seem to work anyway.  At present this is so complicated, so regular, and so changeable and thus so poorly understood, that I think we can reasonably parrot Einstein and call it 'spooky'.

What this means is that chromosomes are not just randomly floating around like a bowl of spaghetti.   Gene expression (including transcribed non-coding RNAs) is thought to be based on the sequence-specific binding of tens of transcription factors in an expression complex that is (usually) just upstream of the transcribed part.  Since a given cell under given conditions is expressing thousands of condition-specific genes, there must be very extensive interaction or 'communication' in trans, that is, across all the chromosomes. That's because the cell can change its expression set very quickly.

The 3C results show that in a given type of cell under given conditions, the chromosomes are physically very non-randomly arranged, with active centers physically very near or perhaps touching each other.  How this massive plate of apparent-spaghetti even physically rearranges to get these areas together, without getting totally tangled up, yet to be quickly rearrangeable is, to me, spooky if anything in Nature is.  The entanglement, disentanglement, and re-entanglement happens genome wide, which is implicitly what the classical term  'polygenic' essentially recognized related to genetic causation, but is now being documented.

The usual approach of genetics these days is to sequence and enumerate various short functional bits as being coding, regulatory, enhancing, inhibiting, transcribing etc. other parts nearby.  We have long been able to analyze cDNA and decide which parts are being used for protein coding, at least. Locally, we can see why or how this happens, in the sense that we can identify the transcription factors and their binding sites, called promoters, enhancers and the like, and the actual protein or functional RNA codes.  We can find expression correlates by extracting them from cells and enumerating them.  3C analysis appears to show that these coding elements are, at least to some extent, found juxtaposed in various transcription hot-spots.

Is gene expression 'entangled'?
What if the molecular aspects of the 3C research were shown to be technical artifacts, relative to what is really going on?  I have read some skepticism about that, concerning what is found in single cells vs aggregates of 'identical' cells.  If 3C stumbles, will our idea of polygenic condition-specific gene usage change?   I think not.  We needn't have 3C data to show the functional results since they are already there to see (e.g., in cell-specific expression studies--cDNA and what ENCODE has found). If 3C has been misleading for technical or other reasons, it would just mean that something else just as spooky but different from the 3D arrangement that 3C detects, is responsible for correlating the genomewide trans gene usage.  And it's of course 4-dimensional since it's time-dependent, too.  So what I've said here still will apply, even if for some other, unknown or even unsuspected reason.

The existing observations on context-specific gene expression show that something 'entangles' different parts of the genome for coordinated use, and that can change very rapidly.  The same genome, among the different types of cells of an individual, can behave very differently in this sense. Somehow, its various chromosomal regions 'know' how to be, or, better put, are coordinated.  This seems at least plausibly to be more than just that a specific context-specific set of transcription factors (TFs) binds selectively near regions to be transcribed and changes in its thousands of details almost instantly.  What TFs?  and how does a given TF know which binding sites to grab or to release, moment by moment, since they typically bind enhancers or promoters of many different genes, not all of them expression-related.  And if you want to dismiss that, by saying for example that this has to do with which TFs are themselves being produced, or which parts of DNA are unwrapped at each particular time, then you're just bumping the same question about trans control up, or over, to a different level of what's involved.  That's no answer!

And there is even another, seemingly simpler example to show that we really don't understand what's going on: the alignment of homologues in the first stage of meiosis.  We've been taught that empirical and necessary fact about meiosis for many decades. But how do the two homologues find each other to align?  This is essentially just not mentioned, if anyone even was asking, in textbooks.  I've seen some speculative ideas, again involving what I'll call 'electromagnetic' properties of each chromosome but even their authors didn't really claim it was sufficient or definitive.  Just for examples, homologous chromosomes in a diploid individual have different rearrangements, deletions, duplications, and all sorts of heterozygous sequence details, yet by and large they still seem to find each other in meiosis.  Something's going on!

How might this be tested?
I don't have any answers, but I wonder if, on the hypothesis that these thoughts are on target, how we might set up some critical experiments to test this.  I don't know if we can push the analogy with tests for quantum entanglement or not, but probably not.

One might hope that 'all' we have to do is enumerate sequence bits to account for this action-at-a-distance, this very detailed trans phenomenon.  But I wonder......I wonder if there may be something entirely unanticipated or even unknown that could be responsible.  Maybe there are 'electromagnetic' properties or something akin to that, that are involved in such detailed 4D contextually relativistic phenomena.

Suppose that what happens at one chromosomal location (let's just call it the binding of a TF), directly affects whether that or a different TF binds somewhere else at the same time.  Whatever causes the first event, if that's how it works, the distance effect would be a very non-local phenomenon, one so central to organized life in complex organisms that, causally, is not just a set of local gene expressions.  Somehow, some sort of 'information' is at work very fast and over very short distances. It is the worst sort of arrogance to assume it is all just encoded in DNA as a code we can read off along the strand and that will succumb to enumerative local informatic sequence analysis.

The current kind of purely local hypothetical sequence enumeration-based account seems too ordinary--it's not spooky enough!

Monday, June 19, 2017

More thoughts, and just plain provocateuring, on genomic causal complexity. . . .

Here are some follow-up reflections on my recent post about GWAS and kindred methods and claims.  I know I'm being contentious, but science has always been contentious.  However, socioeconomic issues (careers, salaries, etc) also enter the picture in a way that is relevant to the inertial nature of our profession.  Readers who haven't read Ludwik Fleck's 1930's volume on 'thought collectives', one preceding Kuhn's 'normal science/paradigm' discussion, should do that, because it's relevant to where we stand now.

The causal complexity of genetic control of quantitative traits was in principle understood by Fisher and others almost exactly century ago.  The development of mapping tools opened the door to seeing what that meant more specifically, at the genome level.

Some key facts about this, I think, are that when there is a strong single signal, we see segregation in families (when there are enough families, as there were in Utah for BRCA mapping), or some other indicator (detectable deletion chromosome detection in Wilm's tumor and perhaps something similar in Retinoblastoma).  Those were families and mainly monogenic in the Mendelian sense (that is, of the traits Mendel carefully chose to study for their simple states).

But BRCA and I think for different reasons, retinitis pigmentosa, mapping by association rather than families doesn't find these genes 'for' the trait.  They're individually strong, but relatively minor on a population and hence association-mapping sense.  And, in nearly all cases, even with 'single locus' diseases, once the gene is known, we see genotypic complexity, including often very low 'penetrance' (showing that 'the' gene isn't a single-locus cause by itself).

BRCA-associated breast cancer risk, once the gene was known and could specifically be typed, is very different even among women carrying known high-risk BRCA1/2 alleles, depend on cohort and the study.  The purported single-locus Hemochromatosis gene (HFE) mutations are associated with high risk in the original sample, in Utah if my recollection is correct, but the mutation does not cause the disease in other samples.  Even the classic PKU is not always caused by PAH alleles, not all pathogenic PAH alleles cause PKU.  Ditto for CF and the CFTR gene.  In some cases, at least, it is likely other interacting genes that in particular populations lead the target gene to seem causal in a Mendelian-like sense.

And of course there is now a substantial literature showing that individuals carrying dead (non-activated) disease-causing genes are walking around without the disease.  I think estimates have shown that each of us carries many (100 or so?) such genes, at least some if not all of which are diploid-negative.  If this doesn't suggest pervasive redundancy and the mappability problems I and others have written about, what does it suggest?

I will once again utter the apparently off-color factor that few want to acknowledge or say in mixed company: somatic mutation. Enough said on that black-box subject.

And while invoking the Truth's name in vain, I'll just whisper here another off-color word: environment.  Enough said on that black-box subject, too.

And there is the non-reductionistic 4th dimension of genetic causation in cells, which is being studied by chromosomal conformation methods (3C and its variants).  What this will lead to is unclear, to me anyway, but clearly there are extensive trans phenomena that methods for sequence parsing and enumerating methods, par for the course now for many decades, are not solving.  If they were working, we wouldn't need a plethora of new terms, and gilded promises from on high (i.e., NIH).

I've often mentioned that much of what we do relies on statistical inference.  That's been getting a well-deserved bad name, but rest assure that the SAS and SPSS people will guarantee you that their packages or use-instructions have been fixed so they won't lead you astray any further.  Nonetheless, there is this third little secret: statistical methods in this arena assume various aspects of replicability while adaptive evolution is fundamentally about non-replicability.

In any case, estimating risk-factor (causal SNP) effects retrospectively is data-fitting and not, in itself, related to cause or prediction, much less doing so with 'precision'.  Such extrapolation rests on the assumption that past fractions mean future probability, which is critical here (especially when sampling, environment, mutation, somatic mutation etc. are inherently unpredictable and essentially non-replicable).

And is it too identity-political to mention that there is the unseemly fact that most of this intensive mapping work has been done on Europeans for the sometimes even openly acknowledged rationale that Europeans have the moolah to pay for the gene-targeted drugs that Pharma has been promising for decades of the genome era? In any case, that's mere racism relative to the deeper genetic-causal issues themselves.  Even restrictive sampling doesn't guarantee replicability; a point I won't mention again lest I be accused of being as repetitive as someone doing GWAS on obesity.

These are just the simple issues one can conjure up without even doing any PubMed searching.  What amount of hammering does it take to get the message to sink in?  By sinking in I mean not just being noted, briefly and in passing, but to force some change of approach beyond enumerating, random sampling, and cachet marketing words (like gene regulatory networks, precision genomic medicine, omingenic, and all the 'omics'-du-jour, etc.).

I would want to be clear even for those who wish to trash all my thoughts: Go ahead!  But at least acknowledge, as I acknowledged in my previous post, that the mapping era did do us great service by providing, for the first time, some specific sense of the genomic details underlying life's causal complexity and showing that in a general sense the original polygenic model was basically right.  Family studies are better when some really meaningfully single strong factor is at work, but the use of IBD assumptions to do association mapping cast, like a flashlight in the dark, light upon what had perforce remained dark to our understanding.  But it's now been quite a while that we have had the understanding we need to know that we should think of different ways to approach the subject of life's causation.  The flashlight's batteries are fading.

And here's my bit of sympathy for what is going on.  Complementing the complexity landscape that is the obvious reality are the key facts underlying all of this: scientists are people and, including yours truly, have limited abilities and can't just facilely be slammed for their not accounting for everything perfectly and immediately.  We're people who, mainly, need salaries, facilities in which to work, and employers like universities who these days have to operate in the black.  These are the deeply socioeconomic underlying problems that serve to encourage or even to force safe science, big science, and oversold science.  That the news media and other vested interests compound the felony is simply one of the problems of our type of imperfect society.

Moving the Big Money that has been locked up by the current haves, to redistribute to more important-payoff kinds of research would inevitably meet resistance, including from NIH's head office, which has been a sloganeering center that makes PT Barnum look like an amateur.  Whether or how or how much redirection of funding, which is what's actually at the unstated core of much of the controversy, is obviously not predictable.  But the importance of trying is what motivates my perhaps too-often and too-cranky posts:  Somebody has to speak of the Emperor's clothes!

Until we fix these underlying issues, whatever mess our current thrust is embedding us in, they will persist until some lucky day when an actually better idea stumbles upon the stage.

Saturday, June 17, 2017

The GWAS hoax....or was it a hoax? Is it a hoax?

A long time ago, in 2000, in Nature Genetics, Joe Terwilliger and I critiqued the idea then being pushed by the powers-that-be, that the genomewide mapping of complex diseases was going to be straightforward, because of the 'theory' (that is, rationale) then being proposed that common variants caused common disease.  At one point, the idea was that only about 50,000 markers would be needed to map any such trait in any global populations.  I and collaborators can claim that in several papers in prominent journals, in a 1992 Cambridge Press book, Genetic Variation and Human Disease, and many times on this blog we have pointed out numerous reasons, based on what we know about evolution, why this was going to be a largely empty promise.  It has been inconvenient for this message to be heard, much less heeded, for reasons we've also discussed in many blog posts.

Before we get into that, it's important to note that unlike me, Joe has moved on to other things, like helping Dennis Rodman's diplomatic efforts in North Korea (here, Joe's shaking hands as he arrives in his most recent trip).  Well, I'm more boring by far, so I guess I'll carry on with my message for today.....




There's now a new paper, coining a new catch-word (omnigenic), to proclaim the major finding that complex traits are genetically complex.  The paper seems solid and clearly worthy of note.  The authors examine the chromosomal distribution of sites that seem to affect a trait, in various ways including chromosomal conformation.  They argue, convincingly, that mapping shows that complex traits are affected by sites strewn across the genome, and they provide a discussion of the pattern and findings.

The authors claim an 'expanded' view of complex traits, and as far as that goes it is justified in detail. What they are adding to the current picture is the idea that mapped traits are affected by 'core' genes but that other regions spread across the genome also contribute. In my view the idea of core genes is largely either obvious (as a toy example, the levels of insulin will relate to the insulin gene) or the concept will be shown to be unclear.  I say this because one can probably always retroactively identify mapped locations and proclaim 'core' elements, but why should any genome region that affects a trait be considered 'non-core'?

In any case, that would be just a semantic point if it were not predictably the phrase that launched a thousand grant applications.  I think neither the basic claim of conceptual novelty, nor the breathless exploitive treatment of it by the news media, are warranted: we've known these basic facts about genomic complexity for a long time, even if the new analysis provides other ways to find or characterize the multiplicity of contributing genome regions.  This assumes that mapping markers are close enough to functionally relevant sites that the latter can be found, and that the unmappable fraction of the heritability isn't leading to over-interpretation of what is 'mapped' (reached significance) or that what isn't won't change the picture.

However, I think the first thing we really need to do is understand the futility of thinking of complex traits as genetic in the 'precision genomic medicine' sense, and the last thing we need is yet another slogan by which hands can remain clasped around billions of dollars for Big Data resting on false promises.  Yet even the new paper itself ends with the ritual ploy, the assertion of the essential need for more information--this time, on gene regulatory networks.  I think it's already safe to assure any reader that these, too, will prove to be as obvious and as elusively ephemeral as genome wide association studies (GWAS) have been.

So was GWAS a hoax on the public?
No!  We've had a theory of complex (quantitative) traits since the early 1900s.  Other authors argued similarly, but RA Fisher's famous 1918 paper is the typical landmark paper.  His theory was, simply put, that infinitely many genome sites contribute to quantitative (what we now call polygenic) traits.  The general model has jibed with the age-old experience of breeders who have used empirical strategies to improve crop, or pets species.  Since association mapping (GWAS) became practicable, they have used mapping-related genotypes to help select animals for breeding; but genomic causation is so complex and changeable that they've recognized even this will have to be regularly updated.

But when genomewide mapping of complex traits was first really done (a prime example being BRCA genes and breast cancer) it seemed that apparently complex traits might, after all, have mappable genetic causes. BRCA1 was found by linkage mapping in multiply affected families (an important point!), in which a strong-effect allele was segregating.  The use of association mapping  was a tool of convenience: it used random samples (like cases vs controls) because one could hardly get sufficient multiply affected families for every trait one wanted to study.  GWAS rested on the assumption that genetic variants were identical by descent from common ancestral mutations, so that a current-day sample captured the latest descendants of an implied deep family: quite a conceptual coup based on the ability to identify association marker alleles across the genome identical by descent from the un-studied shared remote ancestors.

Until it was tried, we really didn't know how tractable such mapping of complex traits might be. Perhaps heritability estimates based on quantitative statistical models was hiding what really could be enumerable, replicable causes, in which case mapping could lead us to functionally relevant genes. It was certainly worth a try!

But it was quickly clear that this was in important ways a fool's errand.  Yes, some good things were to be found here and there, but the hoped-for miracle findings generally weren't there to be found. This, however, was a success not a failure!  It showed us what the genomic causal landscape looked like, in real data rather than just Fisher's theoretical imagination.  It was real science.  It was in the public interest.

But that was then.  It taught us its lessons, in clear terms (of which the new paper provides some detailed aspects).  But it long ago reached the point of diminishing returns.  In that sense, it's time to move on.

So, then, is GWAS a hoax?
Here, the answer must now be 'yes'!  Once the lesson is learned, bluntly speaking, continuing on is more a matter of keeping the funds flowing than profound new insights.  Anyone paying attention should by now know very well what the GWAS etc. lessons have been: complex traits are not genetic in the usual sense of being due to tractable, replicable genetic causation.  Omnigenic traits, the new catchword, will prove the same.

There may not literally be infinitely many contributing sites as in the original statistical models, be they core or peripheral, but infinitely many isn't so far off.  Hundreds or thousands of sites, and accounting for only a fraction of the heritability means essentially infinitely many contributors, for any practical purposes.  This is particularly so since the set is not a closed one:  new mutations are always arising and current variants dying away, and along with somatic mutation, the number of contributing sites is open ended, and not enumerable within or among samples.

The problem is actually worse.  All these data are retrospective statistical fits to samples of past outcomes (e.g., sampled individuals' blood pressures, or cases' vs controls' genotypes).  Past experience is not an automatic prediction of future risk.  Future mutations are not predicable, not even in principle.  Future environments and lifestyles, including major climatic dislocations, wars, epidemics and the like are not predictable, not even in principle.  Future somatic mutations are not predictable, not even in principle.

GWAS almost uniformly have found (1) different mapping results in different samples or populations, (2) only a fraction of heritability is accounted for by tens, hundreds, or even thousands of genome locations and (3) even relatively replicable 'major' contributors, themselves usually (though not always) small in their absolute effect, have widely varying risk effects among samples.

These facts are all entirely expectable based on evolutionary considerations, and they have long been known, both in principle, indirectly, and from detailed mapping of complex traits.  There are other well-known reasons why, based on evolutionary considerations, among other things, this kind of picture should be expected.  They involve the blatantly obvious redundancy in genetic causation, which is the result of the origin of genes by duplication and the highly complex pathways to our traits, among other things.  We've written about them here in the past.  So, given what we now know, more of this kind of Big Data is a hoax, and as such, a drain on public resources and, perhaps worse, on the public trust in science.

What 'omnigenic' might really mean is interesting.  It could mean that we're pressing up ever more intensely against the log-jam of understanding based on an enumerative gestalt about genetics.  Ever more detail, always promising that if we just enumerate and catalog just a bit (in this case, the authors say we need to study gene regulatory networks) more we'll understand.  But that is a failure to ask the right question: why and how could every trait be affected by every part of the genome?  Until someone starts looking at the deeper mysteries we've been identifying, we won't have the transormative insight that seems to be called for, in my view.

To use Kuhn's term, this really is normal science pressing up against a conceptual barrier, in my view. The authors work the details, but there's scant hint they recognize we need something more than more of the same.  What is called for, I think is young people who haven't already been propagandized about the current way of thinking, the current grantsmanship path to careers.

Perhaps more importantly, I think the situation is at present an especially cruel hoax, because there are real health problems, and real, tragic, truly genetic diseases that a major shift in public funding could enable real science to address.

Saturday, June 3, 2017

The real reason Graham Spanier's going to jail

This post will seem to be a serious diversion from our usual topics, but in another sense it is actually of the same sort.  There's a lesson to be learned from Penn State's recent inglorious history, or perhaps a 'meta-lesson', and since we are at Penn State, it's perhaps an appropriate context for us.

Our former President, Graham Spanier, was just yesterday sentenced to some jail time for his conviction on charges related to negligent response to reports of Jerry Sandusky's child abuse. Spanier and two other high officials at Penn State were convicted of essentially turning their heads away from reports of abuse.  They have essentially acknowledged this, though Dr Spanier still seems to be wriggling, unconvinced that, despite the tragedy for many abused boys, he looked the other way when action could have been taken.  All the evidence we see in the news, at least, suggests that the administrators did in fact do that, as did rightfully legendary football coach Joe Paterno.

They all didn't want to know unpleasant things, or conveniently didn't see them.  This doesn't mean they were bad people who wanted little boys to be abused.  It means they didn't respond to some indirect information they received about possible abuse going on in a Penn State athletic facility.  In a sense this was the most convenient response, rather than be bothered by something that they, after all, didn't see directly and that would at best be a distraction from their busy lives.

But there is a far deeper part of the story, and it has to do with why this happened. Seeing that bigger picture is the only way to understand what has happened.  The reason has to do with the nature of our society generally and how it applies to universities.  Our former President is going to jail because he was part of a system that favors money, convenience, and appearance over ethics.

The Spin Society
As university president, Graham Spanier seemed to have an insatiable hunger for attention.  He was in front of a camera all the time and everywhere.  He was a very good spinner and fundraiser.  The University's coffers grew admirably during his administration.  He became prominent in many ways, none of them intellectual or very seriously about education.

Penn State grew in size, even though it's hard argue that bigger classes or athletic programs were about serious-level education.  Dr Spanier didn't impede good faculty hiring or strong research. Indeed, he approved of it and helped.  But his administration was not an intellectual one or a drive for higher educational quality or more rigor per se.  More research could be counted in terms of dollars flowing in and publications flowing out.  If it happened, or if he could help, that was terrific.

But Graham Spanier's downfall was because he was part of the Spin Society.  His presidency was focused on image, and image was based on money.  Uncomfortable facts were dealt with quietly, or brushed aside.  Social activism was ignored or given a patronizing pat on the head.  Anything that was good for image, was good for  Penn State.

Graham Spanier is going to jail because he was a cog in this wheel, a wheel that is the essence of contemporary American society.  In the Spin Society, competition for resources is the bottom line, more the operant word, and cardboard cutouts the easiest approach to take.  Boards of Trustees expect it (and may not tolerate Presidents who actually try to do something to shake the system itself).  Since Boards are responsible for the tenor of the university and the tenure of its leaders, maybe it is they who should be behind bars, not the agent whom they encouraged and rewarded for 15+ years.

If we see Spanier's sentence as a just reward for a negligent or miscreant, then we are misunderstanding and complicit in worse that is likely to come.  If we see this as a Penn State problem, then we are complicit in allowing the rest of the country to carry on business as has become usual.

A 1952 French movie called Nous Sommes Tous des Assassins, or We Are All Assassins, was the story of a murderer on death row who was being personally held responsible and punished, even though his acts were more deeply the result of the nature of society, for which we are all responsible.

We do not need an increaasingly spin-driven society, where essential dishonesty is at the core of how we operate. There surely are other, better ways to live.

And in a case like this one at Penn State, who is really the guilty party?

Friday, June 2, 2017

Allegiance to the Earth: The Environmentalism Pledge


For her final project in Anthropology/Biology 282G Sapiens: The Changing Nature of Human Evolution, recent University of Rhode Island graduate Marisa DeCollibus created something wonderful.

During her studies at URI, she gained expertise in psychology, learning, and education. From that vantage, she wrote a pledge of allegiance to the earth to be recited daily by K-12 students.

In the companion paper she writes, "Today's Kindergarteners are tomorrow's impactful Sapiens ..." and "I tried to create imagery that invoked being a part of a collective landscape instead of being the rulers of the landscape..."

Here's the pledge.

The Environmentalism Pledge
by Marisa DeCollibus

I pledge to care for the natural world
Of all living and nonliving formations
And to the resources
Of which we share
One planet
Amongst stars
Irreplaceable
With intention
And effort
For all

**
Please share widely and please let us know if you think it's just as wonderful as we do and, especially, if you begin reciting this with your children. If you'd like to get in touch with Marisa, please let us know.