Friday, October 19, 2018

Nyah, nyah! My study's bigger than your study!!

It looks like a food-fight at the Precision Corral!  Maybe the Big Data era is over!  That's because what we really seem to need (of course) is even bigger GWAS or other sorts of enumerative (or EnumerOmics studies, because then (and only then) will we really realize how complex traits are caused, so that we can produce 'precision' genomic medicine to cure all that ails us.  After all, there no such thing as enough 'data' or a big (and open-ended) enough study.  Of course, because so much, at stake, such a food-fight is not just children in a sand box, but purported adults, scientists even, wanting more money from you, the taxpayer (what else?).  The contest will never end on its own.  It will have to be ended from the outside, in one way or another, because it is predatory: it takes resources away from what might be focused, limited, but actually successful problem-solving research.

The idea that we need larger and larger GWAS studies, not to mention almost any other kind of 'omics enumerative study, reflects the deeper idea that we have no idea what to do with what we've got.  The easiest word to say is "more", because that keeps the fiscal flood gates open.  Just as preachers keep the plate full by promising redemption in the future--a future that, like an oasis to desert trekkers, can be a mirage never reached, scientists are modern preachers who've learned the tricks of the trade.  And, of course, since each group wants its flood gates to stay wide open it must resist any even faint suggestion that somebody else's gates might open wider.

There is a kind of desperate defense, as well as food fight, over the situation.  This, at least, is one way to view a recent exchange between an assertion by Boyle et al. (Cell 169(7):1177-86, 2018**)  that some few key genes perhaps with rare alleles scattered across the genome are the 'core' genes responsible for complex diseases, but that lesser often indirect or incidental genes across the genome provide other pathways to affect a trait, and are detected in GWAS.  If a focus on this model were to take place, it might threaten the gravy train of more traditional, more mindless, Big Data chasing. As a plea to avoid that is Wray et al.'s falsely polite spitball in return (Cell 173:1573-80, 2018**)  urging that things really are spread all over the genome, differently so in everyone.  Thus, of course, the really true answer is some statistical prediction method, after we have more and even larger studies.

Could it be, possibly, that this is at its root merely a defense of large statistical data bases and Big Data per se, expressed as if it were a legitimate debate about biological causation?  Could it be that for vested interests, if you have a well-funded hammer everything can be presented as if it were a nail (or, rather, a bucket's worth of nails, scattered all over the place)?

Am I being snide here? 
Yes, of course. I'm not the Ultimate Authority to adjudicate about who's right, or what metric to use, or how many genome sites, in which individuals, can dance on the head of the same 'omics trait.  But I'm not just being snide.  One reason is that both the Boyle and Wray papers are right, as I'll explain.

The arguments seem in essence to assert that complex traits are due either to many genetic variants strewn across the genome, or to a few rare larger-effect alleles here and there complemented by nearby variants that may involve indirect pathways to the 'main' genes, and that these are scattered across the genome ('omnigenic').  Or that we can tinker with GWAS results and various technical measurements from them to get the real truth?

We are chasing our tails these days in an endless-seeming circle to see who can do the biggest and most detailed enumerative study, to find the most and tiniest of effects, with the most open-ended largesse, while Rome burns.  Rome, here, are the victims of the many diseases which might be studied with actual positive therapeutic results by more, focused, if smaller, studies.  Or, in many cases, by a real effort at revealing and ameliorating the lifestyle exposures that typically, one might say overwhelmingly, are responsible for common diseases.

If, sadly, it were to turn out that there is no more integrative way, other than add-'em-up, by which genetic variants cause or predispose to disease, then at least we should know that and spend our research resources elsewhere, where they might do good for someone other than universities.  I actually happen to think that life is more integratively orderly than its effects typically being enumeratively additive, and that more thoughtful approaches, indeed reflecting findings of the decades of GWAS data, might lead to better understanding of complex traits.  But this seemingly can't be achieved by just sampling extensively enough to estimate 'interactions'.  The interactions may, and I think probably, have higher-level structure that can be addressed in other ways.

But if not, if these traits are as they seem, and there is no such simplifying understanding to be had, then let's come clean to the public and invest our resources in other ways to improve our lives before these additive trivia add up to our ends when those supporting the work tire of exaggerated promises.

Our scientific system, that we collectively let grow like mushrooms because it was good for our self interests, puts us in a situation where we must sing for our supper (often literally, if investigators' salary depends on grants).  No one can be surprised at the cacophony of top-of-the-voice arias ("Me-me-meeeee!").  Human systems can't be perfect, but they can be perfected.  At some point, perhaps we'll start doing that.  If it happens, it will only partly reflect the particular scientific issues at issue, because it's mainly about the underlying system itself.

**NOTE: We provide links to sources, but, yep, they are paywalled --unless you just want to see the abstract or have access to an academic library.  If you have the looney idea that as a taxpayer you have already paid for this research so private selling of its results should be illegal--sorry!--that's not our society.


Ken Weiss said...

Damn! On Twitter, Tarquin Holmes' tweet pointed out the famous story by Borges, On Exactitude in Science, about a mapmaker whose map was so good.....that it was the full size of the entire territory being mapped. That's about it: the goal for the Big Data people inevitably must be to sequence everyone:. the guaranteed never-ending 'study'!

Hey, wait a sec! This might go by some other name, such as national health database for a comprehensive national health service. Geez, some civilized nations actually have such civilized health services, so the BiggestPossibleBigData study might actually be possible and, ironically, for the public good. Presumably genome sequencing won't be so expensive that it can, if there's a good reason, be included.

Of course, even that neglects somatic mutation, environments, and some fundamental other things. Such as that lifestyle changes can, by and large, make by far more improvement in public health. But at least if we were to do something noble, we'd be moving in the right direction....

Ken Weiss said...

It can't be proved and would not generally be stated out loud, but in the past there seemed to be clear (and, I know, sotto voce) concerns on both sides of the Atlantic (and perhaps in China and Japan) ideas that if one country out-did another in Big Data, it might lead to Big Pharma Profits from discoveries, so that all the Big Players had to be in on consortium or other agreements. If for nobler reasons, one might have to agree that cooperation rather than competition (esp. if the latter were for proprietary or profiteering motives), would be ideal. Something like a United Nations Global Health Resource.