Thursday, April 4, 2013

When the crystal ball is cloudy: calling sequence data correctly

Here's a monkey wrench of a paper (O'Rawe et al.), just published in Genome Medicine.  We're all being sold on the idea that knowing our whole genome sequence is going to make us much healthier. The DNA sequencer cum crystal ball will tell us what we're likely to be in for, and this will give us plenty of lead time to prevent it -- by running, lowering our cholesterol intake, losing weight, or whatever -- or to prepare for it.

But, among many other assumptions, this assumes first and foremost that the data are being read correctly, no false positives or negatives.  And here's the clincher: O'Rawe et al. compared five different software packages that read and interpret DNA sequence data, and they report low concordance between results.  Discrepancies have been found before, but not when comparing reads of the same raw data.

This group sequenced whole exomes of 15 different individuals, in 4 families, and fed the raw data through 5 sequence analysis pipelines.  They also sequenced one whole genome.  Sequences were done at 20 - 154X coverage, 120X average, meaning each nucleotide was read at least 20 times, but most often more, and at least 80% of the target sequence was obtained.

They found that the 5 programs agreed on single nucleotide variants (SNVs) about 60% of the time.  That is, 40% of the time a SNV was called by fewer than 5 of the programs.  Each of the pipelines detect variants that the others do not, and they aren't necessarily all false positives.
This disagreement is likely the result of many factors including alignment methods, post alignment data processing, parameterization efficacy of alignment and variant calling algorithms, and the underlying models utilized by the variant calling algorithm(s).
That is, each step along the way potentially introduces errors.  Indel (insertion/deletions, segments of DNA one or more nucleotides in length) concordance rates were even lower, at 26% between three indel calling programs.  (The paper goes into much more detail about specific pipelines and error rates.) Using family data can help reduce inaccuracies when it is possible to determine which calls just cannot be correct.  But, otherwise, with current methods reducing false positives means increasing false negatives, and vice versa.

The authors write,
In the realm of biomedical research, every variant call is a hypothesis to be tested in light of the overall research design. Missing even a single variant can mean the difference between discovering a disease causing mutation or not. For this reason, our data suggest that using a single bioinformatics pipeline for discovering disease related variation is not always sufficient.
This somewhat understates the problem.  Serious level testing of a SNP (single nucleotide polymorphism) to see if it has an effect on disease risk--especially when these effects are typically very small in any case, and biased upwards in GWAS type data, is no joke.  What do you do?  Put that single change into a lab mouse or rat and see if it might be more likely to develop slightly higher blood pressure at old age?  Or have a slightly higher risk of some sort of cancer (again, to be a human model, it should be at older ages)?  Which mouse strain would you use?  If humans are to be used for validation, how would you do it?

The questions are serious because miscalls by sequencers go both ways.  A sequencer can miss a SNV call, so you don't identify one of the variants that you really want to be checking.  Or, it can give you a false positive, and lead you farther astray.  And if you must choose between hundreds of variants across the genome, with comparable estimated effects, you are already in a bit of a bind even if they are all perfectly called!

No technology, or medical test, will be correct 100 percent of the time, and sequencing technologies are likely to get better, not worse (though if MS Windows is any guide, that's not necessarily true!). But, when disease risk estimates depend on accurate DNA sequence, it is obvious that we are way premature in proclaiming findings so loudly and demanding that so much effort and resources be poured into doing more of the same.  Again, focused studies on problems more important, clearer, and less vulnerable to these kinds of errors is where the effort should be going.

And, some subtle manipulation, too?
By the way, the standard term for a single nucleotide variation in a population is SNP (single nucleotide polymorphism).  Now, some authors use SNV (single nucleotide variant), essentially doing two things.  First, they are rhetorically equating 'variant' with causal variant--that is, tactily, subtly, or surreptitiously planting in your mind that they are onto something causal.  And second, they are tacitly, subtly, or surreptitiously suggesting that one of the two is the 'good'  or 'normal' (i.e., health-associated) variant.  This perpetuates the 'wild type' thinking--see our earlier post 'walk on the wild-type side'.

These are ways in which the community of researchers inadvertently or intentionally (you decide which) cooks the books in your and journalists', and even their own minds, entrenching a de facto genetic-causation worldview into their and everybody's thinking.  That's good for business, of course.


Anonymous said...

Thanks for posting this, I wasn't aware of the paper. And excellent point regarding de facto "cooking" the data, spinning occurrence into something causal.

Ken Weiss said...

Thanks for your comment.
The issue is subtle. We really don't want to make up something because we're careless, but history going back to the beginning of history shows our tendency to accept an explanatory framework (natural, or supernatural) into which we retro-fit what we observe.

Historians and philosophers of science have also shown our history of being blind to errors, vulnerable to wishful-thinking, affected by professional or other vested interests, and being less critical than we profess to be, of our results. Our study designs themselves tend to reinforce our biases.

These things all deserve criticism, as a corrective, but it's also true that we do need some working framework, and even biases, to get anything done. And even our gross mistakes can lead to something (as, for example, alchemy led to chemistry).

Our impatient and 'productivity' culture speeds this up, and the number of us who need to earn our living this way exacerbates the problem.

So the task is to balance wastefulness, over-acceptance, and haste with what can be most truly productive.

In these senses, it is really largely societal decisions and priorities, plus other aspects of politics, and not onl the facts, that decide what is to be done in science.

Anne Buchanan said...

The paper, of course, goes into detail about error associated with each of the pipelines and so forth, so if you are interested in the detail, do see the paper.

There has always been error associated with sequencing and genotyping, but it's all so high tech now that people seem to have forgotten the old GIGO caution -- "garbage in garbage out".

Anonymous said...

Ken, today more than most days I feel your "explanatory framework" comment-- after having received yet another rejection on an article of mine reporting what I would consider a potentially foundational aspect of autism genetics but which, I admit, may pose more questions than give answers. Alas, editors are not fond of these kinds of papers. Am having to downgrade my sights re journals and go with a sure thing just to get it published. I've been surprised by the resistance in subtle shift in conceptual framework. But then again, I'm not so surprised after all.

Anne, I'm definitely going to be going over the paper in more detail. Was overrun with labwork today though, but have it sitting on my computer desktop in wait. :)

Ken Weiss said...

Rejection is always hard to take, and we may sometimes of course come up short (i.e., reviewers can be correct). But there seems to be no doubt that thinking too differently is punished by these forms of ostracization.

Fortunately, if one pays serious attention to what the reviewers or editors were saying, one may be able to titer one's work to give it a better shot next time. But perhaps more important, and I think that on the whole it's a very good thing, even 'lesser' journals are indexed and easily Googlable, so ideas that are 'out there' can be found.

People can put their work on their web pages, use the blogosphere, and so on, so that one can circumvent the atheroslcerotic aspects of established orthodoxy.

Even if most new ideas are wrong (and basically all ideas are wrong to some extent), airing them is a way to correct what's current, nudging it to better directions. Even wrong ideas have reasons behind them, except for those from the truly wacky, so that they can help us all think about shortcomings in the current prevailing view and ask 'why does so-and-so think we need his new idea?', and hence sharpen our explanations of why we think the current view is right.

It is also not new that one needs to have a thick skin if one's work is out there for public viewing.

I'm listening right now to the BBC Radio3 (classical music). They're having a wonderful Puccini week. Turandot is playing as I write, but yesterday, they were describing how abashed even he felt if one of his works (Butterfly) was criticized. He was in good company: Verdi was horribly upset by critics who among other things said he was copying Wagner, and even so was Wagner who was so far off the charts.

It's the job of critics to criticize, of course, but it should be in a positive or constructive sense. In these musical examples, the victims were people of creative genius who happened not just to be towing the conventional line.

Of course, sadly, few of us are in the Puccini category.....

Anonymous said...

Very true, and even though I've had critics who've toed the line of critique and just delved straight into criticism, some of their messages have rung true and, when I listened, often made the paper that much stronger. At the moment, however, the paper-- while definitely reporting results from an experiment-- is moreso an observation about a common aspect which seems to characterize autism-related genes in general. It was a small study and meant to mainly be a bioinformatics approach as opposed to anything wetlab-based. The first critique was simply that the study was too specific a topic for the journal; the second, that they wanted to see more work performed on underlying mechanisms. This latter one I can definitely agree with, but the purpose of the study was more preliminary and we'll be moving into the lab-based work in future.

But I guess one problem was that it was moreso an observation with hints of causation but nothing strictly delineated. And I made it so purposefully, as an observation which could lead to new avenues of research. I stressed "relationship" as opposed to "causation". Which was why your comment about how articles are "spun" reminded me of my current situation.

But I do agree regarding smaller journals and simply getting it out there on the net. Which is Plan B and so it's gone on for review now.

Ken Weiss said...

"More work needed" and "too specific for this journal" are polite reasons or excuses for bumping something reviewers don't want to see out there (or, to be more charitable, honorably can't see the value of because of their own blinkers).

See our post on Broadreach University. There, we suggest that the current changes in how business is done, using online publication and social media etc., will help liberate these kinds of constraint. Some of this is pay-as-you-go publishing, which has its problems and may continue to favor the funded elite, but there are also outlets like PeerJ that don't do that.

You might search MT for 'peer review' where we've also commented on these problems. Articles in BioEssays have discussed them cogently as well.

Junior faculty will have to mount a grass roots charge. There will be bumps along the way, but creative uses of the social network sphere, broadly defined, are coming and will help provide end-around possibilities.

Anne Buchanan said...

Michael Eisen's March 27 lecture on "The Past, Present, and Future of Scholarly Publishing" (the transcript of which he posts on his blog) is, I think, a wonderful summary of the whole modern enterprise.

Anonymous said...

Thanks for the resources, Ken and Anne. A new age certainly seems to be dawning, one in which I hope that information reigns king moreso than researchers' names (not that we should lose the importance of having "leaders" in the respective fields as well, who are also necessary for basic progress). I've been curious about PeerJ since I heard about it. After these latest experiences, a shock to myself a young researcher who, so far, has had unusual luck remaining relatively naive to them due to connections made over a lifetime career by my partner, I can see the importance of this grass roots movement even more. I am definitely being made a convert.

Ken Weiss said...

I'm now an editor (one of a large number!) for PeerJ and I'll see how it works, I assume.

Nature March 28 issue has all sorts of stuff on open publishing. Must be a bit of a hedge or feint by them, since they make their money on media-drummed publicity and circulation (because of ads), so they can't be all that pleased. However, I think they've been hedging their bets with some online versions, pay as you go.

We'll see how it all washes out, and probably soon, since the pubscape is changing so rapidly.

Ken Weiss said...

Did you see our post last week or so (google Broadreach) on how career evaluation by administrators and grant reviewers will have to change, in a positive direction, as well?

Anonymous said...

Congrats on the editorial position at PJ, Ken. :) I was wandering around the site and noticed a call for people; I'm wondering whether I might be helpful as a reviewer perhaps.

I didn't see the piece about Broadreach, I'll have to backtrack a bit.

Ken Weiss said...

It was March 29. Search on that or on Gutenberg. We gave our view on the impact of electrons vs printer's ink

Daniel said...

This is a reasonable summary of the take-homes from this paper, although I think it's worth emphasizing that most findings from next-gen sequencing are validated with independent technology as a matter of course - so the impact of false positives on the research enterprise is not as great as you might imagine. False negatives, on the other hand, continue to plague us.

Unfortunately I really disagree with this section of your post:

First, they are rhetorically equating 'variant' with causal variant--that is, tactily, subtly, or surreptitiously planting in your mind that they are onto something causal.

Both the intent and, I would argue, the effect of this terminology change is precisely the opposite of this: this is a term that decreases rather than increases ambiguity about causality. In fact the term variant has been adopted precisely because it is a neutral term that avoids the baggage associated with the terms "polymorphism" (a variant generally assumed to be benign) and "mutation" (a variant thought to be pathogenic).

And second, they are tacitly, subtly, or surreptitiously suggesting that one of the two is the 'good' or 'normal' (i.e., health-associated) variant.

This objection has more merit in it - but not enough, I think, to argue against adopting the term over its historically muddled alternatives.

Ken Weiss said...

One can have differing opinions on this. Many years ago we had a perfectly good term for nucleotide variants: SNPs. It was a good and clear alternative to 'allele', because it focused on a single site in DNA.

It was somewhat co-opted, as a matter of convenience or, I would say, manipulation of just the issues in our post, when 'polymorphism' was redefined to mean some minimal minor allele frequency, say 1% or 10%.

That went along with redefinitions-of-convenience for designing HapMap, inclusion in dbSNP, and 'theories' like common variant/common disease. In CD/CV the 'common' changed when things didn't work out as they had been advertised ('common' originally meant things like 30% or more, when HapMap was being justified). And there, of course, 'variant' clearly meant disease!

It is true that 'mutation' and polymorphism have also been used as you say. But that was wrong to begin with. Mutation is an event and if we have a term--allele--we should not redefine mutation to mean pathogenic change, especially since the epistemology of causation clearly makes causal assignment dicey under many if not most circumstances.

I can't argue that using V to get around the mutation/SNP issue would be OK, but I think we would be better off not to keep inventing, and then redefining terms without good cause.

Anyway, you're right to raise the issues, and I'd only say that terms should be for clarity and their meaning not made into moving targets to claim success when it hasn't been achieved, or to imply importance when it hasn't been demonstrated.

All life is semantics to a great extent, (I think it was Wittgenstein who most famously stressed the fundamental nature of such stuff), but we should not be distracted by constant meaning changes.

Of course, we have a colossal example, and I have no idea what to do about it, with the word 'gene'! So V vs P is probably minor....

Ken Weiss said...

Let me amplify my thinking a little further--to explain, if not to defend, my view about these terms. Forgive me if you already know all of this.

In broad terms, in the early 1900s we only could see phenotypes. Most changed phenotypes ('mutations') were grotesquely harmful. The theory grew that there was a good ('wild type') variant and one or a few harmful ones. The former had been favored by adaptive selection, which would generally purge the latter.

This made it difficult to relate Mendelism to evolution, since the latter was about gradual refinements rather than big adaptive leaps.

As geneticists noticed that, in fact, many if not most mutations (heritable phenotype changes) were small and _not_ harmful, it became possible to resolve Mendel and Darwin, in the 1930s, in the 'modern synthesis.'

The idea of 'mutation' as bad also fit into the hyper-adaptationism that still prevails in biology and biomedicine, despite very extensive realization, data, and theory, to show that most genetic changes have little if any effect, even if conditional on their having an effect, it is typically very slightly harmful more often than slightly helpful.

Without getting into the serious epistemological challenges of what this means (we've discussed this in our MT book, and elsewhere), the 'neutral' view of biology in some senses led to the use of terms like 'mutation' for harmful, and SNP for innocuous. But how does one know? Only in some cases, like the major alleles at CFTR or PAH or BRCA, can we be very confident about harm, or its degree.

I may be justly accused of being a purist, but since _every_ new allele arises via mutation, and since the unconditional assignment of value (harm, help, neutral) is usually very problematic, and since terms like 'common' have been used in advocacy by various schools of thought, I think we should be careful about our terms.

If we can all agree to use variant instead of polymorphism (they're actually essentially synonymous, since both imply multiple states), then OK, but let's agree more clearly. At least, that's my view as a purist!

Anonymous said...

The way I see the terms used, all polymorphisms are variants, but not all variants are polymorphism. The difference being frequency - for a variant to be a polymorphism its frequency should exceed some threshold level. Both terms are neutral with respect to role in adverse phenotypes.

Ken Weiss said...

This is all obviously a matter of opinion on usage. Personally, poly-morphism means many forms, regardless of function (as you say). But why have a cutoff frequency? Historically that has varied and often been used to make distinctions (and lobby for genomics promises) by saying that the minimal frequency must be, say, 1% or more. But the arbitrariness and changeability of 'some threshold' is the issue (for me).

Since this is arbitrary, I would prefer a different term such as common variant, or some concoction, that was clear as to its meaning and didn't change as a matter of convenience.

Some people, and I'm one, don't like reporting 'significance' values with stars etc. in papers, but just reporting the p-value without any cutoff: let the reader decide.

I'd prefer something like that here--but when it makes a difference, such as which SNP alleles you include in, say, a genomic study, then just say that you're using an x% minor allele frequency cutoff. Then we have clarity (at least, again, that's my view).