Friday, June 4, 2010

A TUF nut to translate? The function of 'noncoding' RNA

One of the fundamental principles of life, which we've written about from time to time and include a lot about in our book, is chance. It's everywhere, from the randomness of which genes we inherit from each parent, to being the unlucky fish that get caught in the shark's maw as it sweeps through a school with an open mouth.

Another aspect of chance is sloppiness, which is also found everywhere. Echolocating bats are notably imprecise in their discrimination among prey, error-prone transcription of DNA into RNA is routine--it has been estimated, in fact, that perhaps 3% of a cell's energy is used to correct transcription errors.*

The environment presents us with the unpredictable, too, in the form of phenomena like hail storms or hurricanes, unusual heat or cold, volcanoes, etc., as well as the regular, predictable fluctuation of the seasons. (Of course, as Ken is a former meteorologist, he's inserting the self-defense caveat here that weather forecasting is not entirely unpredictable in the short run, though many details are, and weather may never be predictable very many days in advance--but then, for organisms other than humans, and then only recently, this is irrelevant).

Chance is so ubiquitous that being able to adapt to chance effects--within limits, of course; no amount of adaptiveness will help that fish escape being that marauding shark's dinner--is a characteristic of life that had to have evolved very early because all organisms can do it to some degree. The lineages that couldn't disappeared long ago.

Errors and chance are found at the molecular level, too.  It has been known for decades that only a small fraction of DNA is transcribed into messenger RNA and translated into proteins, and it was thought that the 98% of the genome that wasn't transcribed was 'junk', detritus from evolutionary trial and error. But recently, unexpected transcripts from non-coding DNA have been reported by a number of labs, leading to speculation about the actual role of all that 'junk DNA'. A recent paper in PLoS Biology by van Bakel et al., accompanied by a commentary, addresses this question. van Bakel et al. describe these excess transcripts this way:
Dubbed transcriptional “dark matter”, the “hidden” transcriptome, or transcripts of unknown function (TUFs), the exact nature of much of this additional transcription is unclear, but it has been presumed to comprise a combination of novel protein coding transcripts, extensions of existing transcripts, noncoding RNAs (ncRNAs), antisense transcripts, and biological or experimental background. Determining the relative contributions of each of these potential sources is important for understanding the nature and possible biological function of transcriptional dark matter.
To address the question of what these are, van Bakel et al. compared the usual method for identifying transcripts (tiling arrays) to a "single- and paired-end RNA-Seq" method, and found that the RNA-Seq method identified many fewer unexpected, or 'dark matter' transcripts (reminding us that high-powered technology can be error-prone, too). Most of the transcripts were identifably from intronic regions, suggesting that they were perhaps "fragments of pre-mRNAs", and were associated with open chromatin, that is, segments of DNA that are open for business, ready to be transcribed.
We conclude that analysis of data from tiling arrays leads to vast overestimates of the proportion of transcriptional dark matter. However, the mammalian transcriptome does contain thousands of unannotated transcripts, exons, promoters, and termination sites.
That is, they still found enough unidentified stuff to write home about, maybe 2% of the transcripts they analyzed. While that is considerably less than the dark matter that's given rise to so much speculation, it still suggests that even after 3 billion years, DNA copying enzymes make mistakes. As Richard Robinson says in the PLoS commentary,
The emerging picture of RNA polymerase is of an inherently imprecise, not to say promiscuous, copyist, one whose output includes some mistakes along with lots of valuable product. In this view, most dark matter transcripts are not signals emerging from a hidden universe within the genome, but instead simply the noise emitted by a busy machine.
If these latest results bear out, it will mean that at least some of the enthusiasm in recent years for the idea that there is a huge unknown realm of DNA functions uninvolved with direct protein expression may not be completely warranted, and that our standard ('old-fashioned'?) theory, including the error-proneness of the system, may be pretty accurate after all. 

*Kurland, C., and J. Gallant. (1996). "Errors of heterologous protein expression." Current Opinion in Biotechnology 7(5):489-493.


Holly Dunsworth said...


Ken Weiss said...

Most new discoveries have added to rather than subtracting from genetic complexity. If this paper is right, it's a case of our over-reaction to the cool term and findings of 'dark matter' in the genome. Time will tell who's been right.

James Goetz said...

From my perspective, if various DNA has no significant selective constraints, then it's junk DNA regardless of what it does.

James Goetz said...

I also want to add that large quantities of DNA with no selective constraints is consistent with observation and mathematics. For example, in any given lineage, if the total quantity of neutral mutational insertions exceeds the total quantity of neutral mutational deletions, then then the lineage would accumulate neutral insertions. And basic concepts of mutations implies significantly more insertions compared to deletions, at least in many animals. And as stated above, I hold that DNA with no selective constraints has no evolutionary importance apart from occasional random generation of new genes. I hope I stated this clearly and accurately.