Limits...
Catalogues of mammalian long noncoding RNAs: modest conservation and incompleteness.

Marques AC, Ponting CP - Genome Biol. (2009)

Bottom Line: Here we sought to resolve the reported discrepancy between the evolutionary rates for these two sets.Our analyses reveal lincRNA and macroRNA exon sequences to be subject to the same relatively low degree of sequence constraint.The more tissue-specific macroRNAs are enriched in predicted RNA secondary structures and thus may often act in trans, whereas the more highly and broadly expressed lincRNAs appear more likely to act in the cis-regulation of adjacent transcription factor genes.

View Article: PubMed Central - HTML - PubMed

Affiliation: MRC Functional Genomics Unit, University of Oxford, Department of Physiology, Anatomy and Genetics, Oxford OX1 3QX, UK. ana.marques@dpag.ox.ac.uk

ABSTRACT

Background: Despite increasing interest in the noncoding fraction of transcriptomes, the number, species-conservation and functions, if any, of many non-protein-coding transcripts remain to be discovered. Two extensive long intergenic noncoding RNA (ncRNA) transcript catalogues are now available for mouse: over 3,000 macroRNAs identified by cDNA sequencing, and 1,600 long intergenic noncoding RNA (lincRNA) intervals that are predicted from chromatin-state maps. Previously we showed that macroRNAs tend to be more highly conserved than putatively neutral sequence, although only 5% of bases are predicted as constrained. By contrast, over a thousand lincRNAs were reported as being highly conserved. This apparent difference may account for the surprisingly small fraction (11%) of transcripts that are represented in both catalogues. Here we sought to resolve the reported discrepancy between the evolutionary rates for these two sets.

Results: Our analyses reveal lincRNA and macroRNA exon sequences to be subject to the same relatively low degree of sequence constraint. Nonetheless, our observations are consistent with the functionality of a fraction of ncRNA in these sets, with up to a quarter of ncRNA exons having evolved significantly slower than neighboring neutral sequence. The more tissue-specific macroRNAs are enriched in predicted RNA secondary structures and thus may often act in trans, whereas the more highly and broadly expressed lincRNAs appear more likely to act in the cis-regulation of adjacent transcription factor genes.

Conclusions: Taken together, our results indicate that each of the two ncRNA catalogues unevenly and lightly samples the true, much larger, ncRNA repertoire of the mouse.

Show MeSH

Related in: MedlinePlus

Distribution of highly conserved sequence across ncRNA exon sequences. Examples of phastCons elements (as in [38]) within (a) lincRNA (located on chromosome 10, 68730506-68731547) and (b) macroRNA (located on chromosome 1, 47378880-47380310) exons. Blue histograms represent the conservation in 17 vertebrates based on a phylogenetic hidden Markov model [13]. Green histograms represent pairwise conservation to other vertebrate species. Images have been taken from the UCSC genome browser.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3091318&req=5

Figure 3: Distribution of highly conserved sequence across ncRNA exon sequences. Examples of phastCons elements (as in [38]) within (a) lincRNA (located on chromosome 10, 68730506-68731547) and (b) macroRNA (located on chromosome 1, 47378880-47380310) exons. Blue histograms represent the conservation in 17 vertebrates based on a phylogenetic hidden Markov model [13]. Green histograms represent pairwise conservation to other vertebrate species. Images have been taken from the UCSC genome browser.

Mentions: Guttman et al. previously presented evidence that the highest level of constraint in 12-nucleotide windows within lincRNA exons exceeds that within macroRNAs [10]. This would be consistent with the modest difference in multi-species conserved (phastCons) elements between the two sets (Table 1). Nevertheless, a greater proportion of macroRNA exons show significant evidence of constraint than lincRNA exons (Table 1). Taken together, these findings are consistent with lincRNA exons containing short regions of highly constrained sequence, whereas constraint in macroRNA exons is distributed more diffusely (Figure 3). Short functional DNA elements, such as those regulating the expression of transcription factor genes, may contribute more to sequence constraint on lincRNA exons than they have to macroRNA exon constraint. Furthermore, in contrast to macroRNAs, we found no statistical evidence for lincRNAs being enriched in predicted RNA secondary structures. Consequently, macroRNA locus function may more frequently be RNA sequence-specific, whereas lincRNA loci may more often act in a RNA sequence-independent manner, for example, by transcriptional interference [36].


Catalogues of mammalian long noncoding RNAs: modest conservation and incompleteness.

Marques AC, Ponting CP - Genome Biol. (2009)

Distribution of highly conserved sequence across ncRNA exon sequences. Examples of phastCons elements (as in [38]) within (a) lincRNA (located on chromosome 10, 68730506-68731547) and (b) macroRNA (located on chromosome 1, 47378880-47380310) exons. Blue histograms represent the conservation in 17 vertebrates based on a phylogenetic hidden Markov model [13]. Green histograms represent pairwise conservation to other vertebrate species. Images have been taken from the UCSC genome browser.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3091318&req=5

Figure 3: Distribution of highly conserved sequence across ncRNA exon sequences. Examples of phastCons elements (as in [38]) within (a) lincRNA (located on chromosome 10, 68730506-68731547) and (b) macroRNA (located on chromosome 1, 47378880-47380310) exons. Blue histograms represent the conservation in 17 vertebrates based on a phylogenetic hidden Markov model [13]. Green histograms represent pairwise conservation to other vertebrate species. Images have been taken from the UCSC genome browser.
Mentions: Guttman et al. previously presented evidence that the highest level of constraint in 12-nucleotide windows within lincRNA exons exceeds that within macroRNAs [10]. This would be consistent with the modest difference in multi-species conserved (phastCons) elements between the two sets (Table 1). Nevertheless, a greater proportion of macroRNA exons show significant evidence of constraint than lincRNA exons (Table 1). Taken together, these findings are consistent with lincRNA exons containing short regions of highly constrained sequence, whereas constraint in macroRNA exons is distributed more diffusely (Figure 3). Short functional DNA elements, such as those regulating the expression of transcription factor genes, may contribute more to sequence constraint on lincRNA exons than they have to macroRNA exon constraint. Furthermore, in contrast to macroRNAs, we found no statistical evidence for lincRNAs being enriched in predicted RNA secondary structures. Consequently, macroRNA locus function may more frequently be RNA sequence-specific, whereas lincRNA loci may more often act in a RNA sequence-independent manner, for example, by transcriptional interference [36].

Bottom Line: Here we sought to resolve the reported discrepancy between the evolutionary rates for these two sets.Our analyses reveal lincRNA and macroRNA exon sequences to be subject to the same relatively low degree of sequence constraint.The more tissue-specific macroRNAs are enriched in predicted RNA secondary structures and thus may often act in trans, whereas the more highly and broadly expressed lincRNAs appear more likely to act in the cis-regulation of adjacent transcription factor genes.

View Article: PubMed Central - HTML - PubMed

Affiliation: MRC Functional Genomics Unit, University of Oxford, Department of Physiology, Anatomy and Genetics, Oxford OX1 3QX, UK. ana.marques@dpag.ox.ac.uk

ABSTRACT

Background: Despite increasing interest in the noncoding fraction of transcriptomes, the number, species-conservation and functions, if any, of many non-protein-coding transcripts remain to be discovered. Two extensive long intergenic noncoding RNA (ncRNA) transcript catalogues are now available for mouse: over 3,000 macroRNAs identified by cDNA sequencing, and 1,600 long intergenic noncoding RNA (lincRNA) intervals that are predicted from chromatin-state maps. Previously we showed that macroRNAs tend to be more highly conserved than putatively neutral sequence, although only 5% of bases are predicted as constrained. By contrast, over a thousand lincRNAs were reported as being highly conserved. This apparent difference may account for the surprisingly small fraction (11%) of transcripts that are represented in both catalogues. Here we sought to resolve the reported discrepancy between the evolutionary rates for these two sets.

Results: Our analyses reveal lincRNA and macroRNA exon sequences to be subject to the same relatively low degree of sequence constraint. Nonetheless, our observations are consistent with the functionality of a fraction of ncRNA in these sets, with up to a quarter of ncRNA exons having evolved significantly slower than neighboring neutral sequence. The more tissue-specific macroRNAs are enriched in predicted RNA secondary structures and thus may often act in trans, whereas the more highly and broadly expressed lincRNAs appear more likely to act in the cis-regulation of adjacent transcription factor genes.

Conclusions: Taken together, our results indicate that each of the two ncRNA catalogues unevenly and lightly samples the true, much larger, ncRNA repertoire of the mouse.

Show MeSH
Related in: MedlinePlus