Limits...
Comparative analysis of the transcriptome across distant species.

Gerstein MB, Rozowsky J, Yan KK, Wang D, Cheng C, Brown JB, Davis CA, Hillier L, Sisu C, Li JJ, Pei B, Harmanci AO, Duff MO, Djebali S, Alexander RP, Alver BH, Auerbach R, Bell K, Bickel PJ, Boeck ME, Boley NP, Booth BW, Cherbas L, Cherbas P, Di C, Dobin A, Drenkow J, Ewing B, Fang G, Fastuca M, Feingold EA, Frankish A, Gao G, Good PJ, Guigó R, Hammonds A, Harrow J, Hoskins RA, Howald C, Hu L, Huang H, Hubbard TJ, Huynh C, Jha S, Kasper D, Kato M, Kaufman TC, Kitchen RR, Ladewig E, Lagarde J, Lai E, Leng J, Lu Z, MacCoss M, May G, McWhirter R, Merrihew G, Miller DM, Mortazavi A, Murad R, Oliver B, Olson S, Park PJ, Pazin MJ, Perrimon N, Pervouchine D, Reinke V, Reymond A, Robinson G, Samsonova A, Saunders GI, Schlesinger F, Sethi A, Slack FJ, Spencer WC, Stoiber MH, Strasbourger P, Tanzer A, Thompson OA, Wan KH, Wang G, Wang H, Watkins KL, Wen J, Wen K, Xue C, Yang L, Yip K, Zaleski C, Zhang Y, Zheng H, Brenner SE, Graveley BR, Celniker SE, Gingeras TR, Waterston R - Nature (2014)

Bottom Line: Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings.Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair.Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters.

View Article: PubMed Central - PubMed

Affiliation: 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3] Department of Computer Science, Yale University, 51 Prospect Street, New Haven, Connecticut 06511, USA [4] [5].

ABSTRACT
The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters.

Show MeSH

Related in: MedlinePlus

Summary of annotated ncRNAs, TARs, and ncRNA predictions in eachspeciesThe number of elements, the base pairs covered and the fraction ofthe genome for each class (see also Supplement section C). Thereare comparable numbers of tRNAs in humans and worms but about half as manyin fly. While the number of lncRNAs in human is more than an order ofmagnitude greater than in either worms or flies, the fractional genomiccoverage in all three species is similar. Finally, humans have at least5-fold more miRNAs, snoRNAs and snRNAs compared to worm or fly. The fractionof the genome covered by TARs (highlighted squares) for each species issimilar. A large amount of non-canonical transcription occurs in the intronsof annotated genes, presumably representing a mixture of unprocessed mRNAsand internally initiated transcripts. The remaining non-canonicaltranscription (249Mb, 16Mb, and 14Mb in human, worm, and fly) is intergenicand occurs at low levels, comparable to that observed for introns (Table S2). Overall,the fraction of the genome transcribed -- including intronic, exonic, andnon-canonical transcription -- is consistent with that previously reportedfor human despite the methodological differences in the analysis (Fig. S2, Supplement sectionC).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4155737&req=5

Figure 5: Summary of annotated ncRNAs, TARs, and ncRNA predictions in eachspeciesThe number of elements, the base pairs covered and the fraction ofthe genome for each class (see also Supplement section C). Thereare comparable numbers of tRNAs in humans and worms but about half as manyin fly. While the number of lncRNAs in human is more than an order ofmagnitude greater than in either worms or flies, the fractional genomiccoverage in all three species is similar. Finally, humans have at least5-fold more miRNAs, snoRNAs and snRNAs compared to worm or fly. The fractionof the genome covered by TARs (highlighted squares) for each species issimilar. A large amount of non-canonical transcription occurs in the intronsof annotated genes, presumably representing a mixture of unprocessed mRNAsand internally initiated transcripts. The remaining non-canonicaltranscription (249Mb, 16Mb, and 14Mb in human, worm, and fly) is intergenicand occurs at low levels, comparable to that observed for introns (Table S2). Overall,the fraction of the genome transcribed -- including intronic, exonic, andnon-canonical transcription -- is consistent with that previously reportedfor human despite the methodological differences in the analysis (Fig. S2, Supplement sectionC).

Mentions: The annotation in the resource represents capstones for the decade-long effortsin human, worm, and fly. The new annotation sets have numbers, sizes and families ofprotein-coding genes similar to previous compilations; however, the number ofpseudogenes and annotated ncRNAs differ (Figs. ED2,ED3, S1). Also, the number of splicing events isgreatly increased, resulting in a concomitant increase in protein complexity. We findthe proportion of the different types of alternative splicing (e.g., exon skipping orintron retention) is generally similar across the three organisms; however, skippedexons predominate in human while retained introns are most common in worm andfly[7] (Figs. ED4, S1 and Table S1).


Comparative analysis of the transcriptome across distant species.

Gerstein MB, Rozowsky J, Yan KK, Wang D, Cheng C, Brown JB, Davis CA, Hillier L, Sisu C, Li JJ, Pei B, Harmanci AO, Duff MO, Djebali S, Alexander RP, Alver BH, Auerbach R, Bell K, Bickel PJ, Boeck ME, Boley NP, Booth BW, Cherbas L, Cherbas P, Di C, Dobin A, Drenkow J, Ewing B, Fang G, Fastuca M, Feingold EA, Frankish A, Gao G, Good PJ, Guigó R, Hammonds A, Harrow J, Hoskins RA, Howald C, Hu L, Huang H, Hubbard TJ, Huynh C, Jha S, Kasper D, Kato M, Kaufman TC, Kitchen RR, Ladewig E, Lagarde J, Lai E, Leng J, Lu Z, MacCoss M, May G, McWhirter R, Merrihew G, Miller DM, Mortazavi A, Murad R, Oliver B, Olson S, Park PJ, Pazin MJ, Perrimon N, Pervouchine D, Reinke V, Reymond A, Robinson G, Samsonova A, Saunders GI, Schlesinger F, Sethi A, Slack FJ, Spencer WC, Stoiber MH, Strasbourger P, Tanzer A, Thompson OA, Wan KH, Wang G, Wang H, Watkins KL, Wen J, Wen K, Xue C, Yang L, Yip K, Zaleski C, Zhang Y, Zheng H, Brenner SE, Graveley BR, Celniker SE, Gingeras TR, Waterston R - Nature (2014)

Summary of annotated ncRNAs, TARs, and ncRNA predictions in eachspeciesThe number of elements, the base pairs covered and the fraction ofthe genome for each class (see also Supplement section C). Thereare comparable numbers of tRNAs in humans and worms but about half as manyin fly. While the number of lncRNAs in human is more than an order ofmagnitude greater than in either worms or flies, the fractional genomiccoverage in all three species is similar. Finally, humans have at least5-fold more miRNAs, snoRNAs and snRNAs compared to worm or fly. The fractionof the genome covered by TARs (highlighted squares) for each species issimilar. A large amount of non-canonical transcription occurs in the intronsof annotated genes, presumably representing a mixture of unprocessed mRNAsand internally initiated transcripts. The remaining non-canonicaltranscription (249Mb, 16Mb, and 14Mb in human, worm, and fly) is intergenicand occurs at low levels, comparable to that observed for introns (Table S2). Overall,the fraction of the genome transcribed -- including intronic, exonic, andnon-canonical transcription -- is consistent with that previously reportedfor human despite the methodological differences in the analysis (Fig. S2, Supplement sectionC).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4155737&req=5

Figure 5: Summary of annotated ncRNAs, TARs, and ncRNA predictions in eachspeciesThe number of elements, the base pairs covered and the fraction ofthe genome for each class (see also Supplement section C). Thereare comparable numbers of tRNAs in humans and worms but about half as manyin fly. While the number of lncRNAs in human is more than an order ofmagnitude greater than in either worms or flies, the fractional genomiccoverage in all three species is similar. Finally, humans have at least5-fold more miRNAs, snoRNAs and snRNAs compared to worm or fly. The fractionof the genome covered by TARs (highlighted squares) for each species issimilar. A large amount of non-canonical transcription occurs in the intronsof annotated genes, presumably representing a mixture of unprocessed mRNAsand internally initiated transcripts. The remaining non-canonicaltranscription (249Mb, 16Mb, and 14Mb in human, worm, and fly) is intergenicand occurs at low levels, comparable to that observed for introns (Table S2). Overall,the fraction of the genome transcribed -- including intronic, exonic, andnon-canonical transcription -- is consistent with that previously reportedfor human despite the methodological differences in the analysis (Fig. S2, Supplement sectionC).
Mentions: The annotation in the resource represents capstones for the decade-long effortsin human, worm, and fly. The new annotation sets have numbers, sizes and families ofprotein-coding genes similar to previous compilations; however, the number ofpseudogenes and annotated ncRNAs differ (Figs. ED2,ED3, S1). Also, the number of splicing events isgreatly increased, resulting in a concomitant increase in protein complexity. We findthe proportion of the different types of alternative splicing (e.g., exon skipping orintron retention) is generally similar across the three organisms; however, skippedexons predominate in human while retained introns are most common in worm andfly[7] (Figs. ED4, S1 and Table S1).

Bottom Line: Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings.Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair.Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters.

View Article: PubMed Central - PubMed

Affiliation: 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3] Department of Computer Science, Yale University, 51 Prospect Street, New Haven, Connecticut 06511, USA [4] [5].

ABSTRACT
The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters.

Show MeSH
Related in: MedlinePlus