Limits...
Comparative analysis of the transcriptome across distant species.

Gerstein MB, Rozowsky J, Yan KK, Wang D, Cheng C, Brown JB, Davis CA, Hillier L, Sisu C, Li JJ, Pei B, Harmanci AO, Duff MO, Djebali S, Alexander RP, Alver BH, Auerbach R, Bell K, Bickel PJ, Boeck ME, Boley NP, Booth BW, Cherbas L, Cherbas P, Di C, Dobin A, Drenkow J, Ewing B, Fang G, Fastuca M, Feingold EA, Frankish A, Gao G, Good PJ, Guigó R, Hammonds A, Harrow J, Hoskins RA, Howald C, Hu L, Huang H, Hubbard TJ, Huynh C, Jha S, Kasper D, Kato M, Kaufman TC, Kitchen RR, Ladewig E, Lagarde J, Lai E, Leng J, Lu Z, MacCoss M, May G, McWhirter R, Merrihew G, Miller DM, Mortazavi A, Murad R, Oliver B, Olson S, Park PJ, Pazin MJ, Perrimon N, Pervouchine D, Reinke V, Reymond A, Robinson G, Samsonova A, Saunders GI, Schlesinger F, Sethi A, Slack FJ, Spencer WC, Stoiber MH, Strasbourger P, Tanzer A, Thompson OA, Wan KH, Wang G, Wang H, Watkins KL, Wen J, Wen K, Xue C, Yang L, Yip K, Zaleski C, Zhang Y, Zheng H, Brenner SE, Graveley BR, Celniker SE, Gingeras TR, Waterston R - Nature (2014)

Bottom Line: Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings.Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair.Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters.

View Article: PubMed Central - PubMed

Affiliation: 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3] Department of Computer Science, Yale University, 51 Prospect Street, New Haven, Connecticut 06511, USA [4] [5].

ABSTRACT
The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters.

Show MeSH
Details on Expression Clustering(A) Pie charts showing gene conservation across 56 Ensembl speciesfor the blocks in the Fig. 1 heatmapenclosed with the same symbol (i.e. pentagon here matches pentagon in Fig.1a). Overall, species-specificmodules tend to have fewer orthologs across 56 Ensembl species. (B) Theexpression levels of a conserved module (Module No. 5) in D.melanogaster and its orthologous counterparts in other 5Drosophila species are plotted against time. The x-axisrepresents the middle time points of two-hour periods at fly embryo stages.The boxes represent the log10 modular expression levels from microarray dataof 6 Drosophila species centered by their medians. Themodular expression divergence (inter-quartile region) becomes minimal duringthe fly phylotypic stage (brown, 8-10 hours). (C) The modular expressioncorrelations over a sliding 2-hour window (Pearson correlation per 5 stages,middle time of two-hour period in x-axis) among 16 modules in worm areplotted. The modular correlations (median shown as bar height in y-axis) arehighest during the worm phylotypic stages (brown), 6-8 hours. One can, infact, directly see this coordination as a local maximum in thebetween-module correlation for the worm, which has a more densely sampleddevelopmental time course. (This figure provides more detail on Fig. 1a and 1c. More details on all partsof this figure are in Supplement section D and Figure S3.)
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4155737&req=5

Figure 8: Details on Expression Clustering(A) Pie charts showing gene conservation across 56 Ensembl speciesfor the blocks in the Fig. 1 heatmapenclosed with the same symbol (i.e. pentagon here matches pentagon in Fig.1a). Overall, species-specificmodules tend to have fewer orthologs across 56 Ensembl species. (B) Theexpression levels of a conserved module (Module No. 5) in D.melanogaster and its orthologous counterparts in other 5Drosophila species are plotted against time. The x-axisrepresents the middle time points of two-hour periods at fly embryo stages.The boxes represent the log10 modular expression levels from microarray dataof 6 Drosophila species centered by their medians. Themodular expression divergence (inter-quartile region) becomes minimal duringthe fly phylotypic stage (brown, 8-10 hours). (C) The modular expressioncorrelations over a sliding 2-hour window (Pearson correlation per 5 stages,middle time of two-hour period in x-axis) among 16 modules in worm areplotted. The modular correlations (median shown as bar height in y-axis) arehighest during the worm phylotypic stages (brown), 6-8 hours. One can, infact, directly see this coordination as a local maximum in thebetween-module correlation for the worm, which has a more densely sampleddevelopmental time course. (This figure provides more detail on Fig. 1a and 1c. More details on all partsof this figure are in Supplement section D and Figure S3.)

Mentions: Given the uniformly processed nature of the data and annotations, we were able tomake comparisons across organisms. First, we built co-expression modules, extendingearlier analysis[14](Fig. 1a). To detect modules consistently across thethree species, we combined across-species orthology and within-species co-expressionrelationships. In the resulting multilayer network we searched for dense subgraphs(modules), using simulated annealing[15,16]. We found some modulesdominated by a single species, whereas others contain genes from two or three. Asexpected, the modules with genes from multiple species are enriched in orthologs.Moreover, a phylogenetic analysis shows that the genes in such modules are moreconserved across 56 diverse animal species (Figs.ED6, S3). To focuson the cross-species conserved functions, we restricted the clustering to orthologs,arriving at 16 conserved modules, which are enriched in a variety of functions, rangingfrom morphogenesis to chromatin remodeling (Fig.1a, Table S3). Finally,we annotated many TARs based on correlating their expression profiles with these modules(Fig. ED5).


Comparative analysis of the transcriptome across distant species.

Gerstein MB, Rozowsky J, Yan KK, Wang D, Cheng C, Brown JB, Davis CA, Hillier L, Sisu C, Li JJ, Pei B, Harmanci AO, Duff MO, Djebali S, Alexander RP, Alver BH, Auerbach R, Bell K, Bickel PJ, Boeck ME, Boley NP, Booth BW, Cherbas L, Cherbas P, Di C, Dobin A, Drenkow J, Ewing B, Fang G, Fastuca M, Feingold EA, Frankish A, Gao G, Good PJ, Guigó R, Hammonds A, Harrow J, Hoskins RA, Howald C, Hu L, Huang H, Hubbard TJ, Huynh C, Jha S, Kasper D, Kato M, Kaufman TC, Kitchen RR, Ladewig E, Lagarde J, Lai E, Leng J, Lu Z, MacCoss M, May G, McWhirter R, Merrihew G, Miller DM, Mortazavi A, Murad R, Oliver B, Olson S, Park PJ, Pazin MJ, Perrimon N, Pervouchine D, Reinke V, Reymond A, Robinson G, Samsonova A, Saunders GI, Schlesinger F, Sethi A, Slack FJ, Spencer WC, Stoiber MH, Strasbourger P, Tanzer A, Thompson OA, Wan KH, Wang G, Wang H, Watkins KL, Wen J, Wen K, Xue C, Yang L, Yip K, Zaleski C, Zhang Y, Zheng H, Brenner SE, Graveley BR, Celniker SE, Gingeras TR, Waterston R - Nature (2014)

Details on Expression Clustering(A) Pie charts showing gene conservation across 56 Ensembl speciesfor the blocks in the Fig. 1 heatmapenclosed with the same symbol (i.e. pentagon here matches pentagon in Fig.1a). Overall, species-specificmodules tend to have fewer orthologs across 56 Ensembl species. (B) Theexpression levels of a conserved module (Module No. 5) in D.melanogaster and its orthologous counterparts in other 5Drosophila species are plotted against time. The x-axisrepresents the middle time points of two-hour periods at fly embryo stages.The boxes represent the log10 modular expression levels from microarray dataof 6 Drosophila species centered by their medians. Themodular expression divergence (inter-quartile region) becomes minimal duringthe fly phylotypic stage (brown, 8-10 hours). (C) The modular expressioncorrelations over a sliding 2-hour window (Pearson correlation per 5 stages,middle time of two-hour period in x-axis) among 16 modules in worm areplotted. The modular correlations (median shown as bar height in y-axis) arehighest during the worm phylotypic stages (brown), 6-8 hours. One can, infact, directly see this coordination as a local maximum in thebetween-module correlation for the worm, which has a more densely sampleddevelopmental time course. (This figure provides more detail on Fig. 1a and 1c. More details on all partsof this figure are in Supplement section D and Figure S3.)
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4155737&req=5

Figure 8: Details on Expression Clustering(A) Pie charts showing gene conservation across 56 Ensembl speciesfor the blocks in the Fig. 1 heatmapenclosed with the same symbol (i.e. pentagon here matches pentagon in Fig.1a). Overall, species-specificmodules tend to have fewer orthologs across 56 Ensembl species. (B) Theexpression levels of a conserved module (Module No. 5) in D.melanogaster and its orthologous counterparts in other 5Drosophila species are plotted against time. The x-axisrepresents the middle time points of two-hour periods at fly embryo stages.The boxes represent the log10 modular expression levels from microarray dataof 6 Drosophila species centered by their medians. Themodular expression divergence (inter-quartile region) becomes minimal duringthe fly phylotypic stage (brown, 8-10 hours). (C) The modular expressioncorrelations over a sliding 2-hour window (Pearson correlation per 5 stages,middle time of two-hour period in x-axis) among 16 modules in worm areplotted. The modular correlations (median shown as bar height in y-axis) arehighest during the worm phylotypic stages (brown), 6-8 hours. One can, infact, directly see this coordination as a local maximum in thebetween-module correlation for the worm, which has a more densely sampleddevelopmental time course. (This figure provides more detail on Fig. 1a and 1c. More details on all partsof this figure are in Supplement section D and Figure S3.)
Mentions: Given the uniformly processed nature of the data and annotations, we were able tomake comparisons across organisms. First, we built co-expression modules, extendingearlier analysis[14](Fig. 1a). To detect modules consistently across thethree species, we combined across-species orthology and within-species co-expressionrelationships. In the resulting multilayer network we searched for dense subgraphs(modules), using simulated annealing[15,16]. We found some modulesdominated by a single species, whereas others contain genes from two or three. Asexpected, the modules with genes from multiple species are enriched in orthologs.Moreover, a phylogenetic analysis shows that the genes in such modules are moreconserved across 56 diverse animal species (Figs.ED6, S3). To focuson the cross-species conserved functions, we restricted the clustering to orthologs,arriving at 16 conserved modules, which are enriched in a variety of functions, rangingfrom morphogenesis to chromatin remodeling (Fig.1a, Table S3). Finally,we annotated many TARs based on correlating their expression profiles with these modules(Fig. ED5).

Bottom Line: Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings.Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair.Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters.

View Article: PubMed Central - PubMed

Affiliation: 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3] Department of Computer Science, Yale University, 51 Prospect Street, New Haven, Connecticut 06511, USA [4] [5].

ABSTRACT
The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters.

Show MeSH