Limits...
Enhanced transcriptome maps from multiple mouse tissues reveal evolutionary constraint in gene expression.

Pervouchine DD, Djebali S, Breschi A, Davis CA, Barja PP, Dobin A, Tanzer A, Lagarde J, Zaleski C, See LH, Fastuca M, Drenkow J, Wang H, Bussotti G, Pei B, Balasubramanian S, Monlong J, Harmanci A, Gerstein M, Beer MA, Notredame C, Guigó R, Gingeras TR - Nat Commun (2015)

Bottom Line: This core set of genes captures a substantial fraction of the transcriptional output of mammalian cells, and participates in basic functional and structural housekeeping processes common to all cell types.Perturbation of these constrained genes is associated with significant phenotypes including embryonic lethality and cancer.Evolutionary constraint in gene expression levels is not reflected in the conservation of the genomic sequences, but is associated with conserved epigenetic marking, as well as with characteristic post-transcriptional regulatory programme, in which sub-cellular localization and alternative splicing play comparatively large roles.

View Article: PubMed Central - PubMed

Affiliation: 1] Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, Barcelona 08003, Spain [2] Faculty of Bioengineering and Bioinformatics, Moscow State University, Leninskie Gory 1-73, 119992 Moscow, Russia.

ABSTRACT
Mice have been a long-standing model for human biology and disease. Here we characterize, by RNA sequencing, the transcriptional profiles of a large and heterogeneous collection of mouse tissues, augmenting the mouse transcriptome with thousands of novel transcript candidates. Comparison with transcriptome profiles in human cell lines reveals substantial conservation of transcriptional programmes, and uncovers a distinct class of genes with levels of expression that have been constrained early in vertebrate evolution. This core set of genes captures a substantial fraction of the transcriptional output of mammalian cells, and participates in basic functional and structural housekeeping processes common to all cell types. Perturbation of these constrained genes is associated with significant phenotypes including embryonic lethality and cancer. Evolutionary constraint in gene expression levels is not reflected in the conservation of the genomic sequences, but is associated with conserved epigenetic marking, as well as with characteristic post-transcriptional regulatory programme, in which sub-cellular localization and alternative splicing play comparatively large roles.

No MeSH data available.


Related in: MedlinePlus

Genome-wide conservation of antisense expression and splicing profiles.(a) The joint distribution of the average antisense-to-total expression ratio (the number of reads mapped to the opposite strand as a fraction of the number of reads mapped to both strands) in pairs of orthologous protein-coding genes; cc=0.68. (b,c) Contour plots of the joint probability distribution of the average usage (ψ, per cent-spliced-in) of splice junctions (SJ; b), and of the standard deviation of SJ usage (c) in orthologous SJ pairs. Logistic transformation (logit) was used in a and b. SJ with constant complete inclusion or exclusion are not shown. ‘Alternative’ denotes SJ that are annotated alternative in both species.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4308717&req=5

f3: Genome-wide conservation of antisense expression and splicing profiles.(a) The joint distribution of the average antisense-to-total expression ratio (the number of reads mapped to the opposite strand as a fraction of the number of reads mapped to both strands) in pairs of orthologous protein-coding genes; cc=0.68. (b,c) Contour plots of the joint probability distribution of the average usage (ψ, per cent-spliced-in) of splice junctions (SJ; b), and of the standard deviation of SJ usage (c) in orthologous SJ pairs. Logistic transformation (logit) was used in a and b. SJ with constant complete inclusion or exclusion are not shown. ‘Alternative’ denotes SJ that are annotated alternative in both species.

Mentions: There is overall, substantial genome-wide conservation of expression levels between human and mouse irrespective of the cell or tissue type of the sample. We computed genome-wide expression profiles, measured as AVG read density, for all orthologous 100-nt bins spaced equally along the human and mouse genomes (Supplementary Fig. 4A,B, and Supplementary Methods). We found substantial correlation in AVG read density at orthologous bins (Pearson correlation coefficient, cc=0.67, Fig. 2a). This correlation is significant not only for exonic regions2 (Supplementary Fig. 4C), but also for alignable intronic (Supplementary Fig. 4D) and intergenic regions (Fig. 2b). However, most of this intergenic transcription is proximal to annotated genes (41% less than 10 kb from the closest annotated gene termini, Fig. 2c). This is partially the consequence of the decreasing number of intergenic bins with distance to the closest gene (Supplementary Fig. 5). In any case, the murine-human expression correlation decays with distance to the closest annotated gene (Fig. 2d). Permissive transcription close to protein-coding genes could be the origin of many lncRNAs. However, when computing the distance between annotated lncRNAs and the closest neighbour annotated gene, we found this distance to be larger than for protein-coding genes (on AVG, approximately 66 and 35 kb, respectively). Expression levels correlate with phylogenetic conservation, as measured by phastCons scores2122 (Fig. 2e). However, a fraction of orthologous bins having low sequence conservation are still densely transcribed (5% of the least conserved bins have read density greater than 10) and the bins that correspond to higher expression include a wide range of sequence conservation values (Fig. 2f). Highly expressed intergenic bins are slightly enriched for genome-wide association study (GWAS) hits (Fisher test, P-value≈0.055), and strongly enriched for cis-expression quantitative trait locus (QTLs) (eQTLs; P-value<2.2e-16, Supplementary Methods), the latter suggesting an important role for enhancer transcription in the regulation of gene expression. There is also substantial conservation of antisense transcription18. For each sense/antisense orthologous gene pair, we computed the ratio of antisense-to-total gene expression averaged over all conditions, and found strong correlation of AVG antisense-to-total gene expression ratio in orthologous genes (cc=0.68) as well as of its variation among samples (cc=0.52; Fig. 3a and Supplementary Fig. 6). Antisense transcription has been suggested as an important regulatory mechanism23, and our results indicate that it may have been conserved over large evolutionary distances.


Enhanced transcriptome maps from multiple mouse tissues reveal evolutionary constraint in gene expression.

Pervouchine DD, Djebali S, Breschi A, Davis CA, Barja PP, Dobin A, Tanzer A, Lagarde J, Zaleski C, See LH, Fastuca M, Drenkow J, Wang H, Bussotti G, Pei B, Balasubramanian S, Monlong J, Harmanci A, Gerstein M, Beer MA, Notredame C, Guigó R, Gingeras TR - Nat Commun (2015)

Genome-wide conservation of antisense expression and splicing profiles.(a) The joint distribution of the average antisense-to-total expression ratio (the number of reads mapped to the opposite strand as a fraction of the number of reads mapped to both strands) in pairs of orthologous protein-coding genes; cc=0.68. (b,c) Contour plots of the joint probability distribution of the average usage (ψ, per cent-spliced-in) of splice junctions (SJ; b), and of the standard deviation of SJ usage (c) in orthologous SJ pairs. Logistic transformation (logit) was used in a and b. SJ with constant complete inclusion or exclusion are not shown. ‘Alternative’ denotes SJ that are annotated alternative in both species.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4308717&req=5

f3: Genome-wide conservation of antisense expression and splicing profiles.(a) The joint distribution of the average antisense-to-total expression ratio (the number of reads mapped to the opposite strand as a fraction of the number of reads mapped to both strands) in pairs of orthologous protein-coding genes; cc=0.68. (b,c) Contour plots of the joint probability distribution of the average usage (ψ, per cent-spliced-in) of splice junctions (SJ; b), and of the standard deviation of SJ usage (c) in orthologous SJ pairs. Logistic transformation (logit) was used in a and b. SJ with constant complete inclusion or exclusion are not shown. ‘Alternative’ denotes SJ that are annotated alternative in both species.
Mentions: There is overall, substantial genome-wide conservation of expression levels between human and mouse irrespective of the cell or tissue type of the sample. We computed genome-wide expression profiles, measured as AVG read density, for all orthologous 100-nt bins spaced equally along the human and mouse genomes (Supplementary Fig. 4A,B, and Supplementary Methods). We found substantial correlation in AVG read density at orthologous bins (Pearson correlation coefficient, cc=0.67, Fig. 2a). This correlation is significant not only for exonic regions2 (Supplementary Fig. 4C), but also for alignable intronic (Supplementary Fig. 4D) and intergenic regions (Fig. 2b). However, most of this intergenic transcription is proximal to annotated genes (41% less than 10 kb from the closest annotated gene termini, Fig. 2c). This is partially the consequence of the decreasing number of intergenic bins with distance to the closest gene (Supplementary Fig. 5). In any case, the murine-human expression correlation decays with distance to the closest annotated gene (Fig. 2d). Permissive transcription close to protein-coding genes could be the origin of many lncRNAs. However, when computing the distance between annotated lncRNAs and the closest neighbour annotated gene, we found this distance to be larger than for protein-coding genes (on AVG, approximately 66 and 35 kb, respectively). Expression levels correlate with phylogenetic conservation, as measured by phastCons scores2122 (Fig. 2e). However, a fraction of orthologous bins having low sequence conservation are still densely transcribed (5% of the least conserved bins have read density greater than 10) and the bins that correspond to higher expression include a wide range of sequence conservation values (Fig. 2f). Highly expressed intergenic bins are slightly enriched for genome-wide association study (GWAS) hits (Fisher test, P-value≈0.055), and strongly enriched for cis-expression quantitative trait locus (QTLs) (eQTLs; P-value<2.2e-16, Supplementary Methods), the latter suggesting an important role for enhancer transcription in the regulation of gene expression. There is also substantial conservation of antisense transcription18. For each sense/antisense orthologous gene pair, we computed the ratio of antisense-to-total gene expression averaged over all conditions, and found strong correlation of AVG antisense-to-total gene expression ratio in orthologous genes (cc=0.68) as well as of its variation among samples (cc=0.52; Fig. 3a and Supplementary Fig. 6). Antisense transcription has been suggested as an important regulatory mechanism23, and our results indicate that it may have been conserved over large evolutionary distances.

Bottom Line: This core set of genes captures a substantial fraction of the transcriptional output of mammalian cells, and participates in basic functional and structural housekeeping processes common to all cell types.Perturbation of these constrained genes is associated with significant phenotypes including embryonic lethality and cancer.Evolutionary constraint in gene expression levels is not reflected in the conservation of the genomic sequences, but is associated with conserved epigenetic marking, as well as with characteristic post-transcriptional regulatory programme, in which sub-cellular localization and alternative splicing play comparatively large roles.

View Article: PubMed Central - PubMed

Affiliation: 1] Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, Barcelona 08003, Spain [2] Faculty of Bioengineering and Bioinformatics, Moscow State University, Leninskie Gory 1-73, 119992 Moscow, Russia.

ABSTRACT
Mice have been a long-standing model for human biology and disease. Here we characterize, by RNA sequencing, the transcriptional profiles of a large and heterogeneous collection of mouse tissues, augmenting the mouse transcriptome with thousands of novel transcript candidates. Comparison with transcriptome profiles in human cell lines reveals substantial conservation of transcriptional programmes, and uncovers a distinct class of genes with levels of expression that have been constrained early in vertebrate evolution. This core set of genes captures a substantial fraction of the transcriptional output of mammalian cells, and participates in basic functional and structural housekeeping processes common to all cell types. Perturbation of these constrained genes is associated with significant phenotypes including embryonic lethality and cancer. Evolutionary constraint in gene expression levels is not reflected in the conservation of the genomic sequences, but is associated with conserved epigenetic marking, as well as with characteristic post-transcriptional regulatory programme, in which sub-cellular localization and alternative splicing play comparatively large roles.

No MeSH data available.


Related in: MedlinePlus