Limits...
8.2% of the Human genome is constrained: variation in rates of turnover across functional element classes in the human lineage.

Rands CM, Meader S, Ponting CP, Lunter G - PLoS Genet. (2014)

Bottom Line: While enriched with ENCODE biochemical annotations, much of the short-lived constrained sequences we identify are not detected by models optimized for wider pan-mammalian conservation.By contrast, protein coding sequence has been highly stable, with an estimated half-life of over a billion years (d1/2 = 2.1-5.0).These results reveal that the evolutionary history of the human genome has been highly dynamic, particularly for its noncoding yet biologically functional fraction.

View Article: PubMed Central - PubMed

Affiliation: MRC Functional Genomics Unit, Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford, United Kingdom.

ABSTRACT
Ten years on from the finishing of the human reference genome sequence, it remains unclear what fraction of the human genome confers function, where this sequence resides, and how much is shared with other mammalian species. When addressing these questions, functional sequence has often been equated with pan-mammalian conserved sequence. However, functional elements that are short-lived, including those contributing to species-specific biology, will not leave a footprint of long-lasting negative selection. Here, we address these issues by identifying and characterising sequence that has been constrained with respect to insertions and deletions for pairs of eutherian genomes over a range of divergences. Within noncoding sequence, we find increasing amounts of mutually constrained sequence as species pairs become more closely related, indicating that noncoding constrained sequence turns over rapidly. We estimate that half of present-day noncoding constrained sequence has been gained or lost in approximately the last 130 million years (half-life in units of divergence time, d1/2 = 0.25-0.31). While enriched with ENCODE biochemical annotations, much of the short-lived constrained sequences we identify are not detected by models optimized for wider pan-mammalian conservation. Constrained DNase 1 hypersensitivity sites, promoters and untranslated regions have been more evolutionarily stable than long noncoding RNA loci which have turned over especially rapidly. By contrast, protein coding sequence has been highly stable, with an estimated half-life of over a billion years (d1/2 = 2.1-5.0). From extrapolations we estimate that 8.2% (7.1-9.2%) of the human genome is presently subject to negative selection and thus is likely to be functional, while only 2.2% has maintained constraint in both human and mouse since these species diverged. These results reveal that the evolutionary history of the human genome has been highly dynamic, particularly for its noncoding yet biologically functional fraction.

Show MeSH

Related in: MedlinePlus

Constraint and turnover for different classes of human functional element.A. The total quantities of constrained sequence estimated for the present day by extrapolation for different element types. B. The estimated rate of turnover (b parameter) for different types of constrained element.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4109858&req=5

pgen-1004525-g003: Constraint and turnover for different classes of human functional element.A. The total quantities of constrained sequence estimated for the present day by extrapolation for different element types. B. The estimated rate of turnover (b parameter) for different types of constrained element.

Mentions: We next investigated whether various classes of functional element, identified in human primarily by the ENCODE project [5], exhibit contrasting levels of constraint, and whether these constrained element classes show a propensity to turn over at different rates. Of the functional classes we considered, promoters, untranslated regions (UTRs), DNAse HSs and TFBSs, enhancers and un-annotated sequences (defined as sequences not within 50 bp of ENCODE DNAse HSs, TFBS loci, lncRNAs from [21], Ensembl coding sequence, or UTRs) all show intermediate levels of turnover (Figure 3; Figure S7, Figure S8). LncRNA sequences show the highest level of turnover (Figure 3; Figure S8), and an even higher rate of turnover was inferred when the ENCODE-defined lncRNAs were used rather than the set from [21] (Figure S9). The fraction of sequence that the model inferred to be under present day constraint also varied across these categories, with intermediate fractions inferred for UTRs, DNAse HSs and TFBSs, and lower fractions for lncRNAs and enhancers. As expected, the lowest fractions were observed for un-annotated sequence; nevertheless, in absolute terms the amount of constrained sequence in this category is considerable (70 Mb, 45–85 Mb) (Figure 3). Constrained sequence in this category may represent lineage-specific functional sequences that were not identified by the ENCODE project, for instance because of their function in tissues or developmental stages not investigated by ENCODE. Finally, transposable element-derived sequences show very small amounts of constraint, and as a result our methods have little power to detect turnover in this class.


8.2% of the Human genome is constrained: variation in rates of turnover across functional element classes in the human lineage.

Rands CM, Meader S, Ponting CP, Lunter G - PLoS Genet. (2014)

Constraint and turnover for different classes of human functional element.A. The total quantities of constrained sequence estimated for the present day by extrapolation for different element types. B. The estimated rate of turnover (b parameter) for different types of constrained element.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4109858&req=5

pgen-1004525-g003: Constraint and turnover for different classes of human functional element.A. The total quantities of constrained sequence estimated for the present day by extrapolation for different element types. B. The estimated rate of turnover (b parameter) for different types of constrained element.
Mentions: We next investigated whether various classes of functional element, identified in human primarily by the ENCODE project [5], exhibit contrasting levels of constraint, and whether these constrained element classes show a propensity to turn over at different rates. Of the functional classes we considered, promoters, untranslated regions (UTRs), DNAse HSs and TFBSs, enhancers and un-annotated sequences (defined as sequences not within 50 bp of ENCODE DNAse HSs, TFBS loci, lncRNAs from [21], Ensembl coding sequence, or UTRs) all show intermediate levels of turnover (Figure 3; Figure S7, Figure S8). LncRNA sequences show the highest level of turnover (Figure 3; Figure S8), and an even higher rate of turnover was inferred when the ENCODE-defined lncRNAs were used rather than the set from [21] (Figure S9). The fraction of sequence that the model inferred to be under present day constraint also varied across these categories, with intermediate fractions inferred for UTRs, DNAse HSs and TFBSs, and lower fractions for lncRNAs and enhancers. As expected, the lowest fractions were observed for un-annotated sequence; nevertheless, in absolute terms the amount of constrained sequence in this category is considerable (70 Mb, 45–85 Mb) (Figure 3). Constrained sequence in this category may represent lineage-specific functional sequences that were not identified by the ENCODE project, for instance because of their function in tissues or developmental stages not investigated by ENCODE. Finally, transposable element-derived sequences show very small amounts of constraint, and as a result our methods have little power to detect turnover in this class.

Bottom Line: While enriched with ENCODE biochemical annotations, much of the short-lived constrained sequences we identify are not detected by models optimized for wider pan-mammalian conservation.By contrast, protein coding sequence has been highly stable, with an estimated half-life of over a billion years (d1/2 = 2.1-5.0).These results reveal that the evolutionary history of the human genome has been highly dynamic, particularly for its noncoding yet biologically functional fraction.

View Article: PubMed Central - PubMed

Affiliation: MRC Functional Genomics Unit, Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford, United Kingdom.

ABSTRACT
Ten years on from the finishing of the human reference genome sequence, it remains unclear what fraction of the human genome confers function, where this sequence resides, and how much is shared with other mammalian species. When addressing these questions, functional sequence has often been equated with pan-mammalian conserved sequence. However, functional elements that are short-lived, including those contributing to species-specific biology, will not leave a footprint of long-lasting negative selection. Here, we address these issues by identifying and characterising sequence that has been constrained with respect to insertions and deletions for pairs of eutherian genomes over a range of divergences. Within noncoding sequence, we find increasing amounts of mutually constrained sequence as species pairs become more closely related, indicating that noncoding constrained sequence turns over rapidly. We estimate that half of present-day noncoding constrained sequence has been gained or lost in approximately the last 130 million years (half-life in units of divergence time, d1/2 = 0.25-0.31). While enriched with ENCODE biochemical annotations, much of the short-lived constrained sequences we identify are not detected by models optimized for wider pan-mammalian conservation. Constrained DNase 1 hypersensitivity sites, promoters and untranslated regions have been more evolutionarily stable than long noncoding RNA loci which have turned over especially rapidly. By contrast, protein coding sequence has been highly stable, with an estimated half-life of over a billion years (d1/2 = 2.1-5.0). From extrapolations we estimate that 8.2% (7.1-9.2%) of the human genome is presently subject to negative selection and thus is likely to be functional, while only 2.2% has maintained constraint in both human and mouse since these species diverged. These results reveal that the evolutionary history of the human genome has been highly dynamic, particularly for its noncoding yet biologically functional fraction.

Show MeSH
Related in: MedlinePlus