Limits...
The impact of the nucleosome code on protein-coding sequence evolution in yeast.

Warnecke T, Batada NN, Hurst LD - PLoS Genet. (2008)

Bottom Line: A reduced rate of evolution in linker is especially evident at the 5' end of genes, where the effect extends to non-synonymous substitution rates.We conclude that selection operating on DNA to maintain correct positioning of nucleosomes impacts codon choice, amino acid choice, and synonymous and non-synonymous rates of evolution in coding sequence.The results support the exclusion model for nucleosome positioning and provide an alternative interpretation for runs of rare codons.

View Article: PubMed Central - PubMed

Affiliation: Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom.

ABSTRACT
Coding sequence evolution was once thought to be the result of selection on optimal protein function alone. Selection can, however, also act at the RNA level, for example, to facilitate rapid translation or ensure correct splicing. Here, we ask whether the way DNA works also imposes constraints on coding sequence evolution. We identify nucleosome positioning as a likely candidate to set up such a DNA-level selective regime and use high-resolution microarray data in yeast to compare the evolution of coding sequence bound to or free from nucleosomes. Controlling for gene expression and intra-gene location, we find a nucleosome-free "linker" sequence to evolve on average 5-6% slower at synonymous sites. A reduced rate of evolution in linker is especially evident at the 5' end of genes, where the effect extends to non-synonymous substitution rates. This is consistent with regular nucleosome architecture in this region being important in the context of gene expression control. As predicted, codons likely to generate a sequence unfavourable to nucleosome formation are enriched in linker sequence. Amino acid content is likewise skewed as a function of nucleosome occupancy. We conclude that selection operating on DNA to maintain correct positioning of nucleosomes impacts codon choice, amino acid choice, and synonymous and non-synonymous rates of evolution in coding sequence. The results support the exclusion model for nucleosome positioning and provide an alternative interpretation for runs of rare codons. As the intimate association of histones and DNA is a universal characteristic of genic sequence in eukaryotes, selection on coding sequence composition imposed by nucleosome positioning should be phylogenetically widespread.

Show MeSH
Regional biases in nucleosome occupancy.(A) Occupancy states are unevenly represented across CDS regions. The top panel shows regional variation in the proportion of linker (orange), fuzzy (purple), and well-positioned (black) nucleosomes across yeast CDS regions. In the core panel, the 150 codons bordering each CDS end are depicted. The bottom panel gives mean proportions of nucleotides called as one of the three occupancy states for the terminal 100 codons and the core across genes ≥906 nt. (B, C) CDS regions have distinct substitution dynamics but differences linked to nucleosome occupancy are still evident within regions. Rates of synonymous (B) and non-synonymous (C) evolution between S. cerevisiae and S. mikatae discriminated by CDS region and occupancy state. The dot represents the respective rate determined from the concatenated sequence. The vertical bar represents the distribution of Ka(Ks) values expected under a random model (see Methods) where identity of aligned codons is independent of nucleosome occupancy. Data for the restricted core are shown to make variances comparable.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2570795&req=5

pgen-1000250-g001: Regional biases in nucleosome occupancy.(A) Occupancy states are unevenly represented across CDS regions. The top panel shows regional variation in the proportion of linker (orange), fuzzy (purple), and well-positioned (black) nucleosomes across yeast CDS regions. In the core panel, the 150 codons bordering each CDS end are depicted. The bottom panel gives mean proportions of nucleotides called as one of the three occupancy states for the terminal 100 codons and the core across genes ≥906 nt. (B, C) CDS regions have distinct substitution dynamics but differences linked to nucleosome occupancy are still evident within regions. Rates of synonymous (B) and non-synonymous (C) evolution between S. cerevisiae and S. mikatae discriminated by CDS region and occupancy state. The dot represents the respective rate determined from the concatenated sequence. The vertical bar represents the distribution of Ka(Ks) values expected under a random model (see Methods) where identity of aligned codons is independent of nucleosome occupancy. Data for the restricted core are shown to make variances comparable.

Mentions: However, within-gene comparisons can only be carried out for a small number of genes (N = 158) because rarely is there sufficient sequence for all OSs within the same gene to obtain reliable rate estimates. Consequently, this sample is biased towards very long genes (see Methods). Further, within-gene comparisons might still not reflect the true relationship between nucleosome occupancy and sequence evolution if there is intra-genic heterogeneity in substitution dynamics. This is because nucleosomes exhibit promoter-specific architectures, in line with their role in regulating promoter accessibility [23],[25]. As the majority of translational start sites (ATG) in yeast are positioned within one nucleosomal rotation of the transcriptional start site [33], 5′ ends of CDSs show regular occupancy patterns (Figure 1A), which have repeatedly been described in the literature. This intimate association of CDS region and OS only gradually collapses downstream because linker length variation is typically modest [23]. Furthermore, regularities can also be detected across 3′ ends of CDS [26] (Figure 1A). If, then, there existed gene-region distinct evolutionary trajectories, we would expect any analysis of OS-based differences to be biased as a result of the uneven representation of OSs across these regions (Figure 1A bottom panel).


The impact of the nucleosome code on protein-coding sequence evolution in yeast.

Warnecke T, Batada NN, Hurst LD - PLoS Genet. (2008)

Regional biases in nucleosome occupancy.(A) Occupancy states are unevenly represented across CDS regions. The top panel shows regional variation in the proportion of linker (orange), fuzzy (purple), and well-positioned (black) nucleosomes across yeast CDS regions. In the core panel, the 150 codons bordering each CDS end are depicted. The bottom panel gives mean proportions of nucleotides called as one of the three occupancy states for the terminal 100 codons and the core across genes ≥906 nt. (B, C) CDS regions have distinct substitution dynamics but differences linked to nucleosome occupancy are still evident within regions. Rates of synonymous (B) and non-synonymous (C) evolution between S. cerevisiae and S. mikatae discriminated by CDS region and occupancy state. The dot represents the respective rate determined from the concatenated sequence. The vertical bar represents the distribution of Ka(Ks) values expected under a random model (see Methods) where identity of aligned codons is independent of nucleosome occupancy. Data for the restricted core are shown to make variances comparable.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2570795&req=5

pgen-1000250-g001: Regional biases in nucleosome occupancy.(A) Occupancy states are unevenly represented across CDS regions. The top panel shows regional variation in the proportion of linker (orange), fuzzy (purple), and well-positioned (black) nucleosomes across yeast CDS regions. In the core panel, the 150 codons bordering each CDS end are depicted. The bottom panel gives mean proportions of nucleotides called as one of the three occupancy states for the terminal 100 codons and the core across genes ≥906 nt. (B, C) CDS regions have distinct substitution dynamics but differences linked to nucleosome occupancy are still evident within regions. Rates of synonymous (B) and non-synonymous (C) evolution between S. cerevisiae and S. mikatae discriminated by CDS region and occupancy state. The dot represents the respective rate determined from the concatenated sequence. The vertical bar represents the distribution of Ka(Ks) values expected under a random model (see Methods) where identity of aligned codons is independent of nucleosome occupancy. Data for the restricted core are shown to make variances comparable.
Mentions: However, within-gene comparisons can only be carried out for a small number of genes (N = 158) because rarely is there sufficient sequence for all OSs within the same gene to obtain reliable rate estimates. Consequently, this sample is biased towards very long genes (see Methods). Further, within-gene comparisons might still not reflect the true relationship between nucleosome occupancy and sequence evolution if there is intra-genic heterogeneity in substitution dynamics. This is because nucleosomes exhibit promoter-specific architectures, in line with their role in regulating promoter accessibility [23],[25]. As the majority of translational start sites (ATG) in yeast are positioned within one nucleosomal rotation of the transcriptional start site [33], 5′ ends of CDSs show regular occupancy patterns (Figure 1A), which have repeatedly been described in the literature. This intimate association of CDS region and OS only gradually collapses downstream because linker length variation is typically modest [23]. Furthermore, regularities can also be detected across 3′ ends of CDS [26] (Figure 1A). If, then, there existed gene-region distinct evolutionary trajectories, we would expect any analysis of OS-based differences to be biased as a result of the uneven representation of OSs across these regions (Figure 1A bottom panel).

Bottom Line: A reduced rate of evolution in linker is especially evident at the 5' end of genes, where the effect extends to non-synonymous substitution rates.We conclude that selection operating on DNA to maintain correct positioning of nucleosomes impacts codon choice, amino acid choice, and synonymous and non-synonymous rates of evolution in coding sequence.The results support the exclusion model for nucleosome positioning and provide an alternative interpretation for runs of rare codons.

View Article: PubMed Central - PubMed

Affiliation: Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom.

ABSTRACT
Coding sequence evolution was once thought to be the result of selection on optimal protein function alone. Selection can, however, also act at the RNA level, for example, to facilitate rapid translation or ensure correct splicing. Here, we ask whether the way DNA works also imposes constraints on coding sequence evolution. We identify nucleosome positioning as a likely candidate to set up such a DNA-level selective regime and use high-resolution microarray data in yeast to compare the evolution of coding sequence bound to or free from nucleosomes. Controlling for gene expression and intra-gene location, we find a nucleosome-free "linker" sequence to evolve on average 5-6% slower at synonymous sites. A reduced rate of evolution in linker is especially evident at the 5' end of genes, where the effect extends to non-synonymous substitution rates. This is consistent with regular nucleosome architecture in this region being important in the context of gene expression control. As predicted, codons likely to generate a sequence unfavourable to nucleosome formation are enriched in linker sequence. Amino acid content is likewise skewed as a function of nucleosome occupancy. We conclude that selection operating on DNA to maintain correct positioning of nucleosomes impacts codon choice, amino acid choice, and synonymous and non-synonymous rates of evolution in coding sequence. The results support the exclusion model for nucleosome positioning and provide an alternative interpretation for runs of rare codons. As the intimate association of histones and DNA is a universal characteristic of genic sequence in eukaryotes, selection on coding sequence composition imposed by nucleosome positioning should be phylogenetically widespread.

Show MeSH