Limits...
Selection against tandem splice sites affecting structured protein regions.

Hiller M, Szafranski K, Huse K, Backofen R, Platzer M - BMC Evol. Biol. (2008)

Bottom Line: We found multiple lines of evidence that the human protein coding sequences are under selection against such in-frame tandem splice events, indicating that these events are often deleterious.Investigating structures of functional protein domains, we found that tandem acceptors are preferentially located at the domain surface and outside structural elements such as helices and sheets.We estimate that ~2,400 introns are under selection against possessing a tandem site.

View Article: PubMed Central - HTML - PubMed

Affiliation: Bioinformatics Group, Albert-Ludwigs-University Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany. hiller@informatik.uni-freiburg.de

ABSTRACT

Background: Alternative selection of splice sites in tandem donors and acceptors is a major mode of alternative splicing. Here, we analyzed whether in-frame tandem sites leading to subtle mRNA insertions/deletions of 3, 6, or 9 nucleotides are under natural selection.

Results: We found multiple lines of evidence that the human protein coding sequences are under selection against such in-frame tandem splice events, indicating that these events are often deleterious. The strength of selection is not homogeneous within the coding sequence as protein regions that fold into a fixed 3D structure (intrinsically ordered) are under stronger selection, especially against sites with a strong minor splice site. Investigating structures of functional protein domains, we found that tandem acceptors are preferentially located at the domain surface and outside structural elements such as helices and sheets. Using three-species comparisons, we estimate that more than half of all mutations that create NAGNAG acceptors in the coding region have been eliminated by selection.

Conclusion: We estimate that ~2,400 introns are under selection against possessing a tandem site.

Show MeSH

Related in: MedlinePlus

Distribution of plausible and implausible NAGNAG acceptors. (A) Human and (B) C. elegans UTR vs. CDS introns; (C) Human CDS introns divided into a location in ordered or disordered regions. Each bar is the percentage of introns having a plausible (blue) or implausible (green) NAGNAG acceptor. Absolute intron numbers are given above the bars.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2279118&req=5

Figure 3: Distribution of plausible and implausible NAGNAG acceptors. (A) Human and (B) C. elegans UTR vs. CDS introns; (C) Human CDS introns divided into a location in ordered or disordered regions. Each bar is the percentage of introns having a plausible (blue) or implausible (green) NAGNAG acceptor. Absolute intron numbers are given above the bars.

Mentions: First, we compared the percentage of CDS and UTR introns that have a plausible or implausible NAGNAG acceptor. The frequency of plausible NAGNAG sites is 1.9-fold lower in CDS introns compared to UTR introns (Figure 3A). In contrast, the frequency of implausible sites is very similar in CDS and UTR introns. This shows a significant depletion of plausible sites in CDS introns (Fisher's exact test: P < 0.0001). Consistently, AAG and CAG but not the synonymous codons AAA and CAA have been found to be avoided at the 5' exon boundary [30,31], although AAG/CAG is more often part of splicing enhancer motifs than AAA/CAA [32]. Furthermore, GAG is not underrepresented at the 5' exon boundary compared to the synonymous GAA codon [31].


Selection against tandem splice sites affecting structured protein regions.

Hiller M, Szafranski K, Huse K, Backofen R, Platzer M - BMC Evol. Biol. (2008)

Distribution of plausible and implausible NAGNAG acceptors. (A) Human and (B) C. elegans UTR vs. CDS introns; (C) Human CDS introns divided into a location in ordered or disordered regions. Each bar is the percentage of introns having a plausible (blue) or implausible (green) NAGNAG acceptor. Absolute intron numbers are given above the bars.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2279118&req=5

Figure 3: Distribution of plausible and implausible NAGNAG acceptors. (A) Human and (B) C. elegans UTR vs. CDS introns; (C) Human CDS introns divided into a location in ordered or disordered regions. Each bar is the percentage of introns having a plausible (blue) or implausible (green) NAGNAG acceptor. Absolute intron numbers are given above the bars.
Mentions: First, we compared the percentage of CDS and UTR introns that have a plausible or implausible NAGNAG acceptor. The frequency of plausible NAGNAG sites is 1.9-fold lower in CDS introns compared to UTR introns (Figure 3A). In contrast, the frequency of implausible sites is very similar in CDS and UTR introns. This shows a significant depletion of plausible sites in CDS introns (Fisher's exact test: P < 0.0001). Consistently, AAG and CAG but not the synonymous codons AAA and CAA have been found to be avoided at the 5' exon boundary [30,31], although AAG/CAG is more often part of splicing enhancer motifs than AAA/CAA [32]. Furthermore, GAG is not underrepresented at the 5' exon boundary compared to the synonymous GAA codon [31].

Bottom Line: We found multiple lines of evidence that the human protein coding sequences are under selection against such in-frame tandem splice events, indicating that these events are often deleterious.Investigating structures of functional protein domains, we found that tandem acceptors are preferentially located at the domain surface and outside structural elements such as helices and sheets.We estimate that ~2,400 introns are under selection against possessing a tandem site.

View Article: PubMed Central - HTML - PubMed

Affiliation: Bioinformatics Group, Albert-Ludwigs-University Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany. hiller@informatik.uni-freiburg.de

ABSTRACT

Background: Alternative selection of splice sites in tandem donors and acceptors is a major mode of alternative splicing. Here, we analyzed whether in-frame tandem sites leading to subtle mRNA insertions/deletions of 3, 6, or 9 nucleotides are under natural selection.

Results: We found multiple lines of evidence that the human protein coding sequences are under selection against such in-frame tandem splice events, indicating that these events are often deleterious. The strength of selection is not homogeneous within the coding sequence as protein regions that fold into a fixed 3D structure (intrinsically ordered) are under stronger selection, especially against sites with a strong minor splice site. Investigating structures of functional protein domains, we found that tandem acceptors are preferentially located at the domain surface and outside structural elements such as helices and sheets. Using three-species comparisons, we estimate that more than half of all mutations that create NAGNAG acceptors in the coding region have been eliminated by selection.

Conclusion: We estimate that ~2,400 introns are under selection against possessing a tandem site.

Show MeSH
Related in: MedlinePlus