Limits...
Identification of motifs that function in the splicing of non-canonical introns.

Murray JI, Voelker RB, Henscheid KL, Warf MB, Berglund JA - Genome Biol. (2008)

Bottom Line: While the current model of pre-mRNA splicing is based on the recognition of four canonical intronic motifs (5' splice site, branchpoint sequence, polypyrimidine (PY) tract and 3' splice site), it is becoming increasingly clear that splicing is regulated by both canonical and non-canonical splicing signals located in the RNA sequence of introns and exons that act to recruit the spliceosome and associated splicing factors.In vivo splicing studies show that C-rich and G-rich motifs function as intronic splicing enhancers in a combinatorial manner to compensate for weak PY tracts.The enrichment of specific intronic splicing enhancers upstream of weak PY tracts suggests that a novel mechanism for intron recognition exists, which compensates for a weakened canonical pre-mRNA splicing motif.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Chemistry, Institute of Molecular Biology, University of Oregon, Eugene, Oregon, USA.

ABSTRACT

Background: While the current model of pre-mRNA splicing is based on the recognition of four canonical intronic motifs (5' splice site, branchpoint sequence, polypyrimidine (PY) tract and 3' splice site), it is becoming increasingly clear that splicing is regulated by both canonical and non-canonical splicing signals located in the RNA sequence of introns and exons that act to recruit the spliceosome and associated splicing factors. The diversity of human intronic sequences suggests the existence of novel recognition pathways for non-canonical introns. This study addresses the recognition and splicing of human introns that lack a canonical PY tract. The PY tract is a uridine-rich region at the 3' end of introns that acts as a binding site for U2AF65, a key factor in splicing machinery recruitment.

Results: Human introns were classified computationally into low- and high-scoring PY tracts by scoring the likely U2AF65 binding site strength. Biochemical studies confirmed that low-scoring PY tracts are weak U2AF65 binding sites while high-scoring PY tracts are strong U2AF65 binding sites. A large population of human introns contains weak PY tracts. Computational analysis revealed many families of motifs, including C-rich and G-rich motifs, that are enriched upstream of weak PY tracts. In vivo splicing studies show that C-rich and G-rich motifs function as intronic splicing enhancers in a combinatorial manner to compensate for weak PY tracts.

Conclusion: The enrichment of specific intronic splicing enhancers upstream of weak PY tracts suggests that a novel mechanism for intron recognition exists, which compensates for a weakened canonical pre-mRNA splicing motif.

Show MeSH
Computational analysis of human intron PY tracts. (a) Distribution of intronic motifs (branchpoint (BPS), G-triples (GGG) and U2AF65 binding sites (U2AF65)) adjacent to the 3' end of human introns. The BPS curve is a composite of the distribution of all pentamers containing YTRAC (Y = T or C, R = A or G). The G-triple curve is the composite for all pentamers containing GGG. The U2AF65 curve is a composite of the occurrence of the ten most abundant pentamers found in the U2AF65 SELEX sequences [27,39] (Additional data file 1). The distributions were determined over all human introns, and for each curve the total area under the curve was normalized to unity. The two regions used in this study are depicted below the curves. The PY tract region consisted of the region from -30 to -3, and the upstream PY (UPY) tract region was defined to be from -80 to -30 (relative to the acceptor splice-junction (SJ)). (b) Distribution of U2AF65 binding site scores (S65 scores) for all human introns (filled blue) and for the U2AF65 SELEX sequences used as the training set for the binding site score (vertical solid black lines). The distributions were generated using a bin size of 0.02, and the total area under the curves was normalized to unity. The median (used as the cutoff for 'weak' and 'strong' binding sites) is depicted as a vertical dashed line.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2481429&req=5

Figure 1: Computational analysis of human intron PY tracts. (a) Distribution of intronic motifs (branchpoint (BPS), G-triples (GGG) and U2AF65 binding sites (U2AF65)) adjacent to the 3' end of human introns. The BPS curve is a composite of the distribution of all pentamers containing YTRAC (Y = T or C, R = A or G). The G-triple curve is the composite for all pentamers containing GGG. The U2AF65 curve is a composite of the occurrence of the ten most abundant pentamers found in the U2AF65 SELEX sequences [27,39] (Additional data file 1). The distributions were determined over all human introns, and for each curve the total area under the curve was normalized to unity. The two regions used in this study are depicted below the curves. The PY tract region consisted of the region from -30 to -3, and the upstream PY (UPY) tract region was defined to be from -80 to -30 (relative to the acceptor splice-junction (SJ)). (b) Distribution of U2AF65 binding site scores (S65 scores) for all human introns (filled blue) and for the U2AF65 SELEX sequences used as the training set for the binding site score (vertical solid black lines). The distributions were generated using a bin size of 0.02, and the total area under the curves was normalized to unity. The median (used as the cutoff for 'weak' and 'strong' binding sites) is depicted as a vertical dashed line.

Mentions: Many human introns have been shown to be enriched in motifs containing GGG in the region upstream of the PY tract [42,43] (Figure 1a). This observation demonstrates that this region is under compositional selection. G-triples located upstream of a weak PY tract have been shown to affect splice site usage [20]. We hypothesized other cis-elements may also be located upstream of the PY tract and may compensate for PY tracts containing weak U2AF65 binding sites. To explore this possibility we performed a computational analysis to determine if the region upstream of the PY tract is enriched in specific motifs when the PY tract does not contain a strong U2AF65 binding site.


Identification of motifs that function in the splicing of non-canonical introns.

Murray JI, Voelker RB, Henscheid KL, Warf MB, Berglund JA - Genome Biol. (2008)

Computational analysis of human intron PY tracts. (a) Distribution of intronic motifs (branchpoint (BPS), G-triples (GGG) and U2AF65 binding sites (U2AF65)) adjacent to the 3' end of human introns. The BPS curve is a composite of the distribution of all pentamers containing YTRAC (Y = T or C, R = A or G). The G-triple curve is the composite for all pentamers containing GGG. The U2AF65 curve is a composite of the occurrence of the ten most abundant pentamers found in the U2AF65 SELEX sequences [27,39] (Additional data file 1). The distributions were determined over all human introns, and for each curve the total area under the curve was normalized to unity. The two regions used in this study are depicted below the curves. The PY tract region consisted of the region from -30 to -3, and the upstream PY (UPY) tract region was defined to be from -80 to -30 (relative to the acceptor splice-junction (SJ)). (b) Distribution of U2AF65 binding site scores (S65 scores) for all human introns (filled blue) and for the U2AF65 SELEX sequences used as the training set for the binding site score (vertical solid black lines). The distributions were generated using a bin size of 0.02, and the total area under the curves was normalized to unity. The median (used as the cutoff for 'weak' and 'strong' binding sites) is depicted as a vertical dashed line.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2481429&req=5

Figure 1: Computational analysis of human intron PY tracts. (a) Distribution of intronic motifs (branchpoint (BPS), G-triples (GGG) and U2AF65 binding sites (U2AF65)) adjacent to the 3' end of human introns. The BPS curve is a composite of the distribution of all pentamers containing YTRAC (Y = T or C, R = A or G). The G-triple curve is the composite for all pentamers containing GGG. The U2AF65 curve is a composite of the occurrence of the ten most abundant pentamers found in the U2AF65 SELEX sequences [27,39] (Additional data file 1). The distributions were determined over all human introns, and for each curve the total area under the curve was normalized to unity. The two regions used in this study are depicted below the curves. The PY tract region consisted of the region from -30 to -3, and the upstream PY (UPY) tract region was defined to be from -80 to -30 (relative to the acceptor splice-junction (SJ)). (b) Distribution of U2AF65 binding site scores (S65 scores) for all human introns (filled blue) and for the U2AF65 SELEX sequences used as the training set for the binding site score (vertical solid black lines). The distributions were generated using a bin size of 0.02, and the total area under the curves was normalized to unity. The median (used as the cutoff for 'weak' and 'strong' binding sites) is depicted as a vertical dashed line.
Mentions: Many human introns have been shown to be enriched in motifs containing GGG in the region upstream of the PY tract [42,43] (Figure 1a). This observation demonstrates that this region is under compositional selection. G-triples located upstream of a weak PY tract have been shown to affect splice site usage [20]. We hypothesized other cis-elements may also be located upstream of the PY tract and may compensate for PY tracts containing weak U2AF65 binding sites. To explore this possibility we performed a computational analysis to determine if the region upstream of the PY tract is enriched in specific motifs when the PY tract does not contain a strong U2AF65 binding site.

Bottom Line: While the current model of pre-mRNA splicing is based on the recognition of four canonical intronic motifs (5' splice site, branchpoint sequence, polypyrimidine (PY) tract and 3' splice site), it is becoming increasingly clear that splicing is regulated by both canonical and non-canonical splicing signals located in the RNA sequence of introns and exons that act to recruit the spliceosome and associated splicing factors.In vivo splicing studies show that C-rich and G-rich motifs function as intronic splicing enhancers in a combinatorial manner to compensate for weak PY tracts.The enrichment of specific intronic splicing enhancers upstream of weak PY tracts suggests that a novel mechanism for intron recognition exists, which compensates for a weakened canonical pre-mRNA splicing motif.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Chemistry, Institute of Molecular Biology, University of Oregon, Eugene, Oregon, USA.

ABSTRACT

Background: While the current model of pre-mRNA splicing is based on the recognition of four canonical intronic motifs (5' splice site, branchpoint sequence, polypyrimidine (PY) tract and 3' splice site), it is becoming increasingly clear that splicing is regulated by both canonical and non-canonical splicing signals located in the RNA sequence of introns and exons that act to recruit the spliceosome and associated splicing factors. The diversity of human intronic sequences suggests the existence of novel recognition pathways for non-canonical introns. This study addresses the recognition and splicing of human introns that lack a canonical PY tract. The PY tract is a uridine-rich region at the 3' end of introns that acts as a binding site for U2AF65, a key factor in splicing machinery recruitment.

Results: Human introns were classified computationally into low- and high-scoring PY tracts by scoring the likely U2AF65 binding site strength. Biochemical studies confirmed that low-scoring PY tracts are weak U2AF65 binding sites while high-scoring PY tracts are strong U2AF65 binding sites. A large population of human introns contains weak PY tracts. Computational analysis revealed many families of motifs, including C-rich and G-rich motifs, that are enriched upstream of weak PY tracts. In vivo splicing studies show that C-rich and G-rich motifs function as intronic splicing enhancers in a combinatorial manner to compensate for weak PY tracts.

Conclusion: The enrichment of specific intronic splicing enhancers upstream of weak PY tracts suggests that a novel mechanism for intron recognition exists, which compensates for a weakened canonical pre-mRNA splicing motif.

Show MeSH