Limits...
Genome-wide association between branch point properties and alternative splicing.

Corvelo A, Hallegger M, Smith CW, Eyras E - PLoS Comput. Biol. (2010)

Bottom Line: Finally, the comparison between exons of different evolutionary ages and pseudo exons suggests a key role of the BP in the pathway of exon creation in human.Our computational and experimental analyses suggest that BP recognition is more flexible than previously assumed, and it appears highly dependent on the presence of downstream polypyrimidine tracts.The reported association between BP features and the splicing outcome suggests that this, so far disregarded but yet crucial, element buries information that can complement current acceptor site models.

View Article: PubMed Central - PubMed

Affiliation: Computational Genomics, Universitat Pompeu Fabra, Barcelona, Spain.

ABSTRACT
The branch point (BP) is one of the three obligatory signals required for pre-mRNA splicing. In mammals, the degeneracy of the motif combined with the lack of a large set of experimentally verified BPs complicates the task of modeling it in silico, and therefore of predicting the location of natural BPs. Consequently, BPs have been disregarded in a considerable fraction of the genome-wide studies on the regulation of splicing in mammals. We present a new computational approach for mammalian BP prediction. Using sequence conservation and positional bias we obtained a set of motifs with good agreement with U2 snRNA binding stability. Using a Support Vector Machine algorithm, we created a model complemented with polypyrimidine tract features, which considerably improves the prediction accuracy over previously published methods. Applying our algorithm to human introns, we show that BP position is highly dependent on the presence of AG dinucleotides in the 3' end of introns, with distance to the 3' splice site and BP strength strongly correlating with alternative splicing. Furthermore, experimental BP mapping for five exons preceded by long AG-dinucleotide exclusion zones revealed that, for a given intron, more than one BP can be chosen throughout the course of splicing. Finally, the comparison between exons of different evolutionary ages and pseudo exons suggests a key role of the BP in the pathway of exon creation in human. Our computational and experimental analyses suggest that BP recognition is more flexible than previously assumed, and it appears highly dependent on the presence of downstream polypyrimidine tracts. The reported association between BP features and the splicing outcome suggests that this, so far disregarded but yet crucial, element buries information that can complement current acceptor site models.

Show MeSH
Sequence counts correlate with U2 binding energy.Barplot showing, for each nonamer cluster, the U2 binding energy (blue), number of occurrences in the consTNA and consTNA-BP5 sets (grey and green, respectively). The fraction of eliminated cases by the use of the 124 BP-associated pentamers is also shown in orange. Nonamer clusters were grouped by core pentamer (5 central positions).
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2991248&req=5

pcbi-1001016-g004: Sequence counts correlate with U2 binding energy.Barplot showing, for each nonamer cluster, the U2 binding energy (blue), number of occurrences in the consTNA and consTNA-BP5 sets (grey and green, respectively). The fraction of eliminated cases by the use of the 124 BP-associated pentamers is also shown in orange. Nonamer clusters were grouped by core pentamer (5 central positions).

Mentions: To test whether the frequencies of different BP-associated words correlate with the U2 binding stability, we grouped all the consTNA-BP5 9-mers by their central pentamer sequence, which are the five positions with higher IC in the BP signal, and calculated the mean U2 binding energy for each group (see Methods). In Figure 4 we can observe that there is a direct correlation between the stability of the binding to the U2 and the occurrence of these words in the consTNA-BP5 set (Spearman's rank correlation, rho = −0.65, p = 6.53×10−9, see Figure 5 in Text S1), which validates the captured BP signal. Interestingly, if we compare this set with the consTNA set, we observe for the consTNA-BP5 a drastic reduction in the frequency of some words for which the U2 binding energy is low (Figure 4). In fact, these are mainly PPT-associated words which were filtered out from the consTNA set. Thus, we can consider the consTNA-BP5 set as a good representative of putative functional BPs.


Genome-wide association between branch point properties and alternative splicing.

Corvelo A, Hallegger M, Smith CW, Eyras E - PLoS Comput. Biol. (2010)

Sequence counts correlate with U2 binding energy.Barplot showing, for each nonamer cluster, the U2 binding energy (blue), number of occurrences in the consTNA and consTNA-BP5 sets (grey and green, respectively). The fraction of eliminated cases by the use of the 124 BP-associated pentamers is also shown in orange. Nonamer clusters were grouped by core pentamer (5 central positions).
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2991248&req=5

pcbi-1001016-g004: Sequence counts correlate with U2 binding energy.Barplot showing, for each nonamer cluster, the U2 binding energy (blue), number of occurrences in the consTNA and consTNA-BP5 sets (grey and green, respectively). The fraction of eliminated cases by the use of the 124 BP-associated pentamers is also shown in orange. Nonamer clusters were grouped by core pentamer (5 central positions).
Mentions: To test whether the frequencies of different BP-associated words correlate with the U2 binding stability, we grouped all the consTNA-BP5 9-mers by their central pentamer sequence, which are the five positions with higher IC in the BP signal, and calculated the mean U2 binding energy for each group (see Methods). In Figure 4 we can observe that there is a direct correlation between the stability of the binding to the U2 and the occurrence of these words in the consTNA-BP5 set (Spearman's rank correlation, rho = −0.65, p = 6.53×10−9, see Figure 5 in Text S1), which validates the captured BP signal. Interestingly, if we compare this set with the consTNA set, we observe for the consTNA-BP5 a drastic reduction in the frequency of some words for which the U2 binding energy is low (Figure 4). In fact, these are mainly PPT-associated words which were filtered out from the consTNA set. Thus, we can consider the consTNA-BP5 set as a good representative of putative functional BPs.

Bottom Line: Finally, the comparison between exons of different evolutionary ages and pseudo exons suggests a key role of the BP in the pathway of exon creation in human.Our computational and experimental analyses suggest that BP recognition is more flexible than previously assumed, and it appears highly dependent on the presence of downstream polypyrimidine tracts.The reported association between BP features and the splicing outcome suggests that this, so far disregarded but yet crucial, element buries information that can complement current acceptor site models.

View Article: PubMed Central - PubMed

Affiliation: Computational Genomics, Universitat Pompeu Fabra, Barcelona, Spain.

ABSTRACT
The branch point (BP) is one of the three obligatory signals required for pre-mRNA splicing. In mammals, the degeneracy of the motif combined with the lack of a large set of experimentally verified BPs complicates the task of modeling it in silico, and therefore of predicting the location of natural BPs. Consequently, BPs have been disregarded in a considerable fraction of the genome-wide studies on the regulation of splicing in mammals. We present a new computational approach for mammalian BP prediction. Using sequence conservation and positional bias we obtained a set of motifs with good agreement with U2 snRNA binding stability. Using a Support Vector Machine algorithm, we created a model complemented with polypyrimidine tract features, which considerably improves the prediction accuracy over previously published methods. Applying our algorithm to human introns, we show that BP position is highly dependent on the presence of AG dinucleotides in the 3' end of introns, with distance to the 3' splice site and BP strength strongly correlating with alternative splicing. Furthermore, experimental BP mapping for five exons preceded by long AG-dinucleotide exclusion zones revealed that, for a given intron, more than one BP can be chosen throughout the course of splicing. Finally, the comparison between exons of different evolutionary ages and pseudo exons suggests a key role of the BP in the pathway of exon creation in human. Our computational and experimental analyses suggest that BP recognition is more flexible than previously assumed, and it appears highly dependent on the presence of downstream polypyrimidine tracts. The reported association between BP features and the splicing outcome suggests that this, so far disregarded but yet crucial, element buries information that can complement current acceptor site models.

Show MeSH