Limits...
Genome-wide association between branch point properties and alternative splicing.

Corvelo A, Hallegger M, Smith CW, Eyras E - PLoS Comput. Biol. (2010)

Bottom Line: Finally, the comparison between exons of different evolutionary ages and pseudo exons suggests a key role of the BP in the pathway of exon creation in human.Our computational and experimental analyses suggest that BP recognition is more flexible than previously assumed, and it appears highly dependent on the presence of downstream polypyrimidine tracts.The reported association between BP features and the splicing outcome suggests that this, so far disregarded but yet crucial, element buries information that can complement current acceptor site models.

View Article: PubMed Central - PubMed

Affiliation: Computational Genomics, Universitat Pompeu Fabra, Barcelona, Spain.

ABSTRACT
The branch point (BP) is one of the three obligatory signals required for pre-mRNA splicing. In mammals, the degeneracy of the motif combined with the lack of a large set of experimentally verified BPs complicates the task of modeling it in silico, and therefore of predicting the location of natural BPs. Consequently, BPs have been disregarded in a considerable fraction of the genome-wide studies on the regulation of splicing in mammals. We present a new computational approach for mammalian BP prediction. Using sequence conservation and positional bias we obtained a set of motifs with good agreement with U2 snRNA binding stability. Using a Support Vector Machine algorithm, we created a model complemented with polypyrimidine tract features, which considerably improves the prediction accuracy over previously published methods. Applying our algorithm to human introns, we show that BP position is highly dependent on the presence of AG dinucleotides in the 3' end of introns, with distance to the 3' splice site and BP strength strongly correlating with alternative splicing. Furthermore, experimental BP mapping for five exons preceded by long AG-dinucleotide exclusion zones revealed that, for a given intron, more than one BP can be chosen throughout the course of splicing. Finally, the comparison between exons of different evolutionary ages and pseudo exons suggests a key role of the BP in the pathway of exon creation in human. Our computational and experimental analyses suggest that BP recognition is more flexible than previously assumed, and it appears highly dependent on the presence of downstream polypyrimidine tracts. The reported association between BP features and the splicing outcome suggests that this, so far disregarded but yet crucial, element buries information that can complement current acceptor site models.

Show MeSH
BP sequence, position, intron length and exon skipping.Percentage of exons for which (A) there is skipping evidence and (B) average exon EST inclusion level depending on BP distance. These values were computed using a sliding window of length 20 and step 10. C – Percentage of exons for which there is skipping evidence depending on BP sequence score. This was computed using a sliding window of length 1 and step 0.25. D – Mean BP sequence score as a function of intron length. This was computed in bins of 100 nts. The error bars represent the standard error. In A and C, the standard error is given by the formula:  where  is the probability and  the overall sample size.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2991248&req=5

pcbi-1001016-g007: BP sequence, position, intron length and exon skipping.Percentage of exons for which (A) there is skipping evidence and (B) average exon EST inclusion level depending on BP distance. These values were computed using a sliding window of length 20 and step 10. C – Percentage of exons for which there is skipping evidence depending on BP sequence score. This was computed using a sliding window of length 1 and step 0.25. D – Mean BP sequence score as a function of intron length. This was computed in bins of 100 nts. The error bars represent the standard error. In A and C, the standard error is given by the formula: where is the probability and the overall sample size.

Mentions: Interestingly, BP-3SS distance positively correlates with AS. Skipped exons tend to be more frequently preceded by introns containing distant BPs than constitutive exons (Mann-Whitney, p = 1.97×10−8) (Figure 7A). As the BP-3SS distance increases, so does the percentage of exons for which there is skipping evidence. It is possible to observe an almost linear correlation between BP distance and frequency of skipped exons. We found skipping evidence for approximately 43% of the exons in which the BP is located at more than 100 nts upstream, whereas for exons preceded by proximal BPs (3SS-BP distance<50 nts), only 28.6% of them were skipped (Chi-square, p = 1.61×10−6). Remarkably, this association also holds for the exon inclusion level. For the fraction of skipped exons, inclusion was calculated based on expressed sequence tag (EST) data (see Methods) and is plotted in Figure 7B. Exon inclusion decreases with BP distance. While skipped exons preceded by proximal BPs (distance<50 nts) are included in average in 85% of the transcripts, this value drops down to 65% for exons with a distal BP (3SS-BP distance>100 nts) (Mann-Whitney, p = 2.87×10−9). Additionally, BP sequence score also correlates with AS. In Figure 7C, we observe that skipping of the downstream exon is more frequent for introns with lower BP sequence score. This increase in skipping is fairly gradual. Even though the sequence score distribution is skewed towards high values (not shown), the difference in skipping percentage between lower and upper sequence score quartiles (defined by scores lower than −0.338 and higher 1.838, respectively) is strongly significant (Chi-square, p = 1.83×10−10). Moreover, there is small, but statistical significant, difference in BP sequence score between skipped (mean = 0.706) and constitutive (mean = 0.797) exons (Mann-Whitney, p = 1.78×10−9), further validating that observation.


Genome-wide association between branch point properties and alternative splicing.

Corvelo A, Hallegger M, Smith CW, Eyras E - PLoS Comput. Biol. (2010)

BP sequence, position, intron length and exon skipping.Percentage of exons for which (A) there is skipping evidence and (B) average exon EST inclusion level depending on BP distance. These values were computed using a sliding window of length 20 and step 10. C – Percentage of exons for which there is skipping evidence depending on BP sequence score. This was computed using a sliding window of length 1 and step 0.25. D – Mean BP sequence score as a function of intron length. This was computed in bins of 100 nts. The error bars represent the standard error. In A and C, the standard error is given by the formula:  where  is the probability and  the overall sample size.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2991248&req=5

pcbi-1001016-g007: BP sequence, position, intron length and exon skipping.Percentage of exons for which (A) there is skipping evidence and (B) average exon EST inclusion level depending on BP distance. These values were computed using a sliding window of length 20 and step 10. C – Percentage of exons for which there is skipping evidence depending on BP sequence score. This was computed using a sliding window of length 1 and step 0.25. D – Mean BP sequence score as a function of intron length. This was computed in bins of 100 nts. The error bars represent the standard error. In A and C, the standard error is given by the formula: where is the probability and the overall sample size.
Mentions: Interestingly, BP-3SS distance positively correlates with AS. Skipped exons tend to be more frequently preceded by introns containing distant BPs than constitutive exons (Mann-Whitney, p = 1.97×10−8) (Figure 7A). As the BP-3SS distance increases, so does the percentage of exons for which there is skipping evidence. It is possible to observe an almost linear correlation between BP distance and frequency of skipped exons. We found skipping evidence for approximately 43% of the exons in which the BP is located at more than 100 nts upstream, whereas for exons preceded by proximal BPs (3SS-BP distance<50 nts), only 28.6% of them were skipped (Chi-square, p = 1.61×10−6). Remarkably, this association also holds for the exon inclusion level. For the fraction of skipped exons, inclusion was calculated based on expressed sequence tag (EST) data (see Methods) and is plotted in Figure 7B. Exon inclusion decreases with BP distance. While skipped exons preceded by proximal BPs (distance<50 nts) are included in average in 85% of the transcripts, this value drops down to 65% for exons with a distal BP (3SS-BP distance>100 nts) (Mann-Whitney, p = 2.87×10−9). Additionally, BP sequence score also correlates with AS. In Figure 7C, we observe that skipping of the downstream exon is more frequent for introns with lower BP sequence score. This increase in skipping is fairly gradual. Even though the sequence score distribution is skewed towards high values (not shown), the difference in skipping percentage between lower and upper sequence score quartiles (defined by scores lower than −0.338 and higher 1.838, respectively) is strongly significant (Chi-square, p = 1.83×10−10). Moreover, there is small, but statistical significant, difference in BP sequence score between skipped (mean = 0.706) and constitutive (mean = 0.797) exons (Mann-Whitney, p = 1.78×10−9), further validating that observation.

Bottom Line: Finally, the comparison between exons of different evolutionary ages and pseudo exons suggests a key role of the BP in the pathway of exon creation in human.Our computational and experimental analyses suggest that BP recognition is more flexible than previously assumed, and it appears highly dependent on the presence of downstream polypyrimidine tracts.The reported association between BP features and the splicing outcome suggests that this, so far disregarded but yet crucial, element buries information that can complement current acceptor site models.

View Article: PubMed Central - PubMed

Affiliation: Computational Genomics, Universitat Pompeu Fabra, Barcelona, Spain.

ABSTRACT
The branch point (BP) is one of the three obligatory signals required for pre-mRNA splicing. In mammals, the degeneracy of the motif combined with the lack of a large set of experimentally verified BPs complicates the task of modeling it in silico, and therefore of predicting the location of natural BPs. Consequently, BPs have been disregarded in a considerable fraction of the genome-wide studies on the regulation of splicing in mammals. We present a new computational approach for mammalian BP prediction. Using sequence conservation and positional bias we obtained a set of motifs with good agreement with U2 snRNA binding stability. Using a Support Vector Machine algorithm, we created a model complemented with polypyrimidine tract features, which considerably improves the prediction accuracy over previously published methods. Applying our algorithm to human introns, we show that BP position is highly dependent on the presence of AG dinucleotides in the 3' end of introns, with distance to the 3' splice site and BP strength strongly correlating with alternative splicing. Furthermore, experimental BP mapping for five exons preceded by long AG-dinucleotide exclusion zones revealed that, for a given intron, more than one BP can be chosen throughout the course of splicing. Finally, the comparison between exons of different evolutionary ages and pseudo exons suggests a key role of the BP in the pathway of exon creation in human. Our computational and experimental analyses suggest that BP recognition is more flexible than previously assumed, and it appears highly dependent on the presence of downstream polypyrimidine tracts. The reported association between BP features and the splicing outcome suggests that this, so far disregarded but yet crucial, element buries information that can complement current acceptor site models.

Show MeSH