Limits...
Verification of alternative splicing variants based on domain integrity, truncation length and intrinsic protein disorder.

Hegyi H, Kalmar L, Horvath T, Tompa P - Nucleic Acids Res. (2010)

Bottom Line: However, for 4000 human proteins in PDB, only 14 human proteins have structures of at least two alternative isoforms.We found that strict rules govern the selection of alternative splice variants aimed to preserve the integrity of globular domains: alternative splice sites (i) tend to avoid globular domains or (ii) affect them only marginally or (iii) tend to coincide with a location where the exposed hydrophobic surface is minimal or (iv) the protein is disordered.These observations provide the basis for a prediction method (currently under development) to predict the viability of splice variants.

View Article: PubMed Central - PubMed

Affiliation: Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, PO Box 7, 1518 Budapest, Hungary. hegyi@enzim.hu

ABSTRACT
According to current estimations ∼95% of multi-exonic human protein-coding genes undergo alternative splicing (AS). However, for 4000 human proteins in PDB, only 14 human proteins have structures of at least two alternative isoforms. Surveying these structural isoforms revealed that the maximum insertion accommodated by an isoform of a fully ordered protein domain was 5 amino acids, other instances of domain changes involved intrinsic structural disorder. After collecting 505 minor isoforms of human proteins with evidence for their existence we analyzed their length, protein disorder and exposed hydrophobic surface. We found that strict rules govern the selection of alternative splice variants aimed to preserve the integrity of globular domains: alternative splice sites (i) tend to avoid globular domains or (ii) affect them only marginally or (iii) tend to coincide with a location where the exposed hydrophobic surface is minimal or (iv) the protein is disordered. We also observed an inverse correlation between the domain fraction lost and the full length of the minor isoform containing the domain, possibly indicating a buffering effect for the isoform protein counteracting the domain truncation effect. These observations provide the basis for a prediction method (currently under development) to predict the viability of splice variants.

Show MeSH

Related in: MedlinePlus

Frequency distribution of percentage disorder in protein regions deleted (A) substituted (B) or disrupted by an insertion (C) by AS in the ‘named’ group/all of Swissprot/randomized Swissprot. The three groups were significantly different from one another, established by χ2-tests (P < 0.05), regarding deletions (A) and substitutions (B) but not for insertions (C), due to the small sample number of the named group. For further details see ‘Results’ section.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3045584&req=5

Figure 3: Frequency distribution of percentage disorder in protein regions deleted (A) substituted (B) or disrupted by an insertion (C) by AS in the ‘named’ group/all of Swissprot/randomized Swissprot. The three groups were significantly different from one another, established by χ2-tests (P < 0.05), regarding deletions (A) and substitutions (B) but not for insertions (C), due to the small sample number of the named group. For further details see ‘Results’ section.

Mentions: The most frequently occurring splice events are the deletions. Comparing the frequency distribution of percentage disorder in the deleted protein region (Figure 3) with the control groups using the χ2-probe, we found statistically significant differences between all groups (difference between ‘named’ and Swissprot, P = 0.024; between Swissprot and ‘random’, P < 0.0001). Significant differences could also be observed for substitutions (for the full region replaced, significance of difference between ‘named’ and Swissprot, P = 0.01, whereas between Swissprot and random, P < 0.0001, data not shown). Due to the relatively small sample size of insertions (1467, compared to 6635 substitutions and 10 634 deletions, as described in ‘Materials and Methods’ section), significance could not be established between the ‘named’ group and Swissprot, however Swissprot was significantly more disordered than the ‘random’ group (P < 0.0001).Figure 3.


Verification of alternative splicing variants based on domain integrity, truncation length and intrinsic protein disorder.

Hegyi H, Kalmar L, Horvath T, Tompa P - Nucleic Acids Res. (2010)

Frequency distribution of percentage disorder in protein regions deleted (A) substituted (B) or disrupted by an insertion (C) by AS in the ‘named’ group/all of Swissprot/randomized Swissprot. The three groups were significantly different from one another, established by χ2-tests (P < 0.05), regarding deletions (A) and substitutions (B) but not for insertions (C), due to the small sample number of the named group. For further details see ‘Results’ section.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3045584&req=5

Figure 3: Frequency distribution of percentage disorder in protein regions deleted (A) substituted (B) or disrupted by an insertion (C) by AS in the ‘named’ group/all of Swissprot/randomized Swissprot. The three groups were significantly different from one another, established by χ2-tests (P < 0.05), regarding deletions (A) and substitutions (B) but not for insertions (C), due to the small sample number of the named group. For further details see ‘Results’ section.
Mentions: The most frequently occurring splice events are the deletions. Comparing the frequency distribution of percentage disorder in the deleted protein region (Figure 3) with the control groups using the χ2-probe, we found statistically significant differences between all groups (difference between ‘named’ and Swissprot, P = 0.024; between Swissprot and ‘random’, P < 0.0001). Significant differences could also be observed for substitutions (for the full region replaced, significance of difference between ‘named’ and Swissprot, P = 0.01, whereas between Swissprot and random, P < 0.0001, data not shown). Due to the relatively small sample size of insertions (1467, compared to 6635 substitutions and 10 634 deletions, as described in ‘Materials and Methods’ section), significance could not be established between the ‘named’ group and Swissprot, however Swissprot was significantly more disordered than the ‘random’ group (P < 0.0001).Figure 3.

Bottom Line: However, for 4000 human proteins in PDB, only 14 human proteins have structures of at least two alternative isoforms.We found that strict rules govern the selection of alternative splice variants aimed to preserve the integrity of globular domains: alternative splice sites (i) tend to avoid globular domains or (ii) affect them only marginally or (iii) tend to coincide with a location where the exposed hydrophobic surface is minimal or (iv) the protein is disordered.These observations provide the basis for a prediction method (currently under development) to predict the viability of splice variants.

View Article: PubMed Central - PubMed

Affiliation: Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, PO Box 7, 1518 Budapest, Hungary. hegyi@enzim.hu

ABSTRACT
According to current estimations ∼95% of multi-exonic human protein-coding genes undergo alternative splicing (AS). However, for 4000 human proteins in PDB, only 14 human proteins have structures of at least two alternative isoforms. Surveying these structural isoforms revealed that the maximum insertion accommodated by an isoform of a fully ordered protein domain was 5 amino acids, other instances of domain changes involved intrinsic structural disorder. After collecting 505 minor isoforms of human proteins with evidence for their existence we analyzed their length, protein disorder and exposed hydrophobic surface. We found that strict rules govern the selection of alternative splice variants aimed to preserve the integrity of globular domains: alternative splice sites (i) tend to avoid globular domains or (ii) affect them only marginally or (iii) tend to coincide with a location where the exposed hydrophobic surface is minimal or (iv) the protein is disordered. We also observed an inverse correlation between the domain fraction lost and the full length of the minor isoform containing the domain, possibly indicating a buffering effect for the isoform protein counteracting the domain truncation effect. These observations provide the basis for a prediction method (currently under development) to predict the viability of splice variants.

Show MeSH
Related in: MedlinePlus