Limits...
Verification of alternative splicing variants based on domain integrity, truncation length and intrinsic protein disorder.

Hegyi H, Kalmar L, Horvath T, Tompa P - Nucleic Acids Res. (2010)

Bottom Line: However, for 4000 human proteins in PDB, only 14 human proteins have structures of at least two alternative isoforms.We found that strict rules govern the selection of alternative splice variants aimed to preserve the integrity of globular domains: alternative splice sites (i) tend to avoid globular domains or (ii) affect them only marginally or (iii) tend to coincide with a location where the exposed hydrophobic surface is minimal or (iv) the protein is disordered.These observations provide the basis for a prediction method (currently under development) to predict the viability of splice variants.

View Article: PubMed Central - PubMed

Affiliation: Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, PO Box 7, 1518 Budapest, Hungary. hegyi@enzim.hu

ABSTRACT
According to current estimations ∼95% of multi-exonic human protein-coding genes undergo alternative splicing (AS). However, for 4000 human proteins in PDB, only 14 human proteins have structures of at least two alternative isoforms. Surveying these structural isoforms revealed that the maximum insertion accommodated by an isoform of a fully ordered protein domain was 5 amino acids, other instances of domain changes involved intrinsic structural disorder. After collecting 505 minor isoforms of human proteins with evidence for their existence we analyzed their length, protein disorder and exposed hydrophobic surface. We found that strict rules govern the selection of alternative splice variants aimed to preserve the integrity of globular domains: alternative splice sites (i) tend to avoid globular domains or (ii) affect them only marginally or (iii) tend to coincide with a location where the exposed hydrophobic surface is minimal or (iv) the protein is disordered. We also observed an inverse correlation between the domain fraction lost and the full length of the minor isoform containing the domain, possibly indicating a buffering effect for the isoform protein counteracting the domain truncation effect. These observations provide the basis for a prediction method (currently under development) to predict the viability of splice variants.

Show MeSH

Related in: MedlinePlus

Percentage distribution of relative domain size of domains truncated by AS in the ‘named’/all Swissprot/randomized Swissprot sets. Bin size increment is 0.1.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3045584&req=5

Figure 2: Percentage distribution of relative domain size of domains truncated by AS in the ‘named’/all Swissprot/randomized Swissprot sets. Bin size increment is 0.1.

Mentions: After summarizing all splice events in the ‘named’/all Swissprot/randomized Swissprot sets, it became apparent that splice sites preferably avoid globular domains: while 11 576 out of the total of 33 223 (∼35%) randomized splice sites in the Swissprot splice variants fall into a domain, this value for the actual splice variants without randomization is 7146 (out of 33 223, ∼22%) and further decreases to ∼9% for the ‘named’ set (1019 out of 10 743 domains). However, even when the splice site falls into a globular domain, the relative length of the remaining domain is not evenly distributed between 0 and 1 but strongly biased towards values close to 1, as shown in Figure 2 (also apparent from Figure 1). This bias is the most apparent again in the ‘named’ group; i.e. there is a strong selection against severe domain truncation in globular domains to preserve the structural integrity of such domains. In contrast, in the randomly generated data set the frequencies of the various truncations were almost uniform (except for the minor truncations caused by the overrepresented small sized splice events). The difference between Swissprot and the ‘named’ group was again highly significant (P < 0.0001).Figure 2.


Verification of alternative splicing variants based on domain integrity, truncation length and intrinsic protein disorder.

Hegyi H, Kalmar L, Horvath T, Tompa P - Nucleic Acids Res. (2010)

Percentage distribution of relative domain size of domains truncated by AS in the ‘named’/all Swissprot/randomized Swissprot sets. Bin size increment is 0.1.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3045584&req=5

Figure 2: Percentage distribution of relative domain size of domains truncated by AS in the ‘named’/all Swissprot/randomized Swissprot sets. Bin size increment is 0.1.
Mentions: After summarizing all splice events in the ‘named’/all Swissprot/randomized Swissprot sets, it became apparent that splice sites preferably avoid globular domains: while 11 576 out of the total of 33 223 (∼35%) randomized splice sites in the Swissprot splice variants fall into a domain, this value for the actual splice variants without randomization is 7146 (out of 33 223, ∼22%) and further decreases to ∼9% for the ‘named’ set (1019 out of 10 743 domains). However, even when the splice site falls into a globular domain, the relative length of the remaining domain is not evenly distributed between 0 and 1 but strongly biased towards values close to 1, as shown in Figure 2 (also apparent from Figure 1). This bias is the most apparent again in the ‘named’ group; i.e. there is a strong selection against severe domain truncation in globular domains to preserve the structural integrity of such domains. In contrast, in the randomly generated data set the frequencies of the various truncations were almost uniform (except for the minor truncations caused by the overrepresented small sized splice events). The difference between Swissprot and the ‘named’ group was again highly significant (P < 0.0001).Figure 2.

Bottom Line: However, for 4000 human proteins in PDB, only 14 human proteins have structures of at least two alternative isoforms.We found that strict rules govern the selection of alternative splice variants aimed to preserve the integrity of globular domains: alternative splice sites (i) tend to avoid globular domains or (ii) affect them only marginally or (iii) tend to coincide with a location where the exposed hydrophobic surface is minimal or (iv) the protein is disordered.These observations provide the basis for a prediction method (currently under development) to predict the viability of splice variants.

View Article: PubMed Central - PubMed

Affiliation: Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, PO Box 7, 1518 Budapest, Hungary. hegyi@enzim.hu

ABSTRACT
According to current estimations ∼95% of multi-exonic human protein-coding genes undergo alternative splicing (AS). However, for 4000 human proteins in PDB, only 14 human proteins have structures of at least two alternative isoforms. Surveying these structural isoforms revealed that the maximum insertion accommodated by an isoform of a fully ordered protein domain was 5 amino acids, other instances of domain changes involved intrinsic structural disorder. After collecting 505 minor isoforms of human proteins with evidence for their existence we analyzed their length, protein disorder and exposed hydrophobic surface. We found that strict rules govern the selection of alternative splice variants aimed to preserve the integrity of globular domains: alternative splice sites (i) tend to avoid globular domains or (ii) affect them only marginally or (iii) tend to coincide with a location where the exposed hydrophobic surface is minimal or (iv) the protein is disordered. We also observed an inverse correlation between the domain fraction lost and the full length of the minor isoform containing the domain, possibly indicating a buffering effect for the isoform protein counteracting the domain truncation effect. These observations provide the basis for a prediction method (currently under development) to predict the viability of splice variants.

Show MeSH
Related in: MedlinePlus