Limits...
The (in)dependence of alternative splicing and gene duplication.

Talavera D, Vogel C, Orozco M, Teichmann SA, de la Cruz X - PLoS Comput. Biol. (2007)

Bottom Line: All together, these data strongly suggest that both phenomena result in interchangeability between their effects.Further, we conducted a detailed comparison of the effect of sequence changes in both alternative splice variants and gene duplicates on protein structure, in particular the size, location, and types of sequence substitutions and insertions/deletions.Our results reveal an interesting paradox between the anticorrelation of AS and GD at the genomic level, and their impact at the protein level, which shows little or no equivalence in terms of effects on protein sequence, structure, and function.

View Article: PubMed Central - PubMed

Affiliation: Molecular Modeling and Bioinformatics Unit, Parc Científic de Barcelona, Barcelona, Spain.

ABSTRACT
Alternative splicing (AS) and gene duplication (GD) both are processes that diversify the protein repertoire. Recent examples have shown that sequence changes introduced by AS may be comparable to those introduced by GD. In addition, the two processes are inversely correlated at the genomic scale: large gene families are depleted in splice variants and vice versa. All together, these data strongly suggest that both phenomena result in interchangeability between their effects. Here, we tested the extent to which this applies with respect to various protein characteristics. The amounts of AS and GD per gene are anticorrelated even when accounting for different gene functions or degrees of sequence divergence. In contrast, the two processes appear to be independent in their influence on variation in mRNA expression. Further, we conducted a detailed comparison of the effect of sequence changes in both alternative splice variants and gene duplicates on protein structure, in particular the size, location, and types of sequence substitutions and insertions/deletions. We find that, in general, alternative splicing affects protein sequence and structure in a more drastic way than gene duplication and subsequent divergence. Our results reveal an interesting paradox between the anticorrelation of AS and GD at the genomic level, and their impact at the protein level, which shows little or no equivalence in terms of effects on protein sequence, structure, and function. We discuss possible explanations that relate to the order of appearance of AS and GD in a gene family, and to the selection pressure imposed by the environment.

Show MeSH
The Size Distribution of Insertions/Deletions in AS and GDAll analyses of indels have been made for gene families with both AS and GD (i.e., AS+/GD+).(A) AS indels are longer than GD indels. Indels for GD were obtained from the alignments of GD families at 40% (dark red) and 80% (light violet) seq.id. Information on AS indels (green) was obtained from the SwissProt record of the corresponding protein. Indel size distributions for both GD40 and GD80 are very similar, with most of the indels being shorter than five residues. In contrast, many AS indels are longer than 100 residues.(B,C) Size distribution for external and internal indels in AS and GD. External indels (B) lie at the N- or C-terminal ends of the protein; internal indels (C) lie in the middle. AS and GD40 indel sizes are different depending on the position of the indels in the sequence. While AS indels are generally larger than GD indels (also see Figure 6A), external indels (B) are larger than internal ones (C), both for AS and GD. The shift in indel sizes implies that large indels (as often introduced by AS) are better-tolerated at the N- and C-termini of proteins, where they are less likely to induce important structural changes.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC1808492&req=5

pcbi-0030033-g006: The Size Distribution of Insertions/Deletions in AS and GDAll analyses of indels have been made for gene families with both AS and GD (i.e., AS+/GD+).(A) AS indels are longer than GD indels. Indels for GD were obtained from the alignments of GD families at 40% (dark red) and 80% (light violet) seq.id. Information on AS indels (green) was obtained from the SwissProt record of the corresponding protein. Indel size distributions for both GD40 and GD80 are very similar, with most of the indels being shorter than five residues. In contrast, many AS indels are longer than 100 residues.(B,C) Size distribution for external and internal indels in AS and GD. External indels (B) lie at the N- or C-terminal ends of the protein; internal indels (C) lie in the middle. AS and GD40 indel sizes are different depending on the position of the indels in the sequence. While AS indels are generally larger than GD indels (also see Figure 6A), external indels (B) are larger than internal ones (C), both for AS and GD. The shift in indel sizes implies that large indels (as often introduced by AS) are better-tolerated at the N- and C-termini of proteins, where they are less likely to induce important structural changes.

Mentions: Second, we studied indels, which modify protein structure in a different way than substitutions. A first and intuitive measure of their impact is provided by indel size: small indels are more likely to have a small effect on structure than larger ones. We find that indel sizes are substantially different for AS and GD (Figure 6A) for both GD40 and GD80. AS indels are of domain size (≥30aa, many even >100aa) in agreement with previous results [27,28]. In contrast, about three-quarters of GD indels are fewer than five residues long, which means that they are shorter than a domain (Figure 6A). Thus, AS has a strong prevalence over GD for indels of whole domains. Changes in the domain composition, in turn, can modulate protein function very abruptly—for example, by on/off switch mechanisms that result in dominant-negative regulators [29,30].


The (in)dependence of alternative splicing and gene duplication.

Talavera D, Vogel C, Orozco M, Teichmann SA, de la Cruz X - PLoS Comput. Biol. (2007)

The Size Distribution of Insertions/Deletions in AS and GDAll analyses of indels have been made for gene families with both AS and GD (i.e., AS+/GD+).(A) AS indels are longer than GD indels. Indels for GD were obtained from the alignments of GD families at 40% (dark red) and 80% (light violet) seq.id. Information on AS indels (green) was obtained from the SwissProt record of the corresponding protein. Indel size distributions for both GD40 and GD80 are very similar, with most of the indels being shorter than five residues. In contrast, many AS indels are longer than 100 residues.(B,C) Size distribution for external and internal indels in AS and GD. External indels (B) lie at the N- or C-terminal ends of the protein; internal indels (C) lie in the middle. AS and GD40 indel sizes are different depending on the position of the indels in the sequence. While AS indels are generally larger than GD indels (also see Figure 6A), external indels (B) are larger than internal ones (C), both for AS and GD. The shift in indel sizes implies that large indels (as often introduced by AS) are better-tolerated at the N- and C-termini of proteins, where they are less likely to induce important structural changes.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC1808492&req=5

pcbi-0030033-g006: The Size Distribution of Insertions/Deletions in AS and GDAll analyses of indels have been made for gene families with both AS and GD (i.e., AS+/GD+).(A) AS indels are longer than GD indels. Indels for GD were obtained from the alignments of GD families at 40% (dark red) and 80% (light violet) seq.id. Information on AS indels (green) was obtained from the SwissProt record of the corresponding protein. Indel size distributions for both GD40 and GD80 are very similar, with most of the indels being shorter than five residues. In contrast, many AS indels are longer than 100 residues.(B,C) Size distribution for external and internal indels in AS and GD. External indels (B) lie at the N- or C-terminal ends of the protein; internal indels (C) lie in the middle. AS and GD40 indel sizes are different depending on the position of the indels in the sequence. While AS indels are generally larger than GD indels (also see Figure 6A), external indels (B) are larger than internal ones (C), both for AS and GD. The shift in indel sizes implies that large indels (as often introduced by AS) are better-tolerated at the N- and C-termini of proteins, where they are less likely to induce important structural changes.
Mentions: Second, we studied indels, which modify protein structure in a different way than substitutions. A first and intuitive measure of their impact is provided by indel size: small indels are more likely to have a small effect on structure than larger ones. We find that indel sizes are substantially different for AS and GD (Figure 6A) for both GD40 and GD80. AS indels are of domain size (≥30aa, many even >100aa) in agreement with previous results [27,28]. In contrast, about three-quarters of GD indels are fewer than five residues long, which means that they are shorter than a domain (Figure 6A). Thus, AS has a strong prevalence over GD for indels of whole domains. Changes in the domain composition, in turn, can modulate protein function very abruptly—for example, by on/off switch mechanisms that result in dominant-negative regulators [29,30].

Bottom Line: All together, these data strongly suggest that both phenomena result in interchangeability between their effects.Further, we conducted a detailed comparison of the effect of sequence changes in both alternative splice variants and gene duplicates on protein structure, in particular the size, location, and types of sequence substitutions and insertions/deletions.Our results reveal an interesting paradox between the anticorrelation of AS and GD at the genomic level, and their impact at the protein level, which shows little or no equivalence in terms of effects on protein sequence, structure, and function.

View Article: PubMed Central - PubMed

Affiliation: Molecular Modeling and Bioinformatics Unit, Parc Científic de Barcelona, Barcelona, Spain.

ABSTRACT
Alternative splicing (AS) and gene duplication (GD) both are processes that diversify the protein repertoire. Recent examples have shown that sequence changes introduced by AS may be comparable to those introduced by GD. In addition, the two processes are inversely correlated at the genomic scale: large gene families are depleted in splice variants and vice versa. All together, these data strongly suggest that both phenomena result in interchangeability between their effects. Here, we tested the extent to which this applies with respect to various protein characteristics. The amounts of AS and GD per gene are anticorrelated even when accounting for different gene functions or degrees of sequence divergence. In contrast, the two processes appear to be independent in their influence on variation in mRNA expression. Further, we conducted a detailed comparison of the effect of sequence changes in both alternative splice variants and gene duplicates on protein structure, in particular the size, location, and types of sequence substitutions and insertions/deletions. We find that, in general, alternative splicing affects protein sequence and structure in a more drastic way than gene duplication and subsequent divergence. Our results reveal an interesting paradox between the anticorrelation of AS and GD at the genomic level, and their impact at the protein level, which shows little or no equivalence in terms of effects on protein sequence, structure, and function. We discuss possible explanations that relate to the order of appearance of AS and GD in a gene family, and to the selection pressure imposed by the environment.

Show MeSH