Limits...
Positive and strongly relaxed purifying selection drive the evolution of repeats in proteins

View Article: PubMed Central - PubMed

ABSTRACT

Protein repeats are considered hotspots of protein evolution, associated with acquisition of new functions and novel phenotypic traits, including disease. Paradoxically, however, repeats are often strongly conserved through long spans of evolution. To resolve this conundrum, it is necessary to directly compare paralogous (horizontal) evolution of repeats within proteins with their orthologous (vertical) evolution through speciation. Here we develop a rigorous methodology to identify highly periodic repeats with significant sequence similarity, for which evolutionary rates and selection (dN/dS) can be estimated, and systematically characterize their evolution. We show that horizontal evolution of repeats is markedly accelerated compared with their divergence from orthologues in closely related species. This observation is universal across the diversity of life forms and implies a biphasic evolutionary regime whereby new copies experience rapid functional divergence under combined effects of strongly relaxed purifying selection and positive selection, followed by fixation and conservation of each individual repeat.

No MeSH data available.


A model of repeat evolution.Existing repeats (R1, R2) are functional and conserved. Once a new copy is generated (R3), evolution of all copies starts to accelerate. This acceleration is presumably more pronounced for the new copy, and those physically adjacent to it (see colour scale of the evolutionary rate). During this phase, the new copy (and possibly others) can acquire a new function (neofunctionalization) or become specialized towards improved execution of one of the existing functions (subfunctionalization). This phase is dominated by strongly relaxed purifying selection and positive selection, resulting in substantially increased evolutionary rates of R1, R2 and R3 (right panel). Once the new functions are fixed, evolution of the repeats slows down to rates characteristics, or even below, of the non-repetitive (NR) part of the protein (grey), and the repeats are maintained presumably due to strong purifying selection.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5120217&req=5

f6: A model of repeat evolution.Existing repeats (R1, R2) are functional and conserved. Once a new copy is generated (R3), evolution of all copies starts to accelerate. This acceleration is presumably more pronounced for the new copy, and those physically adjacent to it (see colour scale of the evolutionary rate). During this phase, the new copy (and possibly others) can acquire a new function (neofunctionalization) or become specialized towards improved execution of one of the existing functions (subfunctionalization). This phase is dominated by strongly relaxed purifying selection and positive selection, resulting in substantially increased evolutionary rates of R1, R2 and R3 (right panel). Once the new functions are fixed, evolution of the repeats slows down to rates characteristics, or even below, of the non-repetitive (NR) part of the protein (grey), and the repeats are maintained presumably due to strong purifying selection.

Mentions: For duplicated genes, it has been shown that, although subject to purifying selection (dN/dS<<1), they evolve slightly but significantly faster than orthologous genes of similar divergence5657. Thus, the paralogues experience a phase of relaxed purifying selection shortly after duplication. Similarly, here we show that orthologous repeats evolve under selection that is, on average, at least as strong as the selection on the non-repetitive parts of protein sequences if not somewhat stronger, in agreement with previous studies. Importantly, we show that this selection acts on each copy, clearly indicating that individual repeats possess unique functions maintained by selection. However, within a protein, paralogous repeats diverge from each other substantially (typically, by an order of magnitude) faster than orthologous repeats as indicated by high dN/dS values characteristic of the horizontal evolution of repeats. On average, the horizontal dN/dS values, although much higher than the vertical values, are below unity, suggestive of strongly relaxed purifying selection. Nonetheless, the presence of long tails in the dN/dS distributions and especially the homogenization of repeats in some proteins that is apparently caused by selective drive suggest that positive selection also is involved in the horizontal evolution of repeats. Taken together, these observations translate into a general scenario for the evolution of repetitive regions in proteins that involves an initial phase of rapid sequence and functional diversification, driven in part by positive selection, following a burst of duplication, which leads to the emergence of a repetitive region, with subsequent fixation of the sequences of individual repeats and their ensuing slow evolution under purifying selection (Fig. 6).


Positive and strongly relaxed purifying selection drive the evolution of repeats in proteins
A model of repeat evolution.Existing repeats (R1, R2) are functional and conserved. Once a new copy is generated (R3), evolution of all copies starts to accelerate. This acceleration is presumably more pronounced for the new copy, and those physically adjacent to it (see colour scale of the evolutionary rate). During this phase, the new copy (and possibly others) can acquire a new function (neofunctionalization) or become specialized towards improved execution of one of the existing functions (subfunctionalization). This phase is dominated by strongly relaxed purifying selection and positive selection, resulting in substantially increased evolutionary rates of R1, R2 and R3 (right panel). Once the new functions are fixed, evolution of the repeats slows down to rates characteristics, or even below, of the non-repetitive (NR) part of the protein (grey), and the repeats are maintained presumably due to strong purifying selection.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5120217&req=5

f6: A model of repeat evolution.Existing repeats (R1, R2) are functional and conserved. Once a new copy is generated (R3), evolution of all copies starts to accelerate. This acceleration is presumably more pronounced for the new copy, and those physically adjacent to it (see colour scale of the evolutionary rate). During this phase, the new copy (and possibly others) can acquire a new function (neofunctionalization) or become specialized towards improved execution of one of the existing functions (subfunctionalization). This phase is dominated by strongly relaxed purifying selection and positive selection, resulting in substantially increased evolutionary rates of R1, R2 and R3 (right panel). Once the new functions are fixed, evolution of the repeats slows down to rates characteristics, or even below, of the non-repetitive (NR) part of the protein (grey), and the repeats are maintained presumably due to strong purifying selection.
Mentions: For duplicated genes, it has been shown that, although subject to purifying selection (dN/dS<<1), they evolve slightly but significantly faster than orthologous genes of similar divergence5657. Thus, the paralogues experience a phase of relaxed purifying selection shortly after duplication. Similarly, here we show that orthologous repeats evolve under selection that is, on average, at least as strong as the selection on the non-repetitive parts of protein sequences if not somewhat stronger, in agreement with previous studies. Importantly, we show that this selection acts on each copy, clearly indicating that individual repeats possess unique functions maintained by selection. However, within a protein, paralogous repeats diverge from each other substantially (typically, by an order of magnitude) faster than orthologous repeats as indicated by high dN/dS values characteristic of the horizontal evolution of repeats. On average, the horizontal dN/dS values, although much higher than the vertical values, are below unity, suggestive of strongly relaxed purifying selection. Nonetheless, the presence of long tails in the dN/dS distributions and especially the homogenization of repeats in some proteins that is apparently caused by selective drive suggest that positive selection also is involved in the horizontal evolution of repeats. Taken together, these observations translate into a general scenario for the evolution of repetitive regions in proteins that involves an initial phase of rapid sequence and functional diversification, driven in part by positive selection, following a burst of duplication, which leads to the emergence of a repetitive region, with subsequent fixation of the sequences of individual repeats and their ensuing slow evolution under purifying selection (Fig. 6).

View Article: PubMed Central - PubMed

ABSTRACT

Protein repeats are considered hotspots of protein evolution, associated with acquisition of new functions and novel phenotypic traits, including disease. Paradoxically, however, repeats are often strongly conserved through long spans of evolution. To resolve this conundrum, it is necessary to directly compare paralogous (horizontal) evolution of repeats within proteins with their orthologous (vertical) evolution through speciation. Here we develop a rigorous methodology to identify highly periodic repeats with significant sequence similarity, for which evolutionary rates and selection (dN/dS) can be estimated, and systematically characterize their evolution. We show that horizontal evolution of repeats is markedly accelerated compared with their divergence from orthologues in closely related species. This observation is universal across the diversity of life forms and implies a biphasic evolutionary regime whereby new copies experience rapid functional divergence under combined effects of strongly relaxed purifying selection and positive selection, followed by fixation and conservation of each individual repeat.

No MeSH data available.