Limits...
Pervasive adaptive protein evolution apparent in diversity patterns around amino acid substitutions in Drosophila simulans.

Sattath S, Elyashiv E, Kolodny O, Rinott Y, Sella G - PLoS Genet. (2011)

Bottom Line: All suffer from confounding factors, however, such that the interpretation of the evidence-in particular, conclusions about the rate and strength of beneficial substitutions-remains tentative.Interestingly, it reveals two classes of beneficial fixations: a minority (approximately 3%) that appears to have had large selective effects and accounts for most of the reduction in diversity, and the remaining 10%, which seem to have had very weak selective effects.More generally, our findings provide unequivocal evidence for strongly beneficial substitutions in Drosophila and illustrate how the rapidly accumulating genome-wide data can be leveraged to address enduring questions about the genetic basis of adaptation.

View Article: PubMed Central - PubMed

Affiliation: Department of Ecology, Evolution, and Behavior, Hebrew University of Jerusalem, Jerusalem, Israel.

ABSTRACT
In Drosophila, multiple lines of evidence converge in suggesting that beneficial substitutions to the genome may be common. All suffer from confounding factors, however, such that the interpretation of the evidence-in particular, conclusions about the rate and strength of beneficial substitutions-remains tentative. Here, we use genome-wide polymorphism data in D. simulans and sequenced genomes of its close relatives to construct a readily interpretable characterization of the effects of positive selection: the shape of average neutral diversity around amino acid substitutions. As expected under recurrent selective sweeps, we find a trough in diversity levels around amino acid but not around synonymous substitutions, a distinctive pattern that is not expected under alternative models. This characterization is richer than previous approaches, which relied on limited summaries of the data (e.g., the slope of a scatter plot), and relates to underlying selection parameters in a straightforward way, allowing us to make more reliable inferences about the prevalence and strength of adaptation. Specifically, we develop a coalescent-based model for the shape of the entire curve and use it to infer adaptive parameters by maximum likelihood. Our inference suggests that ∼13% of amino acid substitutions cause selective sweeps. Interestingly, it reveals two classes of beneficial fixations: a minority (approximately 3%) that appears to have had large selective effects and accounts for most of the reduction in diversity, and the remaining 10%, which seem to have had very weak selective effects. These estimates therefore help to reconcile the apparent conflict among previously published estimates of the strength of selection. More generally, our findings provide unequivocal evidence for strongly beneficial substitutions in Drosophila and illustrate how the rapidly accumulating genome-wide data can be leveraged to address enduring questions about the genetic basis of adaptation.

Show MeSH
The fit of recurrent selective sweep models to diversity patterns around amino acid substitutions.A. Observed and predicted curves for the average synonymous heterozygosity as a function of distance from amino acid substitutions. The curve based on the data (black) was smoothed using LOESS with a span of 0.5 and divided by divergence, as in Figure 1. The predicted curves correspond to maximum likelihood estimates based on different distributions of beneficial selection coefficients: “1 point” corresponds to a single selection coefficient (blue); “Gamma” to a Gamma distribution (green); “2 point” to two selection coefficients (red); “2 exponentials” to a mixture of two exponentials (orange). B. A close-up on distances up to 4 kb. To reveal more detail of the observed curve on this scale, we used LOESS smoothing with a smaller span of 0.002. See Text S1 for further details.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3037414&req=5

pgen-1001302-g003: The fit of recurrent selective sweep models to diversity patterns around amino acid substitutions.A. Observed and predicted curves for the average synonymous heterozygosity as a function of distance from amino acid substitutions. The curve based on the data (black) was smoothed using LOESS with a span of 0.5 and divided by divergence, as in Figure 1. The predicted curves correspond to maximum likelihood estimates based on different distributions of beneficial selection coefficients: “1 point” corresponds to a single selection coefficient (blue); “Gamma” to a Gamma distribution (green); “2 point” to two selection coefficients (red); “2 exponentials” to a mixture of two exponentials (orange). B. A close-up on distances up to 4 kb. To reveal more detail of the observed curve on this scale, we used LOESS smoothing with a smaller span of 0.002. See Text S1 for further details.

Mentions: A visual comparison suggests a reasonable fit of these models to the data (Figure 3A). However, the inference based on models with one selection coefficient, or even a Gamma distribution of coefficients, might be dominated by the broad features of the plot, such that any narrower trough caused by beneficial substitutions with weaker selection coefficients could be overlooked. A closer look around the focal substitutions supports this notion, revealing a small trough inside the main trough, on the scale of several hundred bps, which is not captured by either of the two models (Figure 3B). We therefore consider another model, with two beneficial selection coefficients. Using it, we estimate that ∼13% of the substitutions were beneficial, ∼3% with a large selective advantage of ∼0.5% and the rest with a much weaker effect, of approximately one hundredth of a percent (Table 5 in Text S1). A mixture model with two exponentials reveals a similar picture: ∼4% of substitutions are estimated to come from a distribution with a mean selective coefficient of ∼0.5% and 11% from a distribution with a mean of ∼4·10−5 (Table 5 in Text S1). Importantly, both models provide a substantially better fit to the data (Table 5 in Text S1) and they capture the smaller as well as the larger troughs in diversity (Figure 3A and 3B). In turn, estimates under a model with three beneficial selective coefficients are similar to those obtained in model with only two and offer no improvement to the fit (Table 5 in Text S1). Taken together, these findings indicate that selective sweeps are driven by two classes of beneficial fixations: a minority with large beneficial effects that account for most of the reduction in diversity and a majority with much weaker effects. Moreover, they help explain why previous inferences based on the signatures of sweeps in Drosophila yielded markedly different estimates (ranging over three orders of magnitudes) [1]–[4].


Pervasive adaptive protein evolution apparent in diversity patterns around amino acid substitutions in Drosophila simulans.

Sattath S, Elyashiv E, Kolodny O, Rinott Y, Sella G - PLoS Genet. (2011)

The fit of recurrent selective sweep models to diversity patterns around amino acid substitutions.A. Observed and predicted curves for the average synonymous heterozygosity as a function of distance from amino acid substitutions. The curve based on the data (black) was smoothed using LOESS with a span of 0.5 and divided by divergence, as in Figure 1. The predicted curves correspond to maximum likelihood estimates based on different distributions of beneficial selection coefficients: “1 point” corresponds to a single selection coefficient (blue); “Gamma” to a Gamma distribution (green); “2 point” to two selection coefficients (red); “2 exponentials” to a mixture of two exponentials (orange). B. A close-up on distances up to 4 kb. To reveal more detail of the observed curve on this scale, we used LOESS smoothing with a smaller span of 0.002. See Text S1 for further details.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3037414&req=5

pgen-1001302-g003: The fit of recurrent selective sweep models to diversity patterns around amino acid substitutions.A. Observed and predicted curves for the average synonymous heterozygosity as a function of distance from amino acid substitutions. The curve based on the data (black) was smoothed using LOESS with a span of 0.5 and divided by divergence, as in Figure 1. The predicted curves correspond to maximum likelihood estimates based on different distributions of beneficial selection coefficients: “1 point” corresponds to a single selection coefficient (blue); “Gamma” to a Gamma distribution (green); “2 point” to two selection coefficients (red); “2 exponentials” to a mixture of two exponentials (orange). B. A close-up on distances up to 4 kb. To reveal more detail of the observed curve on this scale, we used LOESS smoothing with a smaller span of 0.002. See Text S1 for further details.
Mentions: A visual comparison suggests a reasonable fit of these models to the data (Figure 3A). However, the inference based on models with one selection coefficient, or even a Gamma distribution of coefficients, might be dominated by the broad features of the plot, such that any narrower trough caused by beneficial substitutions with weaker selection coefficients could be overlooked. A closer look around the focal substitutions supports this notion, revealing a small trough inside the main trough, on the scale of several hundred bps, which is not captured by either of the two models (Figure 3B). We therefore consider another model, with two beneficial selection coefficients. Using it, we estimate that ∼13% of the substitutions were beneficial, ∼3% with a large selective advantage of ∼0.5% and the rest with a much weaker effect, of approximately one hundredth of a percent (Table 5 in Text S1). A mixture model with two exponentials reveals a similar picture: ∼4% of substitutions are estimated to come from a distribution with a mean selective coefficient of ∼0.5% and 11% from a distribution with a mean of ∼4·10−5 (Table 5 in Text S1). Importantly, both models provide a substantially better fit to the data (Table 5 in Text S1) and they capture the smaller as well as the larger troughs in diversity (Figure 3A and 3B). In turn, estimates under a model with three beneficial selective coefficients are similar to those obtained in model with only two and offer no improvement to the fit (Table 5 in Text S1). Taken together, these findings indicate that selective sweeps are driven by two classes of beneficial fixations: a minority with large beneficial effects that account for most of the reduction in diversity and a majority with much weaker effects. Moreover, they help explain why previous inferences based on the signatures of sweeps in Drosophila yielded markedly different estimates (ranging over three orders of magnitudes) [1]–[4].

Bottom Line: All suffer from confounding factors, however, such that the interpretation of the evidence-in particular, conclusions about the rate and strength of beneficial substitutions-remains tentative.Interestingly, it reveals two classes of beneficial fixations: a minority (approximately 3%) that appears to have had large selective effects and accounts for most of the reduction in diversity, and the remaining 10%, which seem to have had very weak selective effects.More generally, our findings provide unequivocal evidence for strongly beneficial substitutions in Drosophila and illustrate how the rapidly accumulating genome-wide data can be leveraged to address enduring questions about the genetic basis of adaptation.

View Article: PubMed Central - PubMed

Affiliation: Department of Ecology, Evolution, and Behavior, Hebrew University of Jerusalem, Jerusalem, Israel.

ABSTRACT
In Drosophila, multiple lines of evidence converge in suggesting that beneficial substitutions to the genome may be common. All suffer from confounding factors, however, such that the interpretation of the evidence-in particular, conclusions about the rate and strength of beneficial substitutions-remains tentative. Here, we use genome-wide polymorphism data in D. simulans and sequenced genomes of its close relatives to construct a readily interpretable characterization of the effects of positive selection: the shape of average neutral diversity around amino acid substitutions. As expected under recurrent selective sweeps, we find a trough in diversity levels around amino acid but not around synonymous substitutions, a distinctive pattern that is not expected under alternative models. This characterization is richer than previous approaches, which relied on limited summaries of the data (e.g., the slope of a scatter plot), and relates to underlying selection parameters in a straightforward way, allowing us to make more reliable inferences about the prevalence and strength of adaptation. Specifically, we develop a coalescent-based model for the shape of the entire curve and use it to infer adaptive parameters by maximum likelihood. Our inference suggests that ∼13% of amino acid substitutions cause selective sweeps. Interestingly, it reveals two classes of beneficial fixations: a minority (approximately 3%) that appears to have had large selective effects and accounts for most of the reduction in diversity, and the remaining 10%, which seem to have had very weak selective effects. These estimates therefore help to reconcile the apparent conflict among previously published estimates of the strength of selection. More generally, our findings provide unequivocal evidence for strongly beneficial substitutions in Drosophila and illustrate how the rapidly accumulating genome-wide data can be leveraged to address enduring questions about the genetic basis of adaptation.

Show MeSH