Limits...
Mutation bias favors protein folding stability in the evolution of small populations.

Mendez R, Fritsche M, Porto M, Bastolla U - PLoS Comput. Biol. (2010)

Bottom Line: This result is robust with respect to the definition of the fitness function and to the protein structures studied.This provides a possible explanation to the observation that most species adopting obligatory intracellular lifestyles with a consequent reduction of effective population size shifted their mutation spectrum towards AT.To test these predictions we estimated the effective population sizes of bacterial species using the optimal codon usage coefficients computed by dos Reis et al. and the synonymous to non-synonymous substitution ratio computed by Daubin and Moran.

View Article: PubMed Central - PubMed

Affiliation: Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas and Universidad Autónoma de Madrid, Madrid, Spain.

ABSTRACT
Mutation bias in prokaryotes varies from extreme adenine and thymine (AT) in obligatory endosymbiotic or parasitic bacteria to extreme guanine and cytosine (GC), for instance in actinobacteria. GC mutation bias deeply influences the folding stability of proteins, making proteins on the average less hydrophobic and therefore less stable with respect to unfolding but also less susceptible to misfolding and aggregation. We study a model where proteins evolve subject to selection for folding stability under given mutation bias, population size, and neutrality. We find a non-neutral regime where, for any given population size, there is an optimal mutation bias that maximizes fitness. Interestingly, this optimal GC usage is small for small populations, large for intermediate populations and around 50% for large populations. This result is robust with respect to the definition of the fitness function and to the protein structures studied. Our model suggests that small populations evolving with small GC usage eventually accumulate a significant selective advantage over populations evolving without this bias. This provides a possible explanation to the observation that most species adopting obligatory intracellular lifestyles with a consequent reduction of effective population size shifted their mutation spectrum towards AT. The model also predicts that large GC usage is optimal for intermediate population size. To test these predictions we estimated the effective population sizes of bacterial species using the optimal codon usage coefficients computed by dos Reis et al. and the synonymous to non-synonymous substitution ratio computed by Daubin and Moran. We found that the population sizes estimated in these ways are significantly smaller for species with small and large GC usage compared to species with no bias, which supports our prediction.

Show MeSH
Estimates of quantities correlating with effective population size obtained from genomic data.Upper plot: Optimal codon bias estimated by dos Reis et al. [44] versus GC content at synonymous third codon position, shown as mean and standard error of the mean for three bins of GC3 (smaller than 30%, 40 to 60%, larger than 70%). Error bars in the plot represent the standard error of the mean, and show that the mean values are significantly different. However, data prior to the mean are rather broadly distributed, with standard deviations equal to  (,  () and  (). Bottom plot: values of  computed by Daubin and Moran [46] are averaged for pairs of bacteria with low, intermediate and high GC content. Both plots support the notion that species with GC content  are characterized by larger effective population size.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2865504&req=5

pcbi-1000767-g009: Estimates of quantities correlating with effective population size obtained from genomic data.Upper plot: Optimal codon bias estimated by dos Reis et al. [44] versus GC content at synonymous third codon position, shown as mean and standard error of the mean for three bins of GC3 (smaller than 30%, 40 to 60%, larger than 70%). Error bars in the plot represent the standard error of the mean, and show that the mean values are significantly different. However, data prior to the mean are rather broadly distributed, with standard deviations equal to (, () and (). Bottom plot: values of computed by Daubin and Moran [46] are averaged for pairs of bacteria with low, intermediate and high GC content. Both plots support the notion that species with GC content are characterized by larger effective population size.

Mentions: The effective population size depends on the breeding structure and the natural history of a population, and in particular it is influenced by the bottlenecks that the population may undergo if a few individuals periodically colonize new environments. Therefore, the effective population size cannot be measured experimentally, but is estimated by fitting some observed population feature to its expected value under evolution in a population with given . Optimal codon usage was used several years ago to estimate the effective population size of Escherichia coli [42]. A recent work supports the existence of a correlation between effective population size and synonymous codon usage [43], and the availability of many complete genomes makes it possible to analyze codon usage on a large scale. Codon usage and mutation bias are intimately correlated. It is commonly believed that the mutation bias, rather than selection for optimal codon usage, ultimately influences the global GC content of a genome [18], [19]. The definition of the optimal codon usage on which the results that we use here are based considers the excess frequency of preferred codons with respect to the frequency expected under mutation alone, and is therefore not expected to depend on the mutation bias in a trivial way. Dos Reis el al. [44] have recently estimated the optimal codon usage in a large number of prokaryotic species. We use their data rather than the analogous data obtained by Sharp et al. [45], since Dos Reis et al. evaluated the optimal codon usage on the entire genome, whereas Sharp et al. concentrated their attention only on ribosomal genes, which can be a biased sample. Fig. 9 shows the average optimal codon usage versus the average GC content at the third codon position, which is not affected by the selection on the amino acid sequence and is expected to be very strongly correlated with the mutation bias. We distinguished species with small (), intermediate ( to ) and large () GC content. Species with intermediate GC content turned out to have significantly larger optimal codon usage, which suggests that they have larger effective population size. The scatter plot and the histogram of the GC content are shown in Fig. 7 and 8) in the Text S1. Error bars in the plot represent the standard error of the mean, and show that the mean values are significantly different. However, data prior to the mean are rather broadly distributed, with standard deviations equal to (, () and ().


Mutation bias favors protein folding stability in the evolution of small populations.

Mendez R, Fritsche M, Porto M, Bastolla U - PLoS Comput. Biol. (2010)

Estimates of quantities correlating with effective population size obtained from genomic data.Upper plot: Optimal codon bias estimated by dos Reis et al. [44] versus GC content at synonymous third codon position, shown as mean and standard error of the mean for three bins of GC3 (smaller than 30%, 40 to 60%, larger than 70%). Error bars in the plot represent the standard error of the mean, and show that the mean values are significantly different. However, data prior to the mean are rather broadly distributed, with standard deviations equal to  (,  () and  (). Bottom plot: values of  computed by Daubin and Moran [46] are averaged for pairs of bacteria with low, intermediate and high GC content. Both plots support the notion that species with GC content  are characterized by larger effective population size.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2865504&req=5

pcbi-1000767-g009: Estimates of quantities correlating with effective population size obtained from genomic data.Upper plot: Optimal codon bias estimated by dos Reis et al. [44] versus GC content at synonymous third codon position, shown as mean and standard error of the mean for three bins of GC3 (smaller than 30%, 40 to 60%, larger than 70%). Error bars in the plot represent the standard error of the mean, and show that the mean values are significantly different. However, data prior to the mean are rather broadly distributed, with standard deviations equal to (, () and (). Bottom plot: values of computed by Daubin and Moran [46] are averaged for pairs of bacteria with low, intermediate and high GC content. Both plots support the notion that species with GC content are characterized by larger effective population size.
Mentions: The effective population size depends on the breeding structure and the natural history of a population, and in particular it is influenced by the bottlenecks that the population may undergo if a few individuals periodically colonize new environments. Therefore, the effective population size cannot be measured experimentally, but is estimated by fitting some observed population feature to its expected value under evolution in a population with given . Optimal codon usage was used several years ago to estimate the effective population size of Escherichia coli [42]. A recent work supports the existence of a correlation between effective population size and synonymous codon usage [43], and the availability of many complete genomes makes it possible to analyze codon usage on a large scale. Codon usage and mutation bias are intimately correlated. It is commonly believed that the mutation bias, rather than selection for optimal codon usage, ultimately influences the global GC content of a genome [18], [19]. The definition of the optimal codon usage on which the results that we use here are based considers the excess frequency of preferred codons with respect to the frequency expected under mutation alone, and is therefore not expected to depend on the mutation bias in a trivial way. Dos Reis el al. [44] have recently estimated the optimal codon usage in a large number of prokaryotic species. We use their data rather than the analogous data obtained by Sharp et al. [45], since Dos Reis et al. evaluated the optimal codon usage on the entire genome, whereas Sharp et al. concentrated their attention only on ribosomal genes, which can be a biased sample. Fig. 9 shows the average optimal codon usage versus the average GC content at the third codon position, which is not affected by the selection on the amino acid sequence and is expected to be very strongly correlated with the mutation bias. We distinguished species with small (), intermediate ( to ) and large () GC content. Species with intermediate GC content turned out to have significantly larger optimal codon usage, which suggests that they have larger effective population size. The scatter plot and the histogram of the GC content are shown in Fig. 7 and 8) in the Text S1. Error bars in the plot represent the standard error of the mean, and show that the mean values are significantly different. However, data prior to the mean are rather broadly distributed, with standard deviations equal to (, () and ().

Bottom Line: This result is robust with respect to the definition of the fitness function and to the protein structures studied.This provides a possible explanation to the observation that most species adopting obligatory intracellular lifestyles with a consequent reduction of effective population size shifted their mutation spectrum towards AT.To test these predictions we estimated the effective population sizes of bacterial species using the optimal codon usage coefficients computed by dos Reis et al. and the synonymous to non-synonymous substitution ratio computed by Daubin and Moran.

View Article: PubMed Central - PubMed

Affiliation: Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas and Universidad Autónoma de Madrid, Madrid, Spain.

ABSTRACT
Mutation bias in prokaryotes varies from extreme adenine and thymine (AT) in obligatory endosymbiotic or parasitic bacteria to extreme guanine and cytosine (GC), for instance in actinobacteria. GC mutation bias deeply influences the folding stability of proteins, making proteins on the average less hydrophobic and therefore less stable with respect to unfolding but also less susceptible to misfolding and aggregation. We study a model where proteins evolve subject to selection for folding stability under given mutation bias, population size, and neutrality. We find a non-neutral regime where, for any given population size, there is an optimal mutation bias that maximizes fitness. Interestingly, this optimal GC usage is small for small populations, large for intermediate populations and around 50% for large populations. This result is robust with respect to the definition of the fitness function and to the protein structures studied. Our model suggests that small populations evolving with small GC usage eventually accumulate a significant selective advantage over populations evolving without this bias. This provides a possible explanation to the observation that most species adopting obligatory intracellular lifestyles with a consequent reduction of effective population size shifted their mutation spectrum towards AT. The model also predicts that large GC usage is optimal for intermediate population size. To test these predictions we estimated the effective population sizes of bacterial species using the optimal codon usage coefficients computed by dos Reis et al. and the synonymous to non-synonymous substitution ratio computed by Daubin and Moran. We found that the population sizes estimated in these ways are significantly smaller for species with small and large GC usage compared to species with no bias, which supports our prediction.

Show MeSH