Limits...
Evolution of the genetic code: partial optimization of a random code for robustness to translation error in a rugged fitness landscape.

Novozhilov AS, Wolf YI, Koonin EV - Biol. Direct (2007)

Bottom Line: It has been repeatedly argued that this structure of the code results from selective optimization for robustness to translation errors such that translational misreading has the minimal adverse effect.The properties of the standard code were compared to the properties of four sets of codes, namely, purely random codes, random codes that are more robust than the standard code, and two sets of codes that resulted from optimization of the first two sets.The reason the code is not fully optimized could be the trade-off between the beneficial effect of increasing robustness to translation errors and the deleterious effect of codon series reassignment that becomes increasingly severe with growing complexity of the evolving system.

View Article: PubMed Central - HTML - PubMed

Affiliation: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. novozhil@ncbi.nlm.nih.gov

ABSTRACT

Background: The standard genetic code table has a distinctly non-random structure, with similar amino acids often encoded by codons series that differ by a single nucleotide substitution, typically, in the third or the first position of the codon. It has been repeatedly argued that this structure of the code results from selective optimization for robustness to translation errors such that translational misreading has the minimal adverse effect. Indeed, it has been shown in several studies that the standard code is more robust than a substantial majority of random codes. However, it remains unclear how much evolution the standard code underwent, what is the level of optimization, and what is the likely starting point.

Results: We explored possible evolutionary trajectories of the genetic code within a limited domain of the vast space of possible codes. Only those codes were analyzed for robustness to translation error that possess the same block structure and the same degree of degeneracy as the standard code. This choice of a small part of the vast space of possible codes is based on the notion that the block structure of the standard code is a consequence of the structure of the complex between the cognate tRNA and the codon in mRNA where the third base of the codon plays a minimum role as a specificity determinant. Within this part of the fitness landscape, a simple evolutionary algorithm, with elementary evolutionary steps comprising swaps of four-codon or two-codon series, was employed to investigate the optimization of codes for the maximum attainable robustness. The properties of the standard code were compared to the properties of four sets of codes, namely, purely random codes, random codes that are more robust than the standard code, and two sets of codes that resulted from optimization of the first two sets. The comparison of these sets of codes with the standard code and its locally optimized version showed that, on average, optimization of random codes yielded evolutionary trajectories that converged at the same level of robustness to translation errors as the optimization path of the standard code; however, the standard code required considerably fewer steps to reach that level than an average random code. When evolution starts from random codes whose fitness is comparable to that of the standard code, they typically reach much higher level of optimization than the standard code, i.e., the standard code is much closer to its local minimum (fitness peak) than most of the random codes with similar levels of robustness. Thus, the standard genetic code appears to be a point on an evolutionary trajectory from a random point (code) about half the way to the summit of the local peak. The fitness landscape of code evolution appears to be extremely rugged, containing numerous peaks with a broad distribution of heights, and the standard code is relatively unremarkable, being located on the slope of a moderate-height peak.

Conclusion: The standard code appears to be the result of partial optimization of a random code for robustness to errors of translation. The reason the code is not fully optimized could be the trade-off between the beneficial effect of increasing robustness to translation errors and the deleterious effect of codon series reassignment that becomes increasingly severe with growing complexity of the evolving system. Thus, evolution of the code can be represented as a combination of adaptation and frozen accident.

Show MeSH
Projection of the code maps onto the plane of the first two principal components (see text for details). Red 'x' signs, random codes, r; red circles, codes resulting from optimization of random codes, o; green squares, random codes that perform better than the standard code, R; green asterisks, codes resulting from optimization of the set R, O; blue square, the standard code; blue asterisk, the code resulting from the optimization of the standard code. (a) PRS; (b) the Gilis matrix.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2211284&req=5

Figure 6: Projection of the code maps onto the plane of the first two principal components (see text for details). Red 'x' signs, random codes, r; red circles, codes resulting from optimization of random codes, o; green squares, random codes that perform better than the standard code, R; green asterisks, codes resulting from optimization of the set R, O; blue square, the standard code; blue asterisk, the code resulting from the optimization of the standard code. (a) PRS; (b) the Gilis matrix.

Mentions: To visualize the relationships between the four sets of codes, we employed the following procedure. For each code, we define its map as the vector of the shortest mutational distances between amino acids (there are 190 elements in this vector). The distance between any two codons is calculated as the weighted sum of the number of nucleotide substitutions involved, with the weights assigned according to (2). For instance, in the standard code, Ala is encoded by four codons {GCU, GCC, GCA, GCG}, whereas Arg is encoded by six codons {CGU, CGC, CGA, CGG, AGA, AGG}, and, hence, at least two nucleotide substitutions are required to replace an Arg with an Ala. The distance between these two amino acids, then, is D(Arg, Ala) = 1(first position, transition) + 10(second position, transversion) + 0(third position). The code map takes into account these distances without specifying the exact positions of the codons for individual amino acids in the code table. Using principal component analysis (PCA), the code maps for the sets r, R, o, and O were projected onto the plane of the first two principal components; the first two principal components account for 20% of the variation in the data in the case of the PRS (Fig. 6a) and for 27% of the variation in the case of the Gilis matrix (Fig. 6b).


Evolution of the genetic code: partial optimization of a random code for robustness to translation error in a rugged fitness landscape.

Novozhilov AS, Wolf YI, Koonin EV - Biol. Direct (2007)

Projection of the code maps onto the plane of the first two principal components (see text for details). Red 'x' signs, random codes, r; red circles, codes resulting from optimization of random codes, o; green squares, random codes that perform better than the standard code, R; green asterisks, codes resulting from optimization of the set R, O; blue square, the standard code; blue asterisk, the code resulting from the optimization of the standard code. (a) PRS; (b) the Gilis matrix.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2211284&req=5

Figure 6: Projection of the code maps onto the plane of the first two principal components (see text for details). Red 'x' signs, random codes, r; red circles, codes resulting from optimization of random codes, o; green squares, random codes that perform better than the standard code, R; green asterisks, codes resulting from optimization of the set R, O; blue square, the standard code; blue asterisk, the code resulting from the optimization of the standard code. (a) PRS; (b) the Gilis matrix.
Mentions: To visualize the relationships between the four sets of codes, we employed the following procedure. For each code, we define its map as the vector of the shortest mutational distances between amino acids (there are 190 elements in this vector). The distance between any two codons is calculated as the weighted sum of the number of nucleotide substitutions involved, with the weights assigned according to (2). For instance, in the standard code, Ala is encoded by four codons {GCU, GCC, GCA, GCG}, whereas Arg is encoded by six codons {CGU, CGC, CGA, CGG, AGA, AGG}, and, hence, at least two nucleotide substitutions are required to replace an Arg with an Ala. The distance between these two amino acids, then, is D(Arg, Ala) = 1(first position, transition) + 10(second position, transversion) + 0(third position). The code map takes into account these distances without specifying the exact positions of the codons for individual amino acids in the code table. Using principal component analysis (PCA), the code maps for the sets r, R, o, and O were projected onto the plane of the first two principal components; the first two principal components account for 20% of the variation in the data in the case of the PRS (Fig. 6a) and for 27% of the variation in the case of the Gilis matrix (Fig. 6b).

Bottom Line: It has been repeatedly argued that this structure of the code results from selective optimization for robustness to translation errors such that translational misreading has the minimal adverse effect.The properties of the standard code were compared to the properties of four sets of codes, namely, purely random codes, random codes that are more robust than the standard code, and two sets of codes that resulted from optimization of the first two sets.The reason the code is not fully optimized could be the trade-off between the beneficial effect of increasing robustness to translation errors and the deleterious effect of codon series reassignment that becomes increasingly severe with growing complexity of the evolving system.

View Article: PubMed Central - HTML - PubMed

Affiliation: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. novozhil@ncbi.nlm.nih.gov

ABSTRACT

Background: The standard genetic code table has a distinctly non-random structure, with similar amino acids often encoded by codons series that differ by a single nucleotide substitution, typically, in the third or the first position of the codon. It has been repeatedly argued that this structure of the code results from selective optimization for robustness to translation errors such that translational misreading has the minimal adverse effect. Indeed, it has been shown in several studies that the standard code is more robust than a substantial majority of random codes. However, it remains unclear how much evolution the standard code underwent, what is the level of optimization, and what is the likely starting point.

Results: We explored possible evolutionary trajectories of the genetic code within a limited domain of the vast space of possible codes. Only those codes were analyzed for robustness to translation error that possess the same block structure and the same degree of degeneracy as the standard code. This choice of a small part of the vast space of possible codes is based on the notion that the block structure of the standard code is a consequence of the structure of the complex between the cognate tRNA and the codon in mRNA where the third base of the codon plays a minimum role as a specificity determinant. Within this part of the fitness landscape, a simple evolutionary algorithm, with elementary evolutionary steps comprising swaps of four-codon or two-codon series, was employed to investigate the optimization of codes for the maximum attainable robustness. The properties of the standard code were compared to the properties of four sets of codes, namely, purely random codes, random codes that are more robust than the standard code, and two sets of codes that resulted from optimization of the first two sets. The comparison of these sets of codes with the standard code and its locally optimized version showed that, on average, optimization of random codes yielded evolutionary trajectories that converged at the same level of robustness to translation errors as the optimization path of the standard code; however, the standard code required considerably fewer steps to reach that level than an average random code. When evolution starts from random codes whose fitness is comparable to that of the standard code, they typically reach much higher level of optimization than the standard code, i.e., the standard code is much closer to its local minimum (fitness peak) than most of the random codes with similar levels of robustness. Thus, the standard genetic code appears to be a point on an evolutionary trajectory from a random point (code) about half the way to the summit of the local peak. The fitness landscape of code evolution appears to be extremely rugged, containing numerous peaks with a broad distribution of heights, and the standard code is relatively unremarkable, being located on the slope of a moderate-height peak.

Conclusion: The standard code appears to be the result of partial optimization of a random code for robustness to errors of translation. The reason the code is not fully optimized could be the trade-off between the beneficial effect of increasing robustness to translation errors and the deleterious effect of codon series reassignment that becomes increasingly severe with growing complexity of the evolving system. Thus, evolution of the code can be represented as a combination of adaptation and frozen accident.

Show MeSH