Limits...
A four-column theory for the origin of the genetic code: tracing the evolutionary pathways that gave rise to an optimized code.

Higgs PG - Biol. Direct (2009)

Bottom Line: Hence, the effects of translational error are minimized with respect to randomly reshuffled codes.As a result, the properties of the amino acids in the final code retain a four-column pattern that is a relic of the earliest stages of code evolution.Nevertheless, the code that results is one in which translational error is minimized.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Physics and Astronomy, McMaster University, Hamilton, Ontario L8S 4M1, Canada. higgsp@mcmaster.ca

ABSTRACT

Background: The arrangement of the amino acids in the genetic code is such that neighbouring codons are assigned to amino acids with similar physical properties. Hence, the effects of translational error are minimized with respect to randomly reshuffled codes. Further inspection reveals that it is amino acids in the same column of the code (i.e. same second base) that are similar, whereas those in the same row show no particular similarity. We propose a 'four-column' theory for the origin of the code that explains how the action of selection during the build-up of the code leads to a final code that has the observed properties.

Results: The theory makes the following propositions. (i) The earliest amino acids in the code were those that are easiest to synthesize non-biologically, namely Gly, Ala, Asp, Glu and Val. (ii) These amino acids are assigned to codons with G at first position. Therefore the first code may have used only these codons. (iii) The code rapidly developed into a four-column code where all codons in the same column coded for the same amino acid: NUN = Val, NCN = Ala, NAN = Asp and/or Glu, and NGN = Gly. (iv) Later amino acids were added sequentially to the code by a process of subdivision of codon blocks in which a subset of the codons assigned to an early amino acid were reassigned to a later amino acid. (v) Later amino acids were added into positions formerly occupied by amino acids with similar properties because this can occur with minimal disruption to the proteins already encoded by the earlier code. As a result, the properties of the amino acids in the final code retain a four-column pattern that is a relic of the earliest stages of code evolution.

Conclusion: The driving force during this process is not the minimization of translational error, but positive selection for the increased diversity and functionality of the proteins that can be made with a larger amino acid alphabet. Nevertheless, the code that results is one in which translational error is minimized. We define a cost function with which we can compare the fitness of codes with varying numbers of amino acids, and a barrier function, which measures the change in cost immediately after addition of a new amino acid. We show that the barrier is positive if an amino acid is added into a column with dissimilar properties, but negative if an amino acid is added into a column with similar physical properties. Thus, natural selection favours the assignment of amino acids to the positions that they occupy in the final code.

Show MeSH
Proposed four-column structure of the earliest genetic code.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2689856&req=5

Figure 2: Proposed four-column structure of the earliest genetic code.

Mentions: It is remarkable that the top 5 amino acids on our list (i.e. those with highest prebiotic concentrations) are precisely those that occupy codons with G at first position. This leads us to propose a very early version of the code that used only these GNN codons. This same early stage of the code is also proposed in a recent version of the coevolution theory [22], and this also forms the starting point for several much earlier treatments of genetic code evolution [17,44-47]. It has also been proposed that that a regular pattern of G's at first position could have been important in keeping the early ribosome in frame [48]. However, if three quarters of the codons were unassigned, then all mutations occurring at 1st position would render the gene non-functional or impossible to translate. We therefore propose that the code rapidly expanded to give the four-column structure in Figure 2. As both Asp and Glu are in our top 5 early amino acids, it is possible that the third column could have been Glu instead of Asp, or that there was a mixture of Asp and Glu codons in this column, or that this column coded both Asp and Glu ambiguously. For concreteness in Figure 2, we have taken the simplest possibility, which is that this column coded only for Asp.


A four-column theory for the origin of the genetic code: tracing the evolutionary pathways that gave rise to an optimized code.

Higgs PG - Biol. Direct (2009)

Proposed four-column structure of the earliest genetic code.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2689856&req=5

Figure 2: Proposed four-column structure of the earliest genetic code.
Mentions: It is remarkable that the top 5 amino acids on our list (i.e. those with highest prebiotic concentrations) are precisely those that occupy codons with G at first position. This leads us to propose a very early version of the code that used only these GNN codons. This same early stage of the code is also proposed in a recent version of the coevolution theory [22], and this also forms the starting point for several much earlier treatments of genetic code evolution [17,44-47]. It has also been proposed that that a regular pattern of G's at first position could have been important in keeping the early ribosome in frame [48]. However, if three quarters of the codons were unassigned, then all mutations occurring at 1st position would render the gene non-functional or impossible to translate. We therefore propose that the code rapidly expanded to give the four-column structure in Figure 2. As both Asp and Glu are in our top 5 early amino acids, it is possible that the third column could have been Glu instead of Asp, or that there was a mixture of Asp and Glu codons in this column, or that this column coded both Asp and Glu ambiguously. For concreteness in Figure 2, we have taken the simplest possibility, which is that this column coded only for Asp.

Bottom Line: Hence, the effects of translational error are minimized with respect to randomly reshuffled codes.As a result, the properties of the amino acids in the final code retain a four-column pattern that is a relic of the earliest stages of code evolution.Nevertheless, the code that results is one in which translational error is minimized.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Physics and Astronomy, McMaster University, Hamilton, Ontario L8S 4M1, Canada. higgsp@mcmaster.ca

ABSTRACT

Background: The arrangement of the amino acids in the genetic code is such that neighbouring codons are assigned to amino acids with similar physical properties. Hence, the effects of translational error are minimized with respect to randomly reshuffled codes. Further inspection reveals that it is amino acids in the same column of the code (i.e. same second base) that are similar, whereas those in the same row show no particular similarity. We propose a 'four-column' theory for the origin of the code that explains how the action of selection during the build-up of the code leads to a final code that has the observed properties.

Results: The theory makes the following propositions. (i) The earliest amino acids in the code were those that are easiest to synthesize non-biologically, namely Gly, Ala, Asp, Glu and Val. (ii) These amino acids are assigned to codons with G at first position. Therefore the first code may have used only these codons. (iii) The code rapidly developed into a four-column code where all codons in the same column coded for the same amino acid: NUN = Val, NCN = Ala, NAN = Asp and/or Glu, and NGN = Gly. (iv) Later amino acids were added sequentially to the code by a process of subdivision of codon blocks in which a subset of the codons assigned to an early amino acid were reassigned to a later amino acid. (v) Later amino acids were added into positions formerly occupied by amino acids with similar properties because this can occur with minimal disruption to the proteins already encoded by the earlier code. As a result, the properties of the amino acids in the final code retain a four-column pattern that is a relic of the earliest stages of code evolution.

Conclusion: The driving force during this process is not the minimization of translational error, but positive selection for the increased diversity and functionality of the proteins that can be made with a larger amino acid alphabet. Nevertheless, the code that results is one in which translational error is minimized. We define a cost function with which we can compare the fitness of codes with varying numbers of amino acids, and a barrier function, which measures the change in cost immediately after addition of a new amino acid. We show that the barrier is positive if an amino acid is added into a column with dissimilar properties, but negative if an amino acid is added into a column with similar physical properties. Thus, natural selection favours the assignment of amino acids to the positions that they occupy in the final code.

Show MeSH