Limits...
The origin of a novel gene through overprinting in Escherichia coli.

Delaye L, Deluna A, Lazcano A, Becerra A - BMC Evol. Biol. (2008)

Bottom Line: Further, we show that yaaW sequences coding for htgA genes have a slower evolutionary rate than those lacking an overlapped htgA gene.We propose the term janolog (from Jano, the two-faced Roman god) to describe the homology relationship that holds between two genes when one originated through overprinting of the other.One cannot dismiss the possibility that at least a small fraction of the large number of novel ORPhan genes detected in pan-genome and metagenomic studies arose by overprinting.

View Article: PubMed Central - HTML - PubMed

Affiliation: Facultad de Ciencias, Universidad Nacional Autónoma de México, Apdo. Postal 70-407, Cd. Universitaria, 04510 México DF, México. josedelaye@gmail.com

ABSTRACT

Background: Overlapped genes originate by a) loss of a stop codon among contiguous genes coded in different frames; b) shift to an upstream initiation codon of one of the contiguous genes; or c) by overprinting, whereby a novel open reading frame originates through point mutation inside an existing gene. Although overlapped genes are common in viruses, it is not clear whether overprinting has led to new genes in prokaryotes.

Results: Here we report the origin of a new gene through overprinting in Escherichia coli K12. The htgA gene coding for a positive regulator of the sigma 32 heat shock promoter arose by point mutation in a 123/213 phase within an open reading frame (yaaW) of unknown function, most likely in the lineage leading to E. coli and Shigella sp. Further, we show that yaaW sequences coding for htgA genes have a slower evolutionary rate than those lacking an overlapped htgA gene.

Conclusion: While overprinting has been shown to be rather frequent in the evolution of new genes in viruses, our results suggest that this mechanism has also contributed to the origin of a novel gene in a prokaryote. We propose the term janolog (from Jano, the two-faced Roman god) to describe the homology relationship that holds between two genes when one originated through overprinting of the other. One cannot dismiss the possibility that at least a small fraction of the large number of novel ORPhan genes detected in pan-genome and metagenomic studies arose by overprinting.

Show MeSH

Related in: MedlinePlus

Statistic analysis. Distribution of Chi-square values of relative rate tests against distance of out-group sequence (O) to node C. Black dots correspond to the first 408 nucleotides of yaaW and crosses correspond to the rest of the gene. The 0.005 and 0.001 significance levels are indicated with dotted lines.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2268670&req=5

Figure 6: Statistic analysis. Distribution of Chi-square values of relative rate tests against distance of out-group sequence (O) to node C. Black dots correspond to the first 408 nucleotides of yaaW and crosses correspond to the rest of the gene. The 0.005 and 0.001 significance levels are indicated with dotted lines.

Mentions: In-group yaaW sequences lacking the overlap (B sequences in Figure 3) have accumulated more exclusive mutations (m2 changes in Figure 5) in the first 409 nucleotides than those in-group yaaW genes endowed with the overlap (A sequences in Figure 3 and m1 changes in Figure 5). This suggests that htgA exerts an evolutionary pressure to yaaW in their first 409 nucleotides. Accordingly, we have subdivided the yaaW alignment in two sections. The first one comprises nucleotides 1 to 408, while the second one includes nucleotides 409 to 714. We have then applied the Tajima test [16] to both sections independently. As seen in Figure 6, many of the differences are significant at α = 0.05 for the first 408 nucleotides, and for some comparisons even at the α = 0.01 level. This is particularly true for the genes encoding for [UniProtKB: O26107] and [UniProtKB: Q9ZJ24] protein sequences (they also align best with A and B sequences). However, not all comparisons give statistically significant results. It is likely that signal erosion in sequences having experienced more substitutions may explain in part lack of statistically significant results in some relative rate tests, since there seems to be a tendency of lower Chi-square values towards increasing genetic distance (Figure 6).


The origin of a novel gene through overprinting in Escherichia coli.

Delaye L, Deluna A, Lazcano A, Becerra A - BMC Evol. Biol. (2008)

Statistic analysis. Distribution of Chi-square values of relative rate tests against distance of out-group sequence (O) to node C. Black dots correspond to the first 408 nucleotides of yaaW and crosses correspond to the rest of the gene. The 0.005 and 0.001 significance levels are indicated with dotted lines.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2268670&req=5

Figure 6: Statistic analysis. Distribution of Chi-square values of relative rate tests against distance of out-group sequence (O) to node C. Black dots correspond to the first 408 nucleotides of yaaW and crosses correspond to the rest of the gene. The 0.005 and 0.001 significance levels are indicated with dotted lines.
Mentions: In-group yaaW sequences lacking the overlap (B sequences in Figure 3) have accumulated more exclusive mutations (m2 changes in Figure 5) in the first 409 nucleotides than those in-group yaaW genes endowed with the overlap (A sequences in Figure 3 and m1 changes in Figure 5). This suggests that htgA exerts an evolutionary pressure to yaaW in their first 409 nucleotides. Accordingly, we have subdivided the yaaW alignment in two sections. The first one comprises nucleotides 1 to 408, while the second one includes nucleotides 409 to 714. We have then applied the Tajima test [16] to both sections independently. As seen in Figure 6, many of the differences are significant at α = 0.05 for the first 408 nucleotides, and for some comparisons even at the α = 0.01 level. This is particularly true for the genes encoding for [UniProtKB: O26107] and [UniProtKB: Q9ZJ24] protein sequences (they also align best with A and B sequences). However, not all comparisons give statistically significant results. It is likely that signal erosion in sequences having experienced more substitutions may explain in part lack of statistically significant results in some relative rate tests, since there seems to be a tendency of lower Chi-square values towards increasing genetic distance (Figure 6).

Bottom Line: Further, we show that yaaW sequences coding for htgA genes have a slower evolutionary rate than those lacking an overlapped htgA gene.We propose the term janolog (from Jano, the two-faced Roman god) to describe the homology relationship that holds between two genes when one originated through overprinting of the other.One cannot dismiss the possibility that at least a small fraction of the large number of novel ORPhan genes detected in pan-genome and metagenomic studies arose by overprinting.

View Article: PubMed Central - HTML - PubMed

Affiliation: Facultad de Ciencias, Universidad Nacional Autónoma de México, Apdo. Postal 70-407, Cd. Universitaria, 04510 México DF, México. josedelaye@gmail.com

ABSTRACT

Background: Overlapped genes originate by a) loss of a stop codon among contiguous genes coded in different frames; b) shift to an upstream initiation codon of one of the contiguous genes; or c) by overprinting, whereby a novel open reading frame originates through point mutation inside an existing gene. Although overlapped genes are common in viruses, it is not clear whether overprinting has led to new genes in prokaryotes.

Results: Here we report the origin of a new gene through overprinting in Escherichia coli K12. The htgA gene coding for a positive regulator of the sigma 32 heat shock promoter arose by point mutation in a 123/213 phase within an open reading frame (yaaW) of unknown function, most likely in the lineage leading to E. coli and Shigella sp. Further, we show that yaaW sequences coding for htgA genes have a slower evolutionary rate than those lacking an overlapped htgA gene.

Conclusion: While overprinting has been shown to be rather frequent in the evolution of new genes in viruses, our results suggest that this mechanism has also contributed to the origin of a novel gene in a prokaryote. We propose the term janolog (from Jano, the two-faced Roman god) to describe the homology relationship that holds between two genes when one originated through overprinting of the other. One cannot dismiss the possibility that at least a small fraction of the large number of novel ORPhan genes detected in pan-genome and metagenomic studies arose by overprinting.

Show MeSH
Related in: MedlinePlus