Limits...
Blueprint for a minimal photoautotrophic cell: conserved and variable genes in Synechococcus elongatus PCC 7942.

Delaye L, González-Domenech CM, Garcillán-Barcia MP, Peretó J, de la Cruz F, Moya A - BMC Genomics (2011)

Bottom Line: We considered that genes in genomic islands could be found if they showed a combination of: a) unusual G+C content; b) unusual phylogenetic similarity; and/or c) a small number of the highly iterated palindrome 1 (HIP1) motif plus an unusual codon usage.Our estimates of genomic islands in PCC 7942 are larger than those predicted by other published methods like SIGI-HMM.Our results set a guide to non-essential genes in S. elongatus PCC 7942 indicating a path towards the engineering of a model photoautotrophic bacterial cell.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institut Cavanilles de Biodiversitat i Biologia Evolutiva, Universitat de València, Valencia, Spain.

ABSTRACT

Background: Simpler biological systems should be easier to understand and to engineer towards pre-defined goals. One way to achieve biological simplicity is through genome minimization. Here we looked for genomic islands in the fresh water cyanobacteria Synechococcus elongatus PCC 7942 (genome size 2.7 Mb) that could be used as targets for deletion. We also looked for conserved genes that might be essential for cell survival.

Results: By using a combination of methods we identified 170 xenologs, 136 ORFans and 1401 core genes in the genome of S. elongatus PCC 7942. These represent 6.5%, 5.2% and 53.6% of the annotated genes respectively. We considered that genes in genomic islands could be found if they showed a combination of: a) unusual G+C content; b) unusual phylogenetic similarity; and/or c) a small number of the highly iterated palindrome 1 (HIP1) motif plus an unusual codon usage. The origin of the largest genomic island by horizontal gene transfer (HGT) could be corroborated by lack of coverage among metagenomic sequences from a fresh water microbialite. Evidence is also presented that xenologous genes tend to cluster in operons. Interestingly, most genes coding for proteins with a diguanylate cyclase domain are predicted to be xenologs, suggesting a role for horizontal gene transfer in the evolution of Synechococcus sensory systems.

Conclusions: Our estimates of genomic islands in PCC 7942 are larger than those predicted by other published methods like SIGI-HMM. Our results set a guide to non-essential genes in S. elongatus PCC 7942 indicating a path towards the engineering of a model photoautotrophic bacterial cell.

Show MeSH

Related in: MedlinePlus

Distribution of HIP1 motifs per ORF in PCC 7942 versus ORF length. Each dot in the figure represents one gene from the PCC 7942 genome. They are sorted by their number of coded amino acids (X-axis) and by the number of HIP1 motifs in their coding sequence (Y-axis). Black triangles show the mean of the size distribution for each category (0 HIP1 motifs, 1 HIP1 motifs... etc); black circles indicate the positions separated by one standard deviation of the mean; green dots represent ORFs with a codon usage reminiscent of highly expressed genes (PHX); blue dots, genes with codon usage similar to (but not as pronounced as) highly expressed genes (PX); and red dots putative alien genes (PA); gray dots, other genes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3025956&req=5

Figure 3: Distribution of HIP1 motifs per ORF in PCC 7942 versus ORF length. Each dot in the figure represents one gene from the PCC 7942 genome. They are sorted by their number of coded amino acids (X-axis) and by the number of HIP1 motifs in their coding sequence (Y-axis). Black triangles show the mean of the size distribution for each category (0 HIP1 motifs, 1 HIP1 motifs... etc); black circles indicate the positions separated by one standard deviation of the mean; green dots represent ORFs with a codon usage reminiscent of highly expressed genes (PHX); blue dots, genes with codon usage similar to (but not as pronounced as) highly expressed genes (PX); and red dots putative alien genes (PA); gray dots, other genes.

Mentions: A fortunate genomic peculiarity of PCC 7942 helped us in the complex task of identifying xenologous genes. We used the scarcity of a highly iterated palindrome (5'-GCGATCGC-3') designated HIP1 [16] in combination with unusual codon usage, as a signal of recent xenology. HIP1 sequences are thought not to be mobile within the genome, but rather to form in situ through mutation [17]. HIP1 sequences are clearly overrepresented in the genome of PCC 7942 (7402 copies instead of 60 expected). The number of HIP1 sequences among ORFs in PCC 7942 ranges from 0 to 20 and their distribution is apparently random (Figure 2). As expected for a random process, the number of copies of HIP1 per gene is related to gene size (Figure 3), with some outstanding cases of ORFs having too few copies of HIP1 relative to their length.


Blueprint for a minimal photoautotrophic cell: conserved and variable genes in Synechococcus elongatus PCC 7942.

Delaye L, González-Domenech CM, Garcillán-Barcia MP, Peretó J, de la Cruz F, Moya A - BMC Genomics (2011)

Distribution of HIP1 motifs per ORF in PCC 7942 versus ORF length. Each dot in the figure represents one gene from the PCC 7942 genome. They are sorted by their number of coded amino acids (X-axis) and by the number of HIP1 motifs in their coding sequence (Y-axis). Black triangles show the mean of the size distribution for each category (0 HIP1 motifs, 1 HIP1 motifs... etc); black circles indicate the positions separated by one standard deviation of the mean; green dots represent ORFs with a codon usage reminiscent of highly expressed genes (PHX); blue dots, genes with codon usage similar to (but not as pronounced as) highly expressed genes (PX); and red dots putative alien genes (PA); gray dots, other genes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3025956&req=5

Figure 3: Distribution of HIP1 motifs per ORF in PCC 7942 versus ORF length. Each dot in the figure represents one gene from the PCC 7942 genome. They are sorted by their number of coded amino acids (X-axis) and by the number of HIP1 motifs in their coding sequence (Y-axis). Black triangles show the mean of the size distribution for each category (0 HIP1 motifs, 1 HIP1 motifs... etc); black circles indicate the positions separated by one standard deviation of the mean; green dots represent ORFs with a codon usage reminiscent of highly expressed genes (PHX); blue dots, genes with codon usage similar to (but not as pronounced as) highly expressed genes (PX); and red dots putative alien genes (PA); gray dots, other genes.
Mentions: A fortunate genomic peculiarity of PCC 7942 helped us in the complex task of identifying xenologous genes. We used the scarcity of a highly iterated palindrome (5'-GCGATCGC-3') designated HIP1 [16] in combination with unusual codon usage, as a signal of recent xenology. HIP1 sequences are thought not to be mobile within the genome, but rather to form in situ through mutation [17]. HIP1 sequences are clearly overrepresented in the genome of PCC 7942 (7402 copies instead of 60 expected). The number of HIP1 sequences among ORFs in PCC 7942 ranges from 0 to 20 and their distribution is apparently random (Figure 2). As expected for a random process, the number of copies of HIP1 per gene is related to gene size (Figure 3), with some outstanding cases of ORFs having too few copies of HIP1 relative to their length.

Bottom Line: We considered that genes in genomic islands could be found if they showed a combination of: a) unusual G+C content; b) unusual phylogenetic similarity; and/or c) a small number of the highly iterated palindrome 1 (HIP1) motif plus an unusual codon usage.Our estimates of genomic islands in PCC 7942 are larger than those predicted by other published methods like SIGI-HMM.Our results set a guide to non-essential genes in S. elongatus PCC 7942 indicating a path towards the engineering of a model photoautotrophic bacterial cell.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institut Cavanilles de Biodiversitat i Biologia Evolutiva, Universitat de València, Valencia, Spain.

ABSTRACT

Background: Simpler biological systems should be easier to understand and to engineer towards pre-defined goals. One way to achieve biological simplicity is through genome minimization. Here we looked for genomic islands in the fresh water cyanobacteria Synechococcus elongatus PCC 7942 (genome size 2.7 Mb) that could be used as targets for deletion. We also looked for conserved genes that might be essential for cell survival.

Results: By using a combination of methods we identified 170 xenologs, 136 ORFans and 1401 core genes in the genome of S. elongatus PCC 7942. These represent 6.5%, 5.2% and 53.6% of the annotated genes respectively. We considered that genes in genomic islands could be found if they showed a combination of: a) unusual G+C content; b) unusual phylogenetic similarity; and/or c) a small number of the highly iterated palindrome 1 (HIP1) motif plus an unusual codon usage. The origin of the largest genomic island by horizontal gene transfer (HGT) could be corroborated by lack of coverage among metagenomic sequences from a fresh water microbialite. Evidence is also presented that xenologous genes tend to cluster in operons. Interestingly, most genes coding for proteins with a diguanylate cyclase domain are predicted to be xenologs, suggesting a role for horizontal gene transfer in the evolution of Synechococcus sensory systems.

Conclusions: Our estimates of genomic islands in PCC 7942 are larger than those predicted by other published methods like SIGI-HMM. Our results set a guide to non-essential genes in S. elongatus PCC 7942 indicating a path towards the engineering of a model photoautotrophic bacterial cell.

Show MeSH
Related in: MedlinePlus