Limits...
Blueprint for a minimal photoautotrophic cell: conserved and variable genes in Synechococcus elongatus PCC 7942.

Delaye L, González-Domenech CM, Garcillán-Barcia MP, Peretó J, de la Cruz F, Moya A - BMC Genomics (2011)

Bottom Line: We considered that genes in genomic islands could be found if they showed a combination of: a) unusual G+C content; b) unusual phylogenetic similarity; and/or c) a small number of the highly iterated palindrome 1 (HIP1) motif plus an unusual codon usage.Our estimates of genomic islands in PCC 7942 are larger than those predicted by other published methods like SIGI-HMM.Our results set a guide to non-essential genes in S. elongatus PCC 7942 indicating a path towards the engineering of a model photoautotrophic bacterial cell.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institut Cavanilles de Biodiversitat i Biologia Evolutiva, Universitat de València, Valencia, Spain.

ABSTRACT

Background: Simpler biological systems should be easier to understand and to engineer towards pre-defined goals. One way to achieve biological simplicity is through genome minimization. Here we looked for genomic islands in the fresh water cyanobacteria Synechococcus elongatus PCC 7942 (genome size 2.7 Mb) that could be used as targets for deletion. We also looked for conserved genes that might be essential for cell survival.

Results: By using a combination of methods we identified 170 xenologs, 136 ORFans and 1401 core genes in the genome of S. elongatus PCC 7942. These represent 6.5%, 5.2% and 53.6% of the annotated genes respectively. We considered that genes in genomic islands could be found if they showed a combination of: a) unusual G+C content; b) unusual phylogenetic similarity; and/or c) a small number of the highly iterated palindrome 1 (HIP1) motif plus an unusual codon usage. The origin of the largest genomic island by horizontal gene transfer (HGT) could be corroborated by lack of coverage among metagenomic sequences from a fresh water microbialite. Evidence is also presented that xenologous genes tend to cluster in operons. Interestingly, most genes coding for proteins with a diguanylate cyclase domain are predicted to be xenologs, suggesting a role for horizontal gene transfer in the evolution of Synechococcus sensory systems.

Conclusions: Our estimates of genomic islands in PCC 7942 are larger than those predicted by other published methods like SIGI-HMM. Our results set a guide to non-essential genes in S. elongatus PCC 7942 indicating a path towards the engineering of a model photoautotrophic bacterial cell.

Show MeSH
Identification of xenologs in PCC 7942 by combining different sources of information. Cases of horizontally transferred genes into the genome of PCC 7942 were identified by having at least two of the following properties: G+C content deviating more than 1 SD from the mean value of all genes; low number of HIP1 motifs relative to gene length well as an unusual codon usage (PA genes); and an unusual phylogenetic relationship. ORFan genes were also considered.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3025956&req=5

Figure 8: Identification of xenologs in PCC 7942 by combining different sources of information. Cases of horizontally transferred genes into the genome of PCC 7942 were identified by having at least two of the following properties: G+C content deviating more than 1 SD from the mean value of all genes; low number of HIP1 motifs relative to gene length well as an unusual codon usage (PA genes); and an unusual phylogenetic relationship. ORFan genes were also considered.

Mentions: No single method is expected to identify all cases of HGT. For instance, composition based methods are biased towards the detection of relatively recent cases of HTG [22]. On the other hand, phylogenetic based methods detect ancient HGT events [23]. When looking for islands of variability in a genome where there are no genomes from different strains to compare, the best approach is to use a combination of methods, in order to avoid as much as possible the detection of false positives [24]. We compared the set of genes having an unusual G+C content, with those identified as PA (detected by atypical codon usage and low number of HIP1 motifs), and those having a nearest neighbor or best BLAST hit outside cyanobacteria. Those genes identified at least by two of the three methods were assumed to be xenologs. As shown in Figure 8, we detected 128 such cases. We have repeated the analysis with different SD cutoff values for the identification of PA, PX and PHX genes, and for the identification of genes deviating in G+C content. To evaluate the performance of these different cutoffs we used as a control set 323 genes that have not been subject to HGT among cyanobacteria [11]. The usage of 1SD for both cutoffs seems to be the best compromise between xenolog detection and avoidance of likely false positives (see Additional File 2).


Blueprint for a minimal photoautotrophic cell: conserved and variable genes in Synechococcus elongatus PCC 7942.

Delaye L, González-Domenech CM, Garcillán-Barcia MP, Peretó J, de la Cruz F, Moya A - BMC Genomics (2011)

Identification of xenologs in PCC 7942 by combining different sources of information. Cases of horizontally transferred genes into the genome of PCC 7942 were identified by having at least two of the following properties: G+C content deviating more than 1 SD from the mean value of all genes; low number of HIP1 motifs relative to gene length well as an unusual codon usage (PA genes); and an unusual phylogenetic relationship. ORFan genes were also considered.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3025956&req=5

Figure 8: Identification of xenologs in PCC 7942 by combining different sources of information. Cases of horizontally transferred genes into the genome of PCC 7942 were identified by having at least two of the following properties: G+C content deviating more than 1 SD from the mean value of all genes; low number of HIP1 motifs relative to gene length well as an unusual codon usage (PA genes); and an unusual phylogenetic relationship. ORFan genes were also considered.
Mentions: No single method is expected to identify all cases of HGT. For instance, composition based methods are biased towards the detection of relatively recent cases of HTG [22]. On the other hand, phylogenetic based methods detect ancient HGT events [23]. When looking for islands of variability in a genome where there are no genomes from different strains to compare, the best approach is to use a combination of methods, in order to avoid as much as possible the detection of false positives [24]. We compared the set of genes having an unusual G+C content, with those identified as PA (detected by atypical codon usage and low number of HIP1 motifs), and those having a nearest neighbor or best BLAST hit outside cyanobacteria. Those genes identified at least by two of the three methods were assumed to be xenologs. As shown in Figure 8, we detected 128 such cases. We have repeated the analysis with different SD cutoff values for the identification of PA, PX and PHX genes, and for the identification of genes deviating in G+C content. To evaluate the performance of these different cutoffs we used as a control set 323 genes that have not been subject to HGT among cyanobacteria [11]. The usage of 1SD for both cutoffs seems to be the best compromise between xenolog detection and avoidance of likely false positives (see Additional File 2).

Bottom Line: We considered that genes in genomic islands could be found if they showed a combination of: a) unusual G+C content; b) unusual phylogenetic similarity; and/or c) a small number of the highly iterated palindrome 1 (HIP1) motif plus an unusual codon usage.Our estimates of genomic islands in PCC 7942 are larger than those predicted by other published methods like SIGI-HMM.Our results set a guide to non-essential genes in S. elongatus PCC 7942 indicating a path towards the engineering of a model photoautotrophic bacterial cell.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institut Cavanilles de Biodiversitat i Biologia Evolutiva, Universitat de València, Valencia, Spain.

ABSTRACT

Background: Simpler biological systems should be easier to understand and to engineer towards pre-defined goals. One way to achieve biological simplicity is through genome minimization. Here we looked for genomic islands in the fresh water cyanobacteria Synechococcus elongatus PCC 7942 (genome size 2.7 Mb) that could be used as targets for deletion. We also looked for conserved genes that might be essential for cell survival.

Results: By using a combination of methods we identified 170 xenologs, 136 ORFans and 1401 core genes in the genome of S. elongatus PCC 7942. These represent 6.5%, 5.2% and 53.6% of the annotated genes respectively. We considered that genes in genomic islands could be found if they showed a combination of: a) unusual G+C content; b) unusual phylogenetic similarity; and/or c) a small number of the highly iterated palindrome 1 (HIP1) motif plus an unusual codon usage. The origin of the largest genomic island by horizontal gene transfer (HGT) could be corroborated by lack of coverage among metagenomic sequences from a fresh water microbialite. Evidence is also presented that xenologous genes tend to cluster in operons. Interestingly, most genes coding for proteins with a diguanylate cyclase domain are predicted to be xenologs, suggesting a role for horizontal gene transfer in the evolution of Synechococcus sensory systems.

Conclusions: Our estimates of genomic islands in PCC 7942 are larger than those predicted by other published methods like SIGI-HMM. Our results set a guide to non-essential genes in S. elongatus PCC 7942 indicating a path towards the engineering of a model photoautotrophic bacterial cell.

Show MeSH