Limits...
Archaeal Clusters of Orthologous Genes (arCOGs): An Update and Application for Analysis of Shared Features between Thermococcales, Methanococcales, and Methanobacteriales.

Makarova KS, Wolf YI, Koonin EV - Life (Basel) (2015)

Bottom Line: Assessment of the current archaeal genome annotation in public databases indicates that consistent use of arCOGs can significantly improve the annotation quality.The results of phylogenomic analysis that involved both comparison of multiple phylogenetic trees and a search for putative derived shared characters by using phyletic patterns extracted from the arCOGs reveal a likely evolutionary relationship between the Thermococci, Methanococci, and Methanobacteria.The arCOGs are expected to be instrumental for a comprehensive phylogenomic study of the archaea.

View Article: PubMed Central - PubMed

Affiliation: National Center for Biotechnology Information, NLM, National Institutes of Health, Bethesda, MD 20894, USA. makarova@ncbi.nlm.nih.gov.

ABSTRACT
With the continuously accelerating genome sequencing from diverse groups of archaea and bacteria, accurate identification of gene orthology and availability of readily expandable clusters of orthologous genes are essential for the functional annotation of new genomes. We report an update of the collection of archaeal Clusters of Orthologous Genes (arCOGs) to cover, on average, 91% of the protein-coding genes in 168 archaeal genomes. The new arCOGs were constructed using refined algorithms for orthology identification combined with extensive manual curation, including incorporation of the results of several completed and ongoing research projects in archaeal genomics. A new level of classification is introduced, superclusters that untie two or more arCOGs and more completely reflect gene family evolution than individual, disconnected arCOGs. Assessment of the current archaeal genome annotation in public databases indicates that consistent use of arCOGs can significantly improve the annotation quality. In addition to their utility for genome annotation, arCOGs also are a platform for phylogenomic analysis. We explore this aspect of arCOGs by performing a phylogenomic study of the Thermococci that are traditionally viewed as the basal branch of the Euryarchaeota. The results of phylogenomic analysis that involved both comparison of multiple phylogenetic trees and a search for putative derived shared characters by using phyletic patterns extracted from the arCOGs reveal a likely evolutionary relationship between the Thermococci, Methanococci, and Methanobacteria. The arCOGs are expected to be instrumental for a comprehensive phylogenomic study of the archaea.

No MeSH data available.


Related in: MedlinePlus

Phylogenetic analysis of the archaeal enolase family. The MUSCLE program [46] was used for construction of sequence alignment. The approximate maximum likelihood tree was reconstructed using the FastTree program [49] (178 sequences and 410 aligned positions). The sequences are denoted by their GI numbers and species names. Several branches are collapsed and shown as triangles denoted by the respective lineage taxonomy name. The complete tree is available in Supplementary file S4. Color code is the same as in Figure 2. Species or lineages that have paralogs elsewhere in the tree are in bold. The conserved neighborhoods (if any) are shown on the right side of the tree for the respective branches. Homologous genes are shown by arrows of the same color. The arCOG numbers are provided for major branches. Abbreviations: Rpo6: DNA-directed RNA polymerase subunit K/omega and S2: RpsB, ribosomal protein S2.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4390880&req=5

life-05-00818-f004: Phylogenetic analysis of the archaeal enolase family. The MUSCLE program [46] was used for construction of sequence alignment. The approximate maximum likelihood tree was reconstructed using the FastTree program [49] (178 sequences and 410 aligned positions). The sequences are denoted by their GI numbers and species names. Several branches are collapsed and shown as triangles denoted by the respective lineage taxonomy name. The complete tree is available in Supplementary file S4. Color code is the same as in Figure 2. Species or lineages that have paralogs elsewhere in the tree are in bold. The conserved neighborhoods (if any) are shown on the right side of the tree for the respective branches. Homologous genes are shown by arrows of the same color. The arCOG numbers are provided for major branches. Abbreviations: Rpo6: DNA-directed RNA polymerase subunit K/omega and S2: RpsB, ribosomal protein S2.

Mentions: A well characterized protein in this group is enolase, a key glycolytic enzyme that additionally has been identified as a subunit of the bacterial RNA degradosome [89,90]. The enolase is often encoded in a conserved gene neighborhood within the ribosomal superoperon [91] and in the majority of the archaeal and bacterial genomes is present in a single copy. In arCOGs, enolase is represented by arCOG01169 and arCOG01170, which together belong to the supercluster COG0148. All Methanococci and Thermococci, several Metanomicrobia, and Candidatus Caldiarchaeum subterraneum have two enolase paralogs. We reconstructed the phylogenetic tree for the enolases of both arCOGs and overlaid the genomic contexts of the respective genes (Figure 4).


Archaeal Clusters of Orthologous Genes (arCOGs): An Update and Application for Analysis of Shared Features between Thermococcales, Methanococcales, and Methanobacteriales.

Makarova KS, Wolf YI, Koonin EV - Life (Basel) (2015)

Phylogenetic analysis of the archaeal enolase family. The MUSCLE program [46] was used for construction of sequence alignment. The approximate maximum likelihood tree was reconstructed using the FastTree program [49] (178 sequences and 410 aligned positions). The sequences are denoted by their GI numbers and species names. Several branches are collapsed and shown as triangles denoted by the respective lineage taxonomy name. The complete tree is available in Supplementary file S4. Color code is the same as in Figure 2. Species or lineages that have paralogs elsewhere in the tree are in bold. The conserved neighborhoods (if any) are shown on the right side of the tree for the respective branches. Homologous genes are shown by arrows of the same color. The arCOG numbers are provided for major branches. Abbreviations: Rpo6: DNA-directed RNA polymerase subunit K/omega and S2: RpsB, ribosomal protein S2.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4390880&req=5

life-05-00818-f004: Phylogenetic analysis of the archaeal enolase family. The MUSCLE program [46] was used for construction of sequence alignment. The approximate maximum likelihood tree was reconstructed using the FastTree program [49] (178 sequences and 410 aligned positions). The sequences are denoted by their GI numbers and species names. Several branches are collapsed and shown as triangles denoted by the respective lineage taxonomy name. The complete tree is available in Supplementary file S4. Color code is the same as in Figure 2. Species or lineages that have paralogs elsewhere in the tree are in bold. The conserved neighborhoods (if any) are shown on the right side of the tree for the respective branches. Homologous genes are shown by arrows of the same color. The arCOG numbers are provided for major branches. Abbreviations: Rpo6: DNA-directed RNA polymerase subunit K/omega and S2: RpsB, ribosomal protein S2.
Mentions: A well characterized protein in this group is enolase, a key glycolytic enzyme that additionally has been identified as a subunit of the bacterial RNA degradosome [89,90]. The enolase is often encoded in a conserved gene neighborhood within the ribosomal superoperon [91] and in the majority of the archaeal and bacterial genomes is present in a single copy. In arCOGs, enolase is represented by arCOG01169 and arCOG01170, which together belong to the supercluster COG0148. All Methanococci and Thermococci, several Metanomicrobia, and Candidatus Caldiarchaeum subterraneum have two enolase paralogs. We reconstructed the phylogenetic tree for the enolases of both arCOGs and overlaid the genomic contexts of the respective genes (Figure 4).

Bottom Line: Assessment of the current archaeal genome annotation in public databases indicates that consistent use of arCOGs can significantly improve the annotation quality.The results of phylogenomic analysis that involved both comparison of multiple phylogenetic trees and a search for putative derived shared characters by using phyletic patterns extracted from the arCOGs reveal a likely evolutionary relationship between the Thermococci, Methanococci, and Methanobacteria.The arCOGs are expected to be instrumental for a comprehensive phylogenomic study of the archaea.

View Article: PubMed Central - PubMed

Affiliation: National Center for Biotechnology Information, NLM, National Institutes of Health, Bethesda, MD 20894, USA. makarova@ncbi.nlm.nih.gov.

ABSTRACT
With the continuously accelerating genome sequencing from diverse groups of archaea and bacteria, accurate identification of gene orthology and availability of readily expandable clusters of orthologous genes are essential for the functional annotation of new genomes. We report an update of the collection of archaeal Clusters of Orthologous Genes (arCOGs) to cover, on average, 91% of the protein-coding genes in 168 archaeal genomes. The new arCOGs were constructed using refined algorithms for orthology identification combined with extensive manual curation, including incorporation of the results of several completed and ongoing research projects in archaeal genomics. A new level of classification is introduced, superclusters that untie two or more arCOGs and more completely reflect gene family evolution than individual, disconnected arCOGs. Assessment of the current archaeal genome annotation in public databases indicates that consistent use of arCOGs can significantly improve the annotation quality. In addition to their utility for genome annotation, arCOGs also are a platform for phylogenomic analysis. We explore this aspect of arCOGs by performing a phylogenomic study of the Thermococci that are traditionally viewed as the basal branch of the Euryarchaeota. The results of phylogenomic analysis that involved both comparison of multiple phylogenetic trees and a search for putative derived shared characters by using phyletic patterns extracted from the arCOGs reveal a likely evolutionary relationship between the Thermococci, Methanococci, and Methanobacteria. The arCOGs are expected to be instrumental for a comprehensive phylogenomic study of the archaea.

No MeSH data available.


Related in: MedlinePlus