Limits...
Archaeal Clusters of Orthologous Genes (arCOGs): An Update and Application for Analysis of Shared Features between Thermococcales, Methanococcales, and Methanobacteriales.

Makarova KS, Wolf YI, Koonin EV - Life (Basel) (2015)

Bottom Line: Assessment of the current archaeal genome annotation in public databases indicates that consistent use of arCOGs can significantly improve the annotation quality.The results of phylogenomic analysis that involved both comparison of multiple phylogenetic trees and a search for putative derived shared characters by using phyletic patterns extracted from the arCOGs reveal a likely evolutionary relationship between the Thermococci, Methanococci, and Methanobacteria.The arCOGs are expected to be instrumental for a comprehensive phylogenomic study of the archaea.

View Article: PubMed Central - PubMed

Affiliation: National Center for Biotechnology Information, NLM, National Institutes of Health, Bethesda, MD 20894, USA. makarova@ncbi.nlm.nih.gov.

ABSTRACT
With the continuously accelerating genome sequencing from diverse groups of archaea and bacteria, accurate identification of gene orthology and availability of readily expandable clusters of orthologous genes are essential for the functional annotation of new genomes. We report an update of the collection of archaeal Clusters of Orthologous Genes (arCOGs) to cover, on average, 91% of the protein-coding genes in 168 archaeal genomes. The new arCOGs were constructed using refined algorithms for orthology identification combined with extensive manual curation, including incorporation of the results of several completed and ongoing research projects in archaeal genomics. A new level of classification is introduced, superclusters that untie two or more arCOGs and more completely reflect gene family evolution than individual, disconnected arCOGs. Assessment of the current archaeal genome annotation in public databases indicates that consistent use of arCOGs can significantly improve the annotation quality. In addition to their utility for genome annotation, arCOGs also are a platform for phylogenomic analysis. We explore this aspect of arCOGs by performing a phylogenomic study of the Thermococci that are traditionally viewed as the basal branch of the Euryarchaeota. The results of phylogenomic analysis that involved both comparison of multiple phylogenetic trees and a search for putative derived shared characters by using phyletic patterns extracted from the arCOGs reveal a likely evolutionary relationship between the Thermococci, Methanococci, and Methanobacteria. The arCOGs are expected to be instrumental for a comprehensive phylogenomic study of the archaea.

No MeSH data available.


Changes in arCOGs between 2012 and 2014. (A) arCOGs added to and removed from the 2012 set; (B) Annotation changes and additions; (C) Breakdown of annotation changes into arCOG functional categories. 2012 annotations are indicated on top, 2014 annotations on the bottom. Functional categories are as follows. Information storage and processing: J, Translation, ribosomal structure, and biogenesis; K, Transcription; L, Replication, recombination, and repair. Cellular processes and signaling: V, Defense mechanisms; M, Cell wall/membrane/envelope biogenesis; N, Cell motility, secretion, and vesicular transport; X, Mobilome. Metabolism: C, Energy production and conversion; P, Inorganic ion transport and metabolism. Poorly characterized: R, General function prediction only; S, Function unknown. Asterisks indicate other categories combined.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4390880&req=5

life-05-00818-f001: Changes in arCOGs between 2012 and 2014. (A) arCOGs added to and removed from the 2012 set; (B) Annotation changes and additions; (C) Breakdown of annotation changes into arCOG functional categories. 2012 annotations are indicated on top, 2014 annotations on the bottom. Functional categories are as follows. Information storage and processing: J, Translation, ribosomal structure, and biogenesis; K, Transcription; L, Replication, recombination, and repair. Cellular processes and signaling: V, Defense mechanisms; M, Cell wall/membrane/envelope biogenesis; N, Cell motility, secretion, and vesicular transport; X, Mobilome. Metabolism: C, Energy production and conversion; P, Inorganic ion transport and metabolism. Poorly characterized: R, General function prediction only; S, Function unknown. Asterisks indicate other categories combined.

Mentions: In the new version of the arCOGs, many clusters that encompassed only a few genomes and consisted of genes that had paralogs in other, larger arCOGs were merged into the latter, resulting in elimination of 397 old arCOGs (Figure 1). Addition of 48 new genomes resulted in the generation of 3527 new arCOGs, several of which were manually rearranged on the basis of phylogenetic analysis. Notably, these changed arCOGs included the archaeal DNA replicative polymerases of the PolB family that were now manually divided into five arCOGs according to recent comparative genomics and phylogenetic analysis [22]: arCOG15271, PolB1; arCOG00329, PolB2; arCOG00328, PolB3; arCOG04926, DNA polymerase elongation subunit, PolB family; and arCOG15272, Casposon-associated protein-primed PolB family polymerase. Similarly, another family of essential proteins involved in DNA replication, the GINS, were split into separate clusters GINS15 (arCOG00551) and GINS23 (arCOG00552). Thus, the updated arCOG database reflects the evolutionary relationships and functional diversification of these key proteins as accurately as is currently feasible.


Archaeal Clusters of Orthologous Genes (arCOGs): An Update and Application for Analysis of Shared Features between Thermococcales, Methanococcales, and Methanobacteriales.

Makarova KS, Wolf YI, Koonin EV - Life (Basel) (2015)

Changes in arCOGs between 2012 and 2014. (A) arCOGs added to and removed from the 2012 set; (B) Annotation changes and additions; (C) Breakdown of annotation changes into arCOG functional categories. 2012 annotations are indicated on top, 2014 annotations on the bottom. Functional categories are as follows. Information storage and processing: J, Translation, ribosomal structure, and biogenesis; K, Transcription; L, Replication, recombination, and repair. Cellular processes and signaling: V, Defense mechanisms; M, Cell wall/membrane/envelope biogenesis; N, Cell motility, secretion, and vesicular transport; X, Mobilome. Metabolism: C, Energy production and conversion; P, Inorganic ion transport and metabolism. Poorly characterized: R, General function prediction only; S, Function unknown. Asterisks indicate other categories combined.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4390880&req=5

life-05-00818-f001: Changes in arCOGs between 2012 and 2014. (A) arCOGs added to and removed from the 2012 set; (B) Annotation changes and additions; (C) Breakdown of annotation changes into arCOG functional categories. 2012 annotations are indicated on top, 2014 annotations on the bottom. Functional categories are as follows. Information storage and processing: J, Translation, ribosomal structure, and biogenesis; K, Transcription; L, Replication, recombination, and repair. Cellular processes and signaling: V, Defense mechanisms; M, Cell wall/membrane/envelope biogenesis; N, Cell motility, secretion, and vesicular transport; X, Mobilome. Metabolism: C, Energy production and conversion; P, Inorganic ion transport and metabolism. Poorly characterized: R, General function prediction only; S, Function unknown. Asterisks indicate other categories combined.
Mentions: In the new version of the arCOGs, many clusters that encompassed only a few genomes and consisted of genes that had paralogs in other, larger arCOGs were merged into the latter, resulting in elimination of 397 old arCOGs (Figure 1). Addition of 48 new genomes resulted in the generation of 3527 new arCOGs, several of which were manually rearranged on the basis of phylogenetic analysis. Notably, these changed arCOGs included the archaeal DNA replicative polymerases of the PolB family that were now manually divided into five arCOGs according to recent comparative genomics and phylogenetic analysis [22]: arCOG15271, PolB1; arCOG00329, PolB2; arCOG00328, PolB3; arCOG04926, DNA polymerase elongation subunit, PolB family; and arCOG15272, Casposon-associated protein-primed PolB family polymerase. Similarly, another family of essential proteins involved in DNA replication, the GINS, were split into separate clusters GINS15 (arCOG00551) and GINS23 (arCOG00552). Thus, the updated arCOG database reflects the evolutionary relationships and functional diversification of these key proteins as accurately as is currently feasible.

Bottom Line: Assessment of the current archaeal genome annotation in public databases indicates that consistent use of arCOGs can significantly improve the annotation quality.The results of phylogenomic analysis that involved both comparison of multiple phylogenetic trees and a search for putative derived shared characters by using phyletic patterns extracted from the arCOGs reveal a likely evolutionary relationship between the Thermococci, Methanococci, and Methanobacteria.The arCOGs are expected to be instrumental for a comprehensive phylogenomic study of the archaea.

View Article: PubMed Central - PubMed

Affiliation: National Center for Biotechnology Information, NLM, National Institutes of Health, Bethesda, MD 20894, USA. makarova@ncbi.nlm.nih.gov.

ABSTRACT
With the continuously accelerating genome sequencing from diverse groups of archaea and bacteria, accurate identification of gene orthology and availability of readily expandable clusters of orthologous genes are essential for the functional annotation of new genomes. We report an update of the collection of archaeal Clusters of Orthologous Genes (arCOGs) to cover, on average, 91% of the protein-coding genes in 168 archaeal genomes. The new arCOGs were constructed using refined algorithms for orthology identification combined with extensive manual curation, including incorporation of the results of several completed and ongoing research projects in archaeal genomics. A new level of classification is introduced, superclusters that untie two or more arCOGs and more completely reflect gene family evolution than individual, disconnected arCOGs. Assessment of the current archaeal genome annotation in public databases indicates that consistent use of arCOGs can significantly improve the annotation quality. In addition to their utility for genome annotation, arCOGs also are a platform for phylogenomic analysis. We explore this aspect of arCOGs by performing a phylogenomic study of the Thermococci that are traditionally viewed as the basal branch of the Euryarchaeota. The results of phylogenomic analysis that involved both comparison of multiple phylogenetic trees and a search for putative derived shared characters by using phyletic patterns extracted from the arCOGs reveal a likely evolutionary relationship between the Thermococci, Methanococci, and Methanobacteria. The arCOGs are expected to be instrumental for a comprehensive phylogenomic study of the archaea.

No MeSH data available.