Limits...
EDGAR: a software framework for the comparative analysis of prokaryotic genomes.

Blom J, Albaum SP, Doppmeier D, Pühler A, Vorhölter FJ, Zakrzewski M, Goesmann A - BMC Bioinformatics (2009)

Bottom Line: As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach.Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database.EDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes.

View Article: PubMed Central - HTML - PubMed

Affiliation: Computational Genomics, Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany. jblom@cebitec.uni-bielefeld.de

ABSTRACT

Background: The introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons.

Results: To support these studies EDGAR - "Efficient Database framework for comparative Genome Analyses using BLAST score Ratios" - was developed. EDGAR is designed to automatically perform genome comparisons in a high throughput approach. Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database. To demonstrate a specific application case, we analyzed ten genomes of the bacterial genus Xanthomonas, for which phylogenetic studies were awkward due to divergent taxonomic systems. The resultant phylogeny EDGAR provided was consistent with outcomes from traditional approaches performed recently and moreover, it was possible to root each strain with unprecedented accuracy.

Conclusion: EDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes. The software supports a quick survey of evolutionary relationships and simplifies the process of obtaining new biological insights into the differential gene content of kindred genomes. Visualization features, like synteny plots or Venn diagrams, are offered to the scientific community through a web-based and therefore platform independent user interface http://edgar.cebitec.uni-bielefeld.de, where the precomputed data sets can be browsed.

Show MeSH

Related in: MedlinePlus

Web interface: Core genome presentation. Screenshot of the core genome calculation in the EDGAR web interface. In the upper part (A) one can choose a reference genome and a set of genomes to compare it with. The resulting table is shown in the lower part (B) of the page, in this case the core genome table for Xcc B100, Xca 756C, and Xcv 85-10. EDGAR displays the orthologous genes of all compared strains together with their gene function (as far as it is known) for every gene in the core genome. For every set of orthologous genes multiple alignments can be constructed of the genes itself and of their upstream region.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2696450&req=5

Figure 4: Web interface: Core genome presentation. Screenshot of the core genome calculation in the EDGAR web interface. In the upper part (A) one can choose a reference genome and a set of genomes to compare it with. The resulting table is shown in the lower part (B) of the page, in this case the core genome table for Xcc B100, Xca 756C, and Xcv 85-10. EDGAR displays the orthologous genes of all compared strains together with their gene function (as far as it is known) for every gene in the core genome. For every set of orthologous genes multiple alignments can be constructed of the genes itself and of their upstream region.

Mentions: The core genome can be calculated for a selected reference genome in comparison to every combination of genomes of its genus. The genes of the reference genome are used as the starting set for the iterative core calculation (see Methods). The calculated core genome is presented as a table of orthologous genes of all selected genomes and their functions, starting with the selected reference genome in the first column (see Figure 4).


EDGAR: a software framework for the comparative analysis of prokaryotic genomes.

Blom J, Albaum SP, Doppmeier D, Pühler A, Vorhölter FJ, Zakrzewski M, Goesmann A - BMC Bioinformatics (2009)

Web interface: Core genome presentation. Screenshot of the core genome calculation in the EDGAR web interface. In the upper part (A) one can choose a reference genome and a set of genomes to compare it with. The resulting table is shown in the lower part (B) of the page, in this case the core genome table for Xcc B100, Xca 756C, and Xcv 85-10. EDGAR displays the orthologous genes of all compared strains together with their gene function (as far as it is known) for every gene in the core genome. For every set of orthologous genes multiple alignments can be constructed of the genes itself and of their upstream region.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2696450&req=5

Figure 4: Web interface: Core genome presentation. Screenshot of the core genome calculation in the EDGAR web interface. In the upper part (A) one can choose a reference genome and a set of genomes to compare it with. The resulting table is shown in the lower part (B) of the page, in this case the core genome table for Xcc B100, Xca 756C, and Xcv 85-10. EDGAR displays the orthologous genes of all compared strains together with their gene function (as far as it is known) for every gene in the core genome. For every set of orthologous genes multiple alignments can be constructed of the genes itself and of their upstream region.
Mentions: The core genome can be calculated for a selected reference genome in comparison to every combination of genomes of its genus. The genes of the reference genome are used as the starting set for the iterative core calculation (see Methods). The calculated core genome is presented as a table of orthologous genes of all selected genomes and their functions, starting with the selected reference genome in the first column (see Figure 4).

Bottom Line: As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach.Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database.EDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes.

View Article: PubMed Central - HTML - PubMed

Affiliation: Computational Genomics, Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany. jblom@cebitec.uni-bielefeld.de

ABSTRACT

Background: The introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons.

Results: To support these studies EDGAR - "Efficient Database framework for comparative Genome Analyses using BLAST score Ratios" - was developed. EDGAR is designed to automatically perform genome comparisons in a high throughput approach. Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database. To demonstrate a specific application case, we analyzed ten genomes of the bacterial genus Xanthomonas, for which phylogenetic studies were awkward due to divergent taxonomic systems. The resultant phylogeny EDGAR provided was consistent with outcomes from traditional approaches performed recently and moreover, it was possible to root each strain with unprecedented accuracy.

Conclusion: EDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes. The software supports a quick survey of evolutionary relationships and simplifies the process of obtaining new biological insights into the differential gene content of kindred genomes. Visualization features, like synteny plots or Venn diagrams, are offered to the scientific community through a web-based and therefore platform independent user interface http://edgar.cebitec.uni-bielefeld.de, where the precomputed data sets can be browsed.

Show MeSH
Related in: MedlinePlus