Limits...
MBGD update 2015: microbial genome database for flexible ortholog analysis utilizing a diverse set of genomic data.

Uchiyama I, Mihara M, Nishide H, Chiba H - Nucleic Acids Res. (2014)

Bottom Line: Because many of the recently accumulating genomic data are draft genome sequences for which some complete genome sequences of the same or closely related species are available, MBGD now stores draft genome data and allows the users to incorporate them into a user-specific ortholog database using the MyMBGD functionality.In this function, draft genome data are incorporated into an existing ortholog table created only from the complete genome data in an incremental manner to prevent low-quality draft data from affecting clustering results.In addition, to provide high-quality orthology relationships, the standard ortholog table containing all the representative genomes, which is first created by the rapid classification program DomClust, is now refined using DomRefine, a recently developed program for improving domain-level clustering using multiple sequence alignment information.

View Article: PubMed Central - PubMed

Affiliation: Laboratory of Genome Informatics, National Institute for Basic Biology, National Institutes of Natural Sciences, Nishigonaka 38, Myodaiji, Okazaki, Aichi 444-8585, Japan Data Integration and Analysis Facility, National Institute for Basic Biology, National Institutes of Natural Sciences, Nishigonaka 38, Myodaiji, Okazaki, Aichi 444-8585, Japan uchiyama@nibb.ac.jp.

Show MeSH
Overview of the data construction procedure in MBGD. Precomputed ortholog tables are colored in light yellow and user-generated ortholog tables are colored in light pink. Three methods (DomClust followed by DomRefine, DomClust only and MergeTree) to create these ortholog tables are shown with different arrows. MergeTree is a program for adding genomes incrementally to an existing ortholog table (base cluster), and thus is represented by two arrows: a base cluster is shown by a solid arrow and an added genome is shown by a broken arrow.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4383954&req=5

Figure 1: Overview of the data construction procedure in MBGD. Precomputed ortholog tables are colored in light yellow and user-generated ortholog tables are colored in light pink. Three methods (DomClust followed by DomRefine, DomClust only and MergeTree) to create these ortholog tables are shown with different arrows. MergeTree is a program for adding genomes incrementally to an existing ortholog table (base cluster), and thus is represented by two arrows: a base cluster is shown by a solid arrow and an added genome is shown by a broken arrow.

Mentions: The current data construction procedure in MBGD is summarized in Figure 1. In MBGD, a representative organism set is defined by selecting one genome from each genus in the increasing order of release date from the complete genome data. The standard ortholog table is created from these representative genomes using DomClust followed by the DomRefine procedure (see below). The genes in the other genomes are then assigned to one of the ortholog groups in the standard ortholog table to create an extended ortholog table that contains all the complete genomes stored in MBGD using an incremental procedure implemented in the MergeTree program (10), which incrementally adds genomes to a given ortholog table (named base cluster).


MBGD update 2015: microbial genome database for flexible ortholog analysis utilizing a diverse set of genomic data.

Uchiyama I, Mihara M, Nishide H, Chiba H - Nucleic Acids Res. (2014)

Overview of the data construction procedure in MBGD. Precomputed ortholog tables are colored in light yellow and user-generated ortholog tables are colored in light pink. Three methods (DomClust followed by DomRefine, DomClust only and MergeTree) to create these ortholog tables are shown with different arrows. MergeTree is a program for adding genomes incrementally to an existing ortholog table (base cluster), and thus is represented by two arrows: a base cluster is shown by a solid arrow and an added genome is shown by a broken arrow.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4383954&req=5

Figure 1: Overview of the data construction procedure in MBGD. Precomputed ortholog tables are colored in light yellow and user-generated ortholog tables are colored in light pink. Three methods (DomClust followed by DomRefine, DomClust only and MergeTree) to create these ortholog tables are shown with different arrows. MergeTree is a program for adding genomes incrementally to an existing ortholog table (base cluster), and thus is represented by two arrows: a base cluster is shown by a solid arrow and an added genome is shown by a broken arrow.
Mentions: The current data construction procedure in MBGD is summarized in Figure 1. In MBGD, a representative organism set is defined by selecting one genome from each genus in the increasing order of release date from the complete genome data. The standard ortholog table is created from these representative genomes using DomClust followed by the DomRefine procedure (see below). The genes in the other genomes are then assigned to one of the ortholog groups in the standard ortholog table to create an extended ortholog table that contains all the complete genomes stored in MBGD using an incremental procedure implemented in the MergeTree program (10), which incrementally adds genomes to a given ortholog table (named base cluster).

Bottom Line: Because many of the recently accumulating genomic data are draft genome sequences for which some complete genome sequences of the same or closely related species are available, MBGD now stores draft genome data and allows the users to incorporate them into a user-specific ortholog database using the MyMBGD functionality.In this function, draft genome data are incorporated into an existing ortholog table created only from the complete genome data in an incremental manner to prevent low-quality draft data from affecting clustering results.In addition, to provide high-quality orthology relationships, the standard ortholog table containing all the representative genomes, which is first created by the rapid classification program DomClust, is now refined using DomRefine, a recently developed program for improving domain-level clustering using multiple sequence alignment information.

View Article: PubMed Central - PubMed

Affiliation: Laboratory of Genome Informatics, National Institute for Basic Biology, National Institutes of Natural Sciences, Nishigonaka 38, Myodaiji, Okazaki, Aichi 444-8585, Japan Data Integration and Analysis Facility, National Institute for Basic Biology, National Institutes of Natural Sciences, Nishigonaka 38, Myodaiji, Okazaki, Aichi 444-8585, Japan uchiyama@nibb.ac.jp.

Show MeSH