Limits...
Metassembler: merging and optimizing de novo genome assemblies.

Wences AH, Schatz MC - Genome Biol. (2015)

Bottom Line: Genome assembly projects typically run multiple algorithms in an attempt to find the single best assembly, although those assemblies often have complementary, if untapped, strengths and weaknesses.We apply it to the four genomes from the Assemblathon competitions and show it consistently and substantially improves the contiguity and quality of each assembly.We also develop guidelines for meta-assembly by systematically evaluating 120 permutations of merging the top 5 assemblies of the first Assemblathon competition.

View Article: PubMed Central - PubMed

Affiliation: Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA. alhernan@cshl.edu.

ABSTRACT
Genome assembly projects typically run multiple algorithms in an attempt to find the single best assembly, although those assemblies often have complementary, if untapped, strengths and weaknesses. We present our metassembler algorithm that merges multiple assemblies of a genome into a single superior sequence. We apply it to the four genomes from the Assemblathon competitions and show it consistently and substantially improves the contiguity and quality of each assembly. We also develop guidelines for meta-assembly by systematically evaluating 120 permutations of merging the top 5 assemblies of the first Assemblathon competition. The software is open-source at http://metassembler.sourceforge.net .

No MeSH data available.


Related in: MedlinePlus

Boxplots of overall Z scores for Assemblathon 1 metassemblies grouped by initial assembly. Blue circles indicate the Z score of the corresponding initial assembly. Below each circle, the corresponding mean difference in Z scores between the final metassembly and the initial assembly (μ∆) is shown. The global mean difference is also shown at the top
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4581417&req=5

Fig2: Boxplots of overall Z scores for Assemblathon 1 metassemblies grouped by initial assembly. Blue circles indicate the Z score of the corresponding initial assembly. Below each circle, the corresponding mean difference in Z scores between the final metassembly and the initial assembly (μ∆) is shown. The global mean difference is also shown at the top

Mentions: We studied the dependency between the order in which the assemblies are metassembled and the quality of the final metassembly by evaluating the final metassembly of all input permutations. To do so, we computed the Z score of assembly quality, proposed in the Assemblathon 2 paper, which aggregates and summarizes all of the different metrics into a single value based on the mean and standard deviation of the individual metrics. The boxplots shown in Fig. 2 summarize the distribution of overall Z scores for all the metassemblies starting with each of the input assemblies. This shows that our algorithm is capable of significantly improving overall Z scores with a mean increment of 14.5 standard deviations, but also strongly suggests that quality and contiguity of the final assembly is dependent on the order of merging and which assembly is used first.Fig. 2


Metassembler: merging and optimizing de novo genome assemblies.

Wences AH, Schatz MC - Genome Biol. (2015)

Boxplots of overall Z scores for Assemblathon 1 metassemblies grouped by initial assembly. Blue circles indicate the Z score of the corresponding initial assembly. Below each circle, the corresponding mean difference in Z scores between the final metassembly and the initial assembly (μ∆) is shown. The global mean difference is also shown at the top
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4581417&req=5

Fig2: Boxplots of overall Z scores for Assemblathon 1 metassemblies grouped by initial assembly. Blue circles indicate the Z score of the corresponding initial assembly. Below each circle, the corresponding mean difference in Z scores between the final metassembly and the initial assembly (μ∆) is shown. The global mean difference is also shown at the top
Mentions: We studied the dependency between the order in which the assemblies are metassembled and the quality of the final metassembly by evaluating the final metassembly of all input permutations. To do so, we computed the Z score of assembly quality, proposed in the Assemblathon 2 paper, which aggregates and summarizes all of the different metrics into a single value based on the mean and standard deviation of the individual metrics. The boxplots shown in Fig. 2 summarize the distribution of overall Z scores for all the metassemblies starting with each of the input assemblies. This shows that our algorithm is capable of significantly improving overall Z scores with a mean increment of 14.5 standard deviations, but also strongly suggests that quality and contiguity of the final assembly is dependent on the order of merging and which assembly is used first.Fig. 2

Bottom Line: Genome assembly projects typically run multiple algorithms in an attempt to find the single best assembly, although those assemblies often have complementary, if untapped, strengths and weaknesses.We apply it to the four genomes from the Assemblathon competitions and show it consistently and substantially improves the contiguity and quality of each assembly.We also develop guidelines for meta-assembly by systematically evaluating 120 permutations of merging the top 5 assemblies of the first Assemblathon competition.

View Article: PubMed Central - PubMed

Affiliation: Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA. alhernan@cshl.edu.

ABSTRACT
Genome assembly projects typically run multiple algorithms in an attempt to find the single best assembly, although those assemblies often have complementary, if untapped, strengths and weaknesses. We present our metassembler algorithm that merges multiple assemblies of a genome into a single superior sequence. We apply it to the four genomes from the Assemblathon competitions and show it consistently and substantially improves the contiguity and quality of each assembly. We also develop guidelines for meta-assembly by systematically evaluating 120 permutations of merging the top 5 assemblies of the first Assemblathon competition. The software is open-source at http://metassembler.sourceforge.net .

No MeSH data available.


Related in: MedlinePlus