Limits...
Tools and pipelines for BioNano data: molecule assembly pipeline and FASTA super scaffolding tool.

Shelton JM, Coleman MC, Herndon N, Lu N, Lam ET, Anantharaman T, Sheth P, Brown SJ - BMC Genomics (2015)

Bottom Line: We used a custom assembly workflow to optimize consensus genome map assembly, resulting in an assembly equal to the estimated length of the Tribolium castaneum genome and with an N50 of more than 1 Mb.We used this map for super scaffolding the T. castaneum sequence assembly, more than tripling its N50 with the program Stitch.We report the results of applying these tools to validate and improve a 7x Sanger draft of the T. castaneum genome.

View Article: PubMed Central - PubMed

Affiliation: KSU/K-INBRE Bioinformatics Center, Division of Biology, Kansas State University, Manhattan, KS, USA. sheltonj@ksu.edu.

ABSTRACT

Background: Genome assembly remains an unsolved problem. Assembly projects face a range of hurdles that confound assembly. Thus a variety of tools and approaches are needed to improve draft genomes.

Results: We used a custom assembly workflow to optimize consensus genome map assembly, resulting in an assembly equal to the estimated length of the Tribolium castaneum genome and with an N50 of more than 1 Mb. We used this map for super scaffolding the T. castaneum sequence assembly, more than tripling its N50 with the program Stitch.

Conclusions: In this article we present software that leverages consensus genome maps assembled from extremely long single molecule maps to increase the contiguity of sequence assemblies. We report the results of applying these tools to validate and improve a 7x Sanger draft of the T. castaneum genome.

No MeSH data available.


Related in: MedlinePlus

Cumulative length per BNX file for T. castaneum data generated over time. Cumulative length of single molecule maps > 150 kb are plotted on the y-axis (purple X), the upgrade to the V2 IrysChip (grey dashed line) is plotted and date is indicated on the x-axis. Data was generated from 03-2013 to 01-2014. Aborted runs (cumulative length = 0) excluded
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4587741&req=5

Fig4: Cumulative length per BNX file for T. castaneum data generated over time. Cumulative length of single molecule maps > 150 kb are plotted on the y-axis (purple X), the upgrade to the V2 IrysChip (grey dashed line) is plotted and date is indicated on the x-axis. Data was generated from 03-2013 to 01-2014. Aborted runs (cumulative length = 0) excluded

Mentions: Using Knickers (BioNano Genomics), an in silico label density calculator, we estimated that the Tribolium castaneum genome had 8.66 and 5.51 labels per 100 kb for the nt.BspQI and nt.BbvCI enzymes (New England BioLabs) respectively. The ideal number of labels per 100 kb is between 10 and 15 therefore we nicked with both enzymes. DNA was nicked, labeled with fluorescent nucleotides, and repaired according to BioNano protocol; and 93 BNX files were produced from the Irys genome mapping system (Fig. 4 and Additional file 2). Four corrupted files (cumulative length = 0) were excluded from this analysis. All T. castaneum BNX files have been deposited to labarchives (doi:10.6070/H4V69GK3).Fig. 4


Tools and pipelines for BioNano data: molecule assembly pipeline and FASTA super scaffolding tool.

Shelton JM, Coleman MC, Herndon N, Lu N, Lam ET, Anantharaman T, Sheth P, Brown SJ - BMC Genomics (2015)

Cumulative length per BNX file for T. castaneum data generated over time. Cumulative length of single molecule maps > 150 kb are plotted on the y-axis (purple X), the upgrade to the V2 IrysChip (grey dashed line) is plotted and date is indicated on the x-axis. Data was generated from 03-2013 to 01-2014. Aborted runs (cumulative length = 0) excluded
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4587741&req=5

Fig4: Cumulative length per BNX file for T. castaneum data generated over time. Cumulative length of single molecule maps > 150 kb are plotted on the y-axis (purple X), the upgrade to the V2 IrysChip (grey dashed line) is plotted and date is indicated on the x-axis. Data was generated from 03-2013 to 01-2014. Aborted runs (cumulative length = 0) excluded
Mentions: Using Knickers (BioNano Genomics), an in silico label density calculator, we estimated that the Tribolium castaneum genome had 8.66 and 5.51 labels per 100 kb for the nt.BspQI and nt.BbvCI enzymes (New England BioLabs) respectively. The ideal number of labels per 100 kb is between 10 and 15 therefore we nicked with both enzymes. DNA was nicked, labeled with fluorescent nucleotides, and repaired according to BioNano protocol; and 93 BNX files were produced from the Irys genome mapping system (Fig. 4 and Additional file 2). Four corrupted files (cumulative length = 0) were excluded from this analysis. All T. castaneum BNX files have been deposited to labarchives (doi:10.6070/H4V69GK3).Fig. 4

Bottom Line: We used a custom assembly workflow to optimize consensus genome map assembly, resulting in an assembly equal to the estimated length of the Tribolium castaneum genome and with an N50 of more than 1 Mb.We used this map for super scaffolding the T. castaneum sequence assembly, more than tripling its N50 with the program Stitch.We report the results of applying these tools to validate and improve a 7x Sanger draft of the T. castaneum genome.

View Article: PubMed Central - PubMed

Affiliation: KSU/K-INBRE Bioinformatics Center, Division of Biology, Kansas State University, Manhattan, KS, USA. sheltonj@ksu.edu.

ABSTRACT

Background: Genome assembly remains an unsolved problem. Assembly projects face a range of hurdles that confound assembly. Thus a variety of tools and approaches are needed to improve draft genomes.

Results: We used a custom assembly workflow to optimize consensus genome map assembly, resulting in an assembly equal to the estimated length of the Tribolium castaneum genome and with an N50 of more than 1 Mb. We used this map for super scaffolding the T. castaneum sequence assembly, more than tripling its N50 with the program Stitch.

Conclusions: In this article we present software that leverages consensus genome maps assembled from extremely long single molecule maps to increase the contiguity of sequence assemblies. We report the results of applying these tools to validate and improve a 7x Sanger draft of the T. castaneum genome.

No MeSH data available.


Related in: MedlinePlus