Limits...
Scaffolding and validation of bacterial genome assemblies using optical restriction maps.

Nagarajan N, Read TD, Pop M - Bioinformatics (2008)

Bottom Line: We demonstrate the robustness of our methods to sequencing and assembly errors using extensive experiments on simulated datasets.We then present the results obtained by applying our algorithms to data generated from two bacterial genomes Yersinia aldovae and Yersinia kristensenii.The tools described here are available as an open-source package at ftp://ftp.cbcb.umd.edu/pub/software/soma

View Article: PubMed Central - PubMed

Affiliation: University of Maryland, College Park, MD 20742, USA.

ABSTRACT

Motivation: New, high-throughput sequencing technologies have made it feasible to cheaply generate vast amounts of sequence information from a genome of interest. The computational reconstruction of the complete sequence of a genome is complicated by specific features of these new sequencing technologies, such as the short length of the sequencing reads and absence of mate-pair information. In this article we propose methods to overcome such limitations by incorporating information from optical restriction maps.

Results: We demonstrate the robustness of our methods to sequencing and assembly errors using extensive experiments on simulated datasets. We then present the results obtained by applying our algorithms to data generated from two bacterial genomes Yersinia aldovae and Yersinia kristensenii. The resulting assemblies contain a single scaffold covering a large fraction of the respective genomes, suggesting that the careful use of optical maps can provide a cost-effective framework for the assembly of genomes.

Availability: The tools described here are available as an open-source package at ftp://ftp.cbcb.umd.edu/pub/software/soma

Show MeSH

Related in: MedlinePlus

The optical mapping process. To generate a whole-genome optical map, DNA is sheared into fragments that are stretched and fixed onto an optical chip and then digested using a restriction enzyme. The resulting pieces are optically analyzed and assembled into a genome-wide map.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2373919&req=5

Figure 1: The optical mapping process. To generate a whole-genome optical map, DNA is sheared into fragments that are stretched and fixed onto an optical chip and then digested using a restriction enzyme. The resulting pieces are optically analyzed and assembled into a genome-wide map.

Mentions: Optical mapping, a variant of restriction mapping, is one of multiple laboratory techniques aimed at mapping the location of specific landmarks along the DNA of an organism of interest. In both optical and restriction mapping, the landmarks correspond to the recognition sites for specific restriction enzymes. Restriction mapping involves cleaving a piece of unknown DNA using a restriction enzyme and then using gel-based methods to measure the range of fragment sizes represented in the sample. The spectrum of sizes obtained provides information about the structure of the unknown piece of DNA and can be viewed as a fingerprint of this sequence (Nathans and Smith, 1975). Optical mapping (Samad et al., 1995) extends this approach by providing, in addition to the set of fragment sizes, information about the order in which these fragments occur in the DNA (see Fig. 1 for a schematic representation of the map generation process). This information provides a genome-wide scaffold into which the sequence data can be placed [in a process somewhat akin to comparative assembly (Pop et al., 2004)]. Computational methods for performing this mapping are the focus of this article. Our work was motivated by the recent availability of accurate high-throughput methods for constructing optical maps (specifically the technology developed at Opgen, www.opgen.com) and its increased adoption as a valuable source of information (Latreille et al., 2007). Note that this technology allows an optical map to be constructed within as little as 24 h after receiving a DNA sample, a time-frame comparable to that needed for sequencing the sample with the 454 technology. Optical maps are, therefore, an attractive alternative to a 454-Sanger hybrid approach (Goldberg et al., 2006) as the construction of a paired-end library can take more than a week. Furthermore, since optical maps and paired-end data have complementary characteristics they can be used together when both data types are available. Optical maps provide a coarse, genome-wide scaffold, in contrast, with the typically fragmented scaffolds generated from paired-end data. The methods described in this article can be easily adapted to a hybrid optical map—paired-end approach by aligning entire paired-end scaffolds to the map instead of individual contigs.Fig. 1.


Scaffolding and validation of bacterial genome assemblies using optical restriction maps.

Nagarajan N, Read TD, Pop M - Bioinformatics (2008)

The optical mapping process. To generate a whole-genome optical map, DNA is sheared into fragments that are stretched and fixed onto an optical chip and then digested using a restriction enzyme. The resulting pieces are optically analyzed and assembled into a genome-wide map.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2373919&req=5

Figure 1: The optical mapping process. To generate a whole-genome optical map, DNA is sheared into fragments that are stretched and fixed onto an optical chip and then digested using a restriction enzyme. The resulting pieces are optically analyzed and assembled into a genome-wide map.
Mentions: Optical mapping, a variant of restriction mapping, is one of multiple laboratory techniques aimed at mapping the location of specific landmarks along the DNA of an organism of interest. In both optical and restriction mapping, the landmarks correspond to the recognition sites for specific restriction enzymes. Restriction mapping involves cleaving a piece of unknown DNA using a restriction enzyme and then using gel-based methods to measure the range of fragment sizes represented in the sample. The spectrum of sizes obtained provides information about the structure of the unknown piece of DNA and can be viewed as a fingerprint of this sequence (Nathans and Smith, 1975). Optical mapping (Samad et al., 1995) extends this approach by providing, in addition to the set of fragment sizes, information about the order in which these fragments occur in the DNA (see Fig. 1 for a schematic representation of the map generation process). This information provides a genome-wide scaffold into which the sequence data can be placed [in a process somewhat akin to comparative assembly (Pop et al., 2004)]. Computational methods for performing this mapping are the focus of this article. Our work was motivated by the recent availability of accurate high-throughput methods for constructing optical maps (specifically the technology developed at Opgen, www.opgen.com) and its increased adoption as a valuable source of information (Latreille et al., 2007). Note that this technology allows an optical map to be constructed within as little as 24 h after receiving a DNA sample, a time-frame comparable to that needed for sequencing the sample with the 454 technology. Optical maps are, therefore, an attractive alternative to a 454-Sanger hybrid approach (Goldberg et al., 2006) as the construction of a paired-end library can take more than a week. Furthermore, since optical maps and paired-end data have complementary characteristics they can be used together when both data types are available. Optical maps provide a coarse, genome-wide scaffold, in contrast, with the typically fragmented scaffolds generated from paired-end data. The methods described in this article can be easily adapted to a hybrid optical map—paired-end approach by aligning entire paired-end scaffolds to the map instead of individual contigs.Fig. 1.

Bottom Line: We demonstrate the robustness of our methods to sequencing and assembly errors using extensive experiments on simulated datasets.We then present the results obtained by applying our algorithms to data generated from two bacterial genomes Yersinia aldovae and Yersinia kristensenii.The tools described here are available as an open-source package at ftp://ftp.cbcb.umd.edu/pub/software/soma

View Article: PubMed Central - PubMed

Affiliation: University of Maryland, College Park, MD 20742, USA.

ABSTRACT

Motivation: New, high-throughput sequencing technologies have made it feasible to cheaply generate vast amounts of sequence information from a genome of interest. The computational reconstruction of the complete sequence of a genome is complicated by specific features of these new sequencing technologies, such as the short length of the sequencing reads and absence of mate-pair information. In this article we propose methods to overcome such limitations by incorporating information from optical restriction maps.

Results: We demonstrate the robustness of our methods to sequencing and assembly errors using extensive experiments on simulated datasets. We then present the results obtained by applying our algorithms to data generated from two bacterial genomes Yersinia aldovae and Yersinia kristensenii. The resulting assemblies contain a single scaffold covering a large fraction of the respective genomes, suggesting that the careful use of optical maps can provide a cost-effective framework for the assembly of genomes.

Availability: The tools described here are available as an open-source package at ftp://ftp.cbcb.umd.edu/pub/software/soma

Show MeSH
Related in: MedlinePlus