Limits...
BSMAP: whole genome bisulfite sequence MAPping program.

Xi Y, Li W - BMC Bioinformatics (2009)

Bottom Line: However, mapping high-throughput bisulfite reads to the reference genome remains a great challenge due to the increased searching space, reduced complexity of bisulfite sequence, asymmetric cytosine to thymine alignments, and multiple CpG heterogeneous methylation.BSMAP is the first general-purpose bisulfite mapping software.It is able to map high-throughput bisulfite reads at whole genome level with feasible memory and CPU usage.

View Article: PubMed Central - HTML - PubMed

Affiliation: Division of Biostatistics, Dan L Duncan Cancer Center, Department of Molecular and Cellular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA. yxi@bcm.edu

ABSTRACT

Background: Bisulfite sequencing is a powerful technique to study DNA cytosine methylation. Bisulfite treatment followed by PCR amplification specifically converts unmethylated cytosines to thymine. Coupled with next generation sequencing technology, it is able to detect the methylation status of every cytosine in the genome. However, mapping high-throughput bisulfite reads to the reference genome remains a great challenge due to the increased searching space, reduced complexity of bisulfite sequence, asymmetric cytosine to thymine alignments, and multiple CpG heterogeneous methylation.

Results: We developed an efficient bisulfite reads mapping algorithm BSMAP to address the above issues. BSMAP combines genome hashing and bitwise masking to achieve fast and accurate bisulfite mapping. Compared with existing bisulfite mapping approaches, BSMAP is faster, more sensitive and more flexible.

Conclusion: BSMAP is the first general-purpose bisulfite mapping software. It is able to map high-throughput bisulfite reads at whole genome level with feasible memory and CPU usage. It is freely available under GPL v3 license at http://code.google.com/p/bsmap/.

Show MeSH

Related in: MedlinePlus

Pipeline of bisulfite sequencing. 1) Denaturation: separating Watson and Crick strands; 2) Bisulfite treatment: converting un-methylated cytosines (blue) to uracils; methylated cytosines (red) remain unchanged; 3) PCR amplification of bisulfite-treated sequences resulting in four distinct strands: Bisulfite Watson (BSW), bisulfite Crick (BSC), reverse complement of BSW (BSWR), and reverse complement of BSC (BSCR).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2724425&req=5

Figure 1: Pipeline of bisulfite sequencing. 1) Denaturation: separating Watson and Crick strands; 2) Bisulfite treatment: converting un-methylated cytosines (blue) to uracils; methylated cytosines (red) remain unchanged; 3) PCR amplification of bisulfite-treated sequences resulting in four distinct strands: Bisulfite Watson (BSW), bisulfite Crick (BSC), reverse complement of BSW (BSWR), and reverse complement of BSC (BSCR).

Mentions: Cytosine (C) DNA methylation plays a crucial role in various biological processes such as gene expression, chromatin accessibility, and imprinting, as well as in many diseases including cancer. Over the decades, bisulfite sequencing [1] has remained the gold standard for DNA methylation analysis. Bisulfite treatment of DNA followed by PCR amplification leads to a chemical conversion of unmethylated Cs to Ts without affecting As, Gs, Ts or methylated Cs. This C to T conversion results in non-complementarity in the two strands of DNA (Figure 1). Following strand-specific and locus-specific PCR amplification, direct- or pyro-sequencing is used to determine the methylation ratio of any given C locus as the proportion of remaining Cs in all the sequencing reads. This PCR-based procedure is very labor intensive and time-consuming, and therefore inappropriate for high throughput studies.


BSMAP: whole genome bisulfite sequence MAPping program.

Xi Y, Li W - BMC Bioinformatics (2009)

Pipeline of bisulfite sequencing. 1) Denaturation: separating Watson and Crick strands; 2) Bisulfite treatment: converting un-methylated cytosines (blue) to uracils; methylated cytosines (red) remain unchanged; 3) PCR amplification of bisulfite-treated sequences resulting in four distinct strands: Bisulfite Watson (BSW), bisulfite Crick (BSC), reverse complement of BSW (BSWR), and reverse complement of BSC (BSCR).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2724425&req=5

Figure 1: Pipeline of bisulfite sequencing. 1) Denaturation: separating Watson and Crick strands; 2) Bisulfite treatment: converting un-methylated cytosines (blue) to uracils; methylated cytosines (red) remain unchanged; 3) PCR amplification of bisulfite-treated sequences resulting in four distinct strands: Bisulfite Watson (BSW), bisulfite Crick (BSC), reverse complement of BSW (BSWR), and reverse complement of BSC (BSCR).
Mentions: Cytosine (C) DNA methylation plays a crucial role in various biological processes such as gene expression, chromatin accessibility, and imprinting, as well as in many diseases including cancer. Over the decades, bisulfite sequencing [1] has remained the gold standard for DNA methylation analysis. Bisulfite treatment of DNA followed by PCR amplification leads to a chemical conversion of unmethylated Cs to Ts without affecting As, Gs, Ts or methylated Cs. This C to T conversion results in non-complementarity in the two strands of DNA (Figure 1). Following strand-specific and locus-specific PCR amplification, direct- or pyro-sequencing is used to determine the methylation ratio of any given C locus as the proportion of remaining Cs in all the sequencing reads. This PCR-based procedure is very labor intensive and time-consuming, and therefore inappropriate for high throughput studies.

Bottom Line: However, mapping high-throughput bisulfite reads to the reference genome remains a great challenge due to the increased searching space, reduced complexity of bisulfite sequence, asymmetric cytosine to thymine alignments, and multiple CpG heterogeneous methylation.BSMAP is the first general-purpose bisulfite mapping software.It is able to map high-throughput bisulfite reads at whole genome level with feasible memory and CPU usage.

View Article: PubMed Central - HTML - PubMed

Affiliation: Division of Biostatistics, Dan L Duncan Cancer Center, Department of Molecular and Cellular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA. yxi@bcm.edu

ABSTRACT

Background: Bisulfite sequencing is a powerful technique to study DNA cytosine methylation. Bisulfite treatment followed by PCR amplification specifically converts unmethylated cytosines to thymine. Coupled with next generation sequencing technology, it is able to detect the methylation status of every cytosine in the genome. However, mapping high-throughput bisulfite reads to the reference genome remains a great challenge due to the increased searching space, reduced complexity of bisulfite sequence, asymmetric cytosine to thymine alignments, and multiple CpG heterogeneous methylation.

Results: We developed an efficient bisulfite reads mapping algorithm BSMAP to address the above issues. BSMAP combines genome hashing and bitwise masking to achieve fast and accurate bisulfite mapping. Compared with existing bisulfite mapping approaches, BSMAP is faster, more sensitive and more flexible.

Conclusion: BSMAP is the first general-purpose bisulfite mapping software. It is able to map high-throughput bisulfite reads at whole genome level with feasible memory and CPU usage. It is freely available under GPL v3 license at http://code.google.com/p/bsmap/.

Show MeSH
Related in: MedlinePlus