Limits...
PacBio-LITS: a large-insert targeted sequencing method for characterization of human disease-associated chromosomal structural variations.

Wang M, Beck CR, English AC, Meng Q, Buhay C, Han Y, Doddapaneni HV, Yu F, Boerwinkle E, Lupski JR, Muzny DM, Gibbs RA - BMC Genomics (2015)

Bottom Line: PacBio-LITS provides deep sequence coverage at the specified sites at substantially reduced cost compared with PacBio WGS.The new method leverages the cost efficiency of targeted capture-sequencing as well as the mappability and scaffolding capabilities of long sequencing reads generated by the PacBio platform.Other genomic research applications, such as haplotype phasing and small insertion and deletion validation could also benefit from this technology.

View Article: PubMed Central - PubMed

Affiliation: Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA. mwang@bcm.edu.

ABSTRACT

Background: Generation of long (>5 Kb) DNA sequencing reads provides an approach for interrogation of complex regions in the human genome. Currently, large-insert whole genome sequencing (WGS) technologies from Pacific Biosciences (PacBio) enable analysis of chromosomal structural variations (SVs), but the cost to achieve the required sequence coverage across the entire human genome is high.

Results: We developed a method (termed PacBio-LITS) that combines oligonucleotide-based DNA target-capture enrichment technologies with PacBio large-insert library preparation to facilitate SV studies at specific chromosomal regions. PacBio-LITS provides deep sequence coverage at the specified sites at substantially reduced cost compared with PacBio WGS. The efficacy of PacBio-LITS is illustrated by delineating the breakpoint junctions of low copy repeat (LCR)-associated complex structural rearrangements on chr17p11.2 in patients diagnosed with Potocki-Lupski syndrome (PTLS; MIM#610883). We successfully identified previously determined breakpoint junctions in three PTLS cases, and also were able to discover novel junctions in repetitive sequences, including LCR-mediated breakpoints. The new information has enabled us to propose mechanisms for formation of these structural variants.

Conclusions: The new method leverages the cost efficiency of targeted capture-sequencing as well as the mappability and scaffolding capabilities of long sequencing reads generated by the PacBio platform. It is therefore suitable for studying complex SVs, especially those involving LCRs, inversions, and the generation of chimeric Alu elements at the breakpoints. Other genomic research applications, such as haplotype phasing and small insertion and deletion validation could also benefit from this technology.

Show MeSH

Related in: MedlinePlus

Delineation of CGRs in PTLS cases. A) CGRs revealed by aCGH. Human chromosome 17p11.2 is illustrated as a horizontal line on the top of the figure with coordinates (Mb) indicated below. Red blocks represent duplicated regions and blue segments indicate triplications. RAI1 is indicated by the vertical gray shadow. Yellow and blue shaded areas represent LCRs; purple arrows indicating the orientation [19]. Vertical black lines define the 7 Mb (14.9-21.9 Mb) targeted by the SMS/PTLS probe set. Individual array results are below the schematics, focused on of copy number alterations. Coordinates coordinates (in Mb) are indicated below arrays. Previously determined junctions are labeled with a “1” so that the rearrangement joins together the two number “1”s, and junctions identified by PacBio-LITS are labeled with a “2”. Data for 2714 [19] and 2695 [23] were published previously. B) Novel breakpoint junction sequences detected with PacBio-LITS. Breakpoint sequences for the three new junctions identified by PacBio-LITS are aligned to the reference sequence. Transitions between the sequences are indicated with different colors, with gray denoting regions of disagreement with the junction sequence. Chromosome 17 coordinates (hg19/GRCh37) are indicated. Red lettering denotes microhomology. The Alu-Alu mediated alignment in BAB2695 has asterisks (*) denoting regions where the two Alu elements do not align. C) Formation of CGRs. Case BAB2714 i) A map of the reference genome. Colored boxes represent sequence blocks. ii) Black arrows indicate the two template switches resulting in the rearrangement. The template switches could also have occurred in the opposite order. iii) The rearranged region, which has an inversion-duplication for the blue sequence block followed by a direct duplication of the red sequence block. Case BAB2695 i) A map of the reference genome. ii) The resultant rearrangement. Both junctions are mediated by Alu elements, and are in a head-to-tail tandem orientation (no inversion).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4376517&req=5

Fig3: Delineation of CGRs in PTLS cases. A) CGRs revealed by aCGH. Human chromosome 17p11.2 is illustrated as a horizontal line on the top of the figure with coordinates (Mb) indicated below. Red blocks represent duplicated regions and blue segments indicate triplications. RAI1 is indicated by the vertical gray shadow. Yellow and blue shaded areas represent LCRs; purple arrows indicating the orientation [19]. Vertical black lines define the 7 Mb (14.9-21.9 Mb) targeted by the SMS/PTLS probe set. Individual array results are below the schematics, focused on of copy number alterations. Coordinates coordinates (in Mb) are indicated below arrays. Previously determined junctions are labeled with a “1” so that the rearrangement joins together the two number “1”s, and junctions identified by PacBio-LITS are labeled with a “2”. Data for 2714 [19] and 2695 [23] were published previously. B) Novel breakpoint junction sequences detected with PacBio-LITS. Breakpoint sequences for the three new junctions identified by PacBio-LITS are aligned to the reference sequence. Transitions between the sequences are indicated with different colors, with gray denoting regions of disagreement with the junction sequence. Chromosome 17 coordinates (hg19/GRCh37) are indicated. Red lettering denotes microhomology. The Alu-Alu mediated alignment in BAB2695 has asterisks (*) denoting regions where the two Alu elements do not align. C) Formation of CGRs. Case BAB2714 i) A map of the reference genome. Colored boxes represent sequence blocks. ii) Black arrows indicate the two template switches resulting in the rearrangement. The template switches could also have occurred in the opposite order. iii) The rearranged region, which has an inversion-duplication for the blue sequence block followed by a direct duplication of the red sequence block. Case BAB2695 i) A map of the reference genome. ii) The resultant rearrangement. Both junctions are mediated by Alu elements, and are in a head-to-tail tandem orientation (no inversion).

Mentions: Three PTLS cases (BAB2714, BAB2695 and BAB3793) involving CGRs were selected for investigation. The first two cases (BAB2714 and BAB2695) each represented prior partial characterizations of the PTLS region, where previously determined breakpoint junctions may indicate the success of the method. Both of these CGRs harbor four copy number transitions, but previously, only one breakpoint had been elucidated in each patient (Figure 3a, 1 to 1 for BAB2714 and BAB2695) [19,23]. These two previously described breakpoint junctions occurred between Alu elements, yielding chimeric AluY elements with 33 and 31 bp of microhomology in BAB2714 and BAB2695, respectively [19,23]. However, these CGRs required the sequence of a second breakpoint to fully resolve the SVs (Figure 3a, 2 to 2). Interestingly, the previously undetermined breakpoint junctions in these two patients have one end within an LCR, leading to large uncertainty regions (~62-230 Kb) in the aCGH data, and difficulty mapping the junctions at sequence resolution for breakpoint junctions (Figure 3a). BAB3793 was a new case and not previously published.Figure 3


PacBio-LITS: a large-insert targeted sequencing method for characterization of human disease-associated chromosomal structural variations.

Wang M, Beck CR, English AC, Meng Q, Buhay C, Han Y, Doddapaneni HV, Yu F, Boerwinkle E, Lupski JR, Muzny DM, Gibbs RA - BMC Genomics (2015)

Delineation of CGRs in PTLS cases. A) CGRs revealed by aCGH. Human chromosome 17p11.2 is illustrated as a horizontal line on the top of the figure with coordinates (Mb) indicated below. Red blocks represent duplicated regions and blue segments indicate triplications. RAI1 is indicated by the vertical gray shadow. Yellow and blue shaded areas represent LCRs; purple arrows indicating the orientation [19]. Vertical black lines define the 7 Mb (14.9-21.9 Mb) targeted by the SMS/PTLS probe set. Individual array results are below the schematics, focused on of copy number alterations. Coordinates coordinates (in Mb) are indicated below arrays. Previously determined junctions are labeled with a “1” so that the rearrangement joins together the two number “1”s, and junctions identified by PacBio-LITS are labeled with a “2”. Data for 2714 [19] and 2695 [23] were published previously. B) Novel breakpoint junction sequences detected with PacBio-LITS. Breakpoint sequences for the three new junctions identified by PacBio-LITS are aligned to the reference sequence. Transitions between the sequences are indicated with different colors, with gray denoting regions of disagreement with the junction sequence. Chromosome 17 coordinates (hg19/GRCh37) are indicated. Red lettering denotes microhomology. The Alu-Alu mediated alignment in BAB2695 has asterisks (*) denoting regions where the two Alu elements do not align. C) Formation of CGRs. Case BAB2714 i) A map of the reference genome. Colored boxes represent sequence blocks. ii) Black arrows indicate the two template switches resulting in the rearrangement. The template switches could also have occurred in the opposite order. iii) The rearranged region, which has an inversion-duplication for the blue sequence block followed by a direct duplication of the red sequence block. Case BAB2695 i) A map of the reference genome. ii) The resultant rearrangement. Both junctions are mediated by Alu elements, and are in a head-to-tail tandem orientation (no inversion).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4376517&req=5

Fig3: Delineation of CGRs in PTLS cases. A) CGRs revealed by aCGH. Human chromosome 17p11.2 is illustrated as a horizontal line on the top of the figure with coordinates (Mb) indicated below. Red blocks represent duplicated regions and blue segments indicate triplications. RAI1 is indicated by the vertical gray shadow. Yellow and blue shaded areas represent LCRs; purple arrows indicating the orientation [19]. Vertical black lines define the 7 Mb (14.9-21.9 Mb) targeted by the SMS/PTLS probe set. Individual array results are below the schematics, focused on of copy number alterations. Coordinates coordinates (in Mb) are indicated below arrays. Previously determined junctions are labeled with a “1” so that the rearrangement joins together the two number “1”s, and junctions identified by PacBio-LITS are labeled with a “2”. Data for 2714 [19] and 2695 [23] were published previously. B) Novel breakpoint junction sequences detected with PacBio-LITS. Breakpoint sequences for the three new junctions identified by PacBio-LITS are aligned to the reference sequence. Transitions between the sequences are indicated with different colors, with gray denoting regions of disagreement with the junction sequence. Chromosome 17 coordinates (hg19/GRCh37) are indicated. Red lettering denotes microhomology. The Alu-Alu mediated alignment in BAB2695 has asterisks (*) denoting regions where the two Alu elements do not align. C) Formation of CGRs. Case BAB2714 i) A map of the reference genome. Colored boxes represent sequence blocks. ii) Black arrows indicate the two template switches resulting in the rearrangement. The template switches could also have occurred in the opposite order. iii) The rearranged region, which has an inversion-duplication for the blue sequence block followed by a direct duplication of the red sequence block. Case BAB2695 i) A map of the reference genome. ii) The resultant rearrangement. Both junctions are mediated by Alu elements, and are in a head-to-tail tandem orientation (no inversion).
Mentions: Three PTLS cases (BAB2714, BAB2695 and BAB3793) involving CGRs were selected for investigation. The first two cases (BAB2714 and BAB2695) each represented prior partial characterizations of the PTLS region, where previously determined breakpoint junctions may indicate the success of the method. Both of these CGRs harbor four copy number transitions, but previously, only one breakpoint had been elucidated in each patient (Figure 3a, 1 to 1 for BAB2714 and BAB2695) [19,23]. These two previously described breakpoint junctions occurred between Alu elements, yielding chimeric AluY elements with 33 and 31 bp of microhomology in BAB2714 and BAB2695, respectively [19,23]. However, these CGRs required the sequence of a second breakpoint to fully resolve the SVs (Figure 3a, 2 to 2). Interestingly, the previously undetermined breakpoint junctions in these two patients have one end within an LCR, leading to large uncertainty regions (~62-230 Kb) in the aCGH data, and difficulty mapping the junctions at sequence resolution for breakpoint junctions (Figure 3a). BAB3793 was a new case and not previously published.Figure 3

Bottom Line: PacBio-LITS provides deep sequence coverage at the specified sites at substantially reduced cost compared with PacBio WGS.The new method leverages the cost efficiency of targeted capture-sequencing as well as the mappability and scaffolding capabilities of long sequencing reads generated by the PacBio platform.Other genomic research applications, such as haplotype phasing and small insertion and deletion validation could also benefit from this technology.

View Article: PubMed Central - PubMed

Affiliation: Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA. mwang@bcm.edu.

ABSTRACT

Background: Generation of long (>5 Kb) DNA sequencing reads provides an approach for interrogation of complex regions in the human genome. Currently, large-insert whole genome sequencing (WGS) technologies from Pacific Biosciences (PacBio) enable analysis of chromosomal structural variations (SVs), but the cost to achieve the required sequence coverage across the entire human genome is high.

Results: We developed a method (termed PacBio-LITS) that combines oligonucleotide-based DNA target-capture enrichment technologies with PacBio large-insert library preparation to facilitate SV studies at specific chromosomal regions. PacBio-LITS provides deep sequence coverage at the specified sites at substantially reduced cost compared with PacBio WGS. The efficacy of PacBio-LITS is illustrated by delineating the breakpoint junctions of low copy repeat (LCR)-associated complex structural rearrangements on chr17p11.2 in patients diagnosed with Potocki-Lupski syndrome (PTLS; MIM#610883). We successfully identified previously determined breakpoint junctions in three PTLS cases, and also were able to discover novel junctions in repetitive sequences, including LCR-mediated breakpoints. The new information has enabled us to propose mechanisms for formation of these structural variants.

Conclusions: The new method leverages the cost efficiency of targeted capture-sequencing as well as the mappability and scaffolding capabilities of long sequencing reads generated by the PacBio platform. It is therefore suitable for studying complex SVs, especially those involving LCRs, inversions, and the generation of chimeric Alu elements at the breakpoints. Other genomic research applications, such as haplotype phasing and small insertion and deletion validation could also benefit from this technology.

Show MeSH
Related in: MedlinePlus