Limits...
Detection of genomic variation by selection of a 9 mb DNA region and high throughput sequencing.

Nikolaev SI, Iseli C, Sharp AJ, Robyr D, Rougemont J, Gehrig C, Farinelli L, Antonarakis SE - PLoS ONE (2009)

Bottom Line: Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold.We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA.Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland. Sergey.Nikolaev@unige.ch

ABSTRACT
Detection of the rare polymorphisms and causative mutations of genetic diseases in a targeted genomic area has become a major goal in order to understand genomic and phenotypic variability. We have interrogated repeat-masked regions of 8.9 Mb on human chromosomes 21 (7.8 Mb) and 7 (1.1 Mb) from an individual from the International HapMap Project (NA12872). We have optimized a method of genomic selection for high throughput sequencing. Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold. When assaying HapMap SNPs in NA12872, our sequence genotypes are 91.3% concordant in regions with coverage > or = 4-fold, and 97.9% concordant in regions with coverage > or = 15-fold. About 81% of the SNPs recovered with both thresholds are listed in dbSNP. We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA. Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.

Show MeSH

Related in: MedlinePlus

Enrichment of target regions.All sequencing tags that map uniquely to the reference human genome are displayed in blue. The height of peaks corresponds to the number of tags mapping to the same location. Red circles represent genomic areas plotted on the array. The image was generated using Genome Graphs (http://genome.ucsc.edu/cgi-bin/hgGenome).
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2722027&req=5

pone-0006659-g001: Enrichment of target regions.All sequencing tags that map uniquely to the reference human genome are displayed in blue. The height of peaks corresponds to the number of tags mapping to the same location. Red circles represent genomic areas plotted on the array. The image was generated using Genome Graphs (http://genome.ucsc.edu/cgi-bin/hgGenome).

Mentions: In order to characterize sequence variants in a genomic region of interest, we performed microarray-based enrichment using genomic DNA from HapMap individual NA12872. The selected region includes 8.9 Mb of non-repetitive DNA from human chromosome 21 (31.6 Mb–46.9 Mb) and chromosome 7 (115.3 Mb–117.2 Mb). A sample from HapMap individual was chosen in order to compare the variants identified in this study by sequencing with those genotyped previously [30]. The enriched sample was sequenced using an Illumina GAII instrument, and signals extracted using both the Illumina default output and our own Rolexa analysis [27]. The selection procedure resulted in 260-fold enrichment of the target region when the sample was mixed with COT-1 DNA at the hybridization step. Figure 1 displays all 4.1 million sequence tags recovered by Rolexa that map unambiguously to the reference genome. The high density of reads on HSA21 corresponds to the target region plotted on the capture array, and the second enrichment peak on HSA7 corresponds to ENCODE region ENm001.


Detection of genomic variation by selection of a 9 mb DNA region and high throughput sequencing.

Nikolaev SI, Iseli C, Sharp AJ, Robyr D, Rougemont J, Gehrig C, Farinelli L, Antonarakis SE - PLoS ONE (2009)

Enrichment of target regions.All sequencing tags that map uniquely to the reference human genome are displayed in blue. The height of peaks corresponds to the number of tags mapping to the same location. Red circles represent genomic areas plotted on the array. The image was generated using Genome Graphs (http://genome.ucsc.edu/cgi-bin/hgGenome).
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2722027&req=5

pone-0006659-g001: Enrichment of target regions.All sequencing tags that map uniquely to the reference human genome are displayed in blue. The height of peaks corresponds to the number of tags mapping to the same location. Red circles represent genomic areas plotted on the array. The image was generated using Genome Graphs (http://genome.ucsc.edu/cgi-bin/hgGenome).
Mentions: In order to characterize sequence variants in a genomic region of interest, we performed microarray-based enrichment using genomic DNA from HapMap individual NA12872. The selected region includes 8.9 Mb of non-repetitive DNA from human chromosome 21 (31.6 Mb–46.9 Mb) and chromosome 7 (115.3 Mb–117.2 Mb). A sample from HapMap individual was chosen in order to compare the variants identified in this study by sequencing with those genotyped previously [30]. The enriched sample was sequenced using an Illumina GAII instrument, and signals extracted using both the Illumina default output and our own Rolexa analysis [27]. The selection procedure resulted in 260-fold enrichment of the target region when the sample was mixed with COT-1 DNA at the hybridization step. Figure 1 displays all 4.1 million sequence tags recovered by Rolexa that map unambiguously to the reference genome. The high density of reads on HSA21 corresponds to the target region plotted on the capture array, and the second enrichment peak on HSA7 corresponds to ENCODE region ENm001.

Bottom Line: Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold.We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA.Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland. Sergey.Nikolaev@unige.ch

ABSTRACT
Detection of the rare polymorphisms and causative mutations of genetic diseases in a targeted genomic area has become a major goal in order to understand genomic and phenotypic variability. We have interrogated repeat-masked regions of 8.9 Mb on human chromosomes 21 (7.8 Mb) and 7 (1.1 Mb) from an individual from the International HapMap Project (NA12872). We have optimized a method of genomic selection for high throughput sequencing. Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold. When assaying HapMap SNPs in NA12872, our sequence genotypes are 91.3% concordant in regions with coverage > or = 4-fold, and 97.9% concordant in regions with coverage > or = 15-fold. About 81% of the SNPs recovered with both thresholds are listed in dbSNP. We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA. Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.

Show MeSH
Related in: MedlinePlus