Limits...
High-Quality Exome Sequencing of Whole-Genome Amplified Neonatal Dried Blood Spot DNA.

Poulsen JB, Lescai F, Grove J, Bækvad-Hansen M, Christiansen M, Hagen CM, Maller J, Stevens C, Li S, Li Q, Sun J, Wang J, Nordentoft M, Werge TM, Mortensen PB, Børglum AD, Daly M, Hougaard DM, Bybjerg-Grauholm J, Hollegaard MV - PLoS ONE (2016)

Bottom Line: Following sequencing and data analysis, we compared pairwise variant calls to obtain a measure of similarity--the concordance rate.The wgaDNA performed similarly to matched high-quality reference--whole-blood DNA--based on concordance rates calculated from variant calls.No differences were observed substituting 2x3.2 with 2x1.6 mm discs, allowing for additional reduction of sample material in future projects.

View Article: PubMed Central - PubMed

Affiliation: Department for Congenital Disorders, Danish Centre for Neonatal Screening, Section of Neonatal Genetics, Statens Serum Institut, Copenhagen, Denmark.

ABSTRACT
Stored neonatal dried blood spot (DBS) samples from neonatal screening programmes are a valuable diagnostic and research resource. Combined with information from national health registries they can be used in population-based studies of genetic diseases. DNA extracted from neonatal DBSs can be amplified to obtain micrograms of an otherwise limited resource, referred to as whole-genome amplified DNA (wgaDNA). Here we investigate the robustness of exome sequencing of wgaDNA of neonatal DBS samples. We conducted three pilot studies of seven, eight and seven subjects, respectively. For each subject we analysed a neonatal DBS sample and corresponding adult whole-blood (WB) reference sample. Different DNA sample types were prepared for each of the subjects. Pilot 1: wgaDNA of 2x3.2mm neonatal DBSs (DBS_2x3.2) and raw DNA extract of the WB reference sample (WB_ref). Pilot 2: DBS_2x3.2, WB_ref and a WB_ref replica sharing DNA extract with the WB_ref sample. Pilot 3: DBS_2x3.2, WB_ref, wgaDNA of 2x1.6 mm neonatal DBSs and wgaDNA of the WB reference sample. Following sequencing and data analysis, we compared pairwise variant calls to obtain a measure of similarity--the concordance rate. Concordance rates were slightly lower when comparing DBS vs WB sample types than for any two WB sample types of the same subject before filtering of the variant calls. The overall concordance rates were dependent on the variant type, with SNPs performing best. Post-filtering, the comparisons of DBS vs WB and WB vs WB sample types yielded similar concordance rates, with values close to 100%. WgaDNA of neonatal DBS samples performs with great accuracy and efficiency in exome sequencing. The wgaDNA performed similarly to matched high-quality reference--whole-blood DNA--based on concordance rates calculated from variant calls. No differences were observed substituting 2x3.2 with 2x1.6 mm discs, allowing for additional reduction of sample material in future projects.

No MeSH data available.


Related in: MedlinePlus

Exome coverage by depth.The data were presented with a box plot as percentage of exome coverage with sequencing depths greater than 30. The exact medians of the observations have been listed above the plot. From left to right, the pilots were depicted in the order: Pilot 1 with DBS_2x3.2 and WB_ref sample types, Pilot 2 with DBS_2x3.2, WB_ref and WB_ref replica sample types and Pilot 3 with DBS_2x1.6, DBS_2x3.2, WB_WGA_ref and WB_ref sample types, respectively. In the plot, the medians are given by a solid black line enclosed in boxes specifying the first and third quartiles. The whiskers represent the statistical dispersion of the data using the interquartile range (1.5*IQR). Data beyond 1.5*IQR range are outliers and plotted as dots. Note that the number of observations per sample type (not considering DBS_2x1.6, see below) equals the number of subjects included in the pilot, i.e. Pilot 1 = 7, Pilot 2 = 8 and Pilot 3 = 7. The DBS_2x1.6 sample type was plotted using all 21 observations, resulting from the triplicate experiments per subject of different sets of 2x1.6 mm discs included in Pilot 3. The coverage statistics were calculated with GATK using R to obtain the percentage of exome coverage per sample type shown.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4835089&req=5

pone.0153253.g002: Exome coverage by depth.The data were presented with a box plot as percentage of exome coverage with sequencing depths greater than 30. The exact medians of the observations have been listed above the plot. From left to right, the pilots were depicted in the order: Pilot 1 with DBS_2x3.2 and WB_ref sample types, Pilot 2 with DBS_2x3.2, WB_ref and WB_ref replica sample types and Pilot 3 with DBS_2x1.6, DBS_2x3.2, WB_WGA_ref and WB_ref sample types, respectively. In the plot, the medians are given by a solid black line enclosed in boxes specifying the first and third quartiles. The whiskers represent the statistical dispersion of the data using the interquartile range (1.5*IQR). Data beyond 1.5*IQR range are outliers and plotted as dots. Note that the number of observations per sample type (not considering DBS_2x1.6, see below) equals the number of subjects included in the pilot, i.e. Pilot 1 = 7, Pilot 2 = 8 and Pilot 3 = 7. The DBS_2x1.6 sample type was plotted using all 21 observations, resulting from the triplicate experiments per subject of different sets of 2x1.6 mm discs included in Pilot 3. The coverage statistics were calculated with GATK using R to obtain the percentage of exome coverage per sample type shown.

Mentions: Defining the required coverage is a subjective matter much depending on species, type of sample and scope of study. At a given site in a diploid genome, a depth threshold of 30X can be considered sufficient for calling high-quality variants. Defining this as our minimum threshold, we analysed the proportion of target regions in our data, displaying an average depth above 30X in each of the sample types (Fig 2). The statistics reveal a clear-cut difference in percentage of sequences reaching the standard threshold of 30X between the DBS and corresponding WB samples with the former exhibiting reduced threshold coverage. For the DBS_2x3.2 and WB_ref sample types, this difference was ~10.7% in Pilot 1, ~5.9% in Pilot 2 and ~6.9% in Pilot 3. Despite hereof, the results are still in an acceptable range for the DBS samples (between ~60–75% of exome coverage >30X) as compared with other samples for which high-quality WES data have been obtained [24, 25]. We also calculated the medians of percentage of exome coverage at depth >10X. For Pilot 1 this was 94.9% and 97.6% for the DBS_2x3.2 and WB_ref sample types, respectively. Pilot 2: 86.9%, 93.7% and 94.1% for the DBS_2x3.2, WB_ref and WB_ref_replica sample types, respectively. Pilot 3: 88.8%, 87.9%, 92.5% and 92.1% for the DBS_2x1.6, DBS_2x3.2, WB_WGA_ref and WB_ref sample types, respectively. At >10X, a reduced threshold coverage of ~2.7% in Pilot 1, ~6.8% in Pilot 2 and ~4.2% in Pilot 3 existed for the DBS_2x3.2 sample type as compared to the WB_ref sample type. We found no differences between the DBS_2x3.2 and DBS_2x1.6 sample types of Pilot 3, suggesting that coverage is independent of disc size in the range tested. In Pilot 3 in particular, the analyses have been run in triplicates, allowing us to perform a technical replication and verify the absence of random bias. We have not conducted further investigation into why the coverage is apparently reduced in WGA samples. Duplicate reads percentages was found to depend more on kit than on whether amplification was used. Respectively for pilot one/two/three Duplication rates for WB were 18%/9%/14%, for DBS they were 18%/9%/16%. The scope of this study was to investigate whether confidently discovered variant were reliable, we leave it to future studies to elucidate if genomic regions are lost. We conclude that the data quality is of a reasonable standard to proceed onwards launching the subsequent steps of variant calling.


High-Quality Exome Sequencing of Whole-Genome Amplified Neonatal Dried Blood Spot DNA.

Poulsen JB, Lescai F, Grove J, Bækvad-Hansen M, Christiansen M, Hagen CM, Maller J, Stevens C, Li S, Li Q, Sun J, Wang J, Nordentoft M, Werge TM, Mortensen PB, Børglum AD, Daly M, Hougaard DM, Bybjerg-Grauholm J, Hollegaard MV - PLoS ONE (2016)

Exome coverage by depth.The data were presented with a box plot as percentage of exome coverage with sequencing depths greater than 30. The exact medians of the observations have been listed above the plot. From left to right, the pilots were depicted in the order: Pilot 1 with DBS_2x3.2 and WB_ref sample types, Pilot 2 with DBS_2x3.2, WB_ref and WB_ref replica sample types and Pilot 3 with DBS_2x1.6, DBS_2x3.2, WB_WGA_ref and WB_ref sample types, respectively. In the plot, the medians are given by a solid black line enclosed in boxes specifying the first and third quartiles. The whiskers represent the statistical dispersion of the data using the interquartile range (1.5*IQR). Data beyond 1.5*IQR range are outliers and plotted as dots. Note that the number of observations per sample type (not considering DBS_2x1.6, see below) equals the number of subjects included in the pilot, i.e. Pilot 1 = 7, Pilot 2 = 8 and Pilot 3 = 7. The DBS_2x1.6 sample type was plotted using all 21 observations, resulting from the triplicate experiments per subject of different sets of 2x1.6 mm discs included in Pilot 3. The coverage statistics were calculated with GATK using R to obtain the percentage of exome coverage per sample type shown.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4835089&req=5

pone.0153253.g002: Exome coverage by depth.The data were presented with a box plot as percentage of exome coverage with sequencing depths greater than 30. The exact medians of the observations have been listed above the plot. From left to right, the pilots were depicted in the order: Pilot 1 with DBS_2x3.2 and WB_ref sample types, Pilot 2 with DBS_2x3.2, WB_ref and WB_ref replica sample types and Pilot 3 with DBS_2x1.6, DBS_2x3.2, WB_WGA_ref and WB_ref sample types, respectively. In the plot, the medians are given by a solid black line enclosed in boxes specifying the first and third quartiles. The whiskers represent the statistical dispersion of the data using the interquartile range (1.5*IQR). Data beyond 1.5*IQR range are outliers and plotted as dots. Note that the number of observations per sample type (not considering DBS_2x1.6, see below) equals the number of subjects included in the pilot, i.e. Pilot 1 = 7, Pilot 2 = 8 and Pilot 3 = 7. The DBS_2x1.6 sample type was plotted using all 21 observations, resulting from the triplicate experiments per subject of different sets of 2x1.6 mm discs included in Pilot 3. The coverage statistics were calculated with GATK using R to obtain the percentage of exome coverage per sample type shown.
Mentions: Defining the required coverage is a subjective matter much depending on species, type of sample and scope of study. At a given site in a diploid genome, a depth threshold of 30X can be considered sufficient for calling high-quality variants. Defining this as our minimum threshold, we analysed the proportion of target regions in our data, displaying an average depth above 30X in each of the sample types (Fig 2). The statistics reveal a clear-cut difference in percentage of sequences reaching the standard threshold of 30X between the DBS and corresponding WB samples with the former exhibiting reduced threshold coverage. For the DBS_2x3.2 and WB_ref sample types, this difference was ~10.7% in Pilot 1, ~5.9% in Pilot 2 and ~6.9% in Pilot 3. Despite hereof, the results are still in an acceptable range for the DBS samples (between ~60–75% of exome coverage >30X) as compared with other samples for which high-quality WES data have been obtained [24, 25]. We also calculated the medians of percentage of exome coverage at depth >10X. For Pilot 1 this was 94.9% and 97.6% for the DBS_2x3.2 and WB_ref sample types, respectively. Pilot 2: 86.9%, 93.7% and 94.1% for the DBS_2x3.2, WB_ref and WB_ref_replica sample types, respectively. Pilot 3: 88.8%, 87.9%, 92.5% and 92.1% for the DBS_2x1.6, DBS_2x3.2, WB_WGA_ref and WB_ref sample types, respectively. At >10X, a reduced threshold coverage of ~2.7% in Pilot 1, ~6.8% in Pilot 2 and ~4.2% in Pilot 3 existed for the DBS_2x3.2 sample type as compared to the WB_ref sample type. We found no differences between the DBS_2x3.2 and DBS_2x1.6 sample types of Pilot 3, suggesting that coverage is independent of disc size in the range tested. In Pilot 3 in particular, the analyses have been run in triplicates, allowing us to perform a technical replication and verify the absence of random bias. We have not conducted further investigation into why the coverage is apparently reduced in WGA samples. Duplicate reads percentages was found to depend more on kit than on whether amplification was used. Respectively for pilot one/two/three Duplication rates for WB were 18%/9%/14%, for DBS they were 18%/9%/16%. The scope of this study was to investigate whether confidently discovered variant were reliable, we leave it to future studies to elucidate if genomic regions are lost. We conclude that the data quality is of a reasonable standard to proceed onwards launching the subsequent steps of variant calling.

Bottom Line: Following sequencing and data analysis, we compared pairwise variant calls to obtain a measure of similarity--the concordance rate.The wgaDNA performed similarly to matched high-quality reference--whole-blood DNA--based on concordance rates calculated from variant calls.No differences were observed substituting 2x3.2 with 2x1.6 mm discs, allowing for additional reduction of sample material in future projects.

View Article: PubMed Central - PubMed

Affiliation: Department for Congenital Disorders, Danish Centre for Neonatal Screening, Section of Neonatal Genetics, Statens Serum Institut, Copenhagen, Denmark.

ABSTRACT
Stored neonatal dried blood spot (DBS) samples from neonatal screening programmes are a valuable diagnostic and research resource. Combined with information from national health registries they can be used in population-based studies of genetic diseases. DNA extracted from neonatal DBSs can be amplified to obtain micrograms of an otherwise limited resource, referred to as whole-genome amplified DNA (wgaDNA). Here we investigate the robustness of exome sequencing of wgaDNA of neonatal DBS samples. We conducted three pilot studies of seven, eight and seven subjects, respectively. For each subject we analysed a neonatal DBS sample and corresponding adult whole-blood (WB) reference sample. Different DNA sample types were prepared for each of the subjects. Pilot 1: wgaDNA of 2x3.2mm neonatal DBSs (DBS_2x3.2) and raw DNA extract of the WB reference sample (WB_ref). Pilot 2: DBS_2x3.2, WB_ref and a WB_ref replica sharing DNA extract with the WB_ref sample. Pilot 3: DBS_2x3.2, WB_ref, wgaDNA of 2x1.6 mm neonatal DBSs and wgaDNA of the WB reference sample. Following sequencing and data analysis, we compared pairwise variant calls to obtain a measure of similarity--the concordance rate. Concordance rates were slightly lower when comparing DBS vs WB sample types than for any two WB sample types of the same subject before filtering of the variant calls. The overall concordance rates were dependent on the variant type, with SNPs performing best. Post-filtering, the comparisons of DBS vs WB and WB vs WB sample types yielded similar concordance rates, with values close to 100%. WgaDNA of neonatal DBS samples performs with great accuracy and efficiency in exome sequencing. The wgaDNA performed similarly to matched high-quality reference--whole-blood DNA--based on concordance rates calculated from variant calls. No differences were observed substituting 2x3.2 with 2x1.6 mm discs, allowing for additional reduction of sample material in future projects.

No MeSH data available.


Related in: MedlinePlus