Limits...
Bioinformatic Challenges in Clinical Diagnostic Application of Targeted Next Generation Sequencing: Experience from Pheochromocytoma.

Crona J, Ljungström V, Welin S, Walz MK, Hellman P, Björklund P - PLoS ONE (2015)

Bottom Line: Recent studies have demonstrated equal quality of targeted next generation sequencing (NGS) compared to Sanger Sequencing.We conclude that targeted next generation sequencing have equal quality compared to Sanger sequencing.As acceptable accuracy was noted for a fully automated bioinformatic workflow, we suggest that processing of NGS data could be performed without expert bioinformatics skills utilizing already existing commercially available bioinformatics tools.

View Article: PubMed Central - PubMed

Affiliation: Department of Surgical Sciences, Uppsala University, SE-75185, Uppsala, Sweden.

ABSTRACT

Background: Recent studies have demonstrated equal quality of targeted next generation sequencing (NGS) compared to Sanger Sequencing. Whereas these novel sequencing processes have a validated robust performance, choice of enrichment method and different available bioinformatic software as reliable analysis tool needs to be further investigated in a diagnostic setting.

Methods: DNA from 21 patients with genetic variants in SDHB, VHL, EPAS1, RET, (n=17) or clinical criteria of NF1 syndrome (n=4) were included. Targeted NGS was performed using Truseq custom amplicon enrichment sequenced on an Illumina MiSEQ instrument. Results were analysed in parallel using three different bioinformatics pipelines; (1) Commercially available MiSEQ Reporter, fully automatized and integrated software, (2) CLC Genomics Workbench, graphical interface based software, also commercially available, and ICP (3) an in-house scripted custom bioinformatic tool.

Results: A tenfold read coverage was achieved in between 95-98% of targeted bases. All workflows had alignment of reads to SDHA and NF1 pseudogenes. Compared to Sanger sequencing, variant calling revealed a sensitivity ranging from 83 to 100% and a specificity of 99.9-100%. Only MiSEQ reporter identified all pathogenic variants in both sequencing runs.

Conclusions: We conclude that targeted next generation sequencing have equal quality compared to Sanger sequencing. Enrichment specificity and the bioinformatic performance need to be carefully assessed in a diagnostic setting. As acceptable accuracy was noted for a fully automated bioinformatic workflow, we suggest that processing of NGS data could be performed without expert bioinformatics skills utilizing already existing commercially available bioinformatics tools.

No MeSH data available.


Venn diagram of overlapping variants between the two sequencing runs, total (all variants available at bases annotated for the 11 included genes) and non synonymous remaining variants after filtering synonymous variants with no calculated splice site disruption.MSR; MiSEQ Reporter, CLC; CLC Genomics Workbench, and ICP; In-house custom pipeline.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4521794&req=5

pone.0133210.g005: Venn diagram of overlapping variants between the two sequencing runs, total (all variants available at bases annotated for the 11 included genes) and non synonymous remaining variants after filtering synonymous variants with no calculated splice site disruption.MSR; MiSEQ Reporter, CLC; CLC Genomics Workbench, and ICP; In-house custom pipeline.

Mentions: Results from variant calling are presented in detail in Table 4 and Figs 4 and 5. Variant calling revealed a total of 1525 (MSR, Run01; 1418, Run02; 1409), 768 (CLC Run01; 740, Run02; 738) and 1880 (ICP Run01; 1732, Run02; 1747) variants located in the targeted genes. Subsequent filtering of synonymous variants with no probable splice effect resulted in 321 (MSR), 87 (CLC) and 305 (ICP) remaining variants. Out of 47 variants detected by Sanger sequencing, MSR detected all 47 variants in both sequencing runs, CLC detected 39 (run01) and 40 (run02) variants and ICP detected 42 (run01) and 43 (run02) variants. Results from variant calling corresponded to a sensitivity of 100/100% (Run01/02 MSR), 82,9/85,1% (Run 01/02 CLC) and 89,4/91,4% (Run01/02 ICP). CLC did not detect VHL p.Tyr98His (run01, patient 3), EPAS1 p.Leu529Pro (Run 01 and 02, patient 8), RET p.Cys611Tyr (Run01, patient 11) and NF1 p.Arg1241* (run01, patient 19). SDHA p.Tyr629Phe was not detected by CLC or freebayse in any of the sequencing runs. The specificity was >99.99% for MSR and ICP while CLC had a perfect 100% specificity (Table 5). The number of false positive variants could be reduced by removal of variants not available in both sequencing runs in the MSR and ICP workflows. In total 17% of variants were reported among all workflows and about 60% were specific to a single workflow.


Bioinformatic Challenges in Clinical Diagnostic Application of Targeted Next Generation Sequencing: Experience from Pheochromocytoma.

Crona J, Ljungström V, Welin S, Walz MK, Hellman P, Björklund P - PLoS ONE (2015)

Venn diagram of overlapping variants between the two sequencing runs, total (all variants available at bases annotated for the 11 included genes) and non synonymous remaining variants after filtering synonymous variants with no calculated splice site disruption.MSR; MiSEQ Reporter, CLC; CLC Genomics Workbench, and ICP; In-house custom pipeline.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4521794&req=5

pone.0133210.g005: Venn diagram of overlapping variants between the two sequencing runs, total (all variants available at bases annotated for the 11 included genes) and non synonymous remaining variants after filtering synonymous variants with no calculated splice site disruption.MSR; MiSEQ Reporter, CLC; CLC Genomics Workbench, and ICP; In-house custom pipeline.
Mentions: Results from variant calling are presented in detail in Table 4 and Figs 4 and 5. Variant calling revealed a total of 1525 (MSR, Run01; 1418, Run02; 1409), 768 (CLC Run01; 740, Run02; 738) and 1880 (ICP Run01; 1732, Run02; 1747) variants located in the targeted genes. Subsequent filtering of synonymous variants with no probable splice effect resulted in 321 (MSR), 87 (CLC) and 305 (ICP) remaining variants. Out of 47 variants detected by Sanger sequencing, MSR detected all 47 variants in both sequencing runs, CLC detected 39 (run01) and 40 (run02) variants and ICP detected 42 (run01) and 43 (run02) variants. Results from variant calling corresponded to a sensitivity of 100/100% (Run01/02 MSR), 82,9/85,1% (Run 01/02 CLC) and 89,4/91,4% (Run01/02 ICP). CLC did not detect VHL p.Tyr98His (run01, patient 3), EPAS1 p.Leu529Pro (Run 01 and 02, patient 8), RET p.Cys611Tyr (Run01, patient 11) and NF1 p.Arg1241* (run01, patient 19). SDHA p.Tyr629Phe was not detected by CLC or freebayse in any of the sequencing runs. The specificity was >99.99% for MSR and ICP while CLC had a perfect 100% specificity (Table 5). The number of false positive variants could be reduced by removal of variants not available in both sequencing runs in the MSR and ICP workflows. In total 17% of variants were reported among all workflows and about 60% were specific to a single workflow.

Bottom Line: Recent studies have demonstrated equal quality of targeted next generation sequencing (NGS) compared to Sanger Sequencing.We conclude that targeted next generation sequencing have equal quality compared to Sanger sequencing.As acceptable accuracy was noted for a fully automated bioinformatic workflow, we suggest that processing of NGS data could be performed without expert bioinformatics skills utilizing already existing commercially available bioinformatics tools.

View Article: PubMed Central - PubMed

Affiliation: Department of Surgical Sciences, Uppsala University, SE-75185, Uppsala, Sweden.

ABSTRACT

Background: Recent studies have demonstrated equal quality of targeted next generation sequencing (NGS) compared to Sanger Sequencing. Whereas these novel sequencing processes have a validated robust performance, choice of enrichment method and different available bioinformatic software as reliable analysis tool needs to be further investigated in a diagnostic setting.

Methods: DNA from 21 patients with genetic variants in SDHB, VHL, EPAS1, RET, (n=17) or clinical criteria of NF1 syndrome (n=4) were included. Targeted NGS was performed using Truseq custom amplicon enrichment sequenced on an Illumina MiSEQ instrument. Results were analysed in parallel using three different bioinformatics pipelines; (1) Commercially available MiSEQ Reporter, fully automatized and integrated software, (2) CLC Genomics Workbench, graphical interface based software, also commercially available, and ICP (3) an in-house scripted custom bioinformatic tool.

Results: A tenfold read coverage was achieved in between 95-98% of targeted bases. All workflows had alignment of reads to SDHA and NF1 pseudogenes. Compared to Sanger sequencing, variant calling revealed a sensitivity ranging from 83 to 100% and a specificity of 99.9-100%. Only MiSEQ reporter identified all pathogenic variants in both sequencing runs.

Conclusions: We conclude that targeted next generation sequencing have equal quality compared to Sanger sequencing. Enrichment specificity and the bioinformatic performance need to be carefully assessed in a diagnostic setting. As acceptable accuracy was noted for a fully automated bioinformatic workflow, we suggest that processing of NGS data could be performed without expert bioinformatics skills utilizing already existing commercially available bioinformatics tools.

No MeSH data available.