Limits...
Best practices for evaluating single nucleotide variant calling methods for microbial genomics.

Olson ND, Lund SP, Colman RE, Foster JT, Sahl JW, Schupp JM, Keim P, Morrow JB, Salit ML, Zook JM - Front Genet (2015)

Bottom Line: As experience grows, researchers increasingly recognize that analyzing the wealth of data provided by these new sequencing platforms requires careful attention to detail for robust results.Missing, however, is a focus on critical evaluation of variant callers for these genomes.Variant calling is essential for comparative genomics as it yields insights into nucleotide-level organismal differences.

View Article: PubMed Central - PubMed

Affiliation: Biosystems and Biomaterials Division, Material Measurement Laboratory, National Institute of Standards and Technology , Gaithersburg, MD, USA.

ABSTRACT
Innovations in sequencing technologies have allowed biologists to make incredible advances in understanding biological systems. As experience grows, researchers increasingly recognize that analyzing the wealth of data provided by these new sequencing platforms requires careful attention to detail for robust results. Thus far, much of the scientific Communit's focus for use in bacterial genomics has been on evaluating genome assembly algorithms and rigorously validating assembly program performance. Missing, however, is a focus on critical evaluation of variant callers for these genomes. Variant calling is essential for comparative genomics as it yields insights into nucleotide-level organismal differences. Variant calling is a multistep process with a host of potential error sources that may lead to incorrect variant calls. Identifying and resolving these incorrect calls is critical for bacterial genomics to advance. The goal of this review is to provide guidance on validating algorithms and pipelines used in variant calling for bacterial genomics. First, we will provide an overview of the variant calling procedures and the potential sources of error associated with the methods. We will then identify appropriate datasets for use in evaluating algorithms and describe statistical methods for evaluating algorithm performance. As variant calling moves from basic research to the applied setting, standardized methods for performance evaluation and reporting are required; it is our hope that this review provides the groundwork for the development of these standards.

No MeSH data available.


Related in: MedlinePlus

Scatter plot showing the relationship between two performance metrics for variant call sets. The individual data points are based on metrics calculated from static contingency tables. The error bars represent the 95% confidence interval for each performance metric.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4493402&req=5

Figure 5: Scatter plot showing the relationship between two performance metrics for variant call sets. The individual data points are based on metrics calculated from static contingency tables. The error bars represent the 95% confidence interval for each performance metric.

Mentions: Plotting the value of a performance metric as a function of the chosen threshold value for the continuous variable can be useful when comparing algorithm performance over a range of cutoff values. Figure 4 depicts the relationship between algorithm performance and variant quality values. The boxplots show how method A has higher FP and TP rates compared to B for a single threshold value. The dynamic plot with the smoothed data shows the range of threshold values for which this relationship holds. The range of threshold values for which a trend (e.g., method A has a higher FP rate compared to method B) is observed is an indication of the robustness of the trend. The ideal values for FP and TP rates are 0 and 1 respectively. As method A has higher values for both metrics compared to method B, a trade-off between accepting FP vs. FN will determine which method to use. Visualizing two metrics plotted against one another can more clearly present the trade-off between the methods. To compare method performance for two metrics with a fixed threshold, a scatter plot can be used (Figure 5). When considering a range of possible thresholds, one metric can be plotted as a function of a second metric (Figure 6). A ROC curve is an example of a comparison between two performance metrics over a range of threshold values. Values of the threshold or additional metrics can be indicated through variations in symbol or line color, width, or pattern. When many metrics are presented, it may prove most effective to plot each metric against a common variable, such as threshold. The uncertainty of the metrics can be presented in a number of ways. Here the uncertainty is represented qualitatively through the comparison of the two methods relative performance across a collection of 16 replicates. The uncertainty of the metrics can also be presented as error bars representing the uncertainty of one metrics given a fixed value for the second metric (marginal distributions), or the combined distribution of both metrics (joint distribution).


Best practices for evaluating single nucleotide variant calling methods for microbial genomics.

Olson ND, Lund SP, Colman RE, Foster JT, Sahl JW, Schupp JM, Keim P, Morrow JB, Salit ML, Zook JM - Front Genet (2015)

Scatter plot showing the relationship between two performance metrics for variant call sets. The individual data points are based on metrics calculated from static contingency tables. The error bars represent the 95% confidence interval for each performance metric.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4493402&req=5

Figure 5: Scatter plot showing the relationship between two performance metrics for variant call sets. The individual data points are based on metrics calculated from static contingency tables. The error bars represent the 95% confidence interval for each performance metric.
Mentions: Plotting the value of a performance metric as a function of the chosen threshold value for the continuous variable can be useful when comparing algorithm performance over a range of cutoff values. Figure 4 depicts the relationship between algorithm performance and variant quality values. The boxplots show how method A has higher FP and TP rates compared to B for a single threshold value. The dynamic plot with the smoothed data shows the range of threshold values for which this relationship holds. The range of threshold values for which a trend (e.g., method A has a higher FP rate compared to method B) is observed is an indication of the robustness of the trend. The ideal values for FP and TP rates are 0 and 1 respectively. As method A has higher values for both metrics compared to method B, a trade-off between accepting FP vs. FN will determine which method to use. Visualizing two metrics plotted against one another can more clearly present the trade-off between the methods. To compare method performance for two metrics with a fixed threshold, a scatter plot can be used (Figure 5). When considering a range of possible thresholds, one metric can be plotted as a function of a second metric (Figure 6). A ROC curve is an example of a comparison between two performance metrics over a range of threshold values. Values of the threshold or additional metrics can be indicated through variations in symbol or line color, width, or pattern. When many metrics are presented, it may prove most effective to plot each metric against a common variable, such as threshold. The uncertainty of the metrics can be presented in a number of ways. Here the uncertainty is represented qualitatively through the comparison of the two methods relative performance across a collection of 16 replicates. The uncertainty of the metrics can also be presented as error bars representing the uncertainty of one metrics given a fixed value for the second metric (marginal distributions), or the combined distribution of both metrics (joint distribution).

Bottom Line: As experience grows, researchers increasingly recognize that analyzing the wealth of data provided by these new sequencing platforms requires careful attention to detail for robust results.Missing, however, is a focus on critical evaluation of variant callers for these genomes.Variant calling is essential for comparative genomics as it yields insights into nucleotide-level organismal differences.

View Article: PubMed Central - PubMed

Affiliation: Biosystems and Biomaterials Division, Material Measurement Laboratory, National Institute of Standards and Technology , Gaithersburg, MD, USA.

ABSTRACT
Innovations in sequencing technologies have allowed biologists to make incredible advances in understanding biological systems. As experience grows, researchers increasingly recognize that analyzing the wealth of data provided by these new sequencing platforms requires careful attention to detail for robust results. Thus far, much of the scientific Communit's focus for use in bacterial genomics has been on evaluating genome assembly algorithms and rigorously validating assembly program performance. Missing, however, is a focus on critical evaluation of variant callers for these genomes. Variant calling is essential for comparative genomics as it yields insights into nucleotide-level organismal differences. Variant calling is a multistep process with a host of potential error sources that may lead to incorrect variant calls. Identifying and resolving these incorrect calls is critical for bacterial genomics to advance. The goal of this review is to provide guidance on validating algorithms and pipelines used in variant calling for bacterial genomics. First, we will provide an overview of the variant calling procedures and the potential sources of error associated with the methods. We will then identify appropriate datasets for use in evaluating algorithms and describe statistical methods for evaluating algorithm performance. As variant calling moves from basic research to the applied setting, standardized methods for performance evaluation and reporting are required; it is our hope that this review provides the groundwork for the development of these standards.

No MeSH data available.


Related in: MedlinePlus