Limits...
Error correction and diversity analysis of population mixtures determined by NGS.

Wood GR, Burroughs NJ, Evans DJ, Ryabov EV - PeerJ (2014)

Bottom Line: The impetus for this work was the need to analyse nucleotide diversity in a viral mix taken from honeybees.The paper has two findings.A compendium of existing and new diversity analysis tools is also presented, allowing hypotheses about diversity and mean diversity to be tested and associated confidence intervals to be calculated.

View Article: PubMed Central - HTML - PubMed

Affiliation: Warwick Systems Biology Centre, University of Warwick , Coventry , United Kingdom.

ABSTRACT
The impetus for this work was the need to analyse nucleotide diversity in a viral mix taken from honeybees. The paper has two findings. First, a method for correction of next generation sequencing error in the distribution of nucleotides at a site is developed. Second, a package of methods for assessment of nucleotide diversity is assembled. The error correction method is statistically based and works at the level of the nucleotide distribution rather than the level of individual nucleotides. The method relies on an error model and a sample of known viral genotypes that is used for model calibration. A compendium of existing and new diversity analysis tools is also presented, allowing hypotheses about diversity and mean diversity to be tested and associated confidence intervals to be calculated. The methods are illustrated using honeybee viral samples. Software in both Excel and Matlab and a guide are available at http://www2.warwick.ac.uk/fac/sci/systemsbiology/research/software/, the Warwick University Systems Biology Centre software download site.

No MeSH data available.


Confidence intervals for true mean diversity across the capsid region, for the high diversity (F3) and low diversity (E7) samples, both before and after correction for the NGS error.Correction has a larger effect when the (uncorrected) diversity is low, clearly revealing the reduction in diversity from F3 (low DWV level) to E7 (high DWV level). The clonal threshold for the mean shows that the corrected E7 data plausibly has non-zero diversity.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4232844&req=5

fig-5: Confidence intervals for true mean diversity across the capsid region, for the high diversity (F3) and low diversity (E7) samples, both before and after correction for the NGS error.Correction has a larger effect when the (uncorrected) diversity is low, clearly revealing the reduction in diversity from F3 (low DWV level) to E7 (high DWV level). The clonal threshold for the mean shows that the corrected E7 data plausibly has non-zero diversity.

Mentions: Figure 5 illustrates the confidence intervals found in Example 2.3, labelled “After correction”, together with corresponding confidence intervals for the uncorrected data, labelled “Before correction”. The corrected clonal mean diversity threshold of Example 3 is also shown. The highly significant difference between the Varroa-free and Varroa-infested nucleotide diversities are evident. Correction of the low diversity sample has a far greater effect on diversity than correction of the high diversity sample, due to the steep slope of the diversity component near zero (Fig. 3). The corrected low diversity sample lies just above the clonal threshold.


Error correction and diversity analysis of population mixtures determined by NGS.

Wood GR, Burroughs NJ, Evans DJ, Ryabov EV - PeerJ (2014)

Confidence intervals for true mean diversity across the capsid region, for the high diversity (F3) and low diversity (E7) samples, both before and after correction for the NGS error.Correction has a larger effect when the (uncorrected) diversity is low, clearly revealing the reduction in diversity from F3 (low DWV level) to E7 (high DWV level). The clonal threshold for the mean shows that the corrected E7 data plausibly has non-zero diversity.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4232844&req=5

fig-5: Confidence intervals for true mean diversity across the capsid region, for the high diversity (F3) and low diversity (E7) samples, both before and after correction for the NGS error.Correction has a larger effect when the (uncorrected) diversity is low, clearly revealing the reduction in diversity from F3 (low DWV level) to E7 (high DWV level). The clonal threshold for the mean shows that the corrected E7 data plausibly has non-zero diversity.
Mentions: Figure 5 illustrates the confidence intervals found in Example 2.3, labelled “After correction”, together with corresponding confidence intervals for the uncorrected data, labelled “Before correction”. The corrected clonal mean diversity threshold of Example 3 is also shown. The highly significant difference between the Varroa-free and Varroa-infested nucleotide diversities are evident. Correction of the low diversity sample has a far greater effect on diversity than correction of the high diversity sample, due to the steep slope of the diversity component near zero (Fig. 3). The corrected low diversity sample lies just above the clonal threshold.

Bottom Line: The impetus for this work was the need to analyse nucleotide diversity in a viral mix taken from honeybees.The paper has two findings.A compendium of existing and new diversity analysis tools is also presented, allowing hypotheses about diversity and mean diversity to be tested and associated confidence intervals to be calculated.

View Article: PubMed Central - HTML - PubMed

Affiliation: Warwick Systems Biology Centre, University of Warwick , Coventry , United Kingdom.

ABSTRACT
The impetus for this work was the need to analyse nucleotide diversity in a viral mix taken from honeybees. The paper has two findings. First, a method for correction of next generation sequencing error in the distribution of nucleotides at a site is developed. Second, a package of methods for assessment of nucleotide diversity is assembled. The error correction method is statistically based and works at the level of the nucleotide distribution rather than the level of individual nucleotides. The method relies on an error model and a sample of known viral genotypes that is used for model calibration. A compendium of existing and new diversity analysis tools is also presented, allowing hypotheses about diversity and mean diversity to be tested and associated confidence intervals to be calculated. The methods are illustrated using honeybee viral samples. Software in both Excel and Matlab and a guide are available at http://www2.warwick.ac.uk/fac/sci/systemsbiology/research/software/, the Warwick University Systems Biology Centre software download site.

No MeSH data available.