Limits...
Fish scales and SNP chips: SNP genotyping and allele frequency estimation in individual and pooled DNA from historical samples of Atlantic salmon (Salmo salar).

Johnston SE, Lindqvist M, Niemelä E, Orell P, Erkinaro J, Kent MP, Lien S, Vähä JP, Vasemägi A, Primmer CR - BMC Genomics (2013)

Bottom Line: SNPs located in polyploid regions of the genome were more sensitive to DNA degradation: older samples had lower genotyping success at these loci, and a larger reference panel of individuals was required to accurately estimate allele frequencies.SNP genotyping was highly successful in degraded DNA samples, paving the way for the use of degraded samples in SNP genotyping projects.We provide recommendations for future studies intending to conduct high-throughput SNP genotyping and allele frequency estimation in historical samples.

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Background: DNA extracted from historical samples is an important resource for understanding genetic consequences of anthropogenic influences and long-term environmental change. However, such samples generally yield DNA of a lower amount and quality, and the extent to which DNA degradation affects SNP genotyping success and allele frequency estimation is not well understood. We conducted high density SNP genotyping and allele frequency estimation in both individual DNA samples and pooled DNA samples extracted from dried Atlantic salmon (Salmo salar) scales stored at room temperature for up to 35 years, and assessed genotyping success, repeatability and accuracy of allele frequency estimation using a high density SNP genotyping array.

Results: In individual DNA samples, genotyping success and repeatability was very high (> 0.973 and > 0.998, respectively) in samples stored for up to 35 years; both increased with the proportion of DNA of fragment size > 1000 bp. In pooled DNA samples, allele frequency estimation was highly repeatable (Repeatability = 0.986) and highly correlated with empirical allele frequency measures (Mean Adjusted R2 = 0.991); allele frequency could be accurately estimated in > 95% of pooled DNA samples with a reference group of at least 30 individuals. SNPs located in polyploid regions of the genome were more sensitive to DNA degradation: older samples had lower genotyping success at these loci, and a larger reference panel of individuals was required to accurately estimate allele frequencies.

Conclusions: SNP genotyping was highly successful in degraded DNA samples, paving the way for the use of degraded samples in SNP genotyping projects. DNA pooling provides the potential for large scale population genetic studies with fewer assays, provided enough reference individuals are also genotyped and DNA quality is properly assessed beforehand. We provide recommendations for future studies intending to conduct high-throughput SNP genotyping and allele frequency estimation in historical samples.

Show MeSH
Boxplot demonstrating the accuracy of allele frequency estimation using subsets of reference individuals. Parameter estimation was carried out 100 times for each subset. A. The proportion of pooled samples for which frequency estimates could be calculated. B. The mean adjusted R2 over all pools and all loci for each simulation. C. The mean difference between empirical and estimated allele frequencies over all pools and loci for each simulation.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3716687&req=5

Figure 5: Boxplot demonstrating the accuracy of allele frequency estimation using subsets of reference individuals. Parameter estimation was carried out 100 times for each subset. A. The proportion of pooled samples for which frequency estimates could be calculated. B. The mean adjusted R2 over all pools and all loci for each simulation. C. The mean difference between empirical and estimated allele frequencies over all pools and loci for each simulation.

Mentions: Allele frequencies in DNA pools were re-estimated using mean genotype cluster positions calculated from smaller subsets of constituent individuals, ranging from 10 to 200 individuals sampled from the full dataset. The mean proportion of pool allele frequencies that could be estimated increased from 0.895 when estimated from N = 10 individuals, to 0.988 when estimated from N = 200 individuals; a proportion of > 0.95 was observed when sampling 30 individuals or more (Figure 5A). The mean adjusted R2 between the empirical and estimated frequencies was high for all subsets (> 0.985) and increased with the number of individuals sampled (Figure 5B). The mean difference between the estimated and empirical allele frequencies decreased as the number of sampled individuals increased (Figure 5C). For all three estimates carried out, the mean values from each subset size were significantly higher from the previous category as the sample size increased (two sample t-test P < 0.001; Figure 5). The mean results obtained for each estimate are given in Additional file 1: Table S4.


Fish scales and SNP chips: SNP genotyping and allele frequency estimation in individual and pooled DNA from historical samples of Atlantic salmon (Salmo salar).

Johnston SE, Lindqvist M, Niemelä E, Orell P, Erkinaro J, Kent MP, Lien S, Vähä JP, Vasemägi A, Primmer CR - BMC Genomics (2013)

Boxplot demonstrating the accuracy of allele frequency estimation using subsets of reference individuals. Parameter estimation was carried out 100 times for each subset. A. The proportion of pooled samples for which frequency estimates could be calculated. B. The mean adjusted R2 over all pools and all loci for each simulation. C. The mean difference between empirical and estimated allele frequencies over all pools and loci for each simulation.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3716687&req=5

Figure 5: Boxplot demonstrating the accuracy of allele frequency estimation using subsets of reference individuals. Parameter estimation was carried out 100 times for each subset. A. The proportion of pooled samples for which frequency estimates could be calculated. B. The mean adjusted R2 over all pools and all loci for each simulation. C. The mean difference between empirical and estimated allele frequencies over all pools and loci for each simulation.
Mentions: Allele frequencies in DNA pools were re-estimated using mean genotype cluster positions calculated from smaller subsets of constituent individuals, ranging from 10 to 200 individuals sampled from the full dataset. The mean proportion of pool allele frequencies that could be estimated increased from 0.895 when estimated from N = 10 individuals, to 0.988 when estimated from N = 200 individuals; a proportion of > 0.95 was observed when sampling 30 individuals or more (Figure 5A). The mean adjusted R2 between the empirical and estimated frequencies was high for all subsets (> 0.985) and increased with the number of individuals sampled (Figure 5B). The mean difference between the estimated and empirical allele frequencies decreased as the number of sampled individuals increased (Figure 5C). For all three estimates carried out, the mean values from each subset size were significantly higher from the previous category as the sample size increased (two sample t-test P < 0.001; Figure 5). The mean results obtained for each estimate are given in Additional file 1: Table S4.

Bottom Line: SNPs located in polyploid regions of the genome were more sensitive to DNA degradation: older samples had lower genotyping success at these loci, and a larger reference panel of individuals was required to accurately estimate allele frequencies.SNP genotyping was highly successful in degraded DNA samples, paving the way for the use of degraded samples in SNP genotyping projects.We provide recommendations for future studies intending to conduct high-throughput SNP genotyping and allele frequency estimation in historical samples.

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Background: DNA extracted from historical samples is an important resource for understanding genetic consequences of anthropogenic influences and long-term environmental change. However, such samples generally yield DNA of a lower amount and quality, and the extent to which DNA degradation affects SNP genotyping success and allele frequency estimation is not well understood. We conducted high density SNP genotyping and allele frequency estimation in both individual DNA samples and pooled DNA samples extracted from dried Atlantic salmon (Salmo salar) scales stored at room temperature for up to 35 years, and assessed genotyping success, repeatability and accuracy of allele frequency estimation using a high density SNP genotyping array.

Results: In individual DNA samples, genotyping success and repeatability was very high (> 0.973 and > 0.998, respectively) in samples stored for up to 35 years; both increased with the proportion of DNA of fragment size > 1000 bp. In pooled DNA samples, allele frequency estimation was highly repeatable (Repeatability = 0.986) and highly correlated with empirical allele frequency measures (Mean Adjusted R2 = 0.991); allele frequency could be accurately estimated in > 95% of pooled DNA samples with a reference group of at least 30 individuals. SNPs located in polyploid regions of the genome were more sensitive to DNA degradation: older samples had lower genotyping success at these loci, and a larger reference panel of individuals was required to accurately estimate allele frequencies.

Conclusions: SNP genotyping was highly successful in degraded DNA samples, paving the way for the use of degraded samples in SNP genotyping projects. DNA pooling provides the potential for large scale population genetic studies with fewer assays, provided enough reference individuals are also genotyped and DNA quality is properly assessed beforehand. We provide recommendations for future studies intending to conduct high-throughput SNP genotyping and allele frequency estimation in historical samples.

Show MeSH