Limits...
Global diversity lines - a five-continent reference panel of sequenced Drosophila melanogaster strains.

Grenier JK, Arguello JR, Moreira MC, Gottipati S, Mohammed J, Hackett SR, Boughton R, Greenberg AJ, Clark AG - G3 (Bethesda) (2015)

Bottom Line: Another key feature of these strains is their widespread geographic origin, coming from Beijing, Ithaca, Netherlands, Tasmania, and Zimbabwe.We found 83 segregating inversions among the lines, and as expected these were especially abundant in the African sample.We anticipate that this will make a useful addition to the set of reference D. melanogaster strains, thanks to its geographic structuring and unusually high level of genetic diversity.

View Article: PubMed Central - PubMed

Affiliation: Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853.

Show MeSH

Related in: MedlinePlus

Variant count summary. (A) A total of 5.78 M SNP sites and 971 k small indel sites were discovered in the final panel of 84+1 Global Diversity lines. About half the variant sites per chromosome are shared among more than one population, with the Zimbabwe population contributing the majority of populations-specific variant sites for both (B) single-nucleotide polymorphisms (SNPs) and (C) small indels. The ZW184 line has suspect provenance, and is excluded from B and C (this line is the “+1” in our designation of 84+1 lines). *Chromosome 4 counts are ×10,000 in panels B and C.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4390575&req=5

fig2: Variant count summary. (A) A total of 5.78 M SNP sites and 971 k small indel sites were discovered in the final panel of 84+1 Global Diversity lines. About half the variant sites per chromosome are shared among more than one population, with the Zimbabwe population contributing the majority of populations-specific variant sites for both (B) single-nucleotide polymorphisms (SNPs) and (C) small indels. The ZW184 line has suspect provenance, and is excluded from B and C (this line is the “+1” in our designation of 84+1 lines). *Chromosome 4 counts are ×10,000 in panels B and C.

Mentions: We investigated whether the variant quality score (VQS) per site or the genotype quality score (GQ) for individual genotype calls correlated with validation rate. Using the “100×” ZW155 validation set, we found that GQ clearly correlated with validation (Figure S3) but VQS did not (data not shown). Genotype calls with a greater GQ validated at a greater rate. We did find that, following variant quality score recalibration, the sites with the lowest VQS-recalibrated also had lower validation rates, especially heterozygous sites. We used a combination of GQ score, VQS recalibrated-flag, and genotype to filter out classes of genotype calls with validation rates below 90%, including all heterozygous calls outside of “het blocks.” This filtering reduced the number of nonreference SNP genotypes by 12% and the number of variant SNP sites by 5%. Finally, we masked SNP genotype calls if the SNP was within 5 nt of an indel call in the same line. There are more than 5.75 M euchromatic SNPs in the final set of variant sites (Figure 2 and Table S3), with 97% of genotypes called across 84+1 lines.


Global diversity lines - a five-continent reference panel of sequenced Drosophila melanogaster strains.

Grenier JK, Arguello JR, Moreira MC, Gottipati S, Mohammed J, Hackett SR, Boughton R, Greenberg AJ, Clark AG - G3 (Bethesda) (2015)

Variant count summary. (A) A total of 5.78 M SNP sites and 971 k small indel sites were discovered in the final panel of 84+1 Global Diversity lines. About half the variant sites per chromosome are shared among more than one population, with the Zimbabwe population contributing the majority of populations-specific variant sites for both (B) single-nucleotide polymorphisms (SNPs) and (C) small indels. The ZW184 line has suspect provenance, and is excluded from B and C (this line is the “+1” in our designation of 84+1 lines). *Chromosome 4 counts are ×10,000 in panels B and C.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4390575&req=5

fig2: Variant count summary. (A) A total of 5.78 M SNP sites and 971 k small indel sites were discovered in the final panel of 84+1 Global Diversity lines. About half the variant sites per chromosome are shared among more than one population, with the Zimbabwe population contributing the majority of populations-specific variant sites for both (B) single-nucleotide polymorphisms (SNPs) and (C) small indels. The ZW184 line has suspect provenance, and is excluded from B and C (this line is the “+1” in our designation of 84+1 lines). *Chromosome 4 counts are ×10,000 in panels B and C.
Mentions: We investigated whether the variant quality score (VQS) per site or the genotype quality score (GQ) for individual genotype calls correlated with validation rate. Using the “100×” ZW155 validation set, we found that GQ clearly correlated with validation (Figure S3) but VQS did not (data not shown). Genotype calls with a greater GQ validated at a greater rate. We did find that, following variant quality score recalibration, the sites with the lowest VQS-recalibrated also had lower validation rates, especially heterozygous sites. We used a combination of GQ score, VQS recalibrated-flag, and genotype to filter out classes of genotype calls with validation rates below 90%, including all heterozygous calls outside of “het blocks.” This filtering reduced the number of nonreference SNP genotypes by 12% and the number of variant SNP sites by 5%. Finally, we masked SNP genotype calls if the SNP was within 5 nt of an indel call in the same line. There are more than 5.75 M euchromatic SNPs in the final set of variant sites (Figure 2 and Table S3), with 97% of genotypes called across 84+1 lines.

Bottom Line: Another key feature of these strains is their widespread geographic origin, coming from Beijing, Ithaca, Netherlands, Tasmania, and Zimbabwe.We found 83 segregating inversions among the lines, and as expected these were especially abundant in the African sample.We anticipate that this will make a useful addition to the set of reference D. melanogaster strains, thanks to its geographic structuring and unusually high level of genetic diversity.

View Article: PubMed Central - PubMed

Affiliation: Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853.

Show MeSH
Related in: MedlinePlus