Limits...
SNP selection for genes of iron metabolism in a study of genetic modifiers of hemochromatosis.

Constantine CC, Gurrin LC, McLaren CE, Bahlo M, Anderson GJ, Vulpe CD, Forrest SM, Allen KJ, Gertig DM, HealthIron Investigato - BMC Med. Genet. (2008)

Bottom Line: We contrasted results from two tag SNP selection algorithms, LDselect and Tagger.We examined the pattern of linkage disequilibrium of three levels of resequencing coverage for the transferrin gene and found HapMap phase 1 tag SNPs capture 45% of the > or = 3% MAF SNPs found in SeattleSNPs where there is nearly complete resequencing.A candidate gene approach should seek to maximise coverage, and this can be improved by adding to HapMap data any available sequencing data.

View Article: PubMed Central - HTML - PubMed

Affiliation: The Centre for Molecular, Environmental, Genetic and Analytic (MEGA) Epidemiology, School of Population Health, The University of Melbourne, Melbourne, Australia. ccconsta@uci.edu

ABSTRACT

Background: We report our experience of selecting tag SNPs in 35 genes involved in iron metabolism in a cohort study seeking to discover genetic modifiers of hereditary hemochromatosis.

Methods: We combined our own and publicly available resequencing data with HapMap to maximise our coverage to select 384 SNPs in candidate genes suitable for typing on the Illumina platform.

Results: Validation/design scores above 0.6 were not strongly correlated with SNP performance as estimated by Gentrain score. We contrasted results from two tag SNP selection algorithms, LDselect and Tagger. Varying r2 from 0.5 to 1.0 produced a near linear correlation with the number of tag SNPs required. We examined the pattern of linkage disequilibrium of three levels of resequencing coverage for the transferrin gene and found HapMap phase 1 tag SNPs capture 45% of the > or = 3% MAF SNPs found in SeattleSNPs where there is nearly complete resequencing. Resequencing can reveal adjacent SNPs (within 60 bp) which may affect assay performance. We report the number of SNPs present within the region of six of our larger candidate genes, for different versions of stock genotyping assays.

Conclusion: A candidate gene approach should seek to maximise coverage, and this can be improved by adding to HapMap data any available sequencing data. Tag SNP software must be fast and flexible to data changes, since tag SNP selection involves iteration as investigators seek to satisfy the competing demands of coverage within and between populations, and typability on the technology platform chosen.

Show MeSH

Related in: MedlinePlus

Frequency distribution of captured and uncaptured SNPs from Seattle resequencing of TF using HapMap tagSNPS. The large number of captured SNPs in the 40–45% range represents the strong block of LD which is captured by a single tagSNP.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2289803&req=5

Figure 3: Frequency distribution of captured and uncaptured SNPs from Seattle resequencing of TF using HapMap tagSNPS. The large number of captured SNPs in the 40–45% range represents the strong block of LD which is captured by a single tagSNP.

Mentions: Table 5 compares the number of SNPs with MAF ≥ 3% identified in data from Caucasians, using the four data sources (HealthIron, NHLBI RS&G, SeattleSNPS, and HapMap). The 7 HapMap tag SNPs, chosen from the 11 HapMap SNPs with MAF ≥ 3% using Tagger, capture just 45 out of the 101 Seattle SNPs (45%) using an r2 threshold of 80% (so only 45% of SNPs have an r2 of 0.80 or more with at least one tag SNP). Increasing the minimum MAF to 5% increases the capture of Seattle SNPs using HapMap tag SNPs to 48/89 (55%). Decreasing the r2 threshold to 0.50 with minimum MAF still 3% only improved capture of the Seattle SNPs to 63/101 (62%). Approximately 55% of variant SNPs for transferrin in the SeattleSNPs database were not captured well using only HapMap phase 1 tag SNPs. In comparison HapMap Mar 2006 has 38 SNPs within the TF gene (MAF ≥ 3% in Caucasians), with 17 tag SNPs (pairwise using Tagger). Unfortunately TF was the only gene for which we could make this comparison, since it requires all the HapMap SNPs to be within the regions for which resequencing data are available. Figure 3 shows the minor allele frequency of the captured and uncaptured SNPs, showing that there was an even frequency distribution of SNPs not captured, not just low frequency.


SNP selection for genes of iron metabolism in a study of genetic modifiers of hemochromatosis.

Constantine CC, Gurrin LC, McLaren CE, Bahlo M, Anderson GJ, Vulpe CD, Forrest SM, Allen KJ, Gertig DM, HealthIron Investigato - BMC Med. Genet. (2008)

Frequency distribution of captured and uncaptured SNPs from Seattle resequencing of TF using HapMap tagSNPS. The large number of captured SNPs in the 40–45% range represents the strong block of LD which is captured by a single tagSNP.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2289803&req=5

Figure 3: Frequency distribution of captured and uncaptured SNPs from Seattle resequencing of TF using HapMap tagSNPS. The large number of captured SNPs in the 40–45% range represents the strong block of LD which is captured by a single tagSNP.
Mentions: Table 5 compares the number of SNPs with MAF ≥ 3% identified in data from Caucasians, using the four data sources (HealthIron, NHLBI RS&G, SeattleSNPS, and HapMap). The 7 HapMap tag SNPs, chosen from the 11 HapMap SNPs with MAF ≥ 3% using Tagger, capture just 45 out of the 101 Seattle SNPs (45%) using an r2 threshold of 80% (so only 45% of SNPs have an r2 of 0.80 or more with at least one tag SNP). Increasing the minimum MAF to 5% increases the capture of Seattle SNPs using HapMap tag SNPs to 48/89 (55%). Decreasing the r2 threshold to 0.50 with minimum MAF still 3% only improved capture of the Seattle SNPs to 63/101 (62%). Approximately 55% of variant SNPs for transferrin in the SeattleSNPs database were not captured well using only HapMap phase 1 tag SNPs. In comparison HapMap Mar 2006 has 38 SNPs within the TF gene (MAF ≥ 3% in Caucasians), with 17 tag SNPs (pairwise using Tagger). Unfortunately TF was the only gene for which we could make this comparison, since it requires all the HapMap SNPs to be within the regions for which resequencing data are available. Figure 3 shows the minor allele frequency of the captured and uncaptured SNPs, showing that there was an even frequency distribution of SNPs not captured, not just low frequency.

Bottom Line: We contrasted results from two tag SNP selection algorithms, LDselect and Tagger.We examined the pattern of linkage disequilibrium of three levels of resequencing coverage for the transferrin gene and found HapMap phase 1 tag SNPs capture 45% of the > or = 3% MAF SNPs found in SeattleSNPs where there is nearly complete resequencing.A candidate gene approach should seek to maximise coverage, and this can be improved by adding to HapMap data any available sequencing data.

View Article: PubMed Central - HTML - PubMed

Affiliation: The Centre for Molecular, Environmental, Genetic and Analytic (MEGA) Epidemiology, School of Population Health, The University of Melbourne, Melbourne, Australia. ccconsta@uci.edu

ABSTRACT

Background: We report our experience of selecting tag SNPs in 35 genes involved in iron metabolism in a cohort study seeking to discover genetic modifiers of hereditary hemochromatosis.

Methods: We combined our own and publicly available resequencing data with HapMap to maximise our coverage to select 384 SNPs in candidate genes suitable for typing on the Illumina platform.

Results: Validation/design scores above 0.6 were not strongly correlated with SNP performance as estimated by Gentrain score. We contrasted results from two tag SNP selection algorithms, LDselect and Tagger. Varying r2 from 0.5 to 1.0 produced a near linear correlation with the number of tag SNPs required. We examined the pattern of linkage disequilibrium of three levels of resequencing coverage for the transferrin gene and found HapMap phase 1 tag SNPs capture 45% of the > or = 3% MAF SNPs found in SeattleSNPs where there is nearly complete resequencing. Resequencing can reveal adjacent SNPs (within 60 bp) which may affect assay performance. We report the number of SNPs present within the region of six of our larger candidate genes, for different versions of stock genotyping assays.

Conclusion: A candidate gene approach should seek to maximise coverage, and this can be improved by adding to HapMap data any available sequencing data. Tag SNP software must be fast and flexible to data changes, since tag SNP selection involves iteration as investigators seek to satisfy the competing demands of coverage within and between populations, and typability on the technology platform chosen.

Show MeSH
Related in: MedlinePlus