Limits...
A genome-wide scan for breast cancer risk haplotypes among African American women.

Song C, Chen GK, Millikan RC, Ambrosone CB, John EM, Bernstein L, Zheng W, Hu JJ, Ziegler RG, Nyante S, Bandera EV, Ingles SA, Press MF, Deming SL, Rodriguez-Gil JL, Chanock SJ, Wan P, Sheng X, Pooler LC, Van Den Berg DJ, Le Marchand L, Kolonel LN, Henderson BE, Haiman CA, Stram DO - PLoS ONE (2013)

Bottom Line: Three regions on chromosomes 1, 4 and 18 exhibited moderate haplotype effects.We also proposed a heuristic of determining the significance level and the effective number of independent tests by the permutation analysis on chromosome 22 data.It suggests that the effective number was approximately half of the total (7,794 out of 15,645), thus the half number could serve as a quick reference to evaluating genome-wide significance if a similar sliding window approach of haplotype analysis is adopted in similar populations using similar genotype density.

View Article: PubMed Central - PubMed

Affiliation: Department of Preventive Medicine, Keck School of Medicine and Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA, USA.

ABSTRACT
Genome-wide association studies (GWAS) simultaneously investigating hundreds of thousands of single nucleotide polymorphisms (SNP) have become a powerful tool in the investigation of new disease susceptibility loci. Haplotypes are sometimes thought to be superior to SNPs and are promising in genetic association analyses. The application of genome-wide haplotype analysis, however, is hindered by the complexity of haplotypes themselves and sophistication in computation. We systematically analyzed the haplotype effects for breast cancer risk among 5,761 African American women (3,016 cases and 2,745 controls) using a sliding window approach on the genome-wide scale. Three regions on chromosomes 1, 4 and 18 exhibited moderate haplotype effects. Furthermore, among 21 breast cancer susceptibility loci previously established in European populations, 10p15 and 14q24 are likely to harbor novel haplotype effects. We also proposed a heuristic of determining the significance level and the effective number of independent tests by the permutation analysis on chromosome 22 data. It suggests that the effective number was approximately half of the total (7,794 out of 15,645), thus the half number could serve as a quick reference to evaluating genome-wide significance if a similar sliding window approach of haplotype analysis is adopted in similar populations using similar genotype density.

Show MeSH

Related in: MedlinePlus

Comparison of the significance of individual haplotypes with the most significant SNPs in three regions on chromosomes 1, 4 and 18.These three regions, namely, (A) chr1∶8,309,317-8,318,147; (B) chr4∶122,325,743- 122,363,114; and (C) chr18∶35,670,316-35,683,522 were identified by the genome-wide haplotype association analysis using 5-SNP sliding windows. The regions were further extended both upstream and downstream by half of the original width to explore underdetected effects. Black circles denote individual haplotypes, the sizes of which are proportional to their haplotype frequencies. Red dots denote genotyped SNPs within the same region. Blue dot shows the most significant SNP.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3585353&req=5

pone-0057298-g002: Comparison of the significance of individual haplotypes with the most significant SNPs in three regions on chromosomes 1, 4 and 18.These three regions, namely, (A) chr1∶8,309,317-8,318,147; (B) chr4∶122,325,743- 122,363,114; and (C) chr18∶35,670,316-35,683,522 were identified by the genome-wide haplotype association analysis using 5-SNP sliding windows. The regions were further extended both upstream and downstream by half of the original width to explore underdetected effects. Black circles denote individual haplotypes, the sizes of which are proportional to their haplotype frequencies. Red dots denote genotyped SNPs within the same region. Blue dot shows the most significant SNP.

Mentions: In search of haplotype peaks where significant SNPs were absent on the Manhattan plots, a region on chromosome 5 exhibited a distinct haplotype effect compared with individual SNP associations at the same chromosomal region (Figure S2). There were five overlapping haplotype blocks defined by 5-SNP sliding windows with global test p-values (p = 1.70×10−8, 3.16×10−8, 1.85×10−7, 1.45×10−6, and 3.38×10−6, respectively) less than any single SNP’s p-value within the same region. However, the most significant SNP rs6882564 (p = 1.14×10−4) made up all the significant haplotypes and were noted to be severely out of HWE (p<1×10−7). A review of the intensity plots for this SNP showed that rs6882564 was clearly miscalled by the genotyping algorithm, and thus we dropped from consideration all haplotypes that contain rs6882564, leaving no other haplotypes in the same region genome-wide significant. No other haplotype blocks throughout the genome had a global test p-value less than 10−6. The top 10 independent genomic regions with haplotype global test p-value between 1.60×10−6 and 1.51×10−5 are summarized in Table S3. After visual examination of the Manhattan plots contrasting the haplotype-specific effects with the individual SNP effects, the remaining most significant regions unlikely to be explained solely by SNPs were chr1∶8,309,317-8,318,147, chr4∶122,325,743-122,363,114, and chr18∶35,670,316-35,683,522. Notably, on chromosome 1, the 5-SNP haplotype AGCTG (Position: 8309317-8318147; frequency = 0.24) (Figure 2A; Table 2) comprised of SNPs rs9628987, rs2289731, rs12711517, rs2305016, and rs7535752, had a p-value three orders of magnitude less than that of the most significant SNP contained in the haplotype, rs12711517 (haplotype p = 5.09×10−6 vs. SNP p = 9.88×10−3). When conditioning on this locally most significant SNP, the haplotype effect stayed almost unchanged (adjusted OR = 0.82; 95% CI = 0.74–0.91) and remained the most significant haplotype, although the adjusted haplotype specific association p-value was less significant than that of without adjustment for the best SNP (unadjusted haplotype p = 5.09×10−6 vs. adjusted haplotype p = 1.36×10−4). On chromosome 4, a 2-SNP haplotype AG (Position: 122340944-122346258; frequency = 0.64) was close to two orders of magnitude more significant than its best individual SNP, rs13116936 (3.37×10−7 vs. 1.09×10−5) (Figure 2B) and the unadjusted haplotype specific effect was among the most significant in all top 10 independent regions. After adjusting for the best SNP, the haplotype effect remained significant at p = 7.54×10−4. A potentially interesting finding was on chromosome 18 (Figure 2C) where a much rarer 6-SNP haplotype AACGTT (Position: 35670316-35684521; frequency = 0.03) showed an improvement of haplotype significance with the adjusted p-value of 2.42×10−5 in contrast to the unadjusted p-value of 6.96×10−5. The haplotype specific effect did not alter meaningfully before and after the adjustment for the best SNP (unadjusted OR = 1.72, 95% CI = 1.32-2.25; adjusted OR = 1.79, 95% CI = 1.36-2.34). The carrier of one copy of this haplotype had 1.79 times higher breast cancer risk relative to women who did not carry it, much stronger than the best SNP rs47995220 alone (OR = 1.23; 95% CI = 1.11-1.45). These three novel haplotypes found on chromosomes 1, 4 and 18 were further verified with comparison to the imputed SNPs based on the 1000 Genomes Project released data within the same chromosomal regions. None of the aforementioned novel haplotype-specific associations could have been revealed by imputed SNPs (Figure 3A–C). As shown in the Manhattan plots contrasting the haplotype effects with that of the imputed SNPs, the most significant haplotypes were independent of the neighboring clusters of imputed SNPs; no adjacent SNPs achieved comparable significance as the top haplotypes did. These novel haplotypes were not confounded by local ancestry inferred from neighboring SNPs either (Table S5). The test statistics stayed largely unchanged after further adjusting for the local ancestry in addition to the global ancestry for a finer correction for population admixture. Among the remainder of the top 10 independent regions with haplotype global test p-values less than 1.51×10−5, the significance levels of the top individual haplotypes and SNPs were very close for chromosomes 3, 5 and 10, implying that the noticeable haplotype effects shown on the Manhattan plots can be mostly credited to the genotyped SNPs (Figures S3 A–C). On the rest of the chromosomes, the top SNPs were more significant than any inferred haplotypes, so that the haplotypes did not contribute more information towards genetic association tests in those regions than SNPs themselves.


A genome-wide scan for breast cancer risk haplotypes among African American women.

Song C, Chen GK, Millikan RC, Ambrosone CB, John EM, Bernstein L, Zheng W, Hu JJ, Ziegler RG, Nyante S, Bandera EV, Ingles SA, Press MF, Deming SL, Rodriguez-Gil JL, Chanock SJ, Wan P, Sheng X, Pooler LC, Van Den Berg DJ, Le Marchand L, Kolonel LN, Henderson BE, Haiman CA, Stram DO - PLoS ONE (2013)

Comparison of the significance of individual haplotypes with the most significant SNPs in three regions on chromosomes 1, 4 and 18.These three regions, namely, (A) chr1∶8,309,317-8,318,147; (B) chr4∶122,325,743- 122,363,114; and (C) chr18∶35,670,316-35,683,522 were identified by the genome-wide haplotype association analysis using 5-SNP sliding windows. The regions were further extended both upstream and downstream by half of the original width to explore underdetected effects. Black circles denote individual haplotypes, the sizes of which are proportional to their haplotype frequencies. Red dots denote genotyped SNPs within the same region. Blue dot shows the most significant SNP.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3585353&req=5

pone-0057298-g002: Comparison of the significance of individual haplotypes with the most significant SNPs in three regions on chromosomes 1, 4 and 18.These three regions, namely, (A) chr1∶8,309,317-8,318,147; (B) chr4∶122,325,743- 122,363,114; and (C) chr18∶35,670,316-35,683,522 were identified by the genome-wide haplotype association analysis using 5-SNP sliding windows. The regions were further extended both upstream and downstream by half of the original width to explore underdetected effects. Black circles denote individual haplotypes, the sizes of which are proportional to their haplotype frequencies. Red dots denote genotyped SNPs within the same region. Blue dot shows the most significant SNP.
Mentions: In search of haplotype peaks where significant SNPs were absent on the Manhattan plots, a region on chromosome 5 exhibited a distinct haplotype effect compared with individual SNP associations at the same chromosomal region (Figure S2). There were five overlapping haplotype blocks defined by 5-SNP sliding windows with global test p-values (p = 1.70×10−8, 3.16×10−8, 1.85×10−7, 1.45×10−6, and 3.38×10−6, respectively) less than any single SNP’s p-value within the same region. However, the most significant SNP rs6882564 (p = 1.14×10−4) made up all the significant haplotypes and were noted to be severely out of HWE (p<1×10−7). A review of the intensity plots for this SNP showed that rs6882564 was clearly miscalled by the genotyping algorithm, and thus we dropped from consideration all haplotypes that contain rs6882564, leaving no other haplotypes in the same region genome-wide significant. No other haplotype blocks throughout the genome had a global test p-value less than 10−6. The top 10 independent genomic regions with haplotype global test p-value between 1.60×10−6 and 1.51×10−5 are summarized in Table S3. After visual examination of the Manhattan plots contrasting the haplotype-specific effects with the individual SNP effects, the remaining most significant regions unlikely to be explained solely by SNPs were chr1∶8,309,317-8,318,147, chr4∶122,325,743-122,363,114, and chr18∶35,670,316-35,683,522. Notably, on chromosome 1, the 5-SNP haplotype AGCTG (Position: 8309317-8318147; frequency = 0.24) (Figure 2A; Table 2) comprised of SNPs rs9628987, rs2289731, rs12711517, rs2305016, and rs7535752, had a p-value three orders of magnitude less than that of the most significant SNP contained in the haplotype, rs12711517 (haplotype p = 5.09×10−6 vs. SNP p = 9.88×10−3). When conditioning on this locally most significant SNP, the haplotype effect stayed almost unchanged (adjusted OR = 0.82; 95% CI = 0.74–0.91) and remained the most significant haplotype, although the adjusted haplotype specific association p-value was less significant than that of without adjustment for the best SNP (unadjusted haplotype p = 5.09×10−6 vs. adjusted haplotype p = 1.36×10−4). On chromosome 4, a 2-SNP haplotype AG (Position: 122340944-122346258; frequency = 0.64) was close to two orders of magnitude more significant than its best individual SNP, rs13116936 (3.37×10−7 vs. 1.09×10−5) (Figure 2B) and the unadjusted haplotype specific effect was among the most significant in all top 10 independent regions. After adjusting for the best SNP, the haplotype effect remained significant at p = 7.54×10−4. A potentially interesting finding was on chromosome 18 (Figure 2C) where a much rarer 6-SNP haplotype AACGTT (Position: 35670316-35684521; frequency = 0.03) showed an improvement of haplotype significance with the adjusted p-value of 2.42×10−5 in contrast to the unadjusted p-value of 6.96×10−5. The haplotype specific effect did not alter meaningfully before and after the adjustment for the best SNP (unadjusted OR = 1.72, 95% CI = 1.32-2.25; adjusted OR = 1.79, 95% CI = 1.36-2.34). The carrier of one copy of this haplotype had 1.79 times higher breast cancer risk relative to women who did not carry it, much stronger than the best SNP rs47995220 alone (OR = 1.23; 95% CI = 1.11-1.45). These three novel haplotypes found on chromosomes 1, 4 and 18 were further verified with comparison to the imputed SNPs based on the 1000 Genomes Project released data within the same chromosomal regions. None of the aforementioned novel haplotype-specific associations could have been revealed by imputed SNPs (Figure 3A–C). As shown in the Manhattan plots contrasting the haplotype effects with that of the imputed SNPs, the most significant haplotypes were independent of the neighboring clusters of imputed SNPs; no adjacent SNPs achieved comparable significance as the top haplotypes did. These novel haplotypes were not confounded by local ancestry inferred from neighboring SNPs either (Table S5). The test statistics stayed largely unchanged after further adjusting for the local ancestry in addition to the global ancestry for a finer correction for population admixture. Among the remainder of the top 10 independent regions with haplotype global test p-values less than 1.51×10−5, the significance levels of the top individual haplotypes and SNPs were very close for chromosomes 3, 5 and 10, implying that the noticeable haplotype effects shown on the Manhattan plots can be mostly credited to the genotyped SNPs (Figures S3 A–C). On the rest of the chromosomes, the top SNPs were more significant than any inferred haplotypes, so that the haplotypes did not contribute more information towards genetic association tests in those regions than SNPs themselves.

Bottom Line: Three regions on chromosomes 1, 4 and 18 exhibited moderate haplotype effects.We also proposed a heuristic of determining the significance level and the effective number of independent tests by the permutation analysis on chromosome 22 data.It suggests that the effective number was approximately half of the total (7,794 out of 15,645), thus the half number could serve as a quick reference to evaluating genome-wide significance if a similar sliding window approach of haplotype analysis is adopted in similar populations using similar genotype density.

View Article: PubMed Central - PubMed

Affiliation: Department of Preventive Medicine, Keck School of Medicine and Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA, USA.

ABSTRACT
Genome-wide association studies (GWAS) simultaneously investigating hundreds of thousands of single nucleotide polymorphisms (SNP) have become a powerful tool in the investigation of new disease susceptibility loci. Haplotypes are sometimes thought to be superior to SNPs and are promising in genetic association analyses. The application of genome-wide haplotype analysis, however, is hindered by the complexity of haplotypes themselves and sophistication in computation. We systematically analyzed the haplotype effects for breast cancer risk among 5,761 African American women (3,016 cases and 2,745 controls) using a sliding window approach on the genome-wide scale. Three regions on chromosomes 1, 4 and 18 exhibited moderate haplotype effects. Furthermore, among 21 breast cancer susceptibility loci previously established in European populations, 10p15 and 14q24 are likely to harbor novel haplotype effects. We also proposed a heuristic of determining the significance level and the effective number of independent tests by the permutation analysis on chromosome 22 data. It suggests that the effective number was approximately half of the total (7,794 out of 15,645), thus the half number could serve as a quick reference to evaluating genome-wide significance if a similar sliding window approach of haplotype analysis is adopted in similar populations using similar genotype density.

Show MeSH
Related in: MedlinePlus