Limits...
Imputation without doing imputation: a new method for the detection of non-genotyped causal variants.

Howey R, Cordell HJ - Genet. Epidemiol. (2014)

Bottom Line: This observation motivates popular but computationally intensive approaches based on imputation or haplotyping.These two SNPs are used as predictors in linear or logistic regression analysis to generate a final significance test.Previous analysis showed that fine-scale sequencing of a Gambian reference panel in the region of the known causal locus, followed by imputation, increased the signal of association to genome-wide significance levels.

View Article: PubMed Central - PubMed

Affiliation: Institute of Genetic Medicine, Newcastle University, International Centre for Life, Central Parkway, Newcastle upon Tyne, United Kingdom.

Show MeSH

Related in: MedlinePlus

Manhattan plots on chromosome 11 for the case/control severe malaria dataset from The Gambia. The top plot shows single-SNP logistic regression P-values obtained from analysis in PLINK. The middle plot shows AI test P-values. The lower plot shows a close-up of the AI test P-values around the causal SNP. The horizontal dashed line shows 10−8 (a common threshold indicating genome-wide significance). The vertical dashed line indicates the position of the causal SNP, rs334. Light gray points show SNPs with logistic regression P-value below a threshold of 10−7, which were presumably removed (by eye, following cluster plot checks) in the analysis by Jallow et al. [Jallow et al., 2009]
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4150535&req=5

fig05: Manhattan plots on chromosome 11 for the case/control severe malaria dataset from The Gambia. The top plot shows single-SNP logistic regression P-values obtained from analysis in PLINK. The middle plot shows AI test P-values. The lower plot shows a close-up of the AI test P-values around the causal SNP. The horizontal dashed line shows 10−8 (a common threshold indicating genome-wide significance). The vertical dashed line indicates the position of the causal SNP, rs334. Light gray points show SNPs with logistic regression P-value below a threshold of 10−7, which were presumably removed (by eye, following cluster plot checks) in the analysis by Jallow et al. [Jallow et al., 2009]

Mentions: Figure 5 (top) shows a Manhattan plot of the P-values for chromosome 11 in the Gambian case/control dataset using single-SNP logistic regression implemented in PLINK. To adjust for population stratification, we first performed principal component analysis (PCA) using the smartpca routine of the EIGENSOFT package [Price et al., 2006] on 139,445 autosomal SNPs (pruned to be in low levels of LD with one another using the PLINK command “–indep 50 5 2”). The first three principal components from smartpca were then included as explanatory variables in the logistic regression models fitted by PLINK. (Note that Jallow et al. [2009] also included the first three principal components to correct for population stratification, within a standard logistic regression framework). Colored in light gray in Figure 5 (top) are points that appear to be spurious associations and do not appear in the corresponding Manhattan plot shown in Jallow et al. [2009]; we presume that these correspond to untrustworthy poorly-clustering SNPs that were removed “by eye.” A weak association signal ( at rs11036635) can be seen in the vicinity of the known causal SNP, rs334, whose position is shown as a dashed vertical line.


Imputation without doing imputation: a new method for the detection of non-genotyped causal variants.

Howey R, Cordell HJ - Genet. Epidemiol. (2014)

Manhattan plots on chromosome 11 for the case/control severe malaria dataset from The Gambia. The top plot shows single-SNP logistic regression P-values obtained from analysis in PLINK. The middle plot shows AI test P-values. The lower plot shows a close-up of the AI test P-values around the causal SNP. The horizontal dashed line shows 10−8 (a common threshold indicating genome-wide significance). The vertical dashed line indicates the position of the causal SNP, rs334. Light gray points show SNPs with logistic regression P-value below a threshold of 10−7, which were presumably removed (by eye, following cluster plot checks) in the analysis by Jallow et al. [Jallow et al., 2009]
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4150535&req=5

fig05: Manhattan plots on chromosome 11 for the case/control severe malaria dataset from The Gambia. The top plot shows single-SNP logistic regression P-values obtained from analysis in PLINK. The middle plot shows AI test P-values. The lower plot shows a close-up of the AI test P-values around the causal SNP. The horizontal dashed line shows 10−8 (a common threshold indicating genome-wide significance). The vertical dashed line indicates the position of the causal SNP, rs334. Light gray points show SNPs with logistic regression P-value below a threshold of 10−7, which were presumably removed (by eye, following cluster plot checks) in the analysis by Jallow et al. [Jallow et al., 2009]
Mentions: Figure 5 (top) shows a Manhattan plot of the P-values for chromosome 11 in the Gambian case/control dataset using single-SNP logistic regression implemented in PLINK. To adjust for population stratification, we first performed principal component analysis (PCA) using the smartpca routine of the EIGENSOFT package [Price et al., 2006] on 139,445 autosomal SNPs (pruned to be in low levels of LD with one another using the PLINK command “–indep 50 5 2”). The first three principal components from smartpca were then included as explanatory variables in the logistic regression models fitted by PLINK. (Note that Jallow et al. [2009] also included the first three principal components to correct for population stratification, within a standard logistic regression framework). Colored in light gray in Figure 5 (top) are points that appear to be spurious associations and do not appear in the corresponding Manhattan plot shown in Jallow et al. [2009]; we presume that these correspond to untrustworthy poorly-clustering SNPs that were removed “by eye.” A weak association signal ( at rs11036635) can be seen in the vicinity of the known causal SNP, rs334, whose position is shown as a dashed vertical line.

Bottom Line: This observation motivates popular but computationally intensive approaches based on imputation or haplotyping.These two SNPs are used as predictors in linear or logistic regression analysis to generate a final significance test.Previous analysis showed that fine-scale sequencing of a Gambian reference panel in the region of the known causal locus, followed by imputation, increased the signal of association to genome-wide significance levels.

View Article: PubMed Central - PubMed

Affiliation: Institute of Genetic Medicine, Newcastle University, International Centre for Life, Central Parkway, Newcastle upon Tyne, United Kingdom.

Show MeSH
Related in: MedlinePlus