Limits...
Caution in interpreting results from imputation analysis when linkage disequilibrium extends over a large distance: a case study on venous thrombosis.

Germain M, Saut N, Oudot-Mellakh T, Letenneur L, Dupuy AM, Bertrand M, Alessi MC, Lambert JC, Zelenika D, Emmerich J, Tiret L, Cambien F, Lathrop M, Amouyel P, Morange PE, Trégouët DA - PLoS ONE (2012)

Bottom Line: A comprehensive linkage disequilibrium and haplotype analysis of the whole locus where twelve SNPs exhibited association p-values lower than 2.23 10(-11) and the use of independent case-control samples demonstrated that the culprit variant was a rare variant located ~1 Mb away from the original hits, not tagged by current genome-wide genotyping arrays and even not well imputed in the original GWAS samples.This variant was in fact the rs1799963, also known as the FII G20210A prothrombin mutation.This work may be of major interest not only for its scientific impact but also for its methodological findings.

View Article: PubMed Central - PubMed

Affiliation: INSERM UMR_S 937, ICAN Institute, Université Pierre et Marie Curie, Paris, France.

ABSTRACT
By applying an imputation strategy based on the 1000 Genomes project to two genome-wide association studies (GWAS), we detected a susceptibility locus for venous thrombosis on chromosome 11p11.2 that was missed by previous GWAS analyses that had been conducted on the same datasets. A comprehensive linkage disequilibrium and haplotype analysis of the whole locus where twelve SNPs exhibited association p-values lower than 2.23 10(-11) and the use of independent case-control samples demonstrated that the culprit variant was a rare variant located ~1 Mb away from the original hits, not tagged by current genome-wide genotyping arrays and even not well imputed in the original GWAS samples. This variant was in fact the rs1799963, also known as the FII G20210A prothrombin mutation. This work may be of major interest not only for its scientific impact but also for its methodological findings.

Show MeSH

Related in: MedlinePlus

Box-Plot representation of the imputed dose at rs1799963 (FII G20210A) according to measured genotypes in a sample of 1,961 VT cases.The imputed dose in the 419 VT cases of the first GWAS is shown in blue while results obtained in the 1,542 VT part of the second GWAS are shown in red.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3366937&req=5

pone-0038538-g003: Box-Plot representation of the imputed dose at rs1799963 (FII G20210A) according to measured genotypes in a sample of 1,961 VT cases.The imputed dose in the 419 VT cases of the first GWAS is shown in blue while results obtained in the 1,542 VT part of the second GWAS are shown in red.

Mentions: Coming back to our imputation GWAS data, we observed that the rs1799963 variant was imputed with poor quality (r2 = 0.12 and r2 = 0.27 in the first and second GWAS, respectively) and thus did not pass the quality control. Nevertheless, it showed suggestive association with VT, P = 0.053 and P = 0.110, in the first and second GWAS, respectively, for a combined statistical evidence of P = 0.020. A conditional logistic regression analysis was then conducted to estimate the effect of the imputed rs2856656 after adjusting for the imputed rs1799963. As indicated in Table 5, the imputed rs2856656-G allele was still associated with increased risk for VT, P = 7.22 10−4 and P = 1.13 10−11, in the first and second GWAS, respectively. However, because the rs1799963 variant was typed in the GWAS patients as part of the inclusion/exclusion criteria (see Materials and Methods), we re-ran the conditional analyses using the true genotyped rs1799963 in cases rather than the imputed dose. The association of rs2856656 with VT was now no longer significant, P = 0.643 and P = 0.122, in the first and second GWAS respectively (Table 5). To corroborate the poor quality of the imputation at rs1799963 mentioned above, we calculated the Spearman correlation between the imputed dose and the true genotype at rs1799963 in all cases for whom both information was available (Figure 1). This correlation was only ρ = 0.36 and ρ = 0.48 in the 419 and 1,542 cases from the first and second GWAS, respectively. As shown in Figure 3, the rs1799963 genotype was poorly imputed in heterozygotes individuals. Finally, a further haplotype analysis revealed that the rare rs17999963-A allele was mainly carried by the AGT at-risk haplotype discussed above (Table S3). Of note, a LD analysis of the whole chromosome 11 region from 46,600,000 to 48,000,000 bp containing 119 SNPs (Figure 4) showed that the rs1799963 variant was not in strong pairwise LD with any other common SNPs, the higher observed r2 being 0.15 with AGBL2 rs7930612.


Caution in interpreting results from imputation analysis when linkage disequilibrium extends over a large distance: a case study on venous thrombosis.

Germain M, Saut N, Oudot-Mellakh T, Letenneur L, Dupuy AM, Bertrand M, Alessi MC, Lambert JC, Zelenika D, Emmerich J, Tiret L, Cambien F, Lathrop M, Amouyel P, Morange PE, Trégouët DA - PLoS ONE (2012)

Box-Plot representation of the imputed dose at rs1799963 (FII G20210A) according to measured genotypes in a sample of 1,961 VT cases.The imputed dose in the 419 VT cases of the first GWAS is shown in blue while results obtained in the 1,542 VT part of the second GWAS are shown in red.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3366937&req=5

pone-0038538-g003: Box-Plot representation of the imputed dose at rs1799963 (FII G20210A) according to measured genotypes in a sample of 1,961 VT cases.The imputed dose in the 419 VT cases of the first GWAS is shown in blue while results obtained in the 1,542 VT part of the second GWAS are shown in red.
Mentions: Coming back to our imputation GWAS data, we observed that the rs1799963 variant was imputed with poor quality (r2 = 0.12 and r2 = 0.27 in the first and second GWAS, respectively) and thus did not pass the quality control. Nevertheless, it showed suggestive association with VT, P = 0.053 and P = 0.110, in the first and second GWAS, respectively, for a combined statistical evidence of P = 0.020. A conditional logistic regression analysis was then conducted to estimate the effect of the imputed rs2856656 after adjusting for the imputed rs1799963. As indicated in Table 5, the imputed rs2856656-G allele was still associated with increased risk for VT, P = 7.22 10−4 and P = 1.13 10−11, in the first and second GWAS, respectively. However, because the rs1799963 variant was typed in the GWAS patients as part of the inclusion/exclusion criteria (see Materials and Methods), we re-ran the conditional analyses using the true genotyped rs1799963 in cases rather than the imputed dose. The association of rs2856656 with VT was now no longer significant, P = 0.643 and P = 0.122, in the first and second GWAS respectively (Table 5). To corroborate the poor quality of the imputation at rs1799963 mentioned above, we calculated the Spearman correlation between the imputed dose and the true genotype at rs1799963 in all cases for whom both information was available (Figure 1). This correlation was only ρ = 0.36 and ρ = 0.48 in the 419 and 1,542 cases from the first and second GWAS, respectively. As shown in Figure 3, the rs1799963 genotype was poorly imputed in heterozygotes individuals. Finally, a further haplotype analysis revealed that the rare rs17999963-A allele was mainly carried by the AGT at-risk haplotype discussed above (Table S3). Of note, a LD analysis of the whole chromosome 11 region from 46,600,000 to 48,000,000 bp containing 119 SNPs (Figure 4) showed that the rs1799963 variant was not in strong pairwise LD with any other common SNPs, the higher observed r2 being 0.15 with AGBL2 rs7930612.

Bottom Line: A comprehensive linkage disequilibrium and haplotype analysis of the whole locus where twelve SNPs exhibited association p-values lower than 2.23 10(-11) and the use of independent case-control samples demonstrated that the culprit variant was a rare variant located ~1 Mb away from the original hits, not tagged by current genome-wide genotyping arrays and even not well imputed in the original GWAS samples.This variant was in fact the rs1799963, also known as the FII G20210A prothrombin mutation.This work may be of major interest not only for its scientific impact but also for its methodological findings.

View Article: PubMed Central - PubMed

Affiliation: INSERM UMR_S 937, ICAN Institute, Université Pierre et Marie Curie, Paris, France.

ABSTRACT
By applying an imputation strategy based on the 1000 Genomes project to two genome-wide association studies (GWAS), we detected a susceptibility locus for venous thrombosis on chromosome 11p11.2 that was missed by previous GWAS analyses that had been conducted on the same datasets. A comprehensive linkage disequilibrium and haplotype analysis of the whole locus where twelve SNPs exhibited association p-values lower than 2.23 10(-11) and the use of independent case-control samples demonstrated that the culprit variant was a rare variant located ~1 Mb away from the original hits, not tagged by current genome-wide genotyping arrays and even not well imputed in the original GWAS samples. This variant was in fact the rs1799963, also known as the FII G20210A prothrombin mutation. This work may be of major interest not only for its scientific impact but also for its methodological findings.

Show MeSH
Related in: MedlinePlus