Limits...
Gene-based multiple regression association testing for combined examination of common and low frequency variants in quantitative trait analysis.

Yoo YJ, Sun L, Bull SB - Front Genet (2013)

Bottom Line: We compared the tests using (1) set of all SNPs in a gene, (2) set of common SNPs in a gene (MAF ≥ 5%), (3) set of low frequency SNPs (1% ≤ MAF < 5%).For different trait models based on low frequency causal SNPs, we found that combined analysis using all SNPs including common and low frequency SNPs is a good and robust choice whereas using common SNPs alone or low frequency SNP alone can lose power.MLC tests performed well in combined analysis except where two low frequency causal SNPs with opposing effects are positively correlated.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics Education, Seoul National University Seoul, South Korea ; Interdisciplinary Program in Bioinformatics, Seoul National University Seoul, South Korea.

ABSTRACT
Multi-marker methods for genetic association analysis can be performed for common and low frequency SNPs to improve power. Regression models are an intuitive way to formulate multi-marker tests. In previous studies we evaluated regression-based multi-marker tests for common SNPs, and through identification of bins consisting of correlated SNPs, developed a multi-bin linear combination (MLC) test that is a compromise between a 1 df linear combination test and a multi-df global test. Bins of SNPs in high linkage disequilibrium (LD) are identified, and a linear combination of individual SNP statistics is constructed within each bin. Then association with the phenotype is represented by an overall statistic with df as many or few as the number of bins. In this report we evaluate multi-marker tests for SNPs that occur at low frequencies. There are many linear and quadratic multi-marker tests that are suitable for common or low frequency variant analysis. We compared the performance of the MLC tests with various linear and quadratic statistics in joint or marginal regressions. For these comparisons, we performed a simulation study of genotypes and quantitative traits for 85 genes with many low frequency SNPs based on HapMap Phase III. We compared the tests using (1) set of all SNPs in a gene, (2) set of common SNPs in a gene (MAF ≥ 5%), (3) set of low frequency SNPs (1% ≤ MAF < 5%). For different trait models based on low frequency causal SNPs, we found that combined analysis using all SNPs including common and low frequency SNPs is a good and robust choice whereas using common SNPs alone or low frequency SNP alone can lose power. MLC tests performed well in combined analysis except where two low frequency causal SNPs with opposing effects are positively correlated. Overall, across different sets of analysis, the joint regression Wald test showed consistently good performance whereas other statistics including the ones based on marginal regression had lower power for some situations.

No MeSH data available.


Related in: MedlinePlus

Power of gene-based tests using three analysis sets of SNPs for 85 genes under trait Model 3. Genes are ordered along the horizontal axis according to the empirical power of Wald test using only low frequency SNPs.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3824159&req=5

Figure 5: Power of gene-based tests using three analysis sets of SNPs for 85 genes under trait Model 3. Genes are ordered along the horizontal axis according to the empirical power of Wald test using only low frequency SNPs.

Mentions: For Model 3 where the causal effects are opposing, the empirical power of MLC tests, MinP-M, PC80, SSB, SSBw, and SKAT with analysis of all SNPs was substantially lower than that of the Wald test, whereas for the other trait models, these tests were more powerful than the Wald when the analysis included all SNPs (Figures 2, 5). The expected beta coefficient for the marginal association was low for both common and rare SNPs, which resulted in relatively low power for the tests based on marginal analysis (Table 3). The joint analysis captured the causal effect better than the marginal analysis for the case of Model 3, but neither of the MLC or LC tests perform well since the captured effects are opposing as indicated by the sum of β near to zero. Model 3 is essentially a “worst case” for the MLC test construction because the opposing LF SNPs are positively correlated and are assigned to the same bin.


Gene-based multiple regression association testing for combined examination of common and low frequency variants in quantitative trait analysis.

Yoo YJ, Sun L, Bull SB - Front Genet (2013)

Power of gene-based tests using three analysis sets of SNPs for 85 genes under trait Model 3. Genes are ordered along the horizontal axis according to the empirical power of Wald test using only low frequency SNPs.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3824159&req=5

Figure 5: Power of gene-based tests using three analysis sets of SNPs for 85 genes under trait Model 3. Genes are ordered along the horizontal axis according to the empirical power of Wald test using only low frequency SNPs.
Mentions: For Model 3 where the causal effects are opposing, the empirical power of MLC tests, MinP-M, PC80, SSB, SSBw, and SKAT with analysis of all SNPs was substantially lower than that of the Wald test, whereas for the other trait models, these tests were more powerful than the Wald when the analysis included all SNPs (Figures 2, 5). The expected beta coefficient for the marginal association was low for both common and rare SNPs, which resulted in relatively low power for the tests based on marginal analysis (Table 3). The joint analysis captured the causal effect better than the marginal analysis for the case of Model 3, but neither of the MLC or LC tests perform well since the captured effects are opposing as indicated by the sum of β near to zero. Model 3 is essentially a “worst case” for the MLC test construction because the opposing LF SNPs are positively correlated and are assigned to the same bin.

Bottom Line: We compared the tests using (1) set of all SNPs in a gene, (2) set of common SNPs in a gene (MAF ≥ 5%), (3) set of low frequency SNPs (1% ≤ MAF < 5%).For different trait models based on low frequency causal SNPs, we found that combined analysis using all SNPs including common and low frequency SNPs is a good and robust choice whereas using common SNPs alone or low frequency SNP alone can lose power.MLC tests performed well in combined analysis except where two low frequency causal SNPs with opposing effects are positively correlated.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics Education, Seoul National University Seoul, South Korea ; Interdisciplinary Program in Bioinformatics, Seoul National University Seoul, South Korea.

ABSTRACT
Multi-marker methods for genetic association analysis can be performed for common and low frequency SNPs to improve power. Regression models are an intuitive way to formulate multi-marker tests. In previous studies we evaluated regression-based multi-marker tests for common SNPs, and through identification of bins consisting of correlated SNPs, developed a multi-bin linear combination (MLC) test that is a compromise between a 1 df linear combination test and a multi-df global test. Bins of SNPs in high linkage disequilibrium (LD) are identified, and a linear combination of individual SNP statistics is constructed within each bin. Then association with the phenotype is represented by an overall statistic with df as many or few as the number of bins. In this report we evaluate multi-marker tests for SNPs that occur at low frequencies. There are many linear and quadratic multi-marker tests that are suitable for common or low frequency variant analysis. We compared the performance of the MLC tests with various linear and quadratic statistics in joint or marginal regressions. For these comparisons, we performed a simulation study of genotypes and quantitative traits for 85 genes with many low frequency SNPs based on HapMap Phase III. We compared the tests using (1) set of all SNPs in a gene, (2) set of common SNPs in a gene (MAF ≥ 5%), (3) set of low frequency SNPs (1% ≤ MAF < 5%). For different trait models based on low frequency causal SNPs, we found that combined analysis using all SNPs including common and low frequency SNPs is a good and robust choice whereas using common SNPs alone or low frequency SNP alone can lose power. MLC tests performed well in combined analysis except where two low frequency causal SNPs with opposing effects are positively correlated. Overall, across different sets of analysis, the joint regression Wald test showed consistently good performance whereas other statistics including the ones based on marginal regression had lower power for some situations.

No MeSH data available.


Related in: MedlinePlus