Limits...
Discovering Genetic Interactions in Large-Scale Association Studies by Stage-wise Likelihood Ratio Tests.

Frånberg M, Gertow K, Hamsten A, PROCARDIS consortiumLagergren J, Sennblad B - PLoS Genet. (2015)

Bottom Line: This sequential testing provides an efficient way to reduce the number of non-interacting variant pairs before the final interaction test.Our new methodology facilitates identification of new disease-relevant interactions from existing and future genome-wide association data, which may involve genes with previously unknown association to the disease.Moreover, it enables construction of interaction networks that provide a systems biology view of complex diseases, serving as a basis for more comprehensive understanding of disease pathophysiology and its clinical consequences.

View Article: PubMed Central - PubMed

Affiliation: Atherosclerosis Research Unit, Department of Medicine, Solna, Karolinska Institutet, Stockholm, Sweden; Department of Numerical Analysis and Computer Science, Stockholm University, Stockholm, Sweden; Science for Life Laboratory, Stockholm, Sweden.

ABSTRACT
Despite the success of genome-wide association studies in medical genetics, the underlying genetics of many complex diseases remains enigmatic. One plausible reason for this could be the failure to account for the presence of genetic interactions in current analyses. Exhaustive investigations of interactions are typically infeasible because the vast number of possible interactions impose hard statistical and computational challenges. There is, therefore, a need for computationally efficient methods that build on models appropriately capturing interaction. We introduce a new methodology where we augment the interaction hypothesis with a set of simpler hypotheses that are tested, in order of their complexity, against a saturated alternative hypothesis representing interaction. This sequential testing provides an efficient way to reduce the number of non-interacting variant pairs before the final interaction test. We devise two different methods, one that relies on a priori estimated numbers of marginally associated variants to correct for multiple tests, and a second that does this adaptively. We show that our methodology in general has an improved statistical power in comparison to seven other methods, and, using the idea of closed testing, that it controls the family-wise error rate. We apply our methodology to genetic data from the PROCARDIS coronary artery disease case/control cohort and discover three distinct interactions. While analyses on simulated data suggest that the statistical power may suffice for an exhaustive search of all variant pairs in ideal cases, we explore strategies for a priori selecting subsets of variant pairs to test. Our new methodology facilitates identification of new disease-relevant interactions from existing and future genome-wide association data, which may involve genes with previously unknown association to the disease. Moreover, it enables construction of interaction networks that provide a systems biology view of complex diseases, serving as a basis for more comprehensive understanding of disease pathophysiology and its clinical consequences.

No MeSH data available.


Related in: MedlinePlus

The power of the first and last test under a double dominant interaction model.The x-axis is the heritability of the model. The y-axis is the statistical power. The colored lines correspond to two different tests: the one performed in the first stage that tests the  hypothesis of no interaction, H1, (red), and the one performed in the last stage that specifically tests the interaction parameters, H2, (blue). The logit link function and a nominal significance level of 0.05 was used for the analysis.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4581725&req=5

pgen.1005502.g002: The power of the first and last test under a double dominant interaction model.The x-axis is the heritability of the model. The y-axis is the statistical power. The colored lines correspond to two different tests: the one performed in the first stage that tests the hypothesis of no interaction, H1, (red), and the one performed in the last stage that specifically tests the interaction parameters, H2, (blue). The logit link function and a nominal significance level of 0.05 was used for the analysis.

Mentions: The intuitive idea behind the stage-wise methodology is that we aim to (1) reduce the number of tests in later stages compared to earlier, while (2) asserting that actual interactions advance to later stages. We show in the Results section Analysis of biological data, below, that the number of tests in the last stage is in fact substantially reduced, suggesting that aim (1) is unlikely to be a problem. Here, we have investigated aim (2 by comparing the power of the tests in the first and last stage. That is, for data generated from HA, we compare the power of the likelihood ratio test of H1 against HA to that of the test of H4 against HA. Indeed, the results in Fig 2 (using data generated from a double dominant interaction model) suggests that the test in the first stage, at least under these conditions, have substantially greater power than that in the last stage. However, the test in the first stage can obviously not be used as a test for interaction by itself, since it measures any kind of association to the phenotype, including, for example, pairs for which only one of the variants is associated.


Discovering Genetic Interactions in Large-Scale Association Studies by Stage-wise Likelihood Ratio Tests.

Frånberg M, Gertow K, Hamsten A, PROCARDIS consortiumLagergren J, Sennblad B - PLoS Genet. (2015)

The power of the first and last test under a double dominant interaction model.The x-axis is the heritability of the model. The y-axis is the statistical power. The colored lines correspond to two different tests: the one performed in the first stage that tests the  hypothesis of no interaction, H1, (red), and the one performed in the last stage that specifically tests the interaction parameters, H2, (blue). The logit link function and a nominal significance level of 0.05 was used for the analysis.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4581725&req=5

pgen.1005502.g002: The power of the first and last test under a double dominant interaction model.The x-axis is the heritability of the model. The y-axis is the statistical power. The colored lines correspond to two different tests: the one performed in the first stage that tests the hypothesis of no interaction, H1, (red), and the one performed in the last stage that specifically tests the interaction parameters, H2, (blue). The logit link function and a nominal significance level of 0.05 was used for the analysis.
Mentions: The intuitive idea behind the stage-wise methodology is that we aim to (1) reduce the number of tests in later stages compared to earlier, while (2) asserting that actual interactions advance to later stages. We show in the Results section Analysis of biological data, below, that the number of tests in the last stage is in fact substantially reduced, suggesting that aim (1) is unlikely to be a problem. Here, we have investigated aim (2 by comparing the power of the tests in the first and last stage. That is, for data generated from HA, we compare the power of the likelihood ratio test of H1 against HA to that of the test of H4 against HA. Indeed, the results in Fig 2 (using data generated from a double dominant interaction model) suggests that the test in the first stage, at least under these conditions, have substantially greater power than that in the last stage. However, the test in the first stage can obviously not be used as a test for interaction by itself, since it measures any kind of association to the phenotype, including, for example, pairs for which only one of the variants is associated.

Bottom Line: This sequential testing provides an efficient way to reduce the number of non-interacting variant pairs before the final interaction test.Our new methodology facilitates identification of new disease-relevant interactions from existing and future genome-wide association data, which may involve genes with previously unknown association to the disease.Moreover, it enables construction of interaction networks that provide a systems biology view of complex diseases, serving as a basis for more comprehensive understanding of disease pathophysiology and its clinical consequences.

View Article: PubMed Central - PubMed

Affiliation: Atherosclerosis Research Unit, Department of Medicine, Solna, Karolinska Institutet, Stockholm, Sweden; Department of Numerical Analysis and Computer Science, Stockholm University, Stockholm, Sweden; Science for Life Laboratory, Stockholm, Sweden.

ABSTRACT
Despite the success of genome-wide association studies in medical genetics, the underlying genetics of many complex diseases remains enigmatic. One plausible reason for this could be the failure to account for the presence of genetic interactions in current analyses. Exhaustive investigations of interactions are typically infeasible because the vast number of possible interactions impose hard statistical and computational challenges. There is, therefore, a need for computationally efficient methods that build on models appropriately capturing interaction. We introduce a new methodology where we augment the interaction hypothesis with a set of simpler hypotheses that are tested, in order of their complexity, against a saturated alternative hypothesis representing interaction. This sequential testing provides an efficient way to reduce the number of non-interacting variant pairs before the final interaction test. We devise two different methods, one that relies on a priori estimated numbers of marginally associated variants to correct for multiple tests, and a second that does this adaptively. We show that our methodology in general has an improved statistical power in comparison to seven other methods, and, using the idea of closed testing, that it controls the family-wise error rate. We apply our methodology to genetic data from the PROCARDIS coronary artery disease case/control cohort and discover three distinct interactions. While analyses on simulated data suggest that the statistical power may suffice for an exhaustive search of all variant pairs in ideal cases, we explore strategies for a priori selecting subsets of variant pairs to test. Our new methodology facilitates identification of new disease-relevant interactions from existing and future genome-wide association data, which may involve genes with previously unknown association to the disease. Moreover, it enables construction of interaction networks that provide a systems biology view of complex diseases, serving as a basis for more comprehensive understanding of disease pathophysiology and its clinical consequences.

No MeSH data available.


Related in: MedlinePlus