Limits...
Epistasis detection on quantitative phenotypes by exhaustive enumeration using GPUs.

Kam-Thong T, Pütz B, Karbalai N, Müller-Myhsok B, Borgwardt K - Bioinformatics (2011)

Bottom Line: The search for significant epistasis (gene-gene interactions) still poses as a computational challenge for modern day computing systems, due to the large number of hypotheses that have to be tested.In this article, we present an approach to epistasis detection by exhaustive testing of all possible SNP pairs.The actual implementation of this search is done on the highly parallelized architecture available on graphics processing units rendering the completion of the full search feasible within a day.

View Article: PubMed Central - PubMed

Affiliation: Statistical Genetics, Max Planck Institute of Psychiatry, Munich, Germany. tony@mpipsykl.mpg.de

ABSTRACT

Motivation: In recent years, numerous genome-wide association studies have been conducted to identify genetic makeup that explains phenotypic differences observed in human population. Analytical tests on single loci are readily available and embedded in common genome analysis software toolset. The search for significant epistasis (gene-gene interactions) still poses as a computational challenge for modern day computing systems, due to the large number of hypotheses that have to be tested.

Results: In this article, we present an approach to epistasis detection by exhaustive testing of all possible SNP pairs. The search strategy based on the Hilbert-Schmidt Independence Criterion can help delineate various forms of statistical dependence between the genetic markers and the phenotype. The actual implementation of this search is done on the highly parallelized architecture available on graphics processing units rendering the completion of the full search feasible within a day.

Availability: The program is available at http://www.mpipsykl.mpg.de/epigpuhsic/.

Contact: tony@mpipsykl.mpg.de.

Show MeSH

Related in: MedlinePlus

−log10 Linear regression P-values versus the HSIC for 50 SNPs (1225 pairs) — r2=0.9764.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3117340&req=5

Figure 1: −log10 Linear regression P-values versus the HSIC for 50 SNPs (1225 pairs) — r2=0.9764.

Mentions: For the purpose of validating the method, data are simulated using a normally distributed output phenotype (mean = 0 and standard deviation = 1) and genotype SNP value in {0, 1, 2} encoding. The number of individuals is set to 10 000 subjects and 50 SNPs, resulting in 1225 unique SNP pairs. These SNPs are simulated in Hardy–Weinberg equilibrium (P=0.05). Testing for the significance of the interaction SNP pair with respect to the quantitative phenotype, a standard linear regression on the full rank model including main effects is performed (ψ(y)=α+βxA+γxB+δxAxB), where the significance of the coefficient δ is compared to the HSIC realization derived for quantitative phenotype in Section 2.1. A total of 1225 pairs are compared; this is a relatively small and unrealistic number of pairs but it serves only to demonstrate the validity of the method. As illustrated in Figure 1, the HSIC is compared against the −log10 of the P-value obtained from the likelihood ratio test comparing the regression models without and with the interaction term. The r2 is noted to be 0.9764, indicating that the HSIC is strongly correlated with the significance of the interaction term sought after.Fig. 1.


Epistasis detection on quantitative phenotypes by exhaustive enumeration using GPUs.

Kam-Thong T, Pütz B, Karbalai N, Müller-Myhsok B, Borgwardt K - Bioinformatics (2011)

−log10 Linear regression P-values versus the HSIC for 50 SNPs (1225 pairs) — r2=0.9764.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3117340&req=5

Figure 1: −log10 Linear regression P-values versus the HSIC for 50 SNPs (1225 pairs) — r2=0.9764.
Mentions: For the purpose of validating the method, data are simulated using a normally distributed output phenotype (mean = 0 and standard deviation = 1) and genotype SNP value in {0, 1, 2} encoding. The number of individuals is set to 10 000 subjects and 50 SNPs, resulting in 1225 unique SNP pairs. These SNPs are simulated in Hardy–Weinberg equilibrium (P=0.05). Testing for the significance of the interaction SNP pair with respect to the quantitative phenotype, a standard linear regression on the full rank model including main effects is performed (ψ(y)=α+βxA+γxB+δxAxB), where the significance of the coefficient δ is compared to the HSIC realization derived for quantitative phenotype in Section 2.1. A total of 1225 pairs are compared; this is a relatively small and unrealistic number of pairs but it serves only to demonstrate the validity of the method. As illustrated in Figure 1, the HSIC is compared against the −log10 of the P-value obtained from the likelihood ratio test comparing the regression models without and with the interaction term. The r2 is noted to be 0.9764, indicating that the HSIC is strongly correlated with the significance of the interaction term sought after.Fig. 1.

Bottom Line: The search for significant epistasis (gene-gene interactions) still poses as a computational challenge for modern day computing systems, due to the large number of hypotheses that have to be tested.In this article, we present an approach to epistasis detection by exhaustive testing of all possible SNP pairs.The actual implementation of this search is done on the highly parallelized architecture available on graphics processing units rendering the completion of the full search feasible within a day.

View Article: PubMed Central - PubMed

Affiliation: Statistical Genetics, Max Planck Institute of Psychiatry, Munich, Germany. tony@mpipsykl.mpg.de

ABSTRACT

Motivation: In recent years, numerous genome-wide association studies have been conducted to identify genetic makeup that explains phenotypic differences observed in human population. Analytical tests on single loci are readily available and embedded in common genome analysis software toolset. The search for significant epistasis (gene-gene interactions) still poses as a computational challenge for modern day computing systems, due to the large number of hypotheses that have to be tested.

Results: In this article, we present an approach to epistasis detection by exhaustive testing of all possible SNP pairs. The search strategy based on the Hilbert-Schmidt Independence Criterion can help delineate various forms of statistical dependence between the genetic markers and the phenotype. The actual implementation of this search is done on the highly parallelized architecture available on graphics processing units rendering the completion of the full search feasible within a day.

Availability: The program is available at http://www.mpipsykl.mpg.de/epigpuhsic/.

Contact: tony@mpipsykl.mpg.de.

Show MeSH
Related in: MedlinePlus