Limits...
GPA: a statistical approach to prioritizing GWAS results by integrating pleiotropy and annotation.

Chung D, Yang C, Li C, Gelernter J, Zhao H - PLoS Genet. (2014)

Bottom Line: Results from Genome-Wide Association Studies (GWAS) have shown that complex diseases are often affected by many genetic variants with small or moderate effects.Using our hypothesis testing framework, statistically significant pleiotropic effects were detected among these psychiatric disorders, and the markers annotated in the central nervous system genes and eQTLs from the Genotype-Tissue Expression (GTEx) database were significantly enriched.GPA was able to detect cell lines that are biologically more relevant to bladder cancer.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America; Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina, United States of America.

ABSTRACT
Results from Genome-Wide Association Studies (GWAS) have shown that complex diseases are often affected by many genetic variants with small or moderate effects. Identifications of these risk variants remain a very challenging problem. There is a need to develop more powerful statistical methods to leverage available information to improve upon traditional approaches that focus on a single GWAS dataset without incorporating additional data. In this paper, we propose a novel statistical approach, GPA (Genetic analysis incorporating Pleiotropy and Annotation), to increase statistical power to identify risk variants through joint analysis of multiple GWAS data sets and annotation information because: (1) accumulating evidence suggests that different complex diseases share common risk bases, i.e., pleiotropy; and (2) functionally annotated variants have been consistently demonstrated to be enriched among GWAS hits. GPA can integrate multiple GWAS datasets and functional annotations to seek association signals, and it can also perform hypothesis testing to test the presence of pleiotropy and enrichment of functional annotation. Statistical inference of the model parameters and SNP ranking is achieved through an EM algorithm that can handle genome-wide markers efficiently. When we applied GPA to jointly analyze five psychiatric disorders with annotation information, not only did GPA identify many weak signals missed by the traditional single phenotype analysis, but it also revealed relationships in the genetic architecture of these disorders. Using our hypothesis testing framework, statistically significant pleiotropic effects were detected among these psychiatric disorders, and the markers annotated in the central nervous system genes and eQTLs from the Genotype-Tissue Expression (GTEx) database were significantly enriched. We also applied GPA to a bladder cancer GWAS data set with the ENCODE DNase-seq data from 125 cell lines. GPA was able to detect cell lines that are biologically more relevant to bladder cancer. The R implementation of GPA is currently available at http://dongjunchung.github.io/GPA/.

Show MeSH

Related in: MedlinePlus

The comparison between GPA and GSEA at number of risk SNPs  = 1000.Here we fixed  and varied  to evaluate the power for sample size  = 2000 (Upper Left panel), 5000 (Upper Right panel), 10000 (Lower Left panel), respectively. We used  to evaluate the type I errors (Lower Right panel). The results are based on 500 simulations.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4230845&req=5

pgen-1004787-g004: The comparison between GPA and GSEA at number of risk SNPs  = 1000.Here we fixed and varied to evaluate the power for sample size  = 2000 (Upper Left panel), 5000 (Upper Right panel), 10000 (Lower Left panel), respectively. We used to evaluate the type I errors (Lower Right panel). The results are based on 500 simulations.

Mentions: Next, we evaluated the type I error and power of GPA for hypothesis testing on the significance of annotation enrichment for risk SNPs. Gene Set Enrichment Analysis (GSEA) [34] is a popular method to accomplish a similar task. Although GSEA typically is used for gene expression data analysis, its input can be a list of p-values obtained from any source. Therefore we implemented the GSEA method to test the enrichment of the -values of a set of SNPs being annotated and compared it with GPA. We followed the previous simulation scheme and simulated one GWAS data set with , varying from 2000 to 10000, and varying from 500 to 2000. Here was fixed at 0.1 and was varied from 0.1 to 0.5. We set the statistical significance level at 0.05. Type I error rate was evaluated at and power was evaluated at . The results for are shown in Figure 4. In general, GPA provided much higher power than GSEA while both methods appropriately controlled the type I error rate.


GPA: a statistical approach to prioritizing GWAS results by integrating pleiotropy and annotation.

Chung D, Yang C, Li C, Gelernter J, Zhao H - PLoS Genet. (2014)

The comparison between GPA and GSEA at number of risk SNPs  = 1000.Here we fixed  and varied  to evaluate the power for sample size  = 2000 (Upper Left panel), 5000 (Upper Right panel), 10000 (Lower Left panel), respectively. We used  to evaluate the type I errors (Lower Right panel). The results are based on 500 simulations.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4230845&req=5

pgen-1004787-g004: The comparison between GPA and GSEA at number of risk SNPs  = 1000.Here we fixed and varied to evaluate the power for sample size  = 2000 (Upper Left panel), 5000 (Upper Right panel), 10000 (Lower Left panel), respectively. We used to evaluate the type I errors (Lower Right panel). The results are based on 500 simulations.
Mentions: Next, we evaluated the type I error and power of GPA for hypothesis testing on the significance of annotation enrichment for risk SNPs. Gene Set Enrichment Analysis (GSEA) [34] is a popular method to accomplish a similar task. Although GSEA typically is used for gene expression data analysis, its input can be a list of p-values obtained from any source. Therefore we implemented the GSEA method to test the enrichment of the -values of a set of SNPs being annotated and compared it with GPA. We followed the previous simulation scheme and simulated one GWAS data set with , varying from 2000 to 10000, and varying from 500 to 2000. Here was fixed at 0.1 and was varied from 0.1 to 0.5. We set the statistical significance level at 0.05. Type I error rate was evaluated at and power was evaluated at . The results for are shown in Figure 4. In general, GPA provided much higher power than GSEA while both methods appropriately controlled the type I error rate.

Bottom Line: Results from Genome-Wide Association Studies (GWAS) have shown that complex diseases are often affected by many genetic variants with small or moderate effects.Using our hypothesis testing framework, statistically significant pleiotropic effects were detected among these psychiatric disorders, and the markers annotated in the central nervous system genes and eQTLs from the Genotype-Tissue Expression (GTEx) database were significantly enriched.GPA was able to detect cell lines that are biologically more relevant to bladder cancer.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America; Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina, United States of America.

ABSTRACT
Results from Genome-Wide Association Studies (GWAS) have shown that complex diseases are often affected by many genetic variants with small or moderate effects. Identifications of these risk variants remain a very challenging problem. There is a need to develop more powerful statistical methods to leverage available information to improve upon traditional approaches that focus on a single GWAS dataset without incorporating additional data. In this paper, we propose a novel statistical approach, GPA (Genetic analysis incorporating Pleiotropy and Annotation), to increase statistical power to identify risk variants through joint analysis of multiple GWAS data sets and annotation information because: (1) accumulating evidence suggests that different complex diseases share common risk bases, i.e., pleiotropy; and (2) functionally annotated variants have been consistently demonstrated to be enriched among GWAS hits. GPA can integrate multiple GWAS datasets and functional annotations to seek association signals, and it can also perform hypothesis testing to test the presence of pleiotropy and enrichment of functional annotation. Statistical inference of the model parameters and SNP ranking is achieved through an EM algorithm that can handle genome-wide markers efficiently. When we applied GPA to jointly analyze five psychiatric disorders with annotation information, not only did GPA identify many weak signals missed by the traditional single phenotype analysis, but it also revealed relationships in the genetic architecture of these disorders. Using our hypothesis testing framework, statistically significant pleiotropic effects were detected among these psychiatric disorders, and the markers annotated in the central nervous system genes and eQTLs from the Genotype-Tissue Expression (GTEx) database were significantly enriched. We also applied GPA to a bladder cancer GWAS data set with the ENCODE DNase-seq data from 125 cell lines. GPA was able to detect cell lines that are biologically more relevant to bladder cancer. The R implementation of GPA is currently available at http://dongjunchung.github.io/GPA/.

Show MeSH
Related in: MedlinePlus