Limits...
GPA: a statistical approach to prioritizing GWAS results by integrating pleiotropy and annotation.

Chung D, Yang C, Li C, Gelernter J, Zhao H - PLoS Genet. (2014)

Bottom Line: Results from Genome-Wide Association Studies (GWAS) have shown that complex diseases are often affected by many genetic variants with small or moderate effects.Using our hypothesis testing framework, statistically significant pleiotropic effects were detected among these psychiatric disorders, and the markers annotated in the central nervous system genes and eQTLs from the Genotype-Tissue Expression (GTEx) database were significantly enriched.GPA was able to detect cell lines that are biologically more relevant to bladder cancer.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America; Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina, United States of America.

ABSTRACT
Results from Genome-Wide Association Studies (GWAS) have shown that complex diseases are often affected by many genetic variants with small or moderate effects. Identifications of these risk variants remain a very challenging problem. There is a need to develop more powerful statistical methods to leverage available information to improve upon traditional approaches that focus on a single GWAS dataset without incorporating additional data. In this paper, we propose a novel statistical approach, GPA (Genetic analysis incorporating Pleiotropy and Annotation), to increase statistical power to identify risk variants through joint analysis of multiple GWAS data sets and annotation information because: (1) accumulating evidence suggests that different complex diseases share common risk bases, i.e., pleiotropy; and (2) functionally annotated variants have been consistently demonstrated to be enriched among GWAS hits. GPA can integrate multiple GWAS datasets and functional annotations to seek association signals, and it can also perform hypothesis testing to test the presence of pleiotropy and enrichment of functional annotation. Statistical inference of the model parameters and SNP ranking is achieved through an EM algorithm that can handle genome-wide markers efficiently. When we applied GPA to jointly analyze five psychiatric disorders with annotation information, not only did GPA identify many weak signals missed by the traditional single phenotype analysis, but it also revealed relationships in the genetic architecture of these disorders. Using our hypothesis testing framework, statistically significant pleiotropic effects were detected among these psychiatric disorders, and the markers annotated in the central nervous system genes and eQTLs from the Genotype-Tissue Expression (GTEx) database were significantly enriched. We also applied GPA to a bladder cancer GWAS data set with the ENCODE DNase-seq data from 125 cell lines. GPA was able to detect cell lines that are biologically more relevant to bladder cancer. The R implementation of GPA is currently available at http://dongjunchung.github.io/GPA/.

Show MeSH

Related in: MedlinePlus

Manhattan plots of BPD and SCZ.Top left panel: separate analysis without annotation. Top right panel: separate analysis with CNS annotation. Bottom left panel: joint analysis without annotation. Bottom right panel: joint analysis with CNS annotation. The red and blue lines indicate local  = 0.05 and 0.1, respectively.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4230845&req=5

pgen-1004787-g006: Manhattan plots of BPD and SCZ.Top left panel: separate analysis without annotation. Top right panel: separate analysis with CNS annotation. Bottom left panel: joint analysis without annotation. Bottom right panel: joint analysis with CNS annotation. The red and blue lines indicate local  = 0.05 and 0.1, respectively.

Mentions: We also compared the results given by four different analysis approaches: single-GWAS analysis with or without annotation, and two-GWAS joint analysis with or without annotation data. The Manhattan plots are shown in Figures 6 and 7. For single-GWAS analysis without annotation, GPA identified 13 SNPs and 391 SNPs with local false discovery rate for BPD and SCZ, respectively. By using the CNS set as annotation, GPA was able to identify 14 and 409 SNPs for BPD and SCZ, respectively, with the same fdr control. For joint analysis without annotation, the number of identified SNPs increased to 383 and 821 for BPD and SCZ, respectively. By using the CNS set as annotation, the number of identified SNPs further increased to 385 and 837 for BPD and SCZ, respectively. We investigated the BPD results in detail to evaluate the power of GPA in identification of functionally important SNPs. For single-GWAS analysis of BPD, GPA was able to identify SNPs located in the ANK3 gene. By using annotation data, the CACNA1C gene, which encodes an alpha-1 subunit of a voltage-dependent calcium channel, was identified by GPA. After incorporating pleiotropy information between SCZ and BPD, additional functionally relevant genes, such as PBRM1, C6orf136, DPCR1, SYNE1, were identified by GPA. For instance, SYNE1 encodes the synaptic nuclear envelope protein 1, and codes the protein Syne-1 that is found in many tissues and is especially critical in the brain. The Syne-1 protein is active (expressed) in Purkinje cells, which are located in the cerebellum and are involved in signaling between neurons. Mutations in the SYNE1 gene have been found to cause autosomal recessive cerebellar ataxia type 1 (ARCA1) and SYNE1 has recently been implicated as a susceptibility gene for BPD in a large collaborative GWAS study [35]. Clearly, the present results indicate that the statistical power to identify associated SNPs increased by making use of pleiotropy and functional annotation (in this real data example, pleiotropy played a more important role than functional annotation).


GPA: a statistical approach to prioritizing GWAS results by integrating pleiotropy and annotation.

Chung D, Yang C, Li C, Gelernter J, Zhao H - PLoS Genet. (2014)

Manhattan plots of BPD and SCZ.Top left panel: separate analysis without annotation. Top right panel: separate analysis with CNS annotation. Bottom left panel: joint analysis without annotation. Bottom right panel: joint analysis with CNS annotation. The red and blue lines indicate local  = 0.05 and 0.1, respectively.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4230845&req=5

pgen-1004787-g006: Manhattan plots of BPD and SCZ.Top left panel: separate analysis without annotation. Top right panel: separate analysis with CNS annotation. Bottom left panel: joint analysis without annotation. Bottom right panel: joint analysis with CNS annotation. The red and blue lines indicate local  = 0.05 and 0.1, respectively.
Mentions: We also compared the results given by four different analysis approaches: single-GWAS analysis with or without annotation, and two-GWAS joint analysis with or without annotation data. The Manhattan plots are shown in Figures 6 and 7. For single-GWAS analysis without annotation, GPA identified 13 SNPs and 391 SNPs with local false discovery rate for BPD and SCZ, respectively. By using the CNS set as annotation, GPA was able to identify 14 and 409 SNPs for BPD and SCZ, respectively, with the same fdr control. For joint analysis without annotation, the number of identified SNPs increased to 383 and 821 for BPD and SCZ, respectively. By using the CNS set as annotation, the number of identified SNPs further increased to 385 and 837 for BPD and SCZ, respectively. We investigated the BPD results in detail to evaluate the power of GPA in identification of functionally important SNPs. For single-GWAS analysis of BPD, GPA was able to identify SNPs located in the ANK3 gene. By using annotation data, the CACNA1C gene, which encodes an alpha-1 subunit of a voltage-dependent calcium channel, was identified by GPA. After incorporating pleiotropy information between SCZ and BPD, additional functionally relevant genes, such as PBRM1, C6orf136, DPCR1, SYNE1, were identified by GPA. For instance, SYNE1 encodes the synaptic nuclear envelope protein 1, and codes the protein Syne-1 that is found in many tissues and is especially critical in the brain. The Syne-1 protein is active (expressed) in Purkinje cells, which are located in the cerebellum and are involved in signaling between neurons. Mutations in the SYNE1 gene have been found to cause autosomal recessive cerebellar ataxia type 1 (ARCA1) and SYNE1 has recently been implicated as a susceptibility gene for BPD in a large collaborative GWAS study [35]. Clearly, the present results indicate that the statistical power to identify associated SNPs increased by making use of pleiotropy and functional annotation (in this real data example, pleiotropy played a more important role than functional annotation).

Bottom Line: Results from Genome-Wide Association Studies (GWAS) have shown that complex diseases are often affected by many genetic variants with small or moderate effects.Using our hypothesis testing framework, statistically significant pleiotropic effects were detected among these psychiatric disorders, and the markers annotated in the central nervous system genes and eQTLs from the Genotype-Tissue Expression (GTEx) database were significantly enriched.GPA was able to detect cell lines that are biologically more relevant to bladder cancer.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America; Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina, United States of America.

ABSTRACT
Results from Genome-Wide Association Studies (GWAS) have shown that complex diseases are often affected by many genetic variants with small or moderate effects. Identifications of these risk variants remain a very challenging problem. There is a need to develop more powerful statistical methods to leverage available information to improve upon traditional approaches that focus on a single GWAS dataset without incorporating additional data. In this paper, we propose a novel statistical approach, GPA (Genetic analysis incorporating Pleiotropy and Annotation), to increase statistical power to identify risk variants through joint analysis of multiple GWAS data sets and annotation information because: (1) accumulating evidence suggests that different complex diseases share common risk bases, i.e., pleiotropy; and (2) functionally annotated variants have been consistently demonstrated to be enriched among GWAS hits. GPA can integrate multiple GWAS datasets and functional annotations to seek association signals, and it can also perform hypothesis testing to test the presence of pleiotropy and enrichment of functional annotation. Statistical inference of the model parameters and SNP ranking is achieved through an EM algorithm that can handle genome-wide markers efficiently. When we applied GPA to jointly analyze five psychiatric disorders with annotation information, not only did GPA identify many weak signals missed by the traditional single phenotype analysis, but it also revealed relationships in the genetic architecture of these disorders. Using our hypothesis testing framework, statistically significant pleiotropic effects were detected among these psychiatric disorders, and the markers annotated in the central nervous system genes and eQTLs from the Genotype-Tissue Expression (GTEx) database were significantly enriched. We also applied GPA to a bladder cancer GWAS data set with the ENCODE DNase-seq data from 125 cell lines. GPA was able to detect cell lines that are biologically more relevant to bladder cancer. The R implementation of GPA is currently available at http://dongjunchung.github.io/GPA/.

Show MeSH
Related in: MedlinePlus