Limits...
Pathways of distinction analysis: a new technique for multi-SNP analysis of GWAS data.

Braun R, Buetow K - PLoS Genet. (2011)

Bottom Line: Importantly, PoDA improves on existing single-SNP and SNP-set enrichment analyses, in that it does not require the SNPs in a pathway to exhibit independent main effects.The results obtained strongly suggest that there exist pathway-wide genomic differences that contribute to disease susceptibility.PoDA thus provides an analytical tool that is complementary to existing techniques and has the power to enrich our understanding of disease genomics at the systems-level.

View Article: PubMed Central - PubMed

Affiliation: Laboratory of Population Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America.

ABSTRACT
Genome-wide association studies (GWAS) have become increasingly common due to advances in technology and have permitted the identification of differences in single nucleotide polymorphism (SNP) alleles that are associated with diseases. However, while typical GWAS analysis techniques treat markers individually, complex diseases (cancers, diabetes, and Alzheimers, amongst others) are unlikely to have a single causative gene. Thus, there is a pressing need for multi-SNP analysis methods that can reveal system-level differences in cases and controls. Here, we present a novel multi-SNP GWAS analysis method called Pathways of Distinction Analysis (PoDA). The method uses GWAS data and known pathway-gene and gene-SNP associations to identify pathways that permit, ideally, the distinction of cases from controls. The technique is based upon the hypothesis that, if a pathway is related to disease risk, cases will appear more similar to other cases than to controls (or vice versa) for the SNPs associated with that pathway. By systematically applying the method to all pathways of potential interest, we can identify those for which the hypothesis holds true, i.e., pathways containing SNPs for which the samples exhibit greater within-class similarity than across classes. Importantly, PoDA improves on existing single-SNP and SNP-set enrichment analyses, in that it does not require the SNPs in a pathway to exhibit independent main effects. This permits PoDA to reveal pathways in which epistatic interactions drive risk. In this paper, we detail the PoDA method and apply it to two GWAS: one of breast cancer and the other of liver cancer. The results obtained strongly suggest that there exist pathway-wide genomic differences that contribute to disease susceptibility. PoDA thus provides an analytical tool that is complementary to existing techniques and has the power to enrich our understanding of disease genomics at the systems-level.

Show MeSH

Related in: MedlinePlus

PoDA applied to four highly-significant SNPs.Shown is the distribution of  values in CGEMS cases (red) and controls (black) for a SNP-set comprised of four highly-significant SNPs located in the  gene [4]. As expected, there is a substantial difference in case and control  values, with the cases having higher  (i.e., closer to other cases) than controls. The discreteness of the distributions are due to the fact that with four SNPs, a finite number of  values are possible.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3111473&req=5

pgen-1002101-g002: PoDA applied to four highly-significant SNPs.Shown is the distribution of values in CGEMS cases (red) and controls (black) for a SNP-set comprised of four highly-significant SNPs located in the gene [4]. As expected, there is a substantial difference in case and control values, with the cases having higher (i.e., closer to other cases) than controls. The discreteness of the distributions are due to the fact that with four SNPs, a finite number of values are possible.

Mentions: We begin by applying PoDA to the CGEMS breast cancer GWAS data. Having observed (Figure 1) that PoDA performs as expected for the simulated data, we first turn our attention to a simple test in which we select a SNP set comprising the four SNPs in intron 2 of that were reported to show significant association with case status in [4] (rs11200014, rs2981579, rs1219648, rs2420946). We expect to see a strong difference in the test case and test control distributions, and indeed we do: the cases more frequently have positive than do controls in Figure 2. (The discrete peaks in the distribution are a result of the fact that with four SNPs there exist fewer available values of .) Using a nonparametric Wilcoxon rank sum test with the alternative hypothesis that cases have greater than controls, is obtained, confirming our intuition.


Pathways of distinction analysis: a new technique for multi-SNP analysis of GWAS data.

Braun R, Buetow K - PLoS Genet. (2011)

PoDA applied to four highly-significant SNPs.Shown is the distribution of  values in CGEMS cases (red) and controls (black) for a SNP-set comprised of four highly-significant SNPs located in the  gene [4]. As expected, there is a substantial difference in case and control  values, with the cases having higher  (i.e., closer to other cases) than controls. The discreteness of the distributions are due to the fact that with four SNPs, a finite number of  values are possible.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3111473&req=5

pgen-1002101-g002: PoDA applied to four highly-significant SNPs.Shown is the distribution of values in CGEMS cases (red) and controls (black) for a SNP-set comprised of four highly-significant SNPs located in the gene [4]. As expected, there is a substantial difference in case and control values, with the cases having higher (i.e., closer to other cases) than controls. The discreteness of the distributions are due to the fact that with four SNPs, a finite number of values are possible.
Mentions: We begin by applying PoDA to the CGEMS breast cancer GWAS data. Having observed (Figure 1) that PoDA performs as expected for the simulated data, we first turn our attention to a simple test in which we select a SNP set comprising the four SNPs in intron 2 of that were reported to show significant association with case status in [4] (rs11200014, rs2981579, rs1219648, rs2420946). We expect to see a strong difference in the test case and test control distributions, and indeed we do: the cases more frequently have positive than do controls in Figure 2. (The discrete peaks in the distribution are a result of the fact that with four SNPs there exist fewer available values of .) Using a nonparametric Wilcoxon rank sum test with the alternative hypothesis that cases have greater than controls, is obtained, confirming our intuition.

Bottom Line: Importantly, PoDA improves on existing single-SNP and SNP-set enrichment analyses, in that it does not require the SNPs in a pathway to exhibit independent main effects.The results obtained strongly suggest that there exist pathway-wide genomic differences that contribute to disease susceptibility.PoDA thus provides an analytical tool that is complementary to existing techniques and has the power to enrich our understanding of disease genomics at the systems-level.

View Article: PubMed Central - PubMed

Affiliation: Laboratory of Population Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America.

ABSTRACT
Genome-wide association studies (GWAS) have become increasingly common due to advances in technology and have permitted the identification of differences in single nucleotide polymorphism (SNP) alleles that are associated with diseases. However, while typical GWAS analysis techniques treat markers individually, complex diseases (cancers, diabetes, and Alzheimers, amongst others) are unlikely to have a single causative gene. Thus, there is a pressing need for multi-SNP analysis methods that can reveal system-level differences in cases and controls. Here, we present a novel multi-SNP GWAS analysis method called Pathways of Distinction Analysis (PoDA). The method uses GWAS data and known pathway-gene and gene-SNP associations to identify pathways that permit, ideally, the distinction of cases from controls. The technique is based upon the hypothesis that, if a pathway is related to disease risk, cases will appear more similar to other cases than to controls (or vice versa) for the SNPs associated with that pathway. By systematically applying the method to all pathways of potential interest, we can identify those for which the hypothesis holds true, i.e., pathways containing SNPs for which the samples exhibit greater within-class similarity than across classes. Importantly, PoDA improves on existing single-SNP and SNP-set enrichment analyses, in that it does not require the SNPs in a pathway to exhibit independent main effects. This permits PoDA to reveal pathways in which epistatic interactions drive risk. In this paper, we detail the PoDA method and apply it to two GWAS: one of breast cancer and the other of liver cancer. The results obtained strongly suggest that there exist pathway-wide genomic differences that contribute to disease susceptibility. PoDA thus provides an analytical tool that is complementary to existing techniques and has the power to enrich our understanding of disease genomics at the systems-level.

Show MeSH
Related in: MedlinePlus