Limits...
Assessing statistical significance in multivariable genome wide association analysis.

Buzdugan L, Kalisch M, Navarro A, Schunk D, Fehr E, Bühlmann P - Bioinformatics (2016)

Bottom Line: The low predictive power of single SNPs, coupled with the high significance threshold needed to correct for multiple testing, greatly decreases the power of GWAS.Thus, our method tests whether or not a SNP carries any additional information about the phenotype beyond that available by all the other SNPs.Reproducibility of our research is supported by the open-source Bioconductor package hierGWAS. peter.buehlmann@stat.math.ethz.ch Supplementary data are available at Bioinformatics online.

View Article: PubMed Central - PubMed

Affiliation: Seminar for Statistics, Department of Mathematics, ETH Zürich, Zürich 8092, Switzerland Department of Economics, University of Zürich, Zürich 8006, Switzerland.

No MeSH data available.


Related in: MedlinePlus

The final cluster tree. The SNPs are first partitioned into chromosomes, and then a cluster tree is built for each chromosome separately using hierarchical clustering with average linkage. The hierarchical clusters of SNPs within chromosomes are not shown due to their size
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4920127&req=5

btw128-F2: The final cluster tree. The SNPs are first partitioned into chromosomes, and then a cluster tree is built for each chromosome separately using hierarchical clustering with average linkage. The hierarchical clusters of SNPs within chromosomes are not shown due to their size

Mentions: Here we adopt the second approach, which is similar to the construction of haplotype maps (Barrett et al., 2005). We use hierarchical clustering with average linkage (Jain and Dubes, 1988) which can be represented as a cluster tree, denoted by . The method requires a distance or dissimilarity measure between SNPs. We consider the distance between two SNPs as one minus their linkage disequilibrium (LD) value, where LD refers to the statistical dependency of the DNA content at nearby locations of the chromosome. One of the most common measures of LD is the square of the Pearson correlation coefficient (Hill and Robertson, 1968), which quantifies the linear dependence between two loci. Thus, two SNPs will have an LD equal to one if they are perfectly correlated, or an LD equal to zero if they are uncorrelated. Since LD has a tendency to decay with the distance of the studied loci, close-by SNPs are typically in high LD. This means that SNPs belonging to the same gene, or more generally, neighboring SNPs will end up in the same cluster. Often, LD is studied within each chromosome separately. Therefore, we construct separate cluster trees for each chromosome (in addition to providing a biological interpretation, clustering each chromosome separately results in substantial computational gains for problems with SNPs), and we then join these into one tree which contains all the SNPs in the study, as shown in Figure 2.Fig. 2.


Assessing statistical significance in multivariable genome wide association analysis.

Buzdugan L, Kalisch M, Navarro A, Schunk D, Fehr E, Bühlmann P - Bioinformatics (2016)

The final cluster tree. The SNPs are first partitioned into chromosomes, and then a cluster tree is built for each chromosome separately using hierarchical clustering with average linkage. The hierarchical clusters of SNPs within chromosomes are not shown due to their size
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4920127&req=5

btw128-F2: The final cluster tree. The SNPs are first partitioned into chromosomes, and then a cluster tree is built for each chromosome separately using hierarchical clustering with average linkage. The hierarchical clusters of SNPs within chromosomes are not shown due to their size
Mentions: Here we adopt the second approach, which is similar to the construction of haplotype maps (Barrett et al., 2005). We use hierarchical clustering with average linkage (Jain and Dubes, 1988) which can be represented as a cluster tree, denoted by . The method requires a distance or dissimilarity measure between SNPs. We consider the distance between two SNPs as one minus their linkage disequilibrium (LD) value, where LD refers to the statistical dependency of the DNA content at nearby locations of the chromosome. One of the most common measures of LD is the square of the Pearson correlation coefficient (Hill and Robertson, 1968), which quantifies the linear dependence between two loci. Thus, two SNPs will have an LD equal to one if they are perfectly correlated, or an LD equal to zero if they are uncorrelated. Since LD has a tendency to decay with the distance of the studied loci, close-by SNPs are typically in high LD. This means that SNPs belonging to the same gene, or more generally, neighboring SNPs will end up in the same cluster. Often, LD is studied within each chromosome separately. Therefore, we construct separate cluster trees for each chromosome (in addition to providing a biological interpretation, clustering each chromosome separately results in substantial computational gains for problems with SNPs), and we then join these into one tree which contains all the SNPs in the study, as shown in Figure 2.Fig. 2.

Bottom Line: The low predictive power of single SNPs, coupled with the high significance threshold needed to correct for multiple testing, greatly decreases the power of GWAS.Thus, our method tests whether or not a SNP carries any additional information about the phenotype beyond that available by all the other SNPs.Reproducibility of our research is supported by the open-source Bioconductor package hierGWAS. peter.buehlmann@stat.math.ethz.ch Supplementary data are available at Bioinformatics online.

View Article: PubMed Central - PubMed

Affiliation: Seminar for Statistics, Department of Mathematics, ETH Zürich, Zürich 8092, Switzerland Department of Economics, University of Zürich, Zürich 8006, Switzerland.

No MeSH data available.


Related in: MedlinePlus