Limits...
Leveraging prior information to detect causal variants via multi-variant regression.

Long N, Dickson SP, Maia JM, Kim HS, Zhu Q, Allen AS - PLoS Comput. Biol. (2013)

Bottom Line: Here we develop and evaluate a Bayesian hierarchical regression method that incorporates prior information on the likelihood of variant causality through weighting of variant effects.By simulation studies using both simulated and real sequence variants, we compared a standard single variant test for analyzing variant-disease association with the proposed method using different weighting schemes.We found that by leveraging linkage disequilibrium of variants with known GWAS signals and sequence conservation (phastCons), the proposed method provides a powerful approach for detecting causal variants while controlling false positives.

View Article: PubMed Central - PubMed

Affiliation: Center for Human Genome Variation, Duke University School of Medicine, Durham, North Carolina, United States of America. n.long@duke.edu

ABSTRACT
Although many methods are available to test sequence variants for association with complex diseases and traits, methods that specifically seek to identify causal variants are less developed. Here we develop and evaluate a Bayesian hierarchical regression method that incorporates prior information on the likelihood of variant causality through weighting of variant effects. By simulation studies using both simulated and real sequence variants, we compared a standard single variant test for analyzing variant-disease association with the proposed method using different weighting schemes. We found that by leveraging linkage disequilibrium of variants with known GWAS signals and sequence conservation (phastCons), the proposed method provides a powerful approach for detecting causal variants while controlling false positives.

Show MeSH

Related in: MedlinePlus

Causal variant detection in the exome sequencing data analysis.(A): NOD2 data; (B): ITPA data. The two top panels are from one replicate of the simulation. For single variant test, SNP effect size was represented by −log10 of p value from logistic regression model; for Bayesian liability model, it was represented by the standardized effect estimated at each SNP. Red dots indicate two causal variants (see Table 1 for more information). Blue vertical bars show values of SNP weights (r × phastCons). The horizontal dashed line indicates effect size at the significance threshold (permutation p value = 0.01). The bottom panel shows proportion of simulations where a variant was detected (i.e., significant at permutation p = 0.01 level). Causal variants are marked in red color.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3675126&req=5

pcbi-1003093-g003: Causal variant detection in the exome sequencing data analysis.(A): NOD2 data; (B): ITPA data. The two top panels are from one replicate of the simulation. For single variant test, SNP effect size was represented by −log10 of p value from logistic regression model; for Bayesian liability model, it was represented by the standardized effect estimated at each SNP. Red dots indicate two causal variants (see Table 1 for more information). Blue vertical bars show values of SNP weights (r × phastCons). The horizontal dashed line indicates effect size at the significance threshold (permutation p value = 0.01). The bottom panel shows proportion of simulations where a variant was detected (i.e., significant at permutation p = 0.01 level). Causal variants are marked in red color.

Mentions: In Figure 3 (A) and (B), we first show a Manhattan plot from single variant test and one from Bayesian liability model with r×phastCons weight, based on one representative example out of the 100 simulated data sets. We then summarize results from both methods by displaying for each candidate variant the proportion of the 100 simulations where it was detected (i.e., being declared as significant).


Leveraging prior information to detect causal variants via multi-variant regression.

Long N, Dickson SP, Maia JM, Kim HS, Zhu Q, Allen AS - PLoS Comput. Biol. (2013)

Causal variant detection in the exome sequencing data analysis.(A): NOD2 data; (B): ITPA data. The two top panels are from one replicate of the simulation. For single variant test, SNP effect size was represented by −log10 of p value from logistic regression model; for Bayesian liability model, it was represented by the standardized effect estimated at each SNP. Red dots indicate two causal variants (see Table 1 for more information). Blue vertical bars show values of SNP weights (r × phastCons). The horizontal dashed line indicates effect size at the significance threshold (permutation p value = 0.01). The bottom panel shows proportion of simulations where a variant was detected (i.e., significant at permutation p = 0.01 level). Causal variants are marked in red color.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3675126&req=5

pcbi-1003093-g003: Causal variant detection in the exome sequencing data analysis.(A): NOD2 data; (B): ITPA data. The two top panels are from one replicate of the simulation. For single variant test, SNP effect size was represented by −log10 of p value from logistic regression model; for Bayesian liability model, it was represented by the standardized effect estimated at each SNP. Red dots indicate two causal variants (see Table 1 for more information). Blue vertical bars show values of SNP weights (r × phastCons). The horizontal dashed line indicates effect size at the significance threshold (permutation p value = 0.01). The bottom panel shows proportion of simulations where a variant was detected (i.e., significant at permutation p = 0.01 level). Causal variants are marked in red color.
Mentions: In Figure 3 (A) and (B), we first show a Manhattan plot from single variant test and one from Bayesian liability model with r×phastCons weight, based on one representative example out of the 100 simulated data sets. We then summarize results from both methods by displaying for each candidate variant the proportion of the 100 simulations where it was detected (i.e., being declared as significant).

Bottom Line: Here we develop and evaluate a Bayesian hierarchical regression method that incorporates prior information on the likelihood of variant causality through weighting of variant effects.By simulation studies using both simulated and real sequence variants, we compared a standard single variant test for analyzing variant-disease association with the proposed method using different weighting schemes.We found that by leveraging linkage disequilibrium of variants with known GWAS signals and sequence conservation (phastCons), the proposed method provides a powerful approach for detecting causal variants while controlling false positives.

View Article: PubMed Central - PubMed

Affiliation: Center for Human Genome Variation, Duke University School of Medicine, Durham, North Carolina, United States of America. n.long@duke.edu

ABSTRACT
Although many methods are available to test sequence variants for association with complex diseases and traits, methods that specifically seek to identify causal variants are less developed. Here we develop and evaluate a Bayesian hierarchical regression method that incorporates prior information on the likelihood of variant causality through weighting of variant effects. By simulation studies using both simulated and real sequence variants, we compared a standard single variant test for analyzing variant-disease association with the proposed method using different weighting schemes. We found that by leveraging linkage disequilibrium of variants with known GWAS signals and sequence conservation (phastCons), the proposed method provides a powerful approach for detecting causal variants while controlling false positives.

Show MeSH
Related in: MedlinePlus