Limits...
Bayesian neural networks for detecting epistasis in genetic association studies.

Beam AL, Motsinger-Reif A, Doyle J - BMC Bioinformatics (2014)

Bottom Line: By using graphics processing units (GPUs) the time needed to build these models is decreased by several orders of magnitude.In comparison with commonly used approaches for detecting interactions, Bayesian neural networks perform very well across a broad spectrum of possible genetic relationships.The proposed framework is shown to be a powerful method for detecting causal SNPs while being computationally efficient enough to handle large datasets.

View Article: PubMed Central - PubMed

Affiliation: Center for Biomedical Informatics, Harvard Medical School, Boston, MA, USA. Andrew_Beam@hms.harvard.edu.

ABSTRACT

Background: Discovering causal genetic variants from large genetic association studies poses many difficult challenges. Assessing which genetic markers are involved in determining trait status is a computationally demanding task, especially in the presence of gene-gene interactions.

Results: A non-parametric Bayesian approach in the form of a Bayesian neural network is proposed for use in analyzing genetic association studies. Demonstrations on synthetic and real data reveal they are able to efficiently and accurately determine which variants are involved in determining case-control status. By using graphics processing units (GPUs) the time needed to build these models is decreased by several orders of magnitude. In comparison with commonly used approaches for detecting interactions, Bayesian neural networks perform very well across a broad spectrum of possible genetic relationships.

Conclusions: The proposed framework is shown to be a powerful method for detecting causal SNPs while being computationally efficient enough to handle large datasets.

Show MeSH
Receiver-Operator Characteristic (ROC) curve for BNNs. Each line represents the ROC curve for a different genetic model, averaged over effect size and MAF. The area under the curve (AUC) for each model is shown in the legend.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4256933&req=5

Fig6: Receiver-Operator Characteristic (ROC) curve for BNNs. Each line represents the ROC curve for a different genetic model, averaged over effect size and MAF. The area under the curve (AUC) for each model is shown in the legend.

Mentions: The cutoff value used for the ARD test has an obvious impact on the method’s performance. In the extreme case, a cutoff of 0 would result in nothing being significant while a cutoff value of 1 would result in everything being declared as such. The cutoff value controls the tradeoff between sensitivity (i.e. the true positive rate) and specificity (i.e. the true negative rate, which is equivalent to 1 – the false positive rate). Evaluation of the false positive rate for the cutoff value of 0.6 used in the previous experiments indicates that the BNN method properly controls the amount of false positives. We observed an average false positive rate (FPR) of roughly 0.005 and 0.06 for the parametric models and the purely epistatic models, respectively (see Additional file 1). To examine the trade off between the true positive rate (TPR) and FPR as the cutoff value is changed, we modulated the cutoff from 0 to 1 in increments of 0.01 and recorded the true positive and false positive rate for each data set in the two previous sections. In Figure 6, we averaged the TPR and FPR over effect size and MAF to produce a receiver-operator characteristic (ROC) curve for each of the 4 genetic models. The legend displays the area under the curve (AUC) for each model.Figure 6


Bayesian neural networks for detecting epistasis in genetic association studies.

Beam AL, Motsinger-Reif A, Doyle J - BMC Bioinformatics (2014)

Receiver-Operator Characteristic (ROC) curve for BNNs. Each line represents the ROC curve for a different genetic model, averaged over effect size and MAF. The area under the curve (AUC) for each model is shown in the legend.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4256933&req=5

Fig6: Receiver-Operator Characteristic (ROC) curve for BNNs. Each line represents the ROC curve for a different genetic model, averaged over effect size and MAF. The area under the curve (AUC) for each model is shown in the legend.
Mentions: The cutoff value used for the ARD test has an obvious impact on the method’s performance. In the extreme case, a cutoff of 0 would result in nothing being significant while a cutoff value of 1 would result in everything being declared as such. The cutoff value controls the tradeoff between sensitivity (i.e. the true positive rate) and specificity (i.e. the true negative rate, which is equivalent to 1 – the false positive rate). Evaluation of the false positive rate for the cutoff value of 0.6 used in the previous experiments indicates that the BNN method properly controls the amount of false positives. We observed an average false positive rate (FPR) of roughly 0.005 and 0.06 for the parametric models and the purely epistatic models, respectively (see Additional file 1). To examine the trade off between the true positive rate (TPR) and FPR as the cutoff value is changed, we modulated the cutoff from 0 to 1 in increments of 0.01 and recorded the true positive and false positive rate for each data set in the two previous sections. In Figure 6, we averaged the TPR and FPR over effect size and MAF to produce a receiver-operator characteristic (ROC) curve for each of the 4 genetic models. The legend displays the area under the curve (AUC) for each model.Figure 6

Bottom Line: By using graphics processing units (GPUs) the time needed to build these models is decreased by several orders of magnitude.In comparison with commonly used approaches for detecting interactions, Bayesian neural networks perform very well across a broad spectrum of possible genetic relationships.The proposed framework is shown to be a powerful method for detecting causal SNPs while being computationally efficient enough to handle large datasets.

View Article: PubMed Central - PubMed

Affiliation: Center for Biomedical Informatics, Harvard Medical School, Boston, MA, USA. Andrew_Beam@hms.harvard.edu.

ABSTRACT

Background: Discovering causal genetic variants from large genetic association studies poses many difficult challenges. Assessing which genetic markers are involved in determining trait status is a computationally demanding task, especially in the presence of gene-gene interactions.

Results: A non-parametric Bayesian approach in the form of a Bayesian neural network is proposed for use in analyzing genetic association studies. Demonstrations on synthetic and real data reveal they are able to efficiently and accurately determine which variants are involved in determining case-control status. By using graphics processing units (GPUs) the time needed to build these models is decreased by several orders of magnitude. In comparison with commonly used approaches for detecting interactions, Bayesian neural networks perform very well across a broad spectrum of possible genetic relationships.

Conclusions: The proposed framework is shown to be a powerful method for detecting causal SNPs while being computationally efficient enough to handle large datasets.

Show MeSH