Limits...
Granger causality vs. dynamic Bayesian network inference: a comparative study.

Zou C, Denby KJ, Feng J - BMC Bioinformatics (2009)

Bottom Line: For synthesized data, a critical point of the data length is found: the dynamic Bayesian network outperforms the Granger causality approach when the data length is short, and vice versa.We then test our results in experimental data of short length which is a common scenario in current biological experiments: it is again confirmed that the dynamic Bayesian network works better.When the data size is short, the dynamic Bayesian network inference performs better than the Granger causality approach; otherwise the Granger causality approach is better.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science, University of Warwick, Coventry, UK. csrcbh@dcs.warwick.ac.uk

ABSTRACT

Background: In computational biology, one often faces the problem of deriving the causal relationship among different elements such as genes, proteins, metabolites, neurons and so on, based upon multi-dimensional temporal data. Currently, there are two common approaches used to explore the network structure among elements. One is the Granger causality approach, and the other is the dynamic Bayesian network inference approach. Both have at least a few thousand publications reported in the literature. A key issue is to choose which approach is used to tackle the data, in particular when they give rise to contradictory results.

Results: In this paper, we provide an answer by focusing on a systematic and computationally intensive comparison between the two approaches on both synthesized and experimental data. For synthesized data, a critical point of the data length is found: the dynamic Bayesian network outperforms the Granger causality approach when the data length is short, and vice versa. We then test our results in experimental data of short length which is a common scenario in current biological experiments: it is again confirmed that the dynamic Bayesian network works better.

Conclusion: When the data size is short, the dynamic Bayesian network inference performs better than the Granger causality approach; otherwise the Granger causality approach is better.

Show MeSH
Granger causality and Bayesian network inference applied on a stochastic coefficients non-linear model. The parameters in polynomial equation are randomly generated in the interval [-2,2]. (A) We applied both approaches on different sample size (from 300 to 900). For each sample size, we generated 100 different coefficient vectors, so the total number of directed interactions for each sample size is 500. (a) The percentage of detected true positive causalities for both approaches. (b) Time cost for both approaches. (B) For sample size 900, the derived causality (1 represents positive causality and 0 represents negative) is plotted with the absolute value of corresponding coefficients. For visualization purpose, the figure for Granger causality is shifted upward.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2691740&req=5

Figure 6: Granger causality and Bayesian network inference applied on a stochastic coefficients non-linear model. The parameters in polynomial equation are randomly generated in the interval [-2,2]. (A) We applied both approaches on different sample size (from 300 to 900). For each sample size, we generated 100 different coefficient vectors, so the total number of directed interactions for each sample size is 500. (a) The percentage of detected true positive causalities for both approaches. (b) Time cost for both approaches. (B) For sample size 900, the derived causality (1 represents positive causality and 0 represents negative) is plotted with the absolute value of corresponding coefficients. For visualization purpose, the figure for Granger causality is shifted upward.

Mentions: In the next step, we extend our non-linear model to a more general setting in which the coefficients in the equations are randomly generated. Figure 6Aa shows the comparison result of the percentage of true positive connections derived from these two methods. It is very interesting to see that a critical point around 500 exists in the non-linear model, similar to the linear model before. From Figure 6Ab, the computing time required for the Bayesian network inference is still much larger than the Granger causality. In Figure 6B, we compare the performances on different coefficients (strength of interaction) for a fixed sample size of 900. From the five graphs, we can see that in general the Granger approach is more sensitive to a small value of the coefficients (see Figure 6B. X5 -> X4 and X4 -> X5).


Granger causality vs. dynamic Bayesian network inference: a comparative study.

Zou C, Denby KJ, Feng J - BMC Bioinformatics (2009)

Granger causality and Bayesian network inference applied on a stochastic coefficients non-linear model. The parameters in polynomial equation are randomly generated in the interval [-2,2]. (A) We applied both approaches on different sample size (from 300 to 900). For each sample size, we generated 100 different coefficient vectors, so the total number of directed interactions for each sample size is 500. (a) The percentage of detected true positive causalities for both approaches. (b) Time cost for both approaches. (B) For sample size 900, the derived causality (1 represents positive causality and 0 represents negative) is plotted with the absolute value of corresponding coefficients. For visualization purpose, the figure for Granger causality is shifted upward.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2691740&req=5

Figure 6: Granger causality and Bayesian network inference applied on a stochastic coefficients non-linear model. The parameters in polynomial equation are randomly generated in the interval [-2,2]. (A) We applied both approaches on different sample size (from 300 to 900). For each sample size, we generated 100 different coefficient vectors, so the total number of directed interactions for each sample size is 500. (a) The percentage of detected true positive causalities for both approaches. (b) Time cost for both approaches. (B) For sample size 900, the derived causality (1 represents positive causality and 0 represents negative) is plotted with the absolute value of corresponding coefficients. For visualization purpose, the figure for Granger causality is shifted upward.
Mentions: In the next step, we extend our non-linear model to a more general setting in which the coefficients in the equations are randomly generated. Figure 6Aa shows the comparison result of the percentage of true positive connections derived from these two methods. It is very interesting to see that a critical point around 500 exists in the non-linear model, similar to the linear model before. From Figure 6Ab, the computing time required for the Bayesian network inference is still much larger than the Granger causality. In Figure 6B, we compare the performances on different coefficients (strength of interaction) for a fixed sample size of 900. From the five graphs, we can see that in general the Granger approach is more sensitive to a small value of the coefficients (see Figure 6B. X5 -> X4 and X4 -> X5).

Bottom Line: For synthesized data, a critical point of the data length is found: the dynamic Bayesian network outperforms the Granger causality approach when the data length is short, and vice versa.We then test our results in experimental data of short length which is a common scenario in current biological experiments: it is again confirmed that the dynamic Bayesian network works better.When the data size is short, the dynamic Bayesian network inference performs better than the Granger causality approach; otherwise the Granger causality approach is better.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science, University of Warwick, Coventry, UK. csrcbh@dcs.warwick.ac.uk

ABSTRACT

Background: In computational biology, one often faces the problem of deriving the causal relationship among different elements such as genes, proteins, metabolites, neurons and so on, based upon multi-dimensional temporal data. Currently, there are two common approaches used to explore the network structure among elements. One is the Granger causality approach, and the other is the dynamic Bayesian network inference approach. Both have at least a few thousand publications reported in the literature. A key issue is to choose which approach is used to tackle the data, in particular when they give rise to contradictory results.

Results: In this paper, we provide an answer by focusing on a systematic and computationally intensive comparison between the two approaches on both synthesized and experimental data. For synthesized data, a critical point of the data length is found: the dynamic Bayesian network outperforms the Granger causality approach when the data length is short, and vice versa. We then test our results in experimental data of short length which is a common scenario in current biological experiments: it is again confirmed that the dynamic Bayesian network works better.

Conclusion: When the data size is short, the dynamic Bayesian network inference performs better than the Granger causality approach; otherwise the Granger causality approach is better.

Show MeSH