Limits...
Inference of gene regulatory networks from time series by Tsallis entropy.

Lopes FM, de Oliveira EA, Cesar RM - BMC Syst Biol (2011)

Bottom Line: By adopting both data sources, it is possible to estimate the average quality of the inference with respect to different network topologies, transfer functions and network sizes.A remarkable improvement of accuracy was observed in the experimental results by reducing the number of false connections in the inferred topology by the non-Shannon entropy.The obtained best free parameter of the Tsallis entropy was on average in the range 2.5 ≤ q ≤ 3.5 (hence, subextensive entropy), which opens new perspectives for GRNs inference methods based on information theory and for investigation of the nonextensivity of such networks.

View Article: PubMed Central - HTML - PubMed

Affiliation: Federal University of Technology - Paraná, Brazil. fabricio@utfpr.edu.br

ABSTRACT

Background: The inference of gene regulatory networks (GRNs) from large-scale expression profiles is one of the most challenging problems of Systems Biology nowadays. Many techniques and models have been proposed for this task. However, it is not generally possible to recover the original topology with great accuracy, mainly due to the short time series data in face of the high complexity of the networks and the intrinsic noise of the expression measurements. In order to improve the accuracy of GRNs inference methods based on entropy (mutual information), a new criterion function is here proposed.

Results: In this paper we introduce the use of generalized entropy proposed by Tsallis, for the inference of GRNs from time series expression profiles. The inference process is based on a feature selection approach and the conditional entropy is applied as criterion function. In order to assess the proposed methodology, the algorithm is applied to recover the network topology from temporal expressions generated by an artificial gene network (AGN) model as well as from the DREAM challenge. The adopted AGN is based on theoretical models of complex networks and its gene transference function is obtained from random drawing on the set of possible Boolean functions, thus creating its dynamics. On the other hand, DREAM time series data presents variation of network size and its topologies are based on real networks. The dynamics are generated by continuous differential equations with noise and perturbation. By adopting both data sources, it is possible to estimate the average quality of the inference with respect to different network topologies, transfer functions and network sizes.

Conclusions: A remarkable improvement of accuracy was observed in the experimental results by reducing the number of false connections in the inferred topology by the non-Shannon entropy. The obtained best free parameter of the Tsallis entropy was on average in the range 2.5 ≤ q ≤ 3.5 (hence, subextensive entropy), which opens new perspectives for GRNs inference methods based on information theory and for investigation of the nonextensivity of such networks. The inference algorithm and criterion function proposed here were implemented and included in the DimReduction software, which is freely available at http://sourceforge.net/projects/dimreduction and http://code.google.com/p/dimreduction/.

Show MeSH
Similarity between AGNs and inferred networks as a function of the entropic parameter q. Similarity between source and inferred networks as a function of the entropic parameter q for each average connectivity (1 ≤ 〈k〉 ≤ 5): (a) average values for the uniformly-random Erdös-Rényi (ER) and (b) average values for the scale-free Barabási-Albert (BA). The simulations were performed for 100 genes (N = 100) and represent the average over 50 runs.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3117729&req=5

Figure 1: Similarity between AGNs and inferred networks as a function of the entropic parameter q. Similarity between source and inferred networks as a function of the entropic parameter q for each average connectivity (1 ≤ 〈k〉 ≤ 5): (a) average values for the uniformly-random Erdös-Rényi (ER) and (b) average values for the scale-free Barabási-Albert (BA). The simulations were performed for 100 genes (N = 100) and represent the average over 50 runs.

Mentions: The similarity curves shown in Figure 1 were obtained by averaging 50 runs (different source networks) for each network model. In both network models improvements were observed in the similarity by ranging q, with the maximum 〈Similarity(A, B)〉 being reached by q ≠ 1 for all tried 〈k〉. Besides, it can also be noted that the q* that maximizes the similarity seems to be almost independent of the network model and the average connectivity. Figures 2(a) and 2(b) show the boxplots of the similarity values for each q and k values. It is possible to notice a very small variation in the boxplots, indicating stable results for all q values.


Inference of gene regulatory networks from time series by Tsallis entropy.

Lopes FM, de Oliveira EA, Cesar RM - BMC Syst Biol (2011)

Similarity between AGNs and inferred networks as a function of the entropic parameter q. Similarity between source and inferred networks as a function of the entropic parameter q for each average connectivity (1 ≤ 〈k〉 ≤ 5): (a) average values for the uniformly-random Erdös-Rényi (ER) and (b) average values for the scale-free Barabási-Albert (BA). The simulations were performed for 100 genes (N = 100) and represent the average over 50 runs.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3117729&req=5

Figure 1: Similarity between AGNs and inferred networks as a function of the entropic parameter q. Similarity between source and inferred networks as a function of the entropic parameter q for each average connectivity (1 ≤ 〈k〉 ≤ 5): (a) average values for the uniformly-random Erdös-Rényi (ER) and (b) average values for the scale-free Barabási-Albert (BA). The simulations were performed for 100 genes (N = 100) and represent the average over 50 runs.
Mentions: The similarity curves shown in Figure 1 were obtained by averaging 50 runs (different source networks) for each network model. In both network models improvements were observed in the similarity by ranging q, with the maximum 〈Similarity(A, B)〉 being reached by q ≠ 1 for all tried 〈k〉. Besides, it can also be noted that the q* that maximizes the similarity seems to be almost independent of the network model and the average connectivity. Figures 2(a) and 2(b) show the boxplots of the similarity values for each q and k values. It is possible to notice a very small variation in the boxplots, indicating stable results for all q values.

Bottom Line: By adopting both data sources, it is possible to estimate the average quality of the inference with respect to different network topologies, transfer functions and network sizes.A remarkable improvement of accuracy was observed in the experimental results by reducing the number of false connections in the inferred topology by the non-Shannon entropy.The obtained best free parameter of the Tsallis entropy was on average in the range 2.5 ≤ q ≤ 3.5 (hence, subextensive entropy), which opens new perspectives for GRNs inference methods based on information theory and for investigation of the nonextensivity of such networks.

View Article: PubMed Central - HTML - PubMed

Affiliation: Federal University of Technology - Paraná, Brazil. fabricio@utfpr.edu.br

ABSTRACT

Background: The inference of gene regulatory networks (GRNs) from large-scale expression profiles is one of the most challenging problems of Systems Biology nowadays. Many techniques and models have been proposed for this task. However, it is not generally possible to recover the original topology with great accuracy, mainly due to the short time series data in face of the high complexity of the networks and the intrinsic noise of the expression measurements. In order to improve the accuracy of GRNs inference methods based on entropy (mutual information), a new criterion function is here proposed.

Results: In this paper we introduce the use of generalized entropy proposed by Tsallis, for the inference of GRNs from time series expression profiles. The inference process is based on a feature selection approach and the conditional entropy is applied as criterion function. In order to assess the proposed methodology, the algorithm is applied to recover the network topology from temporal expressions generated by an artificial gene network (AGN) model as well as from the DREAM challenge. The adopted AGN is based on theoretical models of complex networks and its gene transference function is obtained from random drawing on the set of possible Boolean functions, thus creating its dynamics. On the other hand, DREAM time series data presents variation of network size and its topologies are based on real networks. The dynamics are generated by continuous differential equations with noise and perturbation. By adopting both data sources, it is possible to estimate the average quality of the inference with respect to different network topologies, transfer functions and network sizes.

Conclusions: A remarkable improvement of accuracy was observed in the experimental results by reducing the number of false connections in the inferred topology by the non-Shannon entropy. The obtained best free parameter of the Tsallis entropy was on average in the range 2.5 ≤ q ≤ 3.5 (hence, subextensive entropy), which opens new perspectives for GRNs inference methods based on information theory and for investigation of the nonextensivity of such networks. The inference algorithm and criterion function proposed here were implemented and included in the DimReduction software, which is freely available at http://sourceforge.net/projects/dimreduction and http://code.google.com/p/dimreduction/.

Show MeSH