Limits...
Modeling signal transduction from protein phosphorylation to gene expression.

Cai C, Chen L, Jiang X, Lu X - Cancer Inform (2014)

Bottom Line: We were able to effectively identify sparse signaling networks that modeled the observed transcriptomic and proteomic data.Our methods were able to identify distinct signaling pathways for rat and human cells in a data-driven manner, based on the facts that rat and human cells exhibited distinct transcriptomic and proteomics responses to a common set of stimuli.Our model performed well in the SBV IMPROVER challenge in comparison to other models addressing the same task.

View Article: PubMed Central - PubMed

Affiliation: Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.

ABSTRACT

Background: Signaling networks are of great importance for us to understand the cell's regulatory mechanism. The rise of large-scale genomic and proteomic data, and prior biological knowledge has paved the way for the reconstruction and discovery of novel signaling pathways in a data-driven manner. In this study, we investigate computational methods that integrate proteomics and transcriptomic data to identify signaling pathways transmitting signals in response to specific stimuli. Such methods can be applied to cancer genomic data to infer perturbed signaling pathways.

Method: We proposed a novel Bayesian Network (BN) framework to integrate transcriptomic data with proteomic data reflecting protein phosphorylation states for the purpose of identifying the pathways transmitting the signal of diverse stimuli in rat and human cells. We represented the proteins and genes as nodes in a BN in which edges reflect the regulatory relationship between signaling proteins. We designed an efficient inference algorithm that incorporated the prior knowledge of pathways and searched for a network structure in a data-driven manner.

Results: We applied our method to infer rat and human specific networks given gene expression and proteomic datasets. We were able to effectively identify sparse signaling networks that modeled the observed transcriptomic and proteomic data. Our methods were able to identify distinct signaling pathways for rat and human cells in a data-driven manner, based on the facts that rat and human cells exhibited distinct transcriptomic and proteomics responses to a common set of stimuli. Our model performed well in the SBV IMPROVER challenge in comparison to other models addressing the same task. The capability of inferring signaling pathways in a data-driven fashion may contribute to cancer research by identifying distinct aberrations in signaling pathways underlying heterogeneous cancers subtypes.

No MeSH data available.


Related in: MedlinePlus

(A) Number of predicted edges against elastic network approach parameter α. (B) Approximated log-likelihood against number of edges in the predicted network.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4216050&req=5

f3-cin-suppl.1-2014-059: (A) Number of predicted edges against elastic network approach parameter α. (B) Approximated log-likelihood against number of edges in the predicted network.

Mentions: We tuned the α and λ parameters of “glmnet” and searched for the optimal penalty that led to the sparsest model with best performance [Equations (8) and (9)]. Figure 3A shows that by increasing α (increasing L1 penalization while decreasing L2 penalization), we tend to have sparser networks, which is to be expected. Interestingly, Figure 3B shows that the models with around 350 edges return the best marginal log likelihood for both rat and human data, whereas models with too many edges or too few edges do not fit the data well. This is a key advantage of Bayesian model selection, such that it penalizes the too complex models that tend to over-fit data and the too simple models that cannot explain data well – a characteristic commonly referred to as Occam’s razor. We selected the best models with αrat = 0.9 and αhuman = 1, for rat and human, respectively. Over half of the edges were trimmed off for both rat and human augmented reference networks. We also found that interactions between phospho-proteins in signaling pathways tended to be more translatable with few divergent points from rat to human, whereas TF–gene interactions tended to be more divergent between rat and human, which explained the difference in gene expression profiles.


Modeling signal transduction from protein phosphorylation to gene expression.

Cai C, Chen L, Jiang X, Lu X - Cancer Inform (2014)

(A) Number of predicted edges against elastic network approach parameter α. (B) Approximated log-likelihood against number of edges in the predicted network.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4216050&req=5

f3-cin-suppl.1-2014-059: (A) Number of predicted edges against elastic network approach parameter α. (B) Approximated log-likelihood against number of edges in the predicted network.
Mentions: We tuned the α and λ parameters of “glmnet” and searched for the optimal penalty that led to the sparsest model with best performance [Equations (8) and (9)]. Figure 3A shows that by increasing α (increasing L1 penalization while decreasing L2 penalization), we tend to have sparser networks, which is to be expected. Interestingly, Figure 3B shows that the models with around 350 edges return the best marginal log likelihood for both rat and human data, whereas models with too many edges or too few edges do not fit the data well. This is a key advantage of Bayesian model selection, such that it penalizes the too complex models that tend to over-fit data and the too simple models that cannot explain data well – a characteristic commonly referred to as Occam’s razor. We selected the best models with αrat = 0.9 and αhuman = 1, for rat and human, respectively. Over half of the edges were trimmed off for both rat and human augmented reference networks. We also found that interactions between phospho-proteins in signaling pathways tended to be more translatable with few divergent points from rat to human, whereas TF–gene interactions tended to be more divergent between rat and human, which explained the difference in gene expression profiles.

Bottom Line: We were able to effectively identify sparse signaling networks that modeled the observed transcriptomic and proteomic data.Our methods were able to identify distinct signaling pathways for rat and human cells in a data-driven manner, based on the facts that rat and human cells exhibited distinct transcriptomic and proteomics responses to a common set of stimuli.Our model performed well in the SBV IMPROVER challenge in comparison to other models addressing the same task.

View Article: PubMed Central - PubMed

Affiliation: Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.

ABSTRACT

Background: Signaling networks are of great importance for us to understand the cell's regulatory mechanism. The rise of large-scale genomic and proteomic data, and prior biological knowledge has paved the way for the reconstruction and discovery of novel signaling pathways in a data-driven manner. In this study, we investigate computational methods that integrate proteomics and transcriptomic data to identify signaling pathways transmitting signals in response to specific stimuli. Such methods can be applied to cancer genomic data to infer perturbed signaling pathways.

Method: We proposed a novel Bayesian Network (BN) framework to integrate transcriptomic data with proteomic data reflecting protein phosphorylation states for the purpose of identifying the pathways transmitting the signal of diverse stimuli in rat and human cells. We represented the proteins and genes as nodes in a BN in which edges reflect the regulatory relationship between signaling proteins. We designed an efficient inference algorithm that incorporated the prior knowledge of pathways and searched for a network structure in a data-driven manner.

Results: We applied our method to infer rat and human specific networks given gene expression and proteomic datasets. We were able to effectively identify sparse signaling networks that modeled the observed transcriptomic and proteomic data. Our methods were able to identify distinct signaling pathways for rat and human cells in a data-driven manner, based on the facts that rat and human cells exhibited distinct transcriptomic and proteomics responses to a common set of stimuli. Our model performed well in the SBV IMPROVER challenge in comparison to other models addressing the same task. The capability of inferring signaling pathways in a data-driven fashion may contribute to cancer research by identifying distinct aberrations in signaling pathways underlying heterogeneous cancers subtypes.

No MeSH data available.


Related in: MedlinePlus