Limits...
A systems biology approach to the global analysis of transcription factors in colorectal cancer.

Pradhan MP, Prasad NK, Palakal MJ - BMC Cancer (2012)

Bottom Line: Biological entities do not perform in isolation, and often, it is the nature and degree of interactions among numerous biological entities which ultimately determines any final outcome.Starting with just one TF (SMAD3) in the bait list, the literature mining process identified an additional 116 CRC-associated TFs.Among these identified TFs, we obtained a novel six-node module consisting of ATF2-P53-JNK1-ELK1-EPHB2-HIF1A, from which the novel JNK1-ELK1 association could potentially be a significant marker for CRC.

View Article: PubMed Central - HTML - PubMed

Affiliation: School of Informatics, Indiana University Purdue University Indianapolis, Indianapolis, IN 46202, USA.

ABSTRACT

Background: Biological entities do not perform in isolation, and often, it is the nature and degree of interactions among numerous biological entities which ultimately determines any final outcome. Hence, experimental data on any single biological entity can be of limited value when considered only in isolation. To address this, we propose that augmenting individual entity data with the literature will not only better define the entity's own significance but also uncover relationships with novel biological entities.To test this notion, we developed a comprehensive text mining and computational methodology that focused on discovering new targets of one class of molecular entities, transcription factors (TF), within one particular disease, colorectal cancer (CRC).

Methods: We used 39 molecular entities known to be associated with CRC along with six colorectal cancer terms as the bait list, or list of search terms, for mining the biomedical literature to identify CRC-specific genes and proteins. Using the literature-mined data, we constructed a global TF interaction network for CRC. We then developed a multi-level, multi-parametric methodology to identify TFs to CRC.

Results: The small bait list, when augmented with literature-mined data, identified a large number of biological entities associated with CRC. The relative importance of these TF and their associated modules was identified using functional and topological features. Additional validation of these highly-ranked TF using the literature strengthened our findings. Some of the novel TF that we identified were: SLUG, RUNX1, IRF1, HIF1A, ATF-2, ABL1, ELK-1 and GATA-1. Some of these TFs are associated with functional modules in known pathways of CRC, including the Beta-catenin/development, immune response, transcription, and DNA damage pathways.

Conclusions: Our methodology of using text mining data and a multi-level, multi-parameter scoring technique was able to identify both known and novel TF that have roles in CRC. Starting with just one TF (SMAD3) in the bait list, the literature mining process identified an additional 116 CRC-associated TFs. Our network-based analysis showed that these TFs all belonged to any of 13 major functional groups that are known to play important roles in CRC. Among these identified TFs, we obtained a novel six-node module consisting of ATF2-P53-JNK1-ELK1-EPHB2-HIF1A, from which the novel JNK1-ELK1 association could potentially be a significant marker for CRC.

Show MeSH

Related in: MedlinePlus

A Ranking comparison between the Bait list pathways and Literature Augmented Data pathways. B: p-value comparison between the Bait List pathway and Literature Augmented Data pathways.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3539921&req=5

Figure 4: A Ranking comparison between the Bait list pathways and Literature Augmented Data pathways. B: p-value comparison between the Bait List pathway and Literature Augmented Data pathways.

Mentions: To better understand the significance of the highly-ranked TFs, modules, and the overall TF interaction network, all 2,634 proteins (output from BIOMAP) were analysed using MetaCoreTM for their significance in various pathways from the original bait list (39 pathways) and the literature augmented data-generated list (286 pathways). Figures 4A and B show the comparisons between the rankings and pvalues of the bait list and the literature augmented pathways. For analytic purposes, the 286 pathways were further classified according to their functional groups as given by MetaCoreTM. Table 7 shows the frequency distribution of these pathways with respect to their functional groups. From Table 7 it can be observed that the top three functional groups were Development, Immune Response, and Apoptosis and Survival, which are well-known in CRC. Chemotaxis, which is also listed in Table 7 as associated with four pathways, is the unidirectional movement of a cell in response to any given chemical gradient, which plays an important role in innate and acquired responses. The four chemotaxis-associated pathways were the CXR4 signalling pathway, inhibitory action of IL-8 and leukotriene B4-induced neutrophil-migration, and leukocyte and chemotaxis, all of which have been associated with CRC in literature[95,96], as well as Lipoxin inhibitory action of fMLP-induced neutrophil chemotaxis pathway. This last pathway has not been well-studied in CRC, though lipoxins are known to be associated with anti-inflammatory and proresolving mediators in CRC[97]. The analysis of the chemotaxis functional group demonstrates that while using a small bait list or list of experimental proteins may not fully depict the global profile of a disease, using literature augmented data can help to expand this profile and further help to understand new pathways with respect to disease.


A systems biology approach to the global analysis of transcription factors in colorectal cancer.

Pradhan MP, Prasad NK, Palakal MJ - BMC Cancer (2012)

A Ranking comparison between the Bait list pathways and Literature Augmented Data pathways. B: p-value comparison between the Bait List pathway and Literature Augmented Data pathways.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3539921&req=5

Figure 4: A Ranking comparison between the Bait list pathways and Literature Augmented Data pathways. B: p-value comparison between the Bait List pathway and Literature Augmented Data pathways.
Mentions: To better understand the significance of the highly-ranked TFs, modules, and the overall TF interaction network, all 2,634 proteins (output from BIOMAP) were analysed using MetaCoreTM for their significance in various pathways from the original bait list (39 pathways) and the literature augmented data-generated list (286 pathways). Figures 4A and B show the comparisons between the rankings and pvalues of the bait list and the literature augmented pathways. For analytic purposes, the 286 pathways were further classified according to their functional groups as given by MetaCoreTM. Table 7 shows the frequency distribution of these pathways with respect to their functional groups. From Table 7 it can be observed that the top three functional groups were Development, Immune Response, and Apoptosis and Survival, which are well-known in CRC. Chemotaxis, which is also listed in Table 7 as associated with four pathways, is the unidirectional movement of a cell in response to any given chemical gradient, which plays an important role in innate and acquired responses. The four chemotaxis-associated pathways were the CXR4 signalling pathway, inhibitory action of IL-8 and leukotriene B4-induced neutrophil-migration, and leukocyte and chemotaxis, all of which have been associated with CRC in literature[95,96], as well as Lipoxin inhibitory action of fMLP-induced neutrophil chemotaxis pathway. This last pathway has not been well-studied in CRC, though lipoxins are known to be associated with anti-inflammatory and proresolving mediators in CRC[97]. The analysis of the chemotaxis functional group demonstrates that while using a small bait list or list of experimental proteins may not fully depict the global profile of a disease, using literature augmented data can help to expand this profile and further help to understand new pathways with respect to disease.

Bottom Line: Biological entities do not perform in isolation, and often, it is the nature and degree of interactions among numerous biological entities which ultimately determines any final outcome.Starting with just one TF (SMAD3) in the bait list, the literature mining process identified an additional 116 CRC-associated TFs.Among these identified TFs, we obtained a novel six-node module consisting of ATF2-P53-JNK1-ELK1-EPHB2-HIF1A, from which the novel JNK1-ELK1 association could potentially be a significant marker for CRC.

View Article: PubMed Central - HTML - PubMed

Affiliation: School of Informatics, Indiana University Purdue University Indianapolis, Indianapolis, IN 46202, USA.

ABSTRACT

Background: Biological entities do not perform in isolation, and often, it is the nature and degree of interactions among numerous biological entities which ultimately determines any final outcome. Hence, experimental data on any single biological entity can be of limited value when considered only in isolation. To address this, we propose that augmenting individual entity data with the literature will not only better define the entity's own significance but also uncover relationships with novel biological entities.To test this notion, we developed a comprehensive text mining and computational methodology that focused on discovering new targets of one class of molecular entities, transcription factors (TF), within one particular disease, colorectal cancer (CRC).

Methods: We used 39 molecular entities known to be associated with CRC along with six colorectal cancer terms as the bait list, or list of search terms, for mining the biomedical literature to identify CRC-specific genes and proteins. Using the literature-mined data, we constructed a global TF interaction network for CRC. We then developed a multi-level, multi-parametric methodology to identify TFs to CRC.

Results: The small bait list, when augmented with literature-mined data, identified a large number of biological entities associated with CRC. The relative importance of these TF and their associated modules was identified using functional and topological features. Additional validation of these highly-ranked TF using the literature strengthened our findings. Some of the novel TF that we identified were: SLUG, RUNX1, IRF1, HIF1A, ATF-2, ABL1, ELK-1 and GATA-1. Some of these TFs are associated with functional modules in known pathways of CRC, including the Beta-catenin/development, immune response, transcription, and DNA damage pathways.

Conclusions: Our methodology of using text mining data and a multi-level, multi-parameter scoring technique was able to identify both known and novel TF that have roles in CRC. Starting with just one TF (SMAD3) in the bait list, the literature mining process identified an additional 116 CRC-associated TFs. Our network-based analysis showed that these TFs all belonged to any of 13 major functional groups that are known to play important roles in CRC. Among these identified TFs, we obtained a novel six-node module consisting of ATF2-P53-JNK1-ELK1-EPHB2-HIF1A, from which the novel JNK1-ELK1 association could potentially be a significant marker for CRC.

Show MeSH
Related in: MedlinePlus