Limits...
Integrating Gene Regulatory Networks to identify cancer-specific genes.

Bo V, Tucker A - AMIA Jt Summits Transl Sci Proc (2015)

Bottom Line: To support the results reliability we calculate the prediction accuracy of each gene for the specified conditions and compare to predictions on other conditions.The most predictive are validated using the GeneCards encyclopaedia1 coupled with a statistical test for validating clusters.Finally, we implement an interface that allows the user to identify unique subnetworks of any selected combination of studies using AND & NOT logic operators.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Brunel University, London, UK.

ABSTRACT
Consensus approaches have been widely used to identify Gene Regulatory Networks (GRNs) that are common to multiple studies. However, in this research we develop an application that semi-automatically identifies key mechanisms that are specific to a particular set of conditions. We analyse four different types of cancer to identify gene pathways unique to each of them. To support the results reliability we calculate the prediction accuracy of each gene for the specified conditions and compare to predictions on other conditions. The most predictive are validated using the GeneCards encyclopaedia1 coupled with a statistical test for validating clusters. Finally, we implement an interface that allows the user to identify unique subnetworks of any selected combination of studies using AND & NOT logic operators. Results show that unique genes and sub-networks can be reliably identified and that they reflect key mechanisms that are fundamental to the cancer types under study.

No MeSH data available.


Related in: MedlinePlus

Internal vs External prediction accuracy for each study averaged among all genes involved in the related unique-network.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4525222&req=5

f2-2082746: Internal vs External prediction accuracy for each study averaged among all genes involved in the related unique-network.

Mentions: In this study four cancer datasets are explored: breast, ovarian, medullary breast (a subtype of breast cancer) and lung, in human patients. Each dataset contains a different number of samples (see Table 1). The variable selection approach reduces the number of variables/genes to analyse from 54675 to 1629. Variable reduction is followed by the implementation of glasso with the parameter ρ = 0.05. Given the glasso networks for each study we consider only the edges that are present in the network under consideration but not in the others. Once the unique-edges are detected, the genes involved are used to build a BN for each study called unique-networks (U-Ns). An example of these networks is shown in Figure 1. The structure of the glasso U-Ns differ from the structure of the Bayesian U-Ns. In the Figures 1a and 1b the nodes with a grey background indicate genes with a predicted accuracy for the gene greater than 0.6 (based on our findings in9). Because of the study description in Table 1, we would expect breast cancer to be very similar (involving almost the same genes) to medullary breast cancer and slightly less similar to ovarian, but very different from lung cancer. This implies that the average internal prediction for each study will not differ much from the external prediction. The internal vs external prediction for each study shown in Figure 2 reveals, as expected a very clear difference only in Network 3 and 4, medullary-breast and lung cancer respectively, with a small difference in 1 and 3. This deduction is supported by the p-values obtained from the applied t-test as shown in Table 1. We now evaluate the significance of detecting the identified unique-genes by calculating the probability score using the normal approximation. For this paper si is the size of each unique network, kj the number of genes in the unique gene-list obtained for each cancer type comparing the geneCards gene lists, x the number of genes that are present on both the unique network and the corresponding unique gene-list and n is the number of genes in the original unprocessed dataset. The results in Table 2 show the z-score and the corresponding p-value indicating that the probability of observing x elements from functional group j in cluster i by chance is in all four cases very small. This implies that the unique genes identified by our pipeline are highly significant in all studies.


Integrating Gene Regulatory Networks to identify cancer-specific genes.

Bo V, Tucker A - AMIA Jt Summits Transl Sci Proc (2015)

Internal vs External prediction accuracy for each study averaged among all genes involved in the related unique-network.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4525222&req=5

f2-2082746: Internal vs External prediction accuracy for each study averaged among all genes involved in the related unique-network.
Mentions: In this study four cancer datasets are explored: breast, ovarian, medullary breast (a subtype of breast cancer) and lung, in human patients. Each dataset contains a different number of samples (see Table 1). The variable selection approach reduces the number of variables/genes to analyse from 54675 to 1629. Variable reduction is followed by the implementation of glasso with the parameter ρ = 0.05. Given the glasso networks for each study we consider only the edges that are present in the network under consideration but not in the others. Once the unique-edges are detected, the genes involved are used to build a BN for each study called unique-networks (U-Ns). An example of these networks is shown in Figure 1. The structure of the glasso U-Ns differ from the structure of the Bayesian U-Ns. In the Figures 1a and 1b the nodes with a grey background indicate genes with a predicted accuracy for the gene greater than 0.6 (based on our findings in9). Because of the study description in Table 1, we would expect breast cancer to be very similar (involving almost the same genes) to medullary breast cancer and slightly less similar to ovarian, but very different from lung cancer. This implies that the average internal prediction for each study will not differ much from the external prediction. The internal vs external prediction for each study shown in Figure 2 reveals, as expected a very clear difference only in Network 3 and 4, medullary-breast and lung cancer respectively, with a small difference in 1 and 3. This deduction is supported by the p-values obtained from the applied t-test as shown in Table 1. We now evaluate the significance of detecting the identified unique-genes by calculating the probability score using the normal approximation. For this paper si is the size of each unique network, kj the number of genes in the unique gene-list obtained for each cancer type comparing the geneCards gene lists, x the number of genes that are present on both the unique network and the corresponding unique gene-list and n is the number of genes in the original unprocessed dataset. The results in Table 2 show the z-score and the corresponding p-value indicating that the probability of observing x elements from functional group j in cluster i by chance is in all four cases very small. This implies that the unique genes identified by our pipeline are highly significant in all studies.

Bottom Line: To support the results reliability we calculate the prediction accuracy of each gene for the specified conditions and compare to predictions on other conditions.The most predictive are validated using the GeneCards encyclopaedia1 coupled with a statistical test for validating clusters.Finally, we implement an interface that allows the user to identify unique subnetworks of any selected combination of studies using AND & NOT logic operators.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Brunel University, London, UK.

ABSTRACT
Consensus approaches have been widely used to identify Gene Regulatory Networks (GRNs) that are common to multiple studies. However, in this research we develop an application that semi-automatically identifies key mechanisms that are specific to a particular set of conditions. We analyse four different types of cancer to identify gene pathways unique to each of them. To support the results reliability we calculate the prediction accuracy of each gene for the specified conditions and compare to predictions on other conditions. The most predictive are validated using the GeneCards encyclopaedia1 coupled with a statistical test for validating clusters. Finally, we implement an interface that allows the user to identify unique subnetworks of any selected combination of studies using AND & NOT logic operators. Results show that unique genes and sub-networks can be reliably identified and that they reflect key mechanisms that are fundamental to the cancer types under study.

No MeSH data available.


Related in: MedlinePlus