Limits...
Subpathway-GM: identification of metabolic subpathways via joint power of interesting genes and metabolites and their topologies within pathways.

Li C, Han J, Yao Q, Zou C, Xu Y, Zhang C, Shang D, Zhou L, Zou C, Sun Z, Li J, Zhang Y, Yang H, Gao X, Li X - Nucleic Acids Res. (2013)

Bottom Line: Various 'omics' technologies, including microarrays and gas chromatography mass spectrometry, can be used to identify hundreds of interesting genes, proteins and metabolites, such as differential genes, proteins and metabolites associated with diseases.This provides a more accurate level of pathway analysis by integrating information from genes and metabolites, and their positions and cascade regions within the given pathway.Further analysis indicated that the power of a joint genes/metabolites and subpathway strategy based on their topologies may play a key role in reliably recalling disease-relevant subpathways and finding novel subpathways.

View Article: PubMed Central - PubMed

Affiliation: College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, PR China.

ABSTRACT
Various 'omics' technologies, including microarrays and gas chromatography mass spectrometry, can be used to identify hundreds of interesting genes, proteins and metabolites, such as differential genes, proteins and metabolites associated with diseases. Identifying metabolic pathways has become an invaluable aid to understanding the genes and metabolites associated with studying conditions. However, the classical methods used to identify pathways fail to accurately consider joint power of interesting gene/metabolite and the key regions impacted by them within metabolic pathways. In this study, we propose a powerful analytical method referred to as Subpathway-GM for the identification of metabolic subpathways. This provides a more accurate level of pathway analysis by integrating information from genes and metabolites, and their positions and cascade regions within the given pathway. We analyzed two colorectal cancer and one metastatic prostate cancer data sets and demonstrated that Subpathway-GM was able to identify disease-relevant subpathways whose corresponding entire pathways might be ignored using classical entire pathway identification methods. Further analysis indicated that the power of a joint genes/metabolites and subpathway strategy based on their topologies may play a key role in reliably recalling disease-relevant subpathways and finding novel subpathways.

Show MeSH

Related in: MedlinePlus

Identification of metabolic subpathways associated with colorectal cancer. (A) Distances among known disease nodes within metabolic pathways. (B) Empirical cumulative distribution functions of shortest path lengths between each disease node and its nearest disease node within pathways. (C) Plots of pathway significance (–log10 P-value) in Subpathway-GM, Pathway-G, Pathway-M and IMPaLA. Subpathway-GM identified 26 significant metabolic subpathways, corresponding to 25 entire pathways. Plus sign indicates that the pathway was identified by the corresponding method at the 1% significance level. Bold labels represent the additional pathways identified by Subpathway-GM. (D) Interaction network of the subpathway identified by Subpathway-GM. Two subpathways are connected by an edge if they share a non-empty intersection of metabolites or genes. Edge width between subpathways is proportional to the number of genes and metabolites shared by the two connected subpathways. Node size is proportional to the degree of the node. Node color reflects statistical significance of pathway (P-value). Subpathways well supported by existing literature are shown with a black border node.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3643575&req=5

gkt161-F2: Identification of metabolic subpathways associated with colorectal cancer. (A) Distances among known disease nodes within metabolic pathways. (B) Empirical cumulative distribution functions of shortest path lengths between each disease node and its nearest disease node within pathways. (C) Plots of pathway significance (–log10 P-value) in Subpathway-GM, Pathway-G, Pathway-M and IMPaLA. Subpathway-GM identified 26 significant metabolic subpathways, corresponding to 25 entire pathways. Plus sign indicates that the pathway was identified by the corresponding method at the 1% significance level. Bold labels represent the additional pathways identified by Subpathway-GM. (D) Interaction network of the subpathway identified by Subpathway-GM. Two subpathways are connected by an edge if they share a non-empty intersection of metabolites or genes. Edge width between subpathways is proportional to the number of genes and metabolites shared by the two connected subpathways. Node size is proportional to the degree of the node. Node color reflects statistical significance of pathway (P-value). Subpathways well supported by existing literature are shown with a black border node.

Mentions: We analyzed two colorectal cancer and one metastatic prostate cancer data sets using Subpathway-GM with the parameters n = 5 and s = 5. This study focused on identifying disease-related subpathways. We, therefore, set the parameter s = 5 because this type of subpathways (s ≥ 5) has been reportedly associated with disease in some studies and is considered by many groups to represent a pathway. To set an appropriate n value for the identification of disease-related subpathways, we examined the distances among known disease nodes within pathways based on disease genes and metabolites in the Genetic Association Database (GAD) (24) and HMDB (23) (Figure 2A). The average shortest distance among these nodes was 8.02, which was significantly smaller than that between all nodes within metabolic pathways (P < 2.2E-16; Wilcoxon rank-sum test). We further computed the shortest distance between each disease node and its nearest disease node and found that the distance was <5 for 85% disease nodes (Figure 2B). Some studies have suggested that genes associated with the same disease show close tendencies in biological pathways, and that their biological functions tend to be similar (25–27). A value of n = 5 thus seems to represent the closeness of genes and metabolites in diseases.Figure 2.


Subpathway-GM: identification of metabolic subpathways via joint power of interesting genes and metabolites and their topologies within pathways.

Li C, Han J, Yao Q, Zou C, Xu Y, Zhang C, Shang D, Zhou L, Zou C, Sun Z, Li J, Zhang Y, Yang H, Gao X, Li X - Nucleic Acids Res. (2013)

Identification of metabolic subpathways associated with colorectal cancer. (A) Distances among known disease nodes within metabolic pathways. (B) Empirical cumulative distribution functions of shortest path lengths between each disease node and its nearest disease node within pathways. (C) Plots of pathway significance (–log10 P-value) in Subpathway-GM, Pathway-G, Pathway-M and IMPaLA. Subpathway-GM identified 26 significant metabolic subpathways, corresponding to 25 entire pathways. Plus sign indicates that the pathway was identified by the corresponding method at the 1% significance level. Bold labels represent the additional pathways identified by Subpathway-GM. (D) Interaction network of the subpathway identified by Subpathway-GM. Two subpathways are connected by an edge if they share a non-empty intersection of metabolites or genes. Edge width between subpathways is proportional to the number of genes and metabolites shared by the two connected subpathways. Node size is proportional to the degree of the node. Node color reflects statistical significance of pathway (P-value). Subpathways well supported by existing literature are shown with a black border node.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3643575&req=5

gkt161-F2: Identification of metabolic subpathways associated with colorectal cancer. (A) Distances among known disease nodes within metabolic pathways. (B) Empirical cumulative distribution functions of shortest path lengths between each disease node and its nearest disease node within pathways. (C) Plots of pathway significance (–log10 P-value) in Subpathway-GM, Pathway-G, Pathway-M and IMPaLA. Subpathway-GM identified 26 significant metabolic subpathways, corresponding to 25 entire pathways. Plus sign indicates that the pathway was identified by the corresponding method at the 1% significance level. Bold labels represent the additional pathways identified by Subpathway-GM. (D) Interaction network of the subpathway identified by Subpathway-GM. Two subpathways are connected by an edge if they share a non-empty intersection of metabolites or genes. Edge width between subpathways is proportional to the number of genes and metabolites shared by the two connected subpathways. Node size is proportional to the degree of the node. Node color reflects statistical significance of pathway (P-value). Subpathways well supported by existing literature are shown with a black border node.
Mentions: We analyzed two colorectal cancer and one metastatic prostate cancer data sets using Subpathway-GM with the parameters n = 5 and s = 5. This study focused on identifying disease-related subpathways. We, therefore, set the parameter s = 5 because this type of subpathways (s ≥ 5) has been reportedly associated with disease in some studies and is considered by many groups to represent a pathway. To set an appropriate n value for the identification of disease-related subpathways, we examined the distances among known disease nodes within pathways based on disease genes and metabolites in the Genetic Association Database (GAD) (24) and HMDB (23) (Figure 2A). The average shortest distance among these nodes was 8.02, which was significantly smaller than that between all nodes within metabolic pathways (P < 2.2E-16; Wilcoxon rank-sum test). We further computed the shortest distance between each disease node and its nearest disease node and found that the distance was <5 for 85% disease nodes (Figure 2B). Some studies have suggested that genes associated with the same disease show close tendencies in biological pathways, and that their biological functions tend to be similar (25–27). A value of n = 5 thus seems to represent the closeness of genes and metabolites in diseases.Figure 2.

Bottom Line: Various 'omics' technologies, including microarrays and gas chromatography mass spectrometry, can be used to identify hundreds of interesting genes, proteins and metabolites, such as differential genes, proteins and metabolites associated with diseases.This provides a more accurate level of pathway analysis by integrating information from genes and metabolites, and their positions and cascade regions within the given pathway.Further analysis indicated that the power of a joint genes/metabolites and subpathway strategy based on their topologies may play a key role in reliably recalling disease-relevant subpathways and finding novel subpathways.

View Article: PubMed Central - PubMed

Affiliation: College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, PR China.

ABSTRACT
Various 'omics' technologies, including microarrays and gas chromatography mass spectrometry, can be used to identify hundreds of interesting genes, proteins and metabolites, such as differential genes, proteins and metabolites associated with diseases. Identifying metabolic pathways has become an invaluable aid to understanding the genes and metabolites associated with studying conditions. However, the classical methods used to identify pathways fail to accurately consider joint power of interesting gene/metabolite and the key regions impacted by them within metabolic pathways. In this study, we propose a powerful analytical method referred to as Subpathway-GM for the identification of metabolic subpathways. This provides a more accurate level of pathway analysis by integrating information from genes and metabolites, and their positions and cascade regions within the given pathway. We analyzed two colorectal cancer and one metastatic prostate cancer data sets and demonstrated that Subpathway-GM was able to identify disease-relevant subpathways whose corresponding entire pathways might be ignored using classical entire pathway identification methods. Further analysis indicated that the power of a joint genes/metabolites and subpathway strategy based on their topologies may play a key role in reliably recalling disease-relevant subpathways and finding novel subpathways.

Show MeSH
Related in: MedlinePlus