Limits...
ComPath: comparative enzyme analysis and annotation in pathway/subsystem contexts.

Choi K, Kim S - BMC Bioinformatics (2008)

Bottom Line: ComPath allows users to compare biological pathways in multiple genomes using a spreadsheet style web interface where various sequence-based analysis can be performed either to compare enzymes (e.g. sequence clustering) and pathways (e.g. pathway hole identification), to search a genome for de novo prediction of enzymes, or to annotate a genome in comparison with reference genomes of choice.Gene neighborhood and pathway neighborhood (global network) visualization tools can be used to get context information that is complementary to conventional KEGG map representation.ComPath is an interactive workbench for pathway reconstruction, annotation, and analysis where experts can perform various sequence, domain, context analysis, using an intuitive and interactive spreadsheet-style interface.

View Article: PubMed Central - HTML - PubMed

Affiliation: School of Informatics, Indiana University, Bloomington, IN 47408, USA. kwchoi@indiana.edu

ABSTRACT

Background: Once a new genome is sequenced, one of the important questions is to determine the presence and absence of biological pathways. Analysis of biological pathways in a genome is a complicated task since a number of biological entities are involved in pathways and biological pathways in different organisms are not identical. Computational pathway identification and analysis thus involves a number of computational tools and databases and typically done in comparison with pathways in other organisms. This computational requirement is much beyond the capability of biologists, so information systems for reconstructing, annotating, and analyzing biological pathways are much needed. We introduce a new comparative pathway analysis workbench, ComPath, which integrates various resources and computational tools using an interactive spreadsheet-style web interface for reliable pathway analyses.

Results: ComPath allows users to compare biological pathways in multiple genomes using a spreadsheet style web interface where various sequence-based analysis can be performed either to compare enzymes (e.g. sequence clustering) and pathways (e.g. pathway hole identification), to search a genome for de novo prediction of enzymes, or to annotate a genome in comparison with reference genomes of choice. To fill in pathway holes or make de novo enzyme predictions, multiple computational methods such as FASTA, Whole-HMM, CSR-HMM (a method of our own introduced in this paper), and PDB-domain search are integrated in ComPath. Our experiments show that FASTA and CSR-HMM search methods generally outperform Whole-HMM and PDB-domain search methods in terms of sensitivity, but FASTA search performs poorly in terms of specificity, detecting more false positive as E-value cutoff increases. Overall, CSR-HMM search method performs best in terms of both sensitivity and specificity. Gene neighborhood and pathway neighborhood (global network) visualization tools can be used to get context information that is complementary to conventional KEGG map representation.

Conclusion: ComPath is an interactive workbench for pathway reconstruction, annotation, and analysis where experts can perform various sequence, domain, context analysis, using an intuitive and interactive spreadsheet-style interface.

Show MeSH

Related in: MedlinePlus

Case II: isozymes. EC 2.7.1.11 and six gamma-proteobacteria species were selected: Escherichia coli K-12 MG1655 (eco), Escherichia coli O157 EDL933 (ece), Salmonella typhi CT18 (sty), Salmonella enterica serovar typhi Ty2 (stt), Shigella flexneri 301 (sfl), Shigella flexneri 8401 (sfv). Initial pathway reconstruction detected no phophofructokinase gene in Shigella flexneri 8401 (sfv). BAG clustering clearly divides class I and class II isozymes from five species. To detect two isozymes classes in Shigella flexneri 8401, two sequential CSR-HMM searches using two set of protein sequences as queries detect SFV_3578 (class I) and SFV_1498 (class II).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2277404&req=5

Figure 7: Case II: isozymes. EC 2.7.1.11 and six gamma-proteobacteria species were selected: Escherichia coli K-12 MG1655 (eco), Escherichia coli O157 EDL933 (ece), Salmonella typhi CT18 (sty), Salmonella enterica serovar typhi Ty2 (stt), Shigella flexneri 301 (sfl), Shigella flexneri 8401 (sfv). Initial pathway reconstruction detected no phophofructokinase gene in Shigella flexneri 8401 (sfv). BAG clustering clearly divides class I and class II isozymes from five species. To detect two isozymes classes in Shigella flexneri 8401, two sequential CSR-HMM searches using two set of protein sequences as queries detect SFV_3578 (class I) and SFV_1498 (class II).

Mentions: Isozymes are enzymes with the same catalytic activity, but they are distant in terms of sequence similarity. For example, phosphofructokinase (EC:2.7.1.11), existing as a homotetramer in bacteria and mammals, has two isozymes in Escherichia coli and related species (pfkA and pfkB). pfkB is a minor phosphofructokinase which is not evolutionary related to the major isozyme (gene pfkA). In Figure 7, Shigella flexneri 8401 has no phophofructokinase gene and BAG clustering clearly divides class I and class II isozymes. To detect two isozymes classes in Shigella flexneri 8401, two sequential CSR-HMM searches detect SFV_3578 (class I) and SFV_1498 (class II).


ComPath: comparative enzyme analysis and annotation in pathway/subsystem contexts.

Choi K, Kim S - BMC Bioinformatics (2008)

Case II: isozymes. EC 2.7.1.11 and six gamma-proteobacteria species were selected: Escherichia coli K-12 MG1655 (eco), Escherichia coli O157 EDL933 (ece), Salmonella typhi CT18 (sty), Salmonella enterica serovar typhi Ty2 (stt), Shigella flexneri 301 (sfl), Shigella flexneri 8401 (sfv). Initial pathway reconstruction detected no phophofructokinase gene in Shigella flexneri 8401 (sfv). BAG clustering clearly divides class I and class II isozymes from five species. To detect two isozymes classes in Shigella flexneri 8401, two sequential CSR-HMM searches using two set of protein sequences as queries detect SFV_3578 (class I) and SFV_1498 (class II).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2277404&req=5

Figure 7: Case II: isozymes. EC 2.7.1.11 and six gamma-proteobacteria species were selected: Escherichia coli K-12 MG1655 (eco), Escherichia coli O157 EDL933 (ece), Salmonella typhi CT18 (sty), Salmonella enterica serovar typhi Ty2 (stt), Shigella flexneri 301 (sfl), Shigella flexneri 8401 (sfv). Initial pathway reconstruction detected no phophofructokinase gene in Shigella flexneri 8401 (sfv). BAG clustering clearly divides class I and class II isozymes from five species. To detect two isozymes classes in Shigella flexneri 8401, two sequential CSR-HMM searches using two set of protein sequences as queries detect SFV_3578 (class I) and SFV_1498 (class II).
Mentions: Isozymes are enzymes with the same catalytic activity, but they are distant in terms of sequence similarity. For example, phosphofructokinase (EC:2.7.1.11), existing as a homotetramer in bacteria and mammals, has two isozymes in Escherichia coli and related species (pfkA and pfkB). pfkB is a minor phosphofructokinase which is not evolutionary related to the major isozyme (gene pfkA). In Figure 7, Shigella flexneri 8401 has no phophofructokinase gene and BAG clustering clearly divides class I and class II isozymes. To detect two isozymes classes in Shigella flexneri 8401, two sequential CSR-HMM searches detect SFV_3578 (class I) and SFV_1498 (class II).

Bottom Line: ComPath allows users to compare biological pathways in multiple genomes using a spreadsheet style web interface where various sequence-based analysis can be performed either to compare enzymes (e.g. sequence clustering) and pathways (e.g. pathway hole identification), to search a genome for de novo prediction of enzymes, or to annotate a genome in comparison with reference genomes of choice.Gene neighborhood and pathway neighborhood (global network) visualization tools can be used to get context information that is complementary to conventional KEGG map representation.ComPath is an interactive workbench for pathway reconstruction, annotation, and analysis where experts can perform various sequence, domain, context analysis, using an intuitive and interactive spreadsheet-style interface.

View Article: PubMed Central - HTML - PubMed

Affiliation: School of Informatics, Indiana University, Bloomington, IN 47408, USA. kwchoi@indiana.edu

ABSTRACT

Background: Once a new genome is sequenced, one of the important questions is to determine the presence and absence of biological pathways. Analysis of biological pathways in a genome is a complicated task since a number of biological entities are involved in pathways and biological pathways in different organisms are not identical. Computational pathway identification and analysis thus involves a number of computational tools and databases and typically done in comparison with pathways in other organisms. This computational requirement is much beyond the capability of biologists, so information systems for reconstructing, annotating, and analyzing biological pathways are much needed. We introduce a new comparative pathway analysis workbench, ComPath, which integrates various resources and computational tools using an interactive spreadsheet-style web interface for reliable pathway analyses.

Results: ComPath allows users to compare biological pathways in multiple genomes using a spreadsheet style web interface where various sequence-based analysis can be performed either to compare enzymes (e.g. sequence clustering) and pathways (e.g. pathway hole identification), to search a genome for de novo prediction of enzymes, or to annotate a genome in comparison with reference genomes of choice. To fill in pathway holes or make de novo enzyme predictions, multiple computational methods such as FASTA, Whole-HMM, CSR-HMM (a method of our own introduced in this paper), and PDB-domain search are integrated in ComPath. Our experiments show that FASTA and CSR-HMM search methods generally outperform Whole-HMM and PDB-domain search methods in terms of sensitivity, but FASTA search performs poorly in terms of specificity, detecting more false positive as E-value cutoff increases. Overall, CSR-HMM search method performs best in terms of both sensitivity and specificity. Gene neighborhood and pathway neighborhood (global network) visualization tools can be used to get context information that is complementary to conventional KEGG map representation.

Conclusion: ComPath is an interactive workbench for pathway reconstruction, annotation, and analysis where experts can perform various sequence, domain, context analysis, using an intuitive and interactive spreadsheet-style interface.

Show MeSH
Related in: MedlinePlus