Limits...
Towards structured output prediction of enzyme function.

Astikainen K, Holm L, Pitkänen E, Szedmak S, Rousu J - BMC Proc (2008)

Bottom Line: A polynomial kernel over the GTG feature set turned out to be a prerequisite for accurate function prediction.Combining GTG with string kernels boosted accuracy slightly in the case of EC class prediction.Structured output prediction with GTG features is shown to be computationally feasible and to have accuracy on par with state-of-the-art approaches in enzyme function prediction.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science, PO Box 68, FI-00014 University of Helsinki, Finland. astikain@cs.helsinki.fi

ABSTRACT

Background: In this paper we describe work in progress in developing kernel methods for enzyme function prediction. Our focus is in developing so called structured output prediction methods, where the enzymatic reaction is the combinatorial target object for prediction. We compared two structured output prediction methods, the Hierarchical Max-Margin Markov algorithm (HM3) and the Maximum Margin Regression algorithm (MMR) in hierarchical classification of enzyme function. As sequence features we use various string kernels and the GTG feature set derived from the global alignment trace graph of protein sequences.

Results: In our experiments, in predicting enzyme EC classification we obtain over 85% accuracy (predicting the four digit EC code) and over 91% microlabel F1 score (predicting individual EC digits). In predicting the Gold Standard enzyme families, we obtain over 79% accuracy (predicting family correctly) and over 89% microlabel F1 score (predicting superfamilies and families). In the latter case, structured output methods are significantly more accurate than nearest neighbor classifier. A polynomial kernel over the GTG feature set turned out to be a prerequisite for accurate function prediction. Combining GTG with string kernels boosted accuracy slightly in the case of EC class prediction.

Conclusion: Structured output prediction with GTG features is shown to be computationally feasible and to have accuracy on par with state-of-the-art approaches in enzyme function prediction.

No MeSH data available.


Related in: MedlinePlus

A chemical reaction catalyzed by the enzyme serine deaminase.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2654971&req=5

Figure 1: A chemical reaction catalyzed by the enzyme serine deaminase.

Mentions: Enzymes are the workhorses of living cells, producing energy and building blocks for cell growth as well as participating in maintaining and regulation of the metabolic states of the cells. Reliable assignment of enzyme function, that is, the biochemical reactions (Fig. 1) catalyzed by the enzymes, is a prerequisite of high-quality metabolic reconstruction and the analysis of metabolic fluxes [1].


Towards structured output prediction of enzyme function.

Astikainen K, Holm L, Pitkänen E, Szedmak S, Rousu J - BMC Proc (2008)

A chemical reaction catalyzed by the enzyme serine deaminase.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2654971&req=5

Figure 1: A chemical reaction catalyzed by the enzyme serine deaminase.
Mentions: Enzymes are the workhorses of living cells, producing energy and building blocks for cell growth as well as participating in maintaining and regulation of the metabolic states of the cells. Reliable assignment of enzyme function, that is, the biochemical reactions (Fig. 1) catalyzed by the enzymes, is a prerequisite of high-quality metabolic reconstruction and the analysis of metabolic fluxes [1].

Bottom Line: A polynomial kernel over the GTG feature set turned out to be a prerequisite for accurate function prediction.Combining GTG with string kernels boosted accuracy slightly in the case of EC class prediction.Structured output prediction with GTG features is shown to be computationally feasible and to have accuracy on par with state-of-the-art approaches in enzyme function prediction.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science, PO Box 68, FI-00014 University of Helsinki, Finland. astikain@cs.helsinki.fi

ABSTRACT

Background: In this paper we describe work in progress in developing kernel methods for enzyme function prediction. Our focus is in developing so called structured output prediction methods, where the enzymatic reaction is the combinatorial target object for prediction. We compared two structured output prediction methods, the Hierarchical Max-Margin Markov algorithm (HM3) and the Maximum Margin Regression algorithm (MMR) in hierarchical classification of enzyme function. As sequence features we use various string kernels and the GTG feature set derived from the global alignment trace graph of protein sequences.

Results: In our experiments, in predicting enzyme EC classification we obtain over 85% accuracy (predicting the four digit EC code) and over 91% microlabel F1 score (predicting individual EC digits). In predicting the Gold Standard enzyme families, we obtain over 79% accuracy (predicting family correctly) and over 89% microlabel F1 score (predicting superfamilies and families). In the latter case, structured output methods are significantly more accurate than nearest neighbor classifier. A polynomial kernel over the GTG feature set turned out to be a prerequisite for accurate function prediction. Combining GTG with string kernels boosted accuracy slightly in the case of EC class prediction.

Conclusion: Structured output prediction with GTG features is shown to be computationally feasible and to have accuracy on par with state-of-the-art approaches in enzyme function prediction.

No MeSH data available.


Related in: MedlinePlus