Limits...
MitoFates: improved prediction of mitochondrial targeting sequences and their cleavage sites.

Fukasawa Y, Tsuji J, Fu SC, Tomii K, Horton P, Imai K - Mol. Cell Proteomics (2015)

Bottom Line: Here we describe MitoFates, an improved prediction method for cleavable N-terminal mitochondrial targeting signals (presequences) and their cleavage sites.Interestingly, these include candidate regulators of parkin translocation to damaged mitochondria, and also many genes with known disease mutations, suggesting that careful investigation of MitoFates predictions may be helpful in elucidating the role of mitochondria in health and disease.MitoFates is open source with a convenient web server publicly available.

View Article: PubMed Central - PubMed

Affiliation: From the ‡Department of Computational Biology, Graduate School of Frontier Sciences, The University Tokyo, 5-1-5, Kashiwanoha, Kashiwa, Chiba, 277-8561, Japan;

Show MeSH
MTS discrimination performance comparison between MitoFates and previous predictors on an independent test data set.A, Comparison by PR-curve. B, True positive rate versus false positive rate. C, Statistical significance (vertical axis) of the true positive rate difference between MitoFates and other predictors plotted against false positive rate. For each input sequence the predictors output both a score (for TPpred2 we extracted the GRHCRF-scores from their software) and a label (mitochondrial, ER, or other); the dashed lines show performance based purely on the scores, and the solid lines always count mislabeled mitochondrial proteins as false negatives and nonmitochondrial proteins with nonmitochondrial labels as true negatives, regardless of score.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4390256&req=5

Figure 2: MTS discrimination performance comparison between MitoFates and previous predictors on an independent test data set.A, Comparison by PR-curve. B, True positive rate versus false positive rate. C, Statistical significance (vertical axis) of the true positive rate difference between MitoFates and other predictors plotted against false positive rate. For each input sequence the predictors output both a score (for TPpred2 we extracted the GRHCRF-scores from their software) and a label (mitochondrial, ER, or other); the dashed lines show performance based purely on the scores, and the solid lines always count mislabeled mitochondrial proteins as false negatives and nonmitochondrial proteins with nonmitochondrial labels as true negatives, regardless of score.

Mentions: We benchmarked presequence prediction performance between our predictor (MitoFates) and four previously developed predictors: TPpred2, TargetP (ver. 1.1), Predotar (ver. 1.03), and MitoProtII (ver. 1.101) on the independent test data containing 78 presequences described in Methods. Fig. 2A shows the 11 point precision-recall curve (PR-curve) of each predictor averaged over 10 random selections of 500 negative test set proteins. MitoFates achieves an average precision of 84% on the PR-curve, outperforming TPpred2, Predotar, TargetP, and MitoProtII, which obtained an average precision of 81%, 79%, 78%, and 74%, respectively. In particular, MitoFates attains better precision for recall values of 50–80% (in this range the average precision of MitoFates, TPpred2, Predotar, TargetP and MitoProtII is 91%, 81%, 82%, 77%, and 77%, respectively). The ROC AUC of MitoFates is also superior to other predictors (Table I). For MitoFates, we focused on two prediction cutoffs (0.5 and 0.385) based on a 5-fold cross-validation test within the training data set (supplemental Fig. S2); 0.5 is the default cutoff determined by LIBSVM (34) with a precision and recall of 0.83 and 0.73, respectively; and 0.385 corresponds to a precision and recall of 0.79 and 0.80. At both prediction cutoff values, MitoFates' Matthews correlation coefficient (MCC) is better than those of other predictors at their default cutoffs. In addition, the PR-curve and ROC AUC of MitoFates is better than TargetP and Predotar even when MitoFates is trained on their training data set (supplemental Fig. S3), suggesting that our novel features contribute to improved prediction accuracy (the training data set of TPpred2 overlapped to a large extent with our test data so we did not do this experiment on the TPpred2 training data). However, the PR-curve and ROC AUC of MitoFates trained on those data sets is inferior to those of MitoFates trained on its original data set, suggesting that the updated MitoFates training data also contributes to its superior performance.


MitoFates: improved prediction of mitochondrial targeting sequences and their cleavage sites.

Fukasawa Y, Tsuji J, Fu SC, Tomii K, Horton P, Imai K - Mol. Cell Proteomics (2015)

MTS discrimination performance comparison between MitoFates and previous predictors on an independent test data set.A, Comparison by PR-curve. B, True positive rate versus false positive rate. C, Statistical significance (vertical axis) of the true positive rate difference between MitoFates and other predictors plotted against false positive rate. For each input sequence the predictors output both a score (for TPpred2 we extracted the GRHCRF-scores from their software) and a label (mitochondrial, ER, or other); the dashed lines show performance based purely on the scores, and the solid lines always count mislabeled mitochondrial proteins as false negatives and nonmitochondrial proteins with nonmitochondrial labels as true negatives, regardless of score.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4390256&req=5

Figure 2: MTS discrimination performance comparison between MitoFates and previous predictors on an independent test data set.A, Comparison by PR-curve. B, True positive rate versus false positive rate. C, Statistical significance (vertical axis) of the true positive rate difference between MitoFates and other predictors plotted against false positive rate. For each input sequence the predictors output both a score (for TPpred2 we extracted the GRHCRF-scores from their software) and a label (mitochondrial, ER, or other); the dashed lines show performance based purely on the scores, and the solid lines always count mislabeled mitochondrial proteins as false negatives and nonmitochondrial proteins with nonmitochondrial labels as true negatives, regardless of score.
Mentions: We benchmarked presequence prediction performance between our predictor (MitoFates) and four previously developed predictors: TPpred2, TargetP (ver. 1.1), Predotar (ver. 1.03), and MitoProtII (ver. 1.101) on the independent test data containing 78 presequences described in Methods. Fig. 2A shows the 11 point precision-recall curve (PR-curve) of each predictor averaged over 10 random selections of 500 negative test set proteins. MitoFates achieves an average precision of 84% on the PR-curve, outperforming TPpred2, Predotar, TargetP, and MitoProtII, which obtained an average precision of 81%, 79%, 78%, and 74%, respectively. In particular, MitoFates attains better precision for recall values of 50–80% (in this range the average precision of MitoFates, TPpred2, Predotar, TargetP and MitoProtII is 91%, 81%, 82%, 77%, and 77%, respectively). The ROC AUC of MitoFates is also superior to other predictors (Table I). For MitoFates, we focused on two prediction cutoffs (0.5 and 0.385) based on a 5-fold cross-validation test within the training data set (supplemental Fig. S2); 0.5 is the default cutoff determined by LIBSVM (34) with a precision and recall of 0.83 and 0.73, respectively; and 0.385 corresponds to a precision and recall of 0.79 and 0.80. At both prediction cutoff values, MitoFates' Matthews correlation coefficient (MCC) is better than those of other predictors at their default cutoffs. In addition, the PR-curve and ROC AUC of MitoFates is better than TargetP and Predotar even when MitoFates is trained on their training data set (supplemental Fig. S3), suggesting that our novel features contribute to improved prediction accuracy (the training data set of TPpred2 overlapped to a large extent with our test data so we did not do this experiment on the TPpred2 training data). However, the PR-curve and ROC AUC of MitoFates trained on those data sets is inferior to those of MitoFates trained on its original data set, suggesting that the updated MitoFates training data also contributes to its superior performance.

Bottom Line: Here we describe MitoFates, an improved prediction method for cleavable N-terminal mitochondrial targeting signals (presequences) and their cleavage sites.Interestingly, these include candidate regulators of parkin translocation to damaged mitochondria, and also many genes with known disease mutations, suggesting that careful investigation of MitoFates predictions may be helpful in elucidating the role of mitochondria in health and disease.MitoFates is open source with a convenient web server publicly available.

View Article: PubMed Central - PubMed

Affiliation: From the ‡Department of Computational Biology, Graduate School of Frontier Sciences, The University Tokyo, 5-1-5, Kashiwanoha, Kashiwa, Chiba, 277-8561, Japan;

Show MeSH