Limits...
MitoFates: improved prediction of mitochondrial targeting sequences and their cleavage sites.

Fukasawa Y, Tsuji J, Fu SC, Tomii K, Horton P, Imai K - Mol. Cell Proteomics (2015)

Bottom Line: Here we describe MitoFates, an improved prediction method for cleavable N-terminal mitochondrial targeting signals (presequences) and their cleavage sites.Interestingly, these include candidate regulators of parkin translocation to damaged mitochondria, and also many genes with known disease mutations, suggesting that careful investigation of MitoFates predictions may be helpful in elucidating the role of mitochondria in health and disease.MitoFates is open source with a convenient web server publicly available.

View Article: PubMed Central - PubMed

Affiliation: From the ‡Department of Computational Biology, Graduate School of Frontier Sciences, The University Tokyo, 5-1-5, Kashiwanoha, Kashiwa, Chiba, 277-8561, Japan;

Show MeSH
Local sequences and prediction performance of cleavage sites.A, Sequence logo of MPP cleavage sites partitioned into three classes (MPP only, MPP+Icp55, MPP+Oct1) based on recent proteomics data. The dashed line boxes show the range of positions covered by the PWMs for MPP, Oct1, and Icp55. B, Cleavage site accuracy comparison on the yeast data set. Error bars show the standard error of mean estimation based on 10-fold cross validation (only MitoFates is retrained, the other tools are used as distributed without retraining but their prediction accuracy still varies between test folds).
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4390256&req=5

Figure 1: Local sequences and prediction performance of cleavage sites.A, Sequence logo of MPP cleavage sites partitioned into three classes (MPP only, MPP+Icp55, MPP+Oct1) based on recent proteomics data. The dashed line boxes show the range of positions covered by the PWMs for MPP, Oct1, and Icp55. B, Cleavage site accuracy comparison on the yeast data set. Error bars show the standard error of mean estimation based on 10-fold cross validation (only MitoFates is retrained, the other tools are used as distributed without retraining but their prediction accuracy still varies between test folds).

Mentions: A large majority of presequences are cleaved by MPP, and many of those by secondary proteases as well. MPP cleavage sites display local sequence tendencies (20), the most conspicuous one being the presence of arginine in the −2 position in nearly all cases, consistent with electrostatic interaction between this arginine and negatively charged residues in MPP (27). After cleavage by MPP, the secondary proteases Oct1 and Icp55 further cleave some presequences, removing eight residues or a single residue, respectively (7). It is reasonable to hope that explicit modeling of this two-step process might improve the prediction of those presequences. Thus, we generated a consensus Position Weight Matrix (PWM) based on the frequencies of amino acids between the −4 position and the +5 position of training set sequences aligned by cleavage site. As with the amino acid composition values described above, we smoothed the observed frequencies in each column of the PWM with a 20-component Dirichlet mixture model (26). The PWM score is calculated as the log-odds ratio between those smoothed frequencies and a background composition based on the mature region of cleaved mitochondrial proteins. To predict if putative MPP cleavage sites are further cleaved by Oct1 or Icp55, we employed PWMs based on the cleavage sites of those peptidases in the training data. By inspection of the training data, we chose the range of positions covered by the PWMs to be [+1, +4] (length 4) and [+1, +2] (length 2) for MPP+Oct1 and MPP+Icp55, respectively (Fig. 1A). Because plant data was rather limited and PWMs require a large number of parameters (19 per column), we chose to use PWMs trained on the more abundant yeast data, even when making predictions for plant proteins (however, we did retrain the length distribution as described below).


MitoFates: improved prediction of mitochondrial targeting sequences and their cleavage sites.

Fukasawa Y, Tsuji J, Fu SC, Tomii K, Horton P, Imai K - Mol. Cell Proteomics (2015)

Local sequences and prediction performance of cleavage sites.A, Sequence logo of MPP cleavage sites partitioned into three classes (MPP only, MPP+Icp55, MPP+Oct1) based on recent proteomics data. The dashed line boxes show the range of positions covered by the PWMs for MPP, Oct1, and Icp55. B, Cleavage site accuracy comparison on the yeast data set. Error bars show the standard error of mean estimation based on 10-fold cross validation (only MitoFates is retrained, the other tools are used as distributed without retraining but their prediction accuracy still varies between test folds).
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4390256&req=5

Figure 1: Local sequences and prediction performance of cleavage sites.A, Sequence logo of MPP cleavage sites partitioned into three classes (MPP only, MPP+Icp55, MPP+Oct1) based on recent proteomics data. The dashed line boxes show the range of positions covered by the PWMs for MPP, Oct1, and Icp55. B, Cleavage site accuracy comparison on the yeast data set. Error bars show the standard error of mean estimation based on 10-fold cross validation (only MitoFates is retrained, the other tools are used as distributed without retraining but their prediction accuracy still varies between test folds).
Mentions: A large majority of presequences are cleaved by MPP, and many of those by secondary proteases as well. MPP cleavage sites display local sequence tendencies (20), the most conspicuous one being the presence of arginine in the −2 position in nearly all cases, consistent with electrostatic interaction between this arginine and negatively charged residues in MPP (27). After cleavage by MPP, the secondary proteases Oct1 and Icp55 further cleave some presequences, removing eight residues or a single residue, respectively (7). It is reasonable to hope that explicit modeling of this two-step process might improve the prediction of those presequences. Thus, we generated a consensus Position Weight Matrix (PWM) based on the frequencies of amino acids between the −4 position and the +5 position of training set sequences aligned by cleavage site. As with the amino acid composition values described above, we smoothed the observed frequencies in each column of the PWM with a 20-component Dirichlet mixture model (26). The PWM score is calculated as the log-odds ratio between those smoothed frequencies and a background composition based on the mature region of cleaved mitochondrial proteins. To predict if putative MPP cleavage sites are further cleaved by Oct1 or Icp55, we employed PWMs based on the cleavage sites of those peptidases in the training data. By inspection of the training data, we chose the range of positions covered by the PWMs to be [+1, +4] (length 4) and [+1, +2] (length 2) for MPP+Oct1 and MPP+Icp55, respectively (Fig. 1A). Because plant data was rather limited and PWMs require a large number of parameters (19 per column), we chose to use PWMs trained on the more abundant yeast data, even when making predictions for plant proteins (however, we did retrain the length distribution as described below).

Bottom Line: Here we describe MitoFates, an improved prediction method for cleavable N-terminal mitochondrial targeting signals (presequences) and their cleavage sites.Interestingly, these include candidate regulators of parkin translocation to damaged mitochondria, and also many genes with known disease mutations, suggesting that careful investigation of MitoFates predictions may be helpful in elucidating the role of mitochondria in health and disease.MitoFates is open source with a convenient web server publicly available.

View Article: PubMed Central - PubMed

Affiliation: From the ‡Department of Computational Biology, Graduate School of Frontier Sciences, The University Tokyo, 5-1-5, Kashiwanoha, Kashiwa, Chiba, 277-8561, Japan;

Show MeSH