Limits...
PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides.

Islam SM, Sajed T, Kearney CM, Baker EJ - BMC Bioinformatics (2015)

Bottom Line: Their effective interstitial and macro-environmental use requires energetic and structural stability.As a result, there is a need for automated high-throughput member classification approaches that leverage their demonstrated tertiary and functional homology.The ability to rapidly filter sequences for potential bioactive peptides can greatly compress the time between peptide identification and testing structural and functional properties for possible antimicrobial and insecticidal candidates.

View Article: PubMed Central - PubMed

Affiliation: Institute of Biomedical Studies, Baylor University, Waco, TX, USA. S_Islam@Baylor.edu.

ABSTRACT

Background: Numerous organisms have evolved a wide range of toxic peptides for self-defense and predation. Their effective interstitial and macro-environmental use requires energetic and structural stability. One successful group of these peptides includes a tri-disulfide domain arrangement that offers toxicity and high stability. Sequential tri-disulfide connectivity variants create highly compact disulfide folds capable of withstanding a variety of environmental stresses. Their combination of toxicity and stability make these peptides remarkably valuable for their potential as bio-insecticides, antimicrobial peptides and peptide drug candidates. However, the wide sequence variation, sources and modalities of group members impose serious limitations on our ability to rapidly identify potential members. As a result, there is a need for automated high-throughput member classification approaches that leverage their demonstrated tertiary and functional homology.

Results: We developed an SVM-based model to predict sequential tri-disulfide peptide (STP) toxins from peptide sequences. One optimized model, called PredSTP, predicted STPs from training set with sensitivity, specificity, precision, accuracy and a Matthews correlation coefficient of 94.86%, 94.11%, 84.31%, 94.30% and 0.86, respectively, using 200 fold cross validation. The same model outperforms existing prediction approaches in three independent out of sample testsets derived from PDB.

Conclusion: PredSTP can accurately identify a wide range of cystine stabilized peptide toxins directly from sequences in a species-agnostic fashion. The ability to rapidly filter sequences for potential bioactive peptides can greatly compress the time between peptide identification and testing structural and functional properties for possible antimicrobial and insecticidal candidates. A web interface is freely available to predict STP toxins from http://crick.ecs.baylor.edu/.

No MeSH data available.


Related in: MedlinePlus

Schematic of the process followed to develop and evaluate the SVM based STP toxin classifier
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4491269&req=5

Fig3: Schematic of the process followed to develop and evaluate the SVM based STP toxin classifier

Mentions: The training data set of 144 STP and 393 non-STP chains was evaluated using randomized sampling over 200 iterations to determine the optimal feature sets. All of the 6 feature sets were examined (Additional file 1: Supplement 3), and the sensitivity, specificity, precision, accuracy and MCC scores were calculated (Fig. 3). Feature set 6 demonstrated the best accuracy and MCC with values of 94.30 %, and 0.86, respectively, and was used for the basis of the remainder of the study. The Receptor Operating Curve (ROC) for feature set 6 is provided in the Fig. 4. In the rest of the article, the model is referred to as PredSTP.Fig. 3


PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides.

Islam SM, Sajed T, Kearney CM, Baker EJ - BMC Bioinformatics (2015)

Schematic of the process followed to develop and evaluate the SVM based STP toxin classifier
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4491269&req=5

Fig3: Schematic of the process followed to develop and evaluate the SVM based STP toxin classifier
Mentions: The training data set of 144 STP and 393 non-STP chains was evaluated using randomized sampling over 200 iterations to determine the optimal feature sets. All of the 6 feature sets were examined (Additional file 1: Supplement 3), and the sensitivity, specificity, precision, accuracy and MCC scores were calculated (Fig. 3). Feature set 6 demonstrated the best accuracy and MCC with values of 94.30 %, and 0.86, respectively, and was used for the basis of the remainder of the study. The Receptor Operating Curve (ROC) for feature set 6 is provided in the Fig. 4. In the rest of the article, the model is referred to as PredSTP.Fig. 3

Bottom Line: Their effective interstitial and macro-environmental use requires energetic and structural stability.As a result, there is a need for automated high-throughput member classification approaches that leverage their demonstrated tertiary and functional homology.The ability to rapidly filter sequences for potential bioactive peptides can greatly compress the time between peptide identification and testing structural and functional properties for possible antimicrobial and insecticidal candidates.

View Article: PubMed Central - PubMed

Affiliation: Institute of Biomedical Studies, Baylor University, Waco, TX, USA. S_Islam@Baylor.edu.

ABSTRACT

Background: Numerous organisms have evolved a wide range of toxic peptides for self-defense and predation. Their effective interstitial and macro-environmental use requires energetic and structural stability. One successful group of these peptides includes a tri-disulfide domain arrangement that offers toxicity and high stability. Sequential tri-disulfide connectivity variants create highly compact disulfide folds capable of withstanding a variety of environmental stresses. Their combination of toxicity and stability make these peptides remarkably valuable for their potential as bio-insecticides, antimicrobial peptides and peptide drug candidates. However, the wide sequence variation, sources and modalities of group members impose serious limitations on our ability to rapidly identify potential members. As a result, there is a need for automated high-throughput member classification approaches that leverage their demonstrated tertiary and functional homology.

Results: We developed an SVM-based model to predict sequential tri-disulfide peptide (STP) toxins from peptide sequences. One optimized model, called PredSTP, predicted STPs from training set with sensitivity, specificity, precision, accuracy and a Matthews correlation coefficient of 94.86%, 94.11%, 84.31%, 94.30% and 0.86, respectively, using 200 fold cross validation. The same model outperforms existing prediction approaches in three independent out of sample testsets derived from PDB.

Conclusion: PredSTP can accurately identify a wide range of cystine stabilized peptide toxins directly from sequences in a species-agnostic fashion. The ability to rapidly filter sequences for potential bioactive peptides can greatly compress the time between peptide identification and testing structural and functional properties for possible antimicrobial and insecticidal candidates. A web interface is freely available to predict STP toxins from http://crick.ecs.baylor.edu/.

No MeSH data available.


Related in: MedlinePlus