Limits...
Machine learning assisted design of highly active peptides for drug discovery.

Giguère S, Laviolette F, Marchand M, Tremblay D, Moineau S, Liang X, Biron É, Corbeil J - PLoS Comput. Biol. (2015)

Bottom Line: Extensive analyses demonstrate how these algorithms can be part of an iterative combinatorial chemistry procedure to speed up the discovery and the validation of peptide leads.Moreover, the proposed approach does not require the use of known ligands for the target protein since it can leverage recent multi-target machine learning predictors where ligands for similar targets can serve as initial training data.Finally, we validated the proposed approach in vitro with the discovery of new cationic antimicrobial peptides.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science and Software Engineering, Université Laval, Québec, Canada.

ABSTRACT
The discovery of peptides possessing high biological activity is very challenging due to the enormous diversity for which only a minority have the desired properties. To lower cost and reduce the time to obtain promising peptides, machine learning approaches can greatly assist in the process and even partly replace expensive laboratory experiments by learning a predictor with existing data or with a smaller amount of data generation. Unfortunately, once the model is learned, selecting peptides having the greatest predicted bioactivity often requires a prohibitive amount of computational time. For this combinatorial problem, heuristics and stochastic optimization methods are not guaranteed to find adequate solutions. We focused on recent advances in kernel methods and machine learning to learn a predictive model with proven success. For this type of model, we propose an efficient algorithm based on graph theory, that is guaranteed to find the peptides for which the model predicts maximal bioactivity. We also present a second algorithm capable of sorting the peptides of maximal bioactivity. Extensive analyses demonstrate how these algorithms can be part of an iterative combinatorial chemistry procedure to speed up the discovery and the validation of peptide leads. Moreover, the proposed approach does not require the use of known ligands for the target protein since it can leverage recent multi-target machine learning predictors where ligands for similar targets can serve as initial training data. Finally, we validated the proposed approach in vitro with the discovery of new cationic antimicrobial peptides. Source code freely available at http://graal.ift.ulaval.ca/peptide-design/.

No MeSH data available.


Illustration of the 3-partite graph Ghy with k = 3 and a two letters alphabet .In this graph, every source-sink path represent a peptide of size 5 (l = n + k − 1) based on the alphabet {A, B}.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4388847&req=5

pcbi.1004074.g001: Illustration of the 3-partite graph Ghy with k = 3 and a two letters alphabet .In this graph, every source-sink path represent a peptide of size 5 (l = n + k − 1) based on the alphabet {A, B}.

Mentions: Here, we assume that we have, for a fixed target y, a prediction function hy(x) given by Equation (6). In this case, we show how the problem of finding, the peptide of maximal bioactivity reduces to the problem of finding the longest path in a directed acyclic graph (DAG). Note that, throughout this manuscript, we will assume that the length of a path is given by the sum of the weights on its edges. To solve this problem, we construct a DAG with a source and a sink vertex such that for all possible peptides , there exists only one path associated to x that goes from the source to the sink. Moreover, the length of the path associated to x is exactly hy(x). Thus, if the size of the constructed graph is polynomial in l, any algorithm that efficiently solves the longest path problem in a DAG will also efficiently find the peptide of maximal bioactivity. A simplification of the graph is shown in Fig. 1 to assist in the comprehension of the formal definition that follows.


Machine learning assisted design of highly active peptides for drug discovery.

Giguère S, Laviolette F, Marchand M, Tremblay D, Moineau S, Liang X, Biron É, Corbeil J - PLoS Comput. Biol. (2015)

Illustration of the 3-partite graph Ghy with k = 3 and a two letters alphabet .In this graph, every source-sink path represent a peptide of size 5 (l = n + k − 1) based on the alphabet {A, B}.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4388847&req=5

pcbi.1004074.g001: Illustration of the 3-partite graph Ghy with k = 3 and a two letters alphabet .In this graph, every source-sink path represent a peptide of size 5 (l = n + k − 1) based on the alphabet {A, B}.
Mentions: Here, we assume that we have, for a fixed target y, a prediction function hy(x) given by Equation (6). In this case, we show how the problem of finding, the peptide of maximal bioactivity reduces to the problem of finding the longest path in a directed acyclic graph (DAG). Note that, throughout this manuscript, we will assume that the length of a path is given by the sum of the weights on its edges. To solve this problem, we construct a DAG with a source and a sink vertex such that for all possible peptides , there exists only one path associated to x that goes from the source to the sink. Moreover, the length of the path associated to x is exactly hy(x). Thus, if the size of the constructed graph is polynomial in l, any algorithm that efficiently solves the longest path problem in a DAG will also efficiently find the peptide of maximal bioactivity. A simplification of the graph is shown in Fig. 1 to assist in the comprehension of the formal definition that follows.

Bottom Line: Extensive analyses demonstrate how these algorithms can be part of an iterative combinatorial chemistry procedure to speed up the discovery and the validation of peptide leads.Moreover, the proposed approach does not require the use of known ligands for the target protein since it can leverage recent multi-target machine learning predictors where ligands for similar targets can serve as initial training data.Finally, we validated the proposed approach in vitro with the discovery of new cationic antimicrobial peptides.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science and Software Engineering, Université Laval, Québec, Canada.

ABSTRACT
The discovery of peptides possessing high biological activity is very challenging due to the enormous diversity for which only a minority have the desired properties. To lower cost and reduce the time to obtain promising peptides, machine learning approaches can greatly assist in the process and even partly replace expensive laboratory experiments by learning a predictor with existing data or with a smaller amount of data generation. Unfortunately, once the model is learned, selecting peptides having the greatest predicted bioactivity often requires a prohibitive amount of computational time. For this combinatorial problem, heuristics and stochastic optimization methods are not guaranteed to find adequate solutions. We focused on recent advances in kernel methods and machine learning to learn a predictive model with proven success. For this type of model, we propose an efficient algorithm based on graph theory, that is guaranteed to find the peptides for which the model predicts maximal bioactivity. We also present a second algorithm capable of sorting the peptides of maximal bioactivity. Extensive analyses demonstrate how these algorithms can be part of an iterative combinatorial chemistry procedure to speed up the discovery and the validation of peptide leads. Moreover, the proposed approach does not require the use of known ligands for the target protein since it can leverage recent multi-target machine learning predictors where ligands for similar targets can serve as initial training data. Finally, we validated the proposed approach in vitro with the discovery of new cationic antimicrobial peptides. Source code freely available at http://graal.ift.ulaval.ca/peptide-design/.

No MeSH data available.