Limits...
Pareto Optimization Identifies Diverse Set of Phosphorylation Signatures Predicting Response to Treatment with Dasatinib.

Klammer M, Dybowski JN, Hoffmann D, Schaab C - PLoS ONE (2015)

Bottom Line: Multivariate biomarkers that can predict the effectiveness of targeted therapy in individual patients are highly desired.All four candidates reached the same good prediction accuracy (83%) as the originally published biomarker.In summary, the method presented here allows for a robust and simultaneous identification of multiple multivariate biomarkers that are optimized for prediction performance, size, and relevance.

View Article: PubMed Central - PubMed

Affiliation: Evotec (München) GmbH, Dept. of Bioinformatics, Am Klopferspitz 19a, 82152 Martinsried, Germany.

ABSTRACT
Multivariate biomarkers that can predict the effectiveness of targeted therapy in individual patients are highly desired. Previous biomarker discovery studies have largely focused on the identification of single biomarker signatures, aimed at maximizing prediction accuracy. Here, we present a different approach that identifies multiple biomarkers by simultaneously optimizing their predictive power, number of features, and proximity to the drug target in a protein-protein interaction network. To this end, we incorporated NSGA-II, a fast and elitist multi-objective optimization algorithm that is based on the principle of Pareto optimality, into the biomarker discovery workflow. The method was applied to quantitative phosphoproteome data of 19 non-small cell lung cancer (NSCLC) cell lines from a previous biomarker study. The algorithm successfully identified a total of 77 candidate biomarker signatures predicting response to treatment with dasatinib. Through filtering and similarity clustering, this set was trimmed to four final biomarker signatures, which then were validated on an independent set of breast cancer cell lines. All four candidates reached the same good prediction accuracy (83%) as the originally published biomarker. Although the newly discovered signatures were diverse in their composition and in their size, the central protein of the originally published signature - integrin β4 (ITGB4) - was also present in all four Pareto signatures, confirming its pivotal role in predicting dasatinib response in NSCLC cell lines. In summary, the method presented here allows for a robust and simultaneous identification of multiple multivariate biomarkers that are optimized for prediction performance, size, and relevance.

No MeSH data available.


Related in: MedlinePlus

Hierarchical clustering of the 53 accepted solutions on the Pareto front in feature space.In each row, red areas represent features (phosphosites) that are part of the corresponding solution. The solutions were subdivided into four clusters according to the row dendogram on the left. Cluster numbers are indicated on the right.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4470654&req=5

pone.0128542.g003: Hierarchical clustering of the 53 accepted solutions on the Pareto front in feature space.In each row, red areas represent features (phosphosites) that are part of the corresponding solution. The solutions were subdivided into four clusters according to the row dendogram on the left. Cluster numbers are indicated on the right.

Mentions: Each of the identified solutions on the Pareto front is optimal in the sense that none of them are dominated by any other solution. Therefore, each solution could be evaluated individually. Here we took another approach and investigated whether solutions can be reduced by clustering according to their similarity while retaining discriminatory power. To this end, we hierarchically clustered the solution in features space using the Ward method and obtained four major clusters (see Fig 3). For each of these clusters, the feature with the smallest Euclidean distance to the respective cluster centroid was selected as so-called Pareto signature for further analysis (see Fig 2B).


Pareto Optimization Identifies Diverse Set of Phosphorylation Signatures Predicting Response to Treatment with Dasatinib.

Klammer M, Dybowski JN, Hoffmann D, Schaab C - PLoS ONE (2015)

Hierarchical clustering of the 53 accepted solutions on the Pareto front in feature space.In each row, red areas represent features (phosphosites) that are part of the corresponding solution. The solutions were subdivided into four clusters according to the row dendogram on the left. Cluster numbers are indicated on the right.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4470654&req=5

pone.0128542.g003: Hierarchical clustering of the 53 accepted solutions on the Pareto front in feature space.In each row, red areas represent features (phosphosites) that are part of the corresponding solution. The solutions were subdivided into four clusters according to the row dendogram on the left. Cluster numbers are indicated on the right.
Mentions: Each of the identified solutions on the Pareto front is optimal in the sense that none of them are dominated by any other solution. Therefore, each solution could be evaluated individually. Here we took another approach and investigated whether solutions can be reduced by clustering according to their similarity while retaining discriminatory power. To this end, we hierarchically clustered the solution in features space using the Ward method and obtained four major clusters (see Fig 3). For each of these clusters, the feature with the smallest Euclidean distance to the respective cluster centroid was selected as so-called Pareto signature for further analysis (see Fig 2B).

Bottom Line: Multivariate biomarkers that can predict the effectiveness of targeted therapy in individual patients are highly desired.All four candidates reached the same good prediction accuracy (83%) as the originally published biomarker.In summary, the method presented here allows for a robust and simultaneous identification of multiple multivariate biomarkers that are optimized for prediction performance, size, and relevance.

View Article: PubMed Central - PubMed

Affiliation: Evotec (München) GmbH, Dept. of Bioinformatics, Am Klopferspitz 19a, 82152 Martinsried, Germany.

ABSTRACT
Multivariate biomarkers that can predict the effectiveness of targeted therapy in individual patients are highly desired. Previous biomarker discovery studies have largely focused on the identification of single biomarker signatures, aimed at maximizing prediction accuracy. Here, we present a different approach that identifies multiple biomarkers by simultaneously optimizing their predictive power, number of features, and proximity to the drug target in a protein-protein interaction network. To this end, we incorporated NSGA-II, a fast and elitist multi-objective optimization algorithm that is based on the principle of Pareto optimality, into the biomarker discovery workflow. The method was applied to quantitative phosphoproteome data of 19 non-small cell lung cancer (NSCLC) cell lines from a previous biomarker study. The algorithm successfully identified a total of 77 candidate biomarker signatures predicting response to treatment with dasatinib. Through filtering and similarity clustering, this set was trimmed to four final biomarker signatures, which then were validated on an independent set of breast cancer cell lines. All four candidates reached the same good prediction accuracy (83%) as the originally published biomarker. Although the newly discovered signatures were diverse in their composition and in their size, the central protein of the originally published signature - integrin β4 (ITGB4) - was also present in all four Pareto signatures, confirming its pivotal role in predicting dasatinib response in NSCLC cell lines. In summary, the method presented here allows for a robust and simultaneous identification of multiple multivariate biomarkers that are optimized for prediction performance, size, and relevance.

No MeSH data available.


Related in: MedlinePlus