Limits...
A p-Median approach for predicting drug response in tumour cells.

Fersini E, Messina E, Archetti F - BMC Bioinformatics (2014)

Bottom Line: The complexity of biological data related to the genetic origins of tumour cells, originates significant challenges to glean valuable knowledge that can be used to predict therapeutic responses.The experimental investigation performed on the NCI60 dataset highlights three main findings: (1) Consensus p-Median is able to create groups of cell lines that are highly correlated both in terms of gene expression and drug response; (2) from a biological point of view, the proposed approach enables the selection of genes that are strongly involved in several cancer processes; (3) the final prediction of drug responses, built upon Consensus p-Median and the selected genes, represents a promising step for predicting potential useful drugs.The proposed learning framework represents a promising approach predicting drug response in tumour cells.

View Article: PubMed Central - PubMed

ABSTRACT

Background: The complexity of biological data related to the genetic origins of tumour cells, originates significant challenges to glean valuable knowledge that can be used to predict therapeutic responses. In order to discover a link between gene expression profiles and drug responses, a computational framework based on Consensus p-Median clustering is proposed. The main goal is to simultaneously predict (in silico) anticancer responses by extracting common patterns among tumour cell lines, selecting genes that could potentially explain the therapy outcome and finally learning a probabilistic model able to predict the therapeutic responses.

Results: The experimental investigation performed on the NCI60 dataset highlights three main findings: (1) Consensus p-Median is able to create groups of cell lines that are highly correlated both in terms of gene expression and drug response; (2) from a biological point of view, the proposed approach enables the selection of genes that are strongly involved in several cancer processes; (3) the final prediction of drug responses, built upon Consensus p-Median and the selected genes, represents a promising step for predicting potential useful drugs.

Conclusion: The proposed learning framework represents a promising approach predicting drug response in tumour cells.

Show MeSH

Related in: MedlinePlus

Comparison of BNs accuracy on Liu dataset. The BNs have been trained according to the selection of genes by means of CFS policy.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4222443&req=5

Fig8: Comparison of BNs accuracy on Liu dataset. The BNs have been trained according to the selection of genes by means of CFS policy.

Mentions: A further validation is targeted at the correctness of Bayesian Networks to predict the drug responses. In particular, we have measured the prediction accuracy of BNs trained with the top relevant genes characterizing the groups of cell lines derived by the mentioned clustering approaches. We have also reported the accuracy of a trivial classifier as baseline, where the prediction of a drug response is performed according to its majority class on the training data. In Figures 5, 6, 7 and 8 the comparison in terms of accuracy, i.e. percentage of drug response correctly predicted, between the investigated approaches is shown. The BNs are trained according to the (top ten) genes selected by the Information Gain and Correlation-based Subset Evaluation policies. In particular, considering that the experimental investigation is performed by means of a leave-one-out cross validation, the relevant genes to be used for training BNs have been selected as the most frequent over the top ten genes selected for each fold of the cross validation. Specifically, given the L=60 solutions obtained by performing a leave-one-out (for each given clustering approach), a voting mechanism has been applied. Each gene received a vote if, in a given run of the leave-one-out, it appears in the top ten list of relevant genes. Once the votes have been collected, the 10 genes with the highest number of votes are selected as the most important and therefore used to train the BN. It can be easily noted that all the solutions generated by the proposed approach outperform the ones obtained by the other methods. Concerning the Sherf dataset (Figures 5 and 6), the BNs trained according to the Consensus p-Median are able to ensure an average prediction accuracy of 85.63% with IG and 84.98% CFS selection policies, outperforming the accuracy of Probabilistic D-Clustering (81.8% with IG and 83.1% with CFS), p-Median (82.9% with IG and 83.33% with CFS), STVQ (83.1% with IG and 82.6% with CFS), k-Means (83.2% with IG and 82.8 with CFS), Relational k-Means (84.0% with IG and 84.2 with CFS) and the trivial classifier (80.5%).Figure 5


A p-Median approach for predicting drug response in tumour cells.

Fersini E, Messina E, Archetti F - BMC Bioinformatics (2014)

Comparison of BNs accuracy on Liu dataset. The BNs have been trained according to the selection of genes by means of CFS policy.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4222443&req=5

Fig8: Comparison of BNs accuracy on Liu dataset. The BNs have been trained according to the selection of genes by means of CFS policy.
Mentions: A further validation is targeted at the correctness of Bayesian Networks to predict the drug responses. In particular, we have measured the prediction accuracy of BNs trained with the top relevant genes characterizing the groups of cell lines derived by the mentioned clustering approaches. We have also reported the accuracy of a trivial classifier as baseline, where the prediction of a drug response is performed according to its majority class on the training data. In Figures 5, 6, 7 and 8 the comparison in terms of accuracy, i.e. percentage of drug response correctly predicted, between the investigated approaches is shown. The BNs are trained according to the (top ten) genes selected by the Information Gain and Correlation-based Subset Evaluation policies. In particular, considering that the experimental investigation is performed by means of a leave-one-out cross validation, the relevant genes to be used for training BNs have been selected as the most frequent over the top ten genes selected for each fold of the cross validation. Specifically, given the L=60 solutions obtained by performing a leave-one-out (for each given clustering approach), a voting mechanism has been applied. Each gene received a vote if, in a given run of the leave-one-out, it appears in the top ten list of relevant genes. Once the votes have been collected, the 10 genes with the highest number of votes are selected as the most important and therefore used to train the BN. It can be easily noted that all the solutions generated by the proposed approach outperform the ones obtained by the other methods. Concerning the Sherf dataset (Figures 5 and 6), the BNs trained according to the Consensus p-Median are able to ensure an average prediction accuracy of 85.63% with IG and 84.98% CFS selection policies, outperforming the accuracy of Probabilistic D-Clustering (81.8% with IG and 83.1% with CFS), p-Median (82.9% with IG and 83.33% with CFS), STVQ (83.1% with IG and 82.6% with CFS), k-Means (83.2% with IG and 82.8 with CFS), Relational k-Means (84.0% with IG and 84.2 with CFS) and the trivial classifier (80.5%).Figure 5

Bottom Line: The complexity of biological data related to the genetic origins of tumour cells, originates significant challenges to glean valuable knowledge that can be used to predict therapeutic responses.The experimental investigation performed on the NCI60 dataset highlights three main findings: (1) Consensus p-Median is able to create groups of cell lines that are highly correlated both in terms of gene expression and drug response; (2) from a biological point of view, the proposed approach enables the selection of genes that are strongly involved in several cancer processes; (3) the final prediction of drug responses, built upon Consensus p-Median and the selected genes, represents a promising step for predicting potential useful drugs.The proposed learning framework represents a promising approach predicting drug response in tumour cells.

View Article: PubMed Central - PubMed

ABSTRACT

Background: The complexity of biological data related to the genetic origins of tumour cells, originates significant challenges to glean valuable knowledge that can be used to predict therapeutic responses. In order to discover a link between gene expression profiles and drug responses, a computational framework based on Consensus p-Median clustering is proposed. The main goal is to simultaneously predict (in silico) anticancer responses by extracting common patterns among tumour cell lines, selecting genes that could potentially explain the therapy outcome and finally learning a probabilistic model able to predict the therapeutic responses.

Results: The experimental investigation performed on the NCI60 dataset highlights three main findings: (1) Consensus p-Median is able to create groups of cell lines that are highly correlated both in terms of gene expression and drug response; (2) from a biological point of view, the proposed approach enables the selection of genes that are strongly involved in several cancer processes; (3) the final prediction of drug responses, built upon Consensus p-Median and the selected genes, represents a promising step for predicting potential useful drugs.

Conclusion: The proposed learning framework represents a promising approach predicting drug response in tumour cells.

Show MeSH
Related in: MedlinePlus