Limits...
Predicting drug side effects by multi-label learning and ensemble learning.

Zhang W, Liu F, Luo L, Zhang J - BMC Bioinformatics (2015)

Bottom Line: Secondly, drug-related features are associated with side effects, and feature dimensions have specific biological meanings.Recognizing critical dimensions and reducing irrelevant dimensions may help to reveal the causes of side effects.Computational experiments demonstrate that FS-MLKNN leads to good performances as well as explainable results.

View Article: PubMed Central - PubMed

Affiliation: School of Computer, Wuhan University, Wuhan, 430072, China. zhangwen@whu.edu.cn.

ABSTRACT

Background: Predicting drug side effects is an important topic in the drug discovery. Although several machine learning methods have been proposed to predict side effects, there is still space for improvements. Firstly, the side effect prediction is a multi-label learning task, and we can adopt the multi-label learning techniques for it. Secondly, drug-related features are associated with side effects, and feature dimensions have specific biological meanings. Recognizing critical dimensions and reducing irrelevant dimensions may help to reveal the causes of side effects.

Methods: In this paper, we propose a novel method 'feature selection-based multi-label k-nearest neighbor method' (FS-MLKNN), which can simultaneously determine critical feature dimensions and construct high-accuracy multi-label prediction models.

Results: Computational experiments demonstrate that FS-MLKNN leads to good performances as well as explainable results. To achieve better performances, we further develop the ensemble learning model by integrating individual feature-based FS-MLKNN models. When compared with other state-of-the-art methods, the ensemble method produces better performances on benchmark datasets.

Conclusions: In conclusion, FS-MLKNN and the ensemble method are promising tools for the side effect prediction. The source code and datasets are available in the Additional file 1.

No MeSH data available.


Related in: MedlinePlus

a Flowchart of FS-MLKNN b details about the GA-based wrapper feature selection c the details about constructing FS-MLKNN prediction model
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4634905&req=5

Fig2: a Flowchart of FS-MLKNN b details about the GA-based wrapper feature selection c the details about constructing FS-MLKNN prediction model

Mentions: We design the feature selection-based multi-label k-nearest neighbor method (FS-MLKNN) to simultaneously determine the optimal feature dimensions and build multi-label prediction models. Here, p dimensions of feature vectors and q dimensions of side effect vectors are respectively denoted as V = {v1, v2, ⋅ ⋅⋅, vp} and D = {d1, d2, ⋅ ⋅⋅, dq}. As shown in Fig. 2(a), FS-MLKNN has two steps.Fig. 2


Predicting drug side effects by multi-label learning and ensemble learning.

Zhang W, Liu F, Luo L, Zhang J - BMC Bioinformatics (2015)

a Flowchart of FS-MLKNN b details about the GA-based wrapper feature selection c the details about constructing FS-MLKNN prediction model
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4634905&req=5

Fig2: a Flowchart of FS-MLKNN b details about the GA-based wrapper feature selection c the details about constructing FS-MLKNN prediction model
Mentions: We design the feature selection-based multi-label k-nearest neighbor method (FS-MLKNN) to simultaneously determine the optimal feature dimensions and build multi-label prediction models. Here, p dimensions of feature vectors and q dimensions of side effect vectors are respectively denoted as V = {v1, v2, ⋅ ⋅⋅, vp} and D = {d1, d2, ⋅ ⋅⋅, dq}. As shown in Fig. 2(a), FS-MLKNN has two steps.Fig. 2

Bottom Line: Secondly, drug-related features are associated with side effects, and feature dimensions have specific biological meanings.Recognizing critical dimensions and reducing irrelevant dimensions may help to reveal the causes of side effects.Computational experiments demonstrate that FS-MLKNN leads to good performances as well as explainable results.

View Article: PubMed Central - PubMed

Affiliation: School of Computer, Wuhan University, Wuhan, 430072, China. zhangwen@whu.edu.cn.

ABSTRACT

Background: Predicting drug side effects is an important topic in the drug discovery. Although several machine learning methods have been proposed to predict side effects, there is still space for improvements. Firstly, the side effect prediction is a multi-label learning task, and we can adopt the multi-label learning techniques for it. Secondly, drug-related features are associated with side effects, and feature dimensions have specific biological meanings. Recognizing critical dimensions and reducing irrelevant dimensions may help to reveal the causes of side effects.

Methods: In this paper, we propose a novel method 'feature selection-based multi-label k-nearest neighbor method' (FS-MLKNN), which can simultaneously determine critical feature dimensions and construct high-accuracy multi-label prediction models.

Results: Computational experiments demonstrate that FS-MLKNN leads to good performances as well as explainable results. To achieve better performances, we further develop the ensemble learning model by integrating individual feature-based FS-MLKNN models. When compared with other state-of-the-art methods, the ensemble method produces better performances on benchmark datasets.

Conclusions: In conclusion, FS-MLKNN and the ensemble method are promising tools for the side effect prediction. The source code and datasets are available in the Additional file 1.

No MeSH data available.


Related in: MedlinePlus