Limits...
Novel Bayesian classification models for predicting compounds blocking hERG potassium channels.

Liu LL, Lu J, Lu Y, Zheng MY, Luo XM, Zhu WL, Jiang HL, Chen KX - Acta Pharmacol. Sin. (2014)

Bottom Line: The models were internally validated with the training set of compounds, and then applied to the test set for validation.Doddareddy's experimentally validated dataset with 60 compounds was used for external test set validation.A Bayesian classification model considering the effects of four molecular properties (Mw, PPSA, ALogP and pKa_basic) as well as extended-connectivity fingerprints (ECFP_14) exhibited a global accuracy (91%), parameter sensitivity (90%) and specificity (92%) in the test set validation, and a global accuracy (58%), parameter sensitivity (61%) and specificity (57%) in the external test set validation.

View Article: PubMed Central - PubMed

Affiliation: Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China.

ABSTRACT

Aim: A large number of drug-induced long QT syndromes are ascribed to blockage of hERG potassium channels. The aim of this study was to construct novel computational models to predict compounds blocking hERG channels.

Methods: Doddareddy's hERG blockage data containing 2644 compounds were used, which divided into training (2389) and test (255) sets. Laplacian-corrected Bayesian classification models were constructed using Discovery Studio. The models were internally validated with the training set of compounds, and then applied to the test set for validation. Doddareddy's experimentally validated dataset with 60 compounds was used for external test set validation.

Results: A Bayesian classification model considering the effects of four molecular properties (Mw, PPSA, ALogP and pKa_basic) as well as extended-connectivity fingerprints (ECFP_14) exhibited a global accuracy (91%), parameter sensitivity (90%) and specificity (92%) in the test set validation, and a global accuracy (58%), parameter sensitivity (61%) and specificity (57%) in the external test set validation.

Conclusion: The novel model is better than those in the literatures for predicting compounds blocking hERG channels, and can be used for large-scale prediction.

Show MeSH

Related in: MedlinePlus

The mean value of (A) ALogP, (B) MW, (C) FPSA and (D) pKa_basic of the training set. cP<0.01 vs inactive.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4125710&req=5

fig2: The mean value of (A) ALogP, (B) MW, (C) FPSA and (D) pKa_basic of the training set. cP<0.01 vs inactive.

Mentions: Many molecular descriptors have been proven to be helpful for hERG blockage prediction, including 2D descriptors, such as ClogP, compound molar refractivity (CMR), polarizability, partial positive/negative surface area, compound diameter, topological polar surface area, static polar surface area, fragment fingerprints and shape descriptors, 3D descriptors, such as GRIND descriptors, and 4D descriptors, such as 4D-FPs8,9,10,11,12,25,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49. Analysis of our simple, interpretable molecular properties for the compounds in the training set indicated that the molecular weight (MW), fractional polar surface area (FPSA), lipophilicity (ALogP) and pKa of the most positive basic nitrogen (pKa_basic) can preliminarily predict hERG blockage. The means and the standard deviations for the four descriptors in the training and test sets are summarized in Table 3. As shown, hERG blocking molecules have a larger molecular weight (mean value of approximately 430) compared with the nonblockers (mean value of approximately 330). In addition, compared with the nonblockers, the hERG blocking molecules have a lower FPSA value, a larger ALogP value and more basic nitrogen pKa. The mean values of the four molecular properties for the training set were listed in Figure 2. A test for significant differences is also shown. The four properties for the blockers are all significantly different from those for nonblockers (P≥0.01). The models also generated the features' statistics of the normalized probability contribution to the model (Figure 3). The features' normalized probability is the final contribution of a feature to the model's prediction, which means that the presence of a feature increases or decreases the likelihood for a compound to be a member of the good subset. The four molecular properties and their contributions to the activity identification were mapped. A positive normalized probability means that a molecule with the corresponding property value is likely to be hERG blockers, and a negative normalized probability means a molecule is likely to be nonblockers. From this figure, we can clearly find the favorable distribution ranges for properties of different molecules that cause a molecule to be more likely to be blockers or nonblockers. For example, molecules with an ALogP of −11.97∼−1, MW of 0∼227, FPSA of 0.43∼1.00, and pKa_basic of 0∼1.54 are likely to be nonblockers.


Novel Bayesian classification models for predicting compounds blocking hERG potassium channels.

Liu LL, Lu J, Lu Y, Zheng MY, Luo XM, Zhu WL, Jiang HL, Chen KX - Acta Pharmacol. Sin. (2014)

The mean value of (A) ALogP, (B) MW, (C) FPSA and (D) pKa_basic of the training set. cP<0.01 vs inactive.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4125710&req=5

fig2: The mean value of (A) ALogP, (B) MW, (C) FPSA and (D) pKa_basic of the training set. cP<0.01 vs inactive.
Mentions: Many molecular descriptors have been proven to be helpful for hERG blockage prediction, including 2D descriptors, such as ClogP, compound molar refractivity (CMR), polarizability, partial positive/negative surface area, compound diameter, topological polar surface area, static polar surface area, fragment fingerprints and shape descriptors, 3D descriptors, such as GRIND descriptors, and 4D descriptors, such as 4D-FPs8,9,10,11,12,25,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49. Analysis of our simple, interpretable molecular properties for the compounds in the training set indicated that the molecular weight (MW), fractional polar surface area (FPSA), lipophilicity (ALogP) and pKa of the most positive basic nitrogen (pKa_basic) can preliminarily predict hERG blockage. The means and the standard deviations for the four descriptors in the training and test sets are summarized in Table 3. As shown, hERG blocking molecules have a larger molecular weight (mean value of approximately 430) compared with the nonblockers (mean value of approximately 330). In addition, compared with the nonblockers, the hERG blocking molecules have a lower FPSA value, a larger ALogP value and more basic nitrogen pKa. The mean values of the four molecular properties for the training set were listed in Figure 2. A test for significant differences is also shown. The four properties for the blockers are all significantly different from those for nonblockers (P≥0.01). The models also generated the features' statistics of the normalized probability contribution to the model (Figure 3). The features' normalized probability is the final contribution of a feature to the model's prediction, which means that the presence of a feature increases or decreases the likelihood for a compound to be a member of the good subset. The four molecular properties and their contributions to the activity identification were mapped. A positive normalized probability means that a molecule with the corresponding property value is likely to be hERG blockers, and a negative normalized probability means a molecule is likely to be nonblockers. From this figure, we can clearly find the favorable distribution ranges for properties of different molecules that cause a molecule to be more likely to be blockers or nonblockers. For example, molecules with an ALogP of −11.97∼−1, MW of 0∼227, FPSA of 0.43∼1.00, and pKa_basic of 0∼1.54 are likely to be nonblockers.

Bottom Line: The models were internally validated with the training set of compounds, and then applied to the test set for validation.Doddareddy's experimentally validated dataset with 60 compounds was used for external test set validation.A Bayesian classification model considering the effects of four molecular properties (Mw, PPSA, ALogP and pKa_basic) as well as extended-connectivity fingerprints (ECFP_14) exhibited a global accuracy (91%), parameter sensitivity (90%) and specificity (92%) in the test set validation, and a global accuracy (58%), parameter sensitivity (61%) and specificity (57%) in the external test set validation.

View Article: PubMed Central - PubMed

Affiliation: Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China.

ABSTRACT

Aim: A large number of drug-induced long QT syndromes are ascribed to blockage of hERG potassium channels. The aim of this study was to construct novel computational models to predict compounds blocking hERG channels.

Methods: Doddareddy's hERG blockage data containing 2644 compounds were used, which divided into training (2389) and test (255) sets. Laplacian-corrected Bayesian classification models were constructed using Discovery Studio. The models were internally validated with the training set of compounds, and then applied to the test set for validation. Doddareddy's experimentally validated dataset with 60 compounds was used for external test set validation.

Results: A Bayesian classification model considering the effects of four molecular properties (Mw, PPSA, ALogP and pKa_basic) as well as extended-connectivity fingerprints (ECFP_14) exhibited a global accuracy (91%), parameter sensitivity (90%) and specificity (92%) in the test set validation, and a global accuracy (58%), parameter sensitivity (61%) and specificity (57%) in the external test set validation.

Conclusion: The novel model is better than those in the literatures for predicting compounds blocking hERG channels, and can be used for large-scale prediction.

Show MeSH
Related in: MedlinePlus