Limits...
Prediction of 1-octanol solubilities using data from the Open Notebook Science Challenge.

Buonaiuto MA, Lang AS - Chem Cent J (2015)

Bottom Line: The model has been deployed for general use as a Shiny application.The 1-octanol solubility model provides reasonably accurate predictions of the 1-octanol solubility of organic solutes directly from structure.The model was developed under Open Notebook Science conditions which makes it open, reproducible, and as useful as possible.Graphical abstract.

View Article: PubMed Central - PubMed

Affiliation: Department of Computing and Mathematics, Oral Roberts University, 7777 S. Lewis Avenue, Tulsa, OK 74171 USA.

ABSTRACT

Background: 1-Octanol solubility is important in a variety of applications involving pharmacology and environmental chemistry. Current models are linear in nature and often require foreknowledge of either melting point or aqueous solubility. Here we extend the range of applicability of 1-octanol solubility models by creating a random forest model that can predict 1-octanol solubilities directly from structure.

Results: We created a random forest model using CDK descriptors that has an out-of-bag (OOB) R(2) value of 0.66 and an OOB mean squared error of 0.34. The model has been deployed for general use as a Shiny application.

Conclusion: The 1-octanol solubility model provides reasonably accurate predictions of the 1-octanol solubility of organic solutes directly from structure. The model was developed under Open Notebook Science conditions which makes it open, reproducible, and as useful as possible.Graphical abstract.

No MeSH data available.


Training set chemical space where red indicates poor model performance
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4585410&req=5

Fig7: Training set chemical space where red indicates poor model performance

Mentions: Previously published models only report the training set statistics, so in order todirectly compare our model with previous models we used our full random forest model to predict thesolubilities of the entire dataset, see Fig. 7. For thetraining set, the model has an R2 value of 0.94 and a MSE of 0.06.Abraham and Acree’s recommended Eq. (3), if all necessary descriptors are available, for estimationsof log Soct has a training set R2 value of 0.83[5] which is lower than our value. Our model also doesnot require a measured melting point. This makes our model, even with the modest OOBR2 value of 0.66, superior to all others previously published.Fig. 7


Prediction of 1-octanol solubilities using data from the Open Notebook Science Challenge.

Buonaiuto MA, Lang AS - Chem Cent J (2015)

Training set chemical space where red indicates poor model performance
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4585410&req=5

Fig7: Training set chemical space where red indicates poor model performance
Mentions: Previously published models only report the training set statistics, so in order todirectly compare our model with previous models we used our full random forest model to predict thesolubilities of the entire dataset, see Fig. 7. For thetraining set, the model has an R2 value of 0.94 and a MSE of 0.06.Abraham and Acree’s recommended Eq. (3), if all necessary descriptors are available, for estimationsof log Soct has a training set R2 value of 0.83[5] which is lower than our value. Our model also doesnot require a measured melting point. This makes our model, even with the modest OOBR2 value of 0.66, superior to all others previously published.Fig. 7

Bottom Line: The model has been deployed for general use as a Shiny application.The 1-octanol solubility model provides reasonably accurate predictions of the 1-octanol solubility of organic solutes directly from structure.The model was developed under Open Notebook Science conditions which makes it open, reproducible, and as useful as possible.Graphical abstract.

View Article: PubMed Central - PubMed

Affiliation: Department of Computing and Mathematics, Oral Roberts University, 7777 S. Lewis Avenue, Tulsa, OK 74171 USA.

ABSTRACT

Background: 1-Octanol solubility is important in a variety of applications involving pharmacology and environmental chemistry. Current models are linear in nature and often require foreknowledge of either melting point or aqueous solubility. Here we extend the range of applicability of 1-octanol solubility models by creating a random forest model that can predict 1-octanol solubilities directly from structure.

Results: We created a random forest model using CDK descriptors that has an out-of-bag (OOB) R(2) value of 0.66 and an OOB mean squared error of 0.34. The model has been deployed for general use as a Shiny application.

Conclusion: The 1-octanol solubility model provides reasonably accurate predictions of the 1-octanol solubility of organic solutes directly from structure. The model was developed under Open Notebook Science conditions which makes it open, reproducible, and as useful as possible.Graphical abstract.

No MeSH data available.