Limits...
Use of cell viability assay data improves the prediction accuracy of conventional quantitative structure-activity relationship models of animal carcinogenicity.

Zhu H, Rusyn I, Richard A, Tropsha A - Environ. Health Perspect. (2008)

Bottom Line: The first HTS results for a set of 1,408 compounds tested for their effects on cell viability in six different cell lines have recently become available via PubChem.We found that compounds classified by HTS as "actives" in at least one cell line were likely to be rodent carcinogens (sensitivity 77%); however, HTS "inactives" were far less informative (specificity 46%).Importantly, the prediction accuracy of the model was significantly improved (72.7%) when chemical descriptors were augmented by HTS data, which were regarded as biological descriptors.

View Article: PubMed Central - PubMed

Affiliation: Carolina Environmental Bioinformatics Research Center, University of North Carolina, Chapel Hill, NC 27599-7360, USA.

ABSTRACT

Background: To develop efficient approaches for rapid evaluation of chemical toxicity and human health risk of environmental compounds, the National Toxicology Program (NTP) in collaboration with the National Center for Chemical Genomics has initiated a project on high-throughput screening (HTS) of environmental chemicals. The first HTS results for a set of 1,408 compounds tested for their effects on cell viability in six different cell lines have recently become available via PubChem.

Objectives: We have explored these data in terms of their utility for predicting adverse health effects of the environmental agents.

Methods and results: Initially, the classification k nearest neighbor (kNN) quantitative structure-activity relationship (QSAR) modeling method was applied to the HTS data only, for a curated data set of 384 compounds. The resulting models had prediction accuracies for training, test (containing 275 compounds together), and external validation (109 compounds) sets as high as 89%, 71%, and 74%, respectively. We then asked if HTS results could be of value in predicting rodent carcinogenicity. We identified 383 compounds for which data were available from both the Berkeley Carcinogenic Potency Database and NTP-HTS studies. We found that compounds classified by HTS as "actives" in at least one cell line were likely to be rodent carcinogens (sensitivity 77%); however, HTS "inactives" were far less informative (specificity 46%). Using chemical descriptors only, kNN QSAR modeling resulted in 62.3% prediction accuracy for rodent carcinogenicity applied to this data set. Importantly, the prediction accuracy of the model was significantly improved (72.7%) when chemical descriptors were augmented by HTS data, which were regarded as biological descriptors.

Conclusions: Our studies suggest that combining NTP-HTS profiles with conventional chemical descriptors could considerably improve the predictive power of computational approaches in toxicology.

Show MeSH

Related in: MedlinePlus

Seven HTS descriptors with their frequency of use in the 198 kNN QSAR model.
© Copyright Policy - public-domain
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2291015&req=5

f1-ehp0116-000506: Seven HTS descriptors with their frequency of use in the 198 kNN QSAR model.

Mentions: kNN QSAR models were selected based on the q2/R2 cutoff of 0.65/0.65 in this modeling development process. One hundred three kNN models developed using chemical descriptors alone that passed these criteria, whereas this number nearly doubled to 198 when a combined chemico-biological descriptor set was used. Although data from each of the six cell lines or their combination were given equal weight in defining the global NTP–HTS activity of each compound, the prognostic value of each cell line varied with regard to its usefulness for predicting the rodent carcinogenicity of a chemical. Figure 1 shows the frequency of use of each biological descriptor in the 198 successful kNN QSAR models. The predictive power of the QSAR models was verified using the external validation set of 50 compounds not used in training set modeling (Table 6). QSAR modeling using MolConnZ descriptors only [referred to as kNN-MolConnZ (kNN-MZ) models] achieved 69.2% sensitivity and 55.5% specificity (Table 7). In contrast, 78.6% sensitivity and 66.7% specificity were achieved when the combined chemicobiological descriptor set (referred to as kNN-MZHTS models) was used for modeling. The overall prediction accuracy rate increased significantly from 62.3% to 72.7% and the coverage of the external set increased from 88% to 92%, that is, more external compounds were found within (numerically) the same applicability domain when using the hybrid descriptor set.


Use of cell viability assay data improves the prediction accuracy of conventional quantitative structure-activity relationship models of animal carcinogenicity.

Zhu H, Rusyn I, Richard A, Tropsha A - Environ. Health Perspect. (2008)

Seven HTS descriptors with their frequency of use in the 198 kNN QSAR model.
© Copyright Policy - public-domain
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2291015&req=5

f1-ehp0116-000506: Seven HTS descriptors with their frequency of use in the 198 kNN QSAR model.
Mentions: kNN QSAR models were selected based on the q2/R2 cutoff of 0.65/0.65 in this modeling development process. One hundred three kNN models developed using chemical descriptors alone that passed these criteria, whereas this number nearly doubled to 198 when a combined chemico-biological descriptor set was used. Although data from each of the six cell lines or their combination were given equal weight in defining the global NTP–HTS activity of each compound, the prognostic value of each cell line varied with regard to its usefulness for predicting the rodent carcinogenicity of a chemical. Figure 1 shows the frequency of use of each biological descriptor in the 198 successful kNN QSAR models. The predictive power of the QSAR models was verified using the external validation set of 50 compounds not used in training set modeling (Table 6). QSAR modeling using MolConnZ descriptors only [referred to as kNN-MolConnZ (kNN-MZ) models] achieved 69.2% sensitivity and 55.5% specificity (Table 7). In contrast, 78.6% sensitivity and 66.7% specificity were achieved when the combined chemicobiological descriptor set (referred to as kNN-MZHTS models) was used for modeling. The overall prediction accuracy rate increased significantly from 62.3% to 72.7% and the coverage of the external set increased from 88% to 92%, that is, more external compounds were found within (numerically) the same applicability domain when using the hybrid descriptor set.

Bottom Line: The first HTS results for a set of 1,408 compounds tested for their effects on cell viability in six different cell lines have recently become available via PubChem.We found that compounds classified by HTS as "actives" in at least one cell line were likely to be rodent carcinogens (sensitivity 77%); however, HTS "inactives" were far less informative (specificity 46%).Importantly, the prediction accuracy of the model was significantly improved (72.7%) when chemical descriptors were augmented by HTS data, which were regarded as biological descriptors.

View Article: PubMed Central - PubMed

Affiliation: Carolina Environmental Bioinformatics Research Center, University of North Carolina, Chapel Hill, NC 27599-7360, USA.

ABSTRACT

Background: To develop efficient approaches for rapid evaluation of chemical toxicity and human health risk of environmental compounds, the National Toxicology Program (NTP) in collaboration with the National Center for Chemical Genomics has initiated a project on high-throughput screening (HTS) of environmental chemicals. The first HTS results for a set of 1,408 compounds tested for their effects on cell viability in six different cell lines have recently become available via PubChem.

Objectives: We have explored these data in terms of their utility for predicting adverse health effects of the environmental agents.

Methods and results: Initially, the classification k nearest neighbor (kNN) quantitative structure-activity relationship (QSAR) modeling method was applied to the HTS data only, for a curated data set of 384 compounds. The resulting models had prediction accuracies for training, test (containing 275 compounds together), and external validation (109 compounds) sets as high as 89%, 71%, and 74%, respectively. We then asked if HTS results could be of value in predicting rodent carcinogenicity. We identified 383 compounds for which data were available from both the Berkeley Carcinogenic Potency Database and NTP-HTS studies. We found that compounds classified by HTS as "actives" in at least one cell line were likely to be rodent carcinogens (sensitivity 77%); however, HTS "inactives" were far less informative (specificity 46%). Using chemical descriptors only, kNN QSAR modeling resulted in 62.3% prediction accuracy for rodent carcinogenicity applied to this data set. Importantly, the prediction accuracy of the model was significantly improved (72.7%) when chemical descriptors were augmented by HTS data, which were regarded as biological descriptors.

Conclusions: Our studies suggest that combining NTP-HTS profiles with conventional chemical descriptors could considerably improve the predictive power of computational approaches in toxicology.

Show MeSH
Related in: MedlinePlus