Limits...
The environmental fate of organic pollutants through the global microbial metabolism.

Gómez MJ, Pazos F, Guijarro FJ, de Lorenzo V, Valencia A - Mol. Syst. Biol. (2007)

Bottom Line: A machine learning approach has been instrumental to expose a correlation between the frequency of 149 atomic triads (chemotopes) common in organo-chemical compounds and the global capacity of microorganisms to metabolise them.Depending on the type of environmental fate defined, the system can correctly predict the biodegradative outcome for 73-87% of compounds.The application of this predictive tool to chemical species released into the environment provides an early instrument for tentatively classifying the compounds as biodegradable or recalcitrant.

View Article: PubMed Central - PubMed

Affiliation: Centro de Astrobiología (INTA-CSIC), Ctra. Torrejón Ajalvir, Km 4. Torrejón de Ardoz, Madrid, Spain.

ABSTRACT
The production of new chemicals for industrial or therapeutic applications exceeds our ability to generate experimental data on their biological fate once they are released into the environment. Typically, mixtures of organic pollutants are freed into a variety of sites inhabited by diverse microorganisms, which structure complex multispecies metabolic networks. A machine learning approach has been instrumental to expose a correlation between the frequency of 149 atomic triads (chemotopes) common in organo-chemical compounds and the global capacity of microorganisms to metabolise them. Depending on the type of environmental fate defined, the system can correctly predict the biodegradative outcome for 73-87% of compounds. This system is available to the community as a web server (http://www.pdg.cnb.uam.es/BDPSERVER). The application of this predictive tool to chemical species released into the environment provides an early instrument for tentatively classifying the compounds as biodegradable or recalcitrant. Automated surveys of lists of industrial chemicals currently employed in large quantities revealed that herbicides are the group of functional molecules more difficult to recycle into the biosphere through the inclusive microbial metabolism.

Show MeSH
Rationale for developing an experience-based biodegradation prediction system. (A) represents the strategy to generate environmental fate classifiers with the learning machine c4.5, in the form of sets of propositional rules, starting from information gathered from the Biodegradation database UMBBD. (B) Sketches the functioning and queries of BDPServer.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC1911198&req=5

f2: Rationale for developing an experience-based biodegradation prediction system. (A) represents the strategy to generate environmental fate classifiers with the learning machine c4.5, in the form of sets of propositional rules, starting from information gathered from the Biodegradation database UMBBD. (B) Sketches the functioning and queries of BDPServer.

Mentions: At the time of starting this work, the UMBBD contained information on 850 compounds and 903 reactions (Ellis et al, 2003, 2006). The first issue at stake was whether structural features of the target molecules could be significantly correlated to their known environmental fate. To this end, we resorted to describing each chemical structure as a whole of 152 descriptors that represented atomic triad frequencies, molecular weight (MW) and water solubility, the latter expressed both quantitatively and qualitatively. Such atomic triads (or chemotopes) included 149 groups of three consecutive, connected atoms that can be identified on the structure of a compound, taking into account the type of connecting chemical bonds. For example, the atomic triad C–C–H is different from C=C–H, whereas C=C–H is equal to H–C=C (Figure 1). The choice of atomics triads instead of focusing on reactive groups or functional motives reflected the tradeoff between having significant structural information and the handling of a minimal number of attributes (see the Discussion section). Deconstruction of each compound in this way is achieved by first translating the SMILES (Weininger, 1988) representation of each molecule, which is available from UMBBD, into other forms of chemical depiction that include explicit information regarding atom connectivity and chemical bond types. Then, the frequency in which the different atomic triads appear for each compound is recorded. MW is also available from UMBDD and compound solubility is, in some cases, accessible through links to the corresponding entry in ChemFinder (Figure 2A). The collection of atomic triad frequencies, the MW and the solubility were then assembled to generate molecular descriptors, henceforth referred to as compound vectors (Figures 1 and 2A).


The environmental fate of organic pollutants through the global microbial metabolism.

Gómez MJ, Pazos F, Guijarro FJ, de Lorenzo V, Valencia A - Mol. Syst. Biol. (2007)

Rationale for developing an experience-based biodegradation prediction system. (A) represents the strategy to generate environmental fate classifiers with the learning machine c4.5, in the form of sets of propositional rules, starting from information gathered from the Biodegradation database UMBBD. (B) Sketches the functioning and queries of BDPServer.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC1911198&req=5

f2: Rationale for developing an experience-based biodegradation prediction system. (A) represents the strategy to generate environmental fate classifiers with the learning machine c4.5, in the form of sets of propositional rules, starting from information gathered from the Biodegradation database UMBBD. (B) Sketches the functioning and queries of BDPServer.
Mentions: At the time of starting this work, the UMBBD contained information on 850 compounds and 903 reactions (Ellis et al, 2003, 2006). The first issue at stake was whether structural features of the target molecules could be significantly correlated to their known environmental fate. To this end, we resorted to describing each chemical structure as a whole of 152 descriptors that represented atomic triad frequencies, molecular weight (MW) and water solubility, the latter expressed both quantitatively and qualitatively. Such atomic triads (or chemotopes) included 149 groups of three consecutive, connected atoms that can be identified on the structure of a compound, taking into account the type of connecting chemical bonds. For example, the atomic triad C–C–H is different from C=C–H, whereas C=C–H is equal to H–C=C (Figure 1). The choice of atomics triads instead of focusing on reactive groups or functional motives reflected the tradeoff between having significant structural information and the handling of a minimal number of attributes (see the Discussion section). Deconstruction of each compound in this way is achieved by first translating the SMILES (Weininger, 1988) representation of each molecule, which is available from UMBBD, into other forms of chemical depiction that include explicit information regarding atom connectivity and chemical bond types. Then, the frequency in which the different atomic triads appear for each compound is recorded. MW is also available from UMBDD and compound solubility is, in some cases, accessible through links to the corresponding entry in ChemFinder (Figure 2A). The collection of atomic triad frequencies, the MW and the solubility were then assembled to generate molecular descriptors, henceforth referred to as compound vectors (Figures 1 and 2A).

Bottom Line: A machine learning approach has been instrumental to expose a correlation between the frequency of 149 atomic triads (chemotopes) common in organo-chemical compounds and the global capacity of microorganisms to metabolise them.Depending on the type of environmental fate defined, the system can correctly predict the biodegradative outcome for 73-87% of compounds.The application of this predictive tool to chemical species released into the environment provides an early instrument for tentatively classifying the compounds as biodegradable or recalcitrant.

View Article: PubMed Central - PubMed

Affiliation: Centro de Astrobiología (INTA-CSIC), Ctra. Torrejón Ajalvir, Km 4. Torrejón de Ardoz, Madrid, Spain.

ABSTRACT
The production of new chemicals for industrial or therapeutic applications exceeds our ability to generate experimental data on their biological fate once they are released into the environment. Typically, mixtures of organic pollutants are freed into a variety of sites inhabited by diverse microorganisms, which structure complex multispecies metabolic networks. A machine learning approach has been instrumental to expose a correlation between the frequency of 149 atomic triads (chemotopes) common in organo-chemical compounds and the global capacity of microorganisms to metabolise them. Depending on the type of environmental fate defined, the system can correctly predict the biodegradative outcome for 73-87% of compounds. This system is available to the community as a web server (http://www.pdg.cnb.uam.es/BDPSERVER). The application of this predictive tool to chemical species released into the environment provides an early instrument for tentatively classifying the compounds as biodegradable or recalcitrant. Automated surveys of lists of industrial chemicals currently employed in large quantities revealed that herbicides are the group of functional molecules more difficult to recycle into the biosphere through the inclusive microbial metabolism.

Show MeSH