Limits...
Predicting Mycobacterium tuberculosis complex clades using knowledge-based Bayesian networks.

Aminian M, Couvin D, Shabbeer A, Hadley K, Vandenberg S, Rastogi N, Bennett KP - Biomed Res Int (2014)

Bottom Line: We develop a predictive KBBN model for 69 MTBC clades found in the SITVIT international collection.Results show that incorporating rules into problems can drastically increase classification accuracy if data alone are insufficient.The SITVIT KBBN is publicly available for use on the World Wide Web.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.

ABSTRACT
We develop a novel approach for incorporating expert rules into Bayesian networks for classification of Mycobacterium tuberculosis complex (MTBC) clades. The proposed knowledge-based Bayesian network (KBBN) treats sets of expert rules as prior distributions on the classes. Unlike prior knowledge-based support vector machine approaches which require rules expressed as polyhedral sets, KBBN directly incorporates the rules without any modification. KBBN uses data to refine rule-based classifiers when the rule set is incomplete or ambiguous. We develop a predictive KBBN model for 69 MTBC clades found in the SITVIT international collection. We validate the approach using two testbeds that model knowledge of the MTBC obtained from two different experts and large DNA fingerprint databases to predict MTBC genetic clades and sublineages. These models represent strains of MTBC using high-throughput biomarkers called spacer oligonucleotide types (spoligotypes), since these are routinely gathered from MTBC isolates of tuberculosis (TB) patients. Results show that incorporating rules into problems can drastically increase classification accuracy if data alone are insufficient. The SITVIT KBBN is publicly available for use on the World Wide Web.

Show MeSH

Related in: MedlinePlus

The heat map represents the posterior probability of each rule given the sublineage for the SITVIT dataset. A strong association of a rule in predicting a sublineage is shown with a red square while a blue square represents no relation. Here H includes URAL-1 and URAL-2 and LAM includes Turkey and Cameroon sublineages.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4016944&req=5

fig5: The heat map represents the posterior probability of each rule given the sublineage for the SITVIT dataset. A strong association of a rule in predicting a sublineage is shown with a red square while a blue square represents no relation. Here H includes URAL-1 and URAL-2 and LAM includes Turkey and Cameroon sublineages.

Mentions: We provide the posterior probability distribution of each rule given the sublineage for the SITVIT-CV dataset as a heat map in Figure 5. Good rules only have red on the diagonal. A rule fires for multiple classes if it has multiple red entries in a row. The rule set is ambiguous for a class if there are multiple red entries within a given class column. Notice that the rules that are fired for many classes with high probability (e.g. T) are not very effective in indicating the associated class as opposed to Beijing which is an effective rule.


Predicting Mycobacterium tuberculosis complex clades using knowledge-based Bayesian networks.

Aminian M, Couvin D, Shabbeer A, Hadley K, Vandenberg S, Rastogi N, Bennett KP - Biomed Res Int (2014)

The heat map represents the posterior probability of each rule given the sublineage for the SITVIT dataset. A strong association of a rule in predicting a sublineage is shown with a red square while a blue square represents no relation. Here H includes URAL-1 and URAL-2 and LAM includes Turkey and Cameroon sublineages.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4016944&req=5

fig5: The heat map represents the posterior probability of each rule given the sublineage for the SITVIT dataset. A strong association of a rule in predicting a sublineage is shown with a red square while a blue square represents no relation. Here H includes URAL-1 and URAL-2 and LAM includes Turkey and Cameroon sublineages.
Mentions: We provide the posterior probability distribution of each rule given the sublineage for the SITVIT-CV dataset as a heat map in Figure 5. Good rules only have red on the diagonal. A rule fires for multiple classes if it has multiple red entries in a row. The rule set is ambiguous for a class if there are multiple red entries within a given class column. Notice that the rules that are fired for many classes with high probability (e.g. T) are not very effective in indicating the associated class as opposed to Beijing which is an effective rule.

Bottom Line: We develop a predictive KBBN model for 69 MTBC clades found in the SITVIT international collection.Results show that incorporating rules into problems can drastically increase classification accuracy if data alone are insufficient.The SITVIT KBBN is publicly available for use on the World Wide Web.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.

ABSTRACT
We develop a novel approach for incorporating expert rules into Bayesian networks for classification of Mycobacterium tuberculosis complex (MTBC) clades. The proposed knowledge-based Bayesian network (KBBN) treats sets of expert rules as prior distributions on the classes. Unlike prior knowledge-based support vector machine approaches which require rules expressed as polyhedral sets, KBBN directly incorporates the rules without any modification. KBBN uses data to refine rule-based classifiers when the rule set is incomplete or ambiguous. We develop a predictive KBBN model for 69 MTBC clades found in the SITVIT international collection. We validate the approach using two testbeds that model knowledge of the MTBC obtained from two different experts and large DNA fingerprint databases to predict MTBC genetic clades and sublineages. These models represent strains of MTBC using high-throughput biomarkers called spacer oligonucleotide types (spoligotypes), since these are routinely gathered from MTBC isolates of tuberculosis (TB) patients. Results show that incorporating rules into problems can drastically increase classification accuracy if data alone are insufficient. The SITVIT KBBN is publicly available for use on the World Wide Web.

Show MeSH
Related in: MedlinePlus