Limits...
Predicting Mycobacterium tuberculosis complex clades using knowledge-based Bayesian networks.

Aminian M, Couvin D, Shabbeer A, Hadley K, Vandenberg S, Rastogi N, Bennett KP - Biomed Res Int (2014)

Bottom Line: We develop a predictive KBBN model for 69 MTBC clades found in the SITVIT international collection.Results show that incorporating rules into problems can drastically increase classification accuracy if data alone are insufficient.The SITVIT KBBN is publicly available for use on the World Wide Web.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.

ABSTRACT
We develop a novel approach for incorporating expert rules into Bayesian networks for classification of Mycobacterium tuberculosis complex (MTBC) clades. The proposed knowledge-based Bayesian network (KBBN) treats sets of expert rules as prior distributions on the classes. Unlike prior knowledge-based support vector machine approaches which require rules expressed as polyhedral sets, KBBN directly incorporates the rules without any modification. KBBN uses data to refine rule-based classifiers when the rule set is incomplete or ambiguous. We develop a predictive KBBN model for 69 MTBC clades found in the SITVIT international collection. We validate the approach using two testbeds that model knowledge of the MTBC obtained from two different experts and large DNA fingerprint databases to predict MTBC genetic clades and sublineages. These models represent strains of MTBC using high-throughput biomarkers called spacer oligonucleotide types (spoligotypes), since these are routinely gathered from MTBC isolates of tuberculosis (TB) patients. Results show that incorporating rules into problems can drastically increase classification accuracy if data alone are insufficient. The SITVIT KBBN is publicly available for use on the World Wide Web.

Show MeSH

Related in: MedlinePlus

(a) The spoligotype conformal Bayesian network uses a single rule based on the number of repeats at the MIRU24 locus as the first level of a hierarchical Bayesian network. It uses the 43 spacers as features. CBN predicts the major lineage with high accuracy. (b) The KBBN uses multiple rules based on the presence of characteristic deletions as the first level of a hierarchical Bayesian network. As with the CBN, it uses the 43 spoligotype spacers.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4016944&req=5

fig2: (a) The spoligotype conformal Bayesian network uses a single rule based on the number of repeats at the MIRU24 locus as the first level of a hierarchical Bayesian network. It uses the 43 spacers as features. CBN predicts the major lineage with high accuracy. (b) The KBBN uses multiple rules based on the presence of characteristic deletions as the first level of a hierarchical Bayesian network. As with the CBN, it uses the 43 spoligotype spacers.

Mentions: The conformal Bayesian network (CBN) is another generative model for analysis of both spoligotype and MIRU type data for MTBC strains [9, 12] (spoligotype CBN is shown in Figure 2(a)) originally designed for predicting major MTBC lineages. CBN captures the domain knowledge about the properties of spoligotypes and MIRU and uses this information to classify MTBC strain genotyping data into major lineages. CBN reflects the known mutation mechanisms of the spoligotypes and MIRU. With rare exceptions, ancestral strains have 2 or more repeats at MIRU24. Thus the top-level variable, M24, indicates whether MIRU24 is less than two (indicating one of the modern lineages with high probability) or at least two (indicating one of the ancestral lineages with high probability).


Predicting Mycobacterium tuberculosis complex clades using knowledge-based Bayesian networks.

Aminian M, Couvin D, Shabbeer A, Hadley K, Vandenberg S, Rastogi N, Bennett KP - Biomed Res Int (2014)

(a) The spoligotype conformal Bayesian network uses a single rule based on the number of repeats at the MIRU24 locus as the first level of a hierarchical Bayesian network. It uses the 43 spacers as features. CBN predicts the major lineage with high accuracy. (b) The KBBN uses multiple rules based on the presence of characteristic deletions as the first level of a hierarchical Bayesian network. As with the CBN, it uses the 43 spoligotype spacers.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4016944&req=5

fig2: (a) The spoligotype conformal Bayesian network uses a single rule based on the number of repeats at the MIRU24 locus as the first level of a hierarchical Bayesian network. It uses the 43 spacers as features. CBN predicts the major lineage with high accuracy. (b) The KBBN uses multiple rules based on the presence of characteristic deletions as the first level of a hierarchical Bayesian network. As with the CBN, it uses the 43 spoligotype spacers.
Mentions: The conformal Bayesian network (CBN) is another generative model for analysis of both spoligotype and MIRU type data for MTBC strains [9, 12] (spoligotype CBN is shown in Figure 2(a)) originally designed for predicting major MTBC lineages. CBN captures the domain knowledge about the properties of spoligotypes and MIRU and uses this information to classify MTBC strain genotyping data into major lineages. CBN reflects the known mutation mechanisms of the spoligotypes and MIRU. With rare exceptions, ancestral strains have 2 or more repeats at MIRU24. Thus the top-level variable, M24, indicates whether MIRU24 is less than two (indicating one of the modern lineages with high probability) or at least two (indicating one of the ancestral lineages with high probability).

Bottom Line: We develop a predictive KBBN model for 69 MTBC clades found in the SITVIT international collection.Results show that incorporating rules into problems can drastically increase classification accuracy if data alone are insufficient.The SITVIT KBBN is publicly available for use on the World Wide Web.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.

ABSTRACT
We develop a novel approach for incorporating expert rules into Bayesian networks for classification of Mycobacterium tuberculosis complex (MTBC) clades. The proposed knowledge-based Bayesian network (KBBN) treats sets of expert rules as prior distributions on the classes. Unlike prior knowledge-based support vector machine approaches which require rules expressed as polyhedral sets, KBBN directly incorporates the rules without any modification. KBBN uses data to refine rule-based classifiers when the rule set is incomplete or ambiguous. We develop a predictive KBBN model for 69 MTBC clades found in the SITVIT international collection. We validate the approach using two testbeds that model knowledge of the MTBC obtained from two different experts and large DNA fingerprint databases to predict MTBC genetic clades and sublineages. These models represent strains of MTBC using high-throughput biomarkers called spacer oligonucleotide types (spoligotypes), since these are routinely gathered from MTBC isolates of tuberculosis (TB) patients. Results show that incorporating rules into problems can drastically increase classification accuracy if data alone are insufficient. The SITVIT KBBN is publicly available for use on the World Wide Web.

Show MeSH
Related in: MedlinePlus