Limits...
Predicting Mycobacterium tuberculosis complex clades using knowledge-based Bayesian networks.

Aminian M, Couvin D, Shabbeer A, Hadley K, Vandenberg S, Rastogi N, Bennett KP - Biomed Res Int (2014)

Bottom Line: We develop a predictive KBBN model for 69 MTBC clades found in the SITVIT international collection.Results show that incorporating rules into problems can drastically increase classification accuracy if data alone are insufficient.The SITVIT KBBN is publicly available for use on the World Wide Web.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.

ABSTRACT
We develop a novel approach for incorporating expert rules into Bayesian networks for classification of Mycobacterium tuberculosis complex (MTBC) clades. The proposed knowledge-based Bayesian network (KBBN) treats sets of expert rules as prior distributions on the classes. Unlike prior knowledge-based support vector machine approaches which require rules expressed as polyhedral sets, KBBN directly incorporates the rules without any modification. KBBN uses data to refine rule-based classifiers when the rule set is incomplete or ambiguous. We develop a predictive KBBN model for 69 MTBC clades found in the SITVIT international collection. We validate the approach using two testbeds that model knowledge of the MTBC obtained from two different experts and large DNA fingerprint databases to predict MTBC genetic clades and sublineages. These models represent strains of MTBC using high-throughput biomarkers called spacer oligonucleotide types (spoligotypes), since these are routinely gathered from MTBC isolates of tuberculosis (TB) patients. Results show that incorporating rules into problems can drastically increase classification accuracy if data alone are insufficient. The SITVIT KBBN is publicly available for use on the World Wide Web.

Show MeSH

Related in: MedlinePlus

Effect of removing rules for each class on the average F-value for (a) SITVIT-CV and (b) CDC-Sublineage.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4016944&req=5

fig4: Effect of removing rules for each class on the average F-value for (a) SITVIT-CV and (b) CDC-Sublineage.

Mentions: In these experiments, we examined the effect of removing all the rules associated with a given class. We examined the KBBN accuracy and recorded the amount of average F-value between all classes after all the rules corresponding to a single class are removed. Again, 10-fold stratified cross validation was performed. The results are presented in Figure 4. “All (BN)” is when no rules are used in KBBN, which is equivalent to BN performance. Clearly, KBBN can lead to significant improvements compared to when no rules exist for entire classes of MTBC. We leave a more comprehensive study of when rules are most helpful for problems in other domains to future work.


Predicting Mycobacterium tuberculosis complex clades using knowledge-based Bayesian networks.

Aminian M, Couvin D, Shabbeer A, Hadley K, Vandenberg S, Rastogi N, Bennett KP - Biomed Res Int (2014)

Effect of removing rules for each class on the average F-value for (a) SITVIT-CV and (b) CDC-Sublineage.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4016944&req=5

fig4: Effect of removing rules for each class on the average F-value for (a) SITVIT-CV and (b) CDC-Sublineage.
Mentions: In these experiments, we examined the effect of removing all the rules associated with a given class. We examined the KBBN accuracy and recorded the amount of average F-value between all classes after all the rules corresponding to a single class are removed. Again, 10-fold stratified cross validation was performed. The results are presented in Figure 4. “All (BN)” is when no rules are used in KBBN, which is equivalent to BN performance. Clearly, KBBN can lead to significant improvements compared to when no rules exist for entire classes of MTBC. We leave a more comprehensive study of when rules are most helpful for problems in other domains to future work.

Bottom Line: We develop a predictive KBBN model for 69 MTBC clades found in the SITVIT international collection.Results show that incorporating rules into problems can drastically increase classification accuracy if data alone are insufficient.The SITVIT KBBN is publicly available for use on the World Wide Web.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.

ABSTRACT
We develop a novel approach for incorporating expert rules into Bayesian networks for classification of Mycobacterium tuberculosis complex (MTBC) clades. The proposed knowledge-based Bayesian network (KBBN) treats sets of expert rules as prior distributions on the classes. Unlike prior knowledge-based support vector machine approaches which require rules expressed as polyhedral sets, KBBN directly incorporates the rules without any modification. KBBN uses data to refine rule-based classifiers when the rule set is incomplete or ambiguous. We develop a predictive KBBN model for 69 MTBC clades found in the SITVIT international collection. We validate the approach using two testbeds that model knowledge of the MTBC obtained from two different experts and large DNA fingerprint databases to predict MTBC genetic clades and sublineages. These models represent strains of MTBC using high-throughput biomarkers called spacer oligonucleotide types (spoligotypes), since these are routinely gathered from MTBC isolates of tuberculosis (TB) patients. Results show that incorporating rules into problems can drastically increase classification accuracy if data alone are insufficient. The SITVIT KBBN is publicly available for use on the World Wide Web.

Show MeSH
Related in: MedlinePlus