Limits...
VIPR: A probabilistic algorithm for analysis of microbial detection microarrays.

Allred AF, Wu G, Wulan T, Fischer KF, Holbrook MR, Tesh RB, Wang D - BMC Bioinformatics (2010)

Bottom Line: VIPR was used to analyze 110 empirical microarray hybridizations generated from 33 distinct virus species.An accuracy of 94% was achieved as measured by leave-one-out cross validation.VIPR outperformed previously described algorithms for this dataset.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Molecular Microbiology, Washington University School of Medicine, St. Louis, Missouri, USA.

ABSTRACT

Background: All infectious disease oriented clinical diagnostic assays in use today focus on detecting the presence of a single, well defined target agent or a set of agents. In recent years, microarray-based diagnostics have been developed that greatly facilitate the highly parallel detection of multiple microbes that may be present in a given clinical specimen. While several algorithms have been described for interpretation of diagnostic microarrays, none of the existing approaches is capable of incorporating training data generated from positive control samples to improve performance.

Results: To specifically address this issue we have developed a novel interpretive algorithm, VIPR (Viral Identification using a PRobabilistic algorithm), which uses Bayesian inference to capitalize on empirical training data to optimize detection sensitivity. To illustrate this approach, we have focused on the detection of viruses that cause hemorrhagic fever (HF) using a custom HF-virus microarray. VIPR was used to analyze 110 empirical microarray hybridizations generated from 33 distinct virus species. An accuracy of 94% was achieved as measured by leave-one-out cross validation.

Conclusions: VIPR outperformed previously described algorithms for this dataset. The VIPR algorithm has potential to be broadly applicable to clinical diagnostic settings, wherein positive controls are typically readily available for generation of training data.

Show MeSH

Related in: MedlinePlus

Examples of On and Off distributions for two probes. A) One representative probe with highly resolved On and Off distributions based on the training set data. B) One representative probe where the On and Off distributions overlap. Empirical distributions (blue = Off, red = On) and estimated distributions (cyan = Off, pink = On) are shown.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2921407&req=5

Figure 2: Examples of On and Off distributions for two probes. A) One representative probe with highly resolved On and Off distributions based on the training set data. B) One representative probe where the On and Off distributions overlap. Empirical distributions (blue = Off, red = On) and estimated distributions (cyan = Off, pink = On) are shown.

Mentions: Empirical distributions and their normal approximations for two representative probes are shown in Figure 2. Figure 2A depicts a highly informative probe since there is effectively no overlap between the On and Off distributions for that probe. In contrast, the distributions in Figure 2B overlap substantially. Gradations between these two extremes constitute probes of intermediate informative value. Posterior probabilities were calculated via Bayes' rule for each probe given the observed intensity from an unclassified array. These posterior probabilities were multiplied to obtain likelihoods for each candidate virus [Additional file 1].


VIPR: A probabilistic algorithm for analysis of microbial detection microarrays.

Allred AF, Wu G, Wulan T, Fischer KF, Holbrook MR, Tesh RB, Wang D - BMC Bioinformatics (2010)

Examples of On and Off distributions for two probes. A) One representative probe with highly resolved On and Off distributions based on the training set data. B) One representative probe where the On and Off distributions overlap. Empirical distributions (blue = Off, red = On) and estimated distributions (cyan = Off, pink = On) are shown.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2921407&req=5

Figure 2: Examples of On and Off distributions for two probes. A) One representative probe with highly resolved On and Off distributions based on the training set data. B) One representative probe where the On and Off distributions overlap. Empirical distributions (blue = Off, red = On) and estimated distributions (cyan = Off, pink = On) are shown.
Mentions: Empirical distributions and their normal approximations for two representative probes are shown in Figure 2. Figure 2A depicts a highly informative probe since there is effectively no overlap between the On and Off distributions for that probe. In contrast, the distributions in Figure 2B overlap substantially. Gradations between these two extremes constitute probes of intermediate informative value. Posterior probabilities were calculated via Bayes' rule for each probe given the observed intensity from an unclassified array. These posterior probabilities were multiplied to obtain likelihoods for each candidate virus [Additional file 1].

Bottom Line: VIPR was used to analyze 110 empirical microarray hybridizations generated from 33 distinct virus species.An accuracy of 94% was achieved as measured by leave-one-out cross validation.VIPR outperformed previously described algorithms for this dataset.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Molecular Microbiology, Washington University School of Medicine, St. Louis, Missouri, USA.

ABSTRACT

Background: All infectious disease oriented clinical diagnostic assays in use today focus on detecting the presence of a single, well defined target agent or a set of agents. In recent years, microarray-based diagnostics have been developed that greatly facilitate the highly parallel detection of multiple microbes that may be present in a given clinical specimen. While several algorithms have been described for interpretation of diagnostic microarrays, none of the existing approaches is capable of incorporating training data generated from positive control samples to improve performance.

Results: To specifically address this issue we have developed a novel interpretive algorithm, VIPR (Viral Identification using a PRobabilistic algorithm), which uses Bayesian inference to capitalize on empirical training data to optimize detection sensitivity. To illustrate this approach, we have focused on the detection of viruses that cause hemorrhagic fever (HF) using a custom HF-virus microarray. VIPR was used to analyze 110 empirical microarray hybridizations generated from 33 distinct virus species. An accuracy of 94% was achieved as measured by leave-one-out cross validation.

Conclusions: VIPR outperformed previously described algorithms for this dataset. The VIPR algorithm has potential to be broadly applicable to clinical diagnostic settings, wherein positive controls are typically readily available for generation of training data.

Show MeSH
Related in: MedlinePlus