Limits...
A novel information retrieval model for high-throughput molecular medicine modalities.

Wehbe FH, Brown SH, Massion PP, Gadd CS, Masys DR, Aliferis CF - Cancer Inform (2009)

Bottom Line: Significant research has been devoted to predicting diagnosis, prognosis, and response to treatment using high-throughput assays.We first explain why this goal is inadequately supported by existing databases and portals and then introduce a novel semantic indexing and information retrieval model for clinical bioinformatics.The formalism provides the means for indexing a variety of relevant objects (e.g. papers, algorithms, signatures, datasets) and includes a model of the research processes that creates and validates these objects in order to support their systematic presentation once retrieved.We test the applicability of the model by constructing proof-of-concept encodings and visual presentations of evidence and modalities in molecular profiling and prognosis of: (a) diffuse large B-cell lymphoma (DLBCL) and (b) breast cancer.

View Article: PubMed Central - PubMed

Affiliation: Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA. firas.wehbe@vanderbilt.edu

ABSTRACT
Significant research has been devoted to predicting diagnosis, prognosis, and response to treatment using high-throughput assays. Rapid translation into clinical results hinges upon efficient access to up-to-date and high-quality molecular medicine modalities. We first explain why this goal is inadequately supported by existing databases and portals and then introduce a novel semantic indexing and information retrieval model for clinical bioinformatics. The formalism provides the means for indexing a variety of relevant objects (e.g. papers, algorithms, signatures, datasets) and includes a model of the research processes that creates and validates these objects in order to support their systematic presentation once retrieved.We test the applicability of the model by constructing proof-of-concept encodings and visual presentations of evidence and modalities in molecular profiling and prognosis of: (a) diffuse large B-cell lymphoma (DLBCL) and (b) breast cancer.

No MeSH data available.


Related in: MedlinePlus

This figure shows the objects and relationships that surround the production and external validation of a Bayes-classifier Model as described in the Wright et al. (Wright and others 2003) Paper and explained in the subsection “Proof of Concept: Diffuse Large B-Cell Lymphoma”, paragraph 4. The Model (bottom center) was produced by applying the Bayes-classifier Algorithm to the lymphochip Dataset (left). The Model was internally validated (left side arc) using that Dataset which was split into independent training and testing sets. It was then externally validated (right side arc) using another independent Dataset that was assayed and described in a previous Paper (right). It is important to represent and identify this type of scenario in which higher quality Models are produced, i.e. Models that generalize across different Datasets and, in this case, across different molecular assay platforms (oligonucleotide vs. cDNA).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC2664697&req=5

f3-cin-08-01: This figure shows the objects and relationships that surround the production and external validation of a Bayes-classifier Model as described in the Wright et al. (Wright and others 2003) Paper and explained in the subsection “Proof of Concept: Diffuse Large B-Cell Lymphoma”, paragraph 4. The Model (bottom center) was produced by applying the Bayes-classifier Algorithm to the lymphochip Dataset (left). The Model was internally validated (left side arc) using that Dataset which was split into independent training and testing sets. It was then externally validated (right side arc) using another independent Dataset that was assayed and described in a previous Paper (right). It is important to represent and identify this type of scenario in which higher quality Models are produced, i.e. Models that generalize across different Datasets and, in this case, across different molecular assay platforms (oligonucleotide vs. cDNA).

Mentions: Wright et al. (Wright and others 2003) wanted to reconcile the results from the last two studies (See Fig. 3). They developed a Bayes classifier (i.e. a decision Model) to predict molecular sub-type and clinical outcome. It was trained and validated on the Rosenwald Dataset that used the lymphochip platform. The classifier was then independently validated on the Dataset produced by the Shipp group, again using sequence annotations to reconcile the cDNA sequences with the oligonucleotide sequences. This seems to support the biological hypothesis that the “two molecular subtypes” in DLBCL correlate with different biological and clinical behavior. The semantics of the relationship between this Model and these two Datasets is reflected through the visual description and organization in this figure.


A novel information retrieval model for high-throughput molecular medicine modalities.

Wehbe FH, Brown SH, Massion PP, Gadd CS, Masys DR, Aliferis CF - Cancer Inform (2009)

This figure shows the objects and relationships that surround the production and external validation of a Bayes-classifier Model as described in the Wright et al. (Wright and others 2003) Paper and explained in the subsection “Proof of Concept: Diffuse Large B-Cell Lymphoma”, paragraph 4. The Model (bottom center) was produced by applying the Bayes-classifier Algorithm to the lymphochip Dataset (left). The Model was internally validated (left side arc) using that Dataset which was split into independent training and testing sets. It was then externally validated (right side arc) using another independent Dataset that was assayed and described in a previous Paper (right). It is important to represent and identify this type of scenario in which higher quality Models are produced, i.e. Models that generalize across different Datasets and, in this case, across different molecular assay platforms (oligonucleotide vs. cDNA).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC2664697&req=5

f3-cin-08-01: This figure shows the objects and relationships that surround the production and external validation of a Bayes-classifier Model as described in the Wright et al. (Wright and others 2003) Paper and explained in the subsection “Proof of Concept: Diffuse Large B-Cell Lymphoma”, paragraph 4. The Model (bottom center) was produced by applying the Bayes-classifier Algorithm to the lymphochip Dataset (left). The Model was internally validated (left side arc) using that Dataset which was split into independent training and testing sets. It was then externally validated (right side arc) using another independent Dataset that was assayed and described in a previous Paper (right). It is important to represent and identify this type of scenario in which higher quality Models are produced, i.e. Models that generalize across different Datasets and, in this case, across different molecular assay platforms (oligonucleotide vs. cDNA).
Mentions: Wright et al. (Wright and others 2003) wanted to reconcile the results from the last two studies (See Fig. 3). They developed a Bayes classifier (i.e. a decision Model) to predict molecular sub-type and clinical outcome. It was trained and validated on the Rosenwald Dataset that used the lymphochip platform. The classifier was then independently validated on the Dataset produced by the Shipp group, again using sequence annotations to reconcile the cDNA sequences with the oligonucleotide sequences. This seems to support the biological hypothesis that the “two molecular subtypes” in DLBCL correlate with different biological and clinical behavior. The semantics of the relationship between this Model and these two Datasets is reflected through the visual description and organization in this figure.

Bottom Line: Significant research has been devoted to predicting diagnosis, prognosis, and response to treatment using high-throughput assays.We first explain why this goal is inadequately supported by existing databases and portals and then introduce a novel semantic indexing and information retrieval model for clinical bioinformatics.The formalism provides the means for indexing a variety of relevant objects (e.g. papers, algorithms, signatures, datasets) and includes a model of the research processes that creates and validates these objects in order to support their systematic presentation once retrieved.We test the applicability of the model by constructing proof-of-concept encodings and visual presentations of evidence and modalities in molecular profiling and prognosis of: (a) diffuse large B-cell lymphoma (DLBCL) and (b) breast cancer.

View Article: PubMed Central - PubMed

Affiliation: Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA. firas.wehbe@vanderbilt.edu

ABSTRACT
Significant research has been devoted to predicting diagnosis, prognosis, and response to treatment using high-throughput assays. Rapid translation into clinical results hinges upon efficient access to up-to-date and high-quality molecular medicine modalities. We first explain why this goal is inadequately supported by existing databases and portals and then introduce a novel semantic indexing and information retrieval model for clinical bioinformatics. The formalism provides the means for indexing a variety of relevant objects (e.g. papers, algorithms, signatures, datasets) and includes a model of the research processes that creates and validates these objects in order to support their systematic presentation once retrieved.We test the applicability of the model by constructing proof-of-concept encodings and visual presentations of evidence and modalities in molecular profiling and prognosis of: (a) diffuse large B-cell lymphoma (DLBCL) and (b) breast cancer.

No MeSH data available.


Related in: MedlinePlus