Limits...
Passage relevance models for genomics search.

Urbain J, Frieder O, Goharian N - BMC Bioinformatics (2009)

Bottom Line: Component models of topics, concepts, terms, and document are represented as potential functions within a Markov Random Field.By integrating multiple sources of evidence including dependencies between topics, concepts, and terms, we seek to improve genomics literature passage retrieval precision.Using this model, we are able to demonstrate statistically significant improvements in retrieval precision using a large genomics literature corpus.

View Article: PubMed Central - HTML - PubMed

Affiliation: Electrical Engineering and Computer Science Department, Milwaukee School of Engineering, Milwaukee, WI, USA. urbain@msoe.edu

ABSTRACT
We present a passage relevance model for integrating syntactic and semantic evidence of biomedical concepts and topics using a probabilistic graphical model. Component models of topics, concepts, terms, and document are represented as potential functions within a Markov Random Field. The probability of a passage being relevant to a biologist's information need is represented as the joint distribution across all potential functions. Relevance model feedback of top ranked passages is used to improve distributional estimates of query concepts and topics in context, and a dimensional indexing strategy is used for efficient aggregation of concept and term statistics. By integrating multiple sources of evidence including dependencies between topics, concepts, and terms, we seek to improve genomics literature passage retrieval precision. Using this model, we are able to demonstrate statistically significant improvements in retrieval precision using a large genomics literature corpus.

Show MeSH
Dimensional term index (paragraph).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2665051&req=5

Figure 2: Dimensional term index (paragraph).

Mentions: We use a dimensional indexing model to efficiently aggregate term co-occurrence statistics. The grain of the index is an individual term variant. Figure 2 illustrates a cube representing a paragraph within the term index. For simplicity, the document dimension is not shown. Each document consists of a sequence of paragraphs, each paragraph consists of a sequence of sentences, each sentence consists of a sequence of terms, and each term consists of one or more term variants.


Passage relevance models for genomics search.

Urbain J, Frieder O, Goharian N - BMC Bioinformatics (2009)

Dimensional term index (paragraph).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2665051&req=5

Figure 2: Dimensional term index (paragraph).
Mentions: We use a dimensional indexing model to efficiently aggregate term co-occurrence statistics. The grain of the index is an individual term variant. Figure 2 illustrates a cube representing a paragraph within the term index. For simplicity, the document dimension is not shown. Each document consists of a sequence of paragraphs, each paragraph consists of a sequence of sentences, each sentence consists of a sequence of terms, and each term consists of one or more term variants.

Bottom Line: Component models of topics, concepts, terms, and document are represented as potential functions within a Markov Random Field.By integrating multiple sources of evidence including dependencies between topics, concepts, and terms, we seek to improve genomics literature passage retrieval precision.Using this model, we are able to demonstrate statistically significant improvements in retrieval precision using a large genomics literature corpus.

View Article: PubMed Central - HTML - PubMed

Affiliation: Electrical Engineering and Computer Science Department, Milwaukee School of Engineering, Milwaukee, WI, USA. urbain@msoe.edu

ABSTRACT
We present a passage relevance model for integrating syntactic and semantic evidence of biomedical concepts and topics using a probabilistic graphical model. Component models of topics, concepts, terms, and document are represented as potential functions within a Markov Random Field. The probability of a passage being relevant to a biologist's information need is represented as the joint distribution across all potential functions. Relevance model feedback of top ranked passages is used to improve distributional estimates of query concepts and topics in context, and a dimensional indexing strategy is used for efficient aggregation of concept and term statistics. By integrating multiple sources of evidence including dependencies between topics, concepts, and terms, we seek to improve genomics literature passage retrieval precision. Using this model, we are able to demonstrate statistically significant improvements in retrieval precision using a large genomics literature corpus.

Show MeSH