Limits...
Information discovery on electronic health records using authority flow techniques.

Hristidis V, Varadarajan RR, Biondich P, Weiner M - BMC Med Inform Decis Mak (2010)

Bottom Line: As the use of electronic health records (EHRs) becomes more widespread, so does the need to search and provide effective information discovery within them.Querying by keyword has emerged as one of the most effective paradigms for searching.We compare the effectiveness of two fundamentally different techniques for keyword search of EHRs.

View Article: PubMed Central - HTML - PubMed

Affiliation: School of Computing and Information Sciences, Florida International University, Miami, Florida, USA. vagelis@cis.fiu.edu

ABSTRACT

Background: As the use of electronic health records (EHRs) becomes more widespread, so does the need to search and provide effective information discovery within them. Querying by keyword has emerged as one of the most effective paradigms for searching. Most work in this area is based on traditional Information Retrieval (IR) techniques, where each document is compared individually against the query. We compare the effectiveness of two fundamentally different techniques for keyword search of EHRs.

Methods: We built two ranking systems. The traditional BM25 system exploits the EHRs' content without regard to association among entities within. The Clinical ObjectRank (CO) system exploits the entities' associations in EHRs using an authority-flow algorithm to discover the most relevant entities. BM25 and CO were deployed on an EHR dataset of the cardiovascular division of Miami Children's Hospital. Using sequences of keywords as queries, sensitivity and specificity were measured by two physicians for a set of 11 queries related to congenital cardiac disease.

Results: Our pilot evaluation showed that CO outperforms BM25 in terms of sensitivity (65% vs. 38%) by 71% on average, while maintaining the specificity (64% vs. 61%). The evaluation was done by two physicians.

Conclusions: Authority-flow techniques can greatly improve the detection of relevant information in EHRs and hence deserve further study.

Show MeSH

Related in: MedlinePlus

An example HL7 CDA XML medical document. An example HL7 CDA XML medical document that shows the use of ID/IDREF attributes in XML.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2984470&req=5

Figure 2: An example HL7 CDA XML medical document. An example HL7 CDA XML medical document that shows the use of ID/IDREF attributes in XML.

Mentions: The fundamental difference between the two search methods studied here, BM25 and CO, stems from their different modelling of the collection of EHRs: in BM25, every document is modelled as a bag of keywords. CO models the corpus as a graph of interconnected entities. In particular, the dataset is viewed as a graph where each entity is modelled as a node and edges denote associations among various entities, such as a link from patient to hospitalization or from hospitalization to medication. This data model can abstract for both XML and relational data. Health standards organizations like Health Level 7 have been designing XML-based formats to represent EHRs. The Clinical Document Architecture [19] is such a format. An XML document can be represented as a hierarchical tree of nodes under a unique root element. In this model every XML element is represented as a node, and the parent-child relationships between elements are captured as edges called containment edges. The use of ID/IDREF attributes in XML [14] creates an additional edge--the ID/IDREF edge--between elements that are not directly connected by a parent-child relationship. This introduces cycles and hence transforms the tree into a graph. Figure 2 shows a medical record in HL7 CDA format. In this example, there is an element in the document with an ID-type attribute: <content ID="m1">Theophylline</content> and elsewhere, there is another element that refers to it: <medication IDREF="m1"> ...</medication>.


Information discovery on electronic health records using authority flow techniques.

Hristidis V, Varadarajan RR, Biondich P, Weiner M - BMC Med Inform Decis Mak (2010)

An example HL7 CDA XML medical document. An example HL7 CDA XML medical document that shows the use of ID/IDREF attributes in XML.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2984470&req=5

Figure 2: An example HL7 CDA XML medical document. An example HL7 CDA XML medical document that shows the use of ID/IDREF attributes in XML.
Mentions: The fundamental difference between the two search methods studied here, BM25 and CO, stems from their different modelling of the collection of EHRs: in BM25, every document is modelled as a bag of keywords. CO models the corpus as a graph of interconnected entities. In particular, the dataset is viewed as a graph where each entity is modelled as a node and edges denote associations among various entities, such as a link from patient to hospitalization or from hospitalization to medication. This data model can abstract for both XML and relational data. Health standards organizations like Health Level 7 have been designing XML-based formats to represent EHRs. The Clinical Document Architecture [19] is such a format. An XML document can be represented as a hierarchical tree of nodes under a unique root element. In this model every XML element is represented as a node, and the parent-child relationships between elements are captured as edges called containment edges. The use of ID/IDREF attributes in XML [14] creates an additional edge--the ID/IDREF edge--between elements that are not directly connected by a parent-child relationship. This introduces cycles and hence transforms the tree into a graph. Figure 2 shows a medical record in HL7 CDA format. In this example, there is an element in the document with an ID-type attribute: <content ID="m1">Theophylline</content> and elsewhere, there is another element that refers to it: <medication IDREF="m1"> ...</medication>.

Bottom Line: As the use of electronic health records (EHRs) becomes more widespread, so does the need to search and provide effective information discovery within them.Querying by keyword has emerged as one of the most effective paradigms for searching.We compare the effectiveness of two fundamentally different techniques for keyword search of EHRs.

View Article: PubMed Central - HTML - PubMed

Affiliation: School of Computing and Information Sciences, Florida International University, Miami, Florida, USA. vagelis@cis.fiu.edu

ABSTRACT

Background: As the use of electronic health records (EHRs) becomes more widespread, so does the need to search and provide effective information discovery within them. Querying by keyword has emerged as one of the most effective paradigms for searching. Most work in this area is based on traditional Information Retrieval (IR) techniques, where each document is compared individually against the query. We compare the effectiveness of two fundamentally different techniques for keyword search of EHRs.

Methods: We built two ranking systems. The traditional BM25 system exploits the EHRs' content without regard to association among entities within. The Clinical ObjectRank (CO) system exploits the entities' associations in EHRs using an authority-flow algorithm to discover the most relevant entities. BM25 and CO were deployed on an EHR dataset of the cardiovascular division of Miami Children's Hospital. Using sequences of keywords as queries, sensitivity and specificity were measured by two physicians for a set of 11 queries related to congenital cardiac disease.

Results: Our pilot evaluation showed that CO outperforms BM25 in terms of sensitivity (65% vs. 38%) by 71% on average, while maintaining the specificity (64% vs. 61%). The evaluation was done by two physicians.

Conclusions: Authority-flow techniques can greatly improve the detection of relevant information in EHRs and hence deserve further study.

Show MeSH
Related in: MedlinePlus