Limits...
Using a Data Quality Framework to Clean Data Extracted from the Electronic Health Record: A Case Study.

Dziadkowiec O, Callahan T, Ozkaynak M, Reeder B, Welton J - EGEMS (Wash DC) (2016)

Bottom Line: Without adequate preparation of such large data sets for analysis, the results might be erroneous, which might affect clinical decision-making or the results of Comparative Effectives Research studies.The data set cleaned using Kahn's framework yielded more accurate results than the data set cleaned without this framework.Future plans involve creating functions in R language for cleaning data extracted from the EHR as well as an R package that combines DQ checks with missing data analysis functions.

View Article: PubMed Central - PubMed

Affiliation: University of Colorado, College of Nursing, Anschutz Medical Campus.

ABSTRACT

Objectives: We examine the following: (1) the appropriateness of using a data quality (DQ) framework developed for relational databases as a data-cleaning tool for a data set extracted from two EPIC databases, and (2) the differences in statistical parameter estimates on a data set cleaned with the DQ framework and data set not cleaned with the DQ framework.

Background: The use of data contained within electronic health records (EHRs) has the potential to open doors for a new wave of innovative research. Without adequate preparation of such large data sets for analysis, the results might be erroneous, which might affect clinical decision-making or the results of Comparative Effectives Research studies.

Methods: Two emergency department (ED) data sets extracted from EPIC databases (adult ED and children ED) were used as examples for examining the five concepts of DQ based on a DQ assessment framework designed for EHR databases. The first data set contained 70,061 visits; and the second data set contained 2,815,550 visits. SPSS Syntax examples as well as step-by-step instructions of how to apply the five key DQ concepts these EHR database extracts are provided.

Conclusions: SPSS Syntax to address each of the DQ concepts proposed by Kahn et al. (2012)1 was developed. The data set cleaned using Kahn's framework yielded more accurate results than the data set cleaned without this framework. Future plans involve creating functions in R language for cleaning data extracted from the EHR as well as an R package that combines DQ checks with missing data analysis functions.

No MeSH data available.


Related in: MedlinePlus

Sample SPSS Code for Checking Historical Data Rules
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4933574&req=5

f4-egems1201: Sample SPSS Code for Checking Historical Data Rules

Mentions: Practical Guide to Examining EHR Data Sets Based on Kahn et al (2012)


Using a Data Quality Framework to Clean Data Extracted from the Electronic Health Record: A Case Study.

Dziadkowiec O, Callahan T, Ozkaynak M, Reeder B, Welton J - EGEMS (Wash DC) (2016)

Sample SPSS Code for Checking Historical Data Rules
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4933574&req=5

f4-egems1201: Sample SPSS Code for Checking Historical Data Rules
Mentions: Practical Guide to Examining EHR Data Sets Based on Kahn et al (2012)

Bottom Line: Without adequate preparation of such large data sets for analysis, the results might be erroneous, which might affect clinical decision-making or the results of Comparative Effectives Research studies.The data set cleaned using Kahn's framework yielded more accurate results than the data set cleaned without this framework.Future plans involve creating functions in R language for cleaning data extracted from the EHR as well as an R package that combines DQ checks with missing data analysis functions.

View Article: PubMed Central - PubMed

Affiliation: University of Colorado, College of Nursing, Anschutz Medical Campus.

ABSTRACT

Objectives: We examine the following: (1) the appropriateness of using a data quality (DQ) framework developed for relational databases as a data-cleaning tool for a data set extracted from two EPIC databases, and (2) the differences in statistical parameter estimates on a data set cleaned with the DQ framework and data set not cleaned with the DQ framework.

Background: The use of data contained within electronic health records (EHRs) has the potential to open doors for a new wave of innovative research. Without adequate preparation of such large data sets for analysis, the results might be erroneous, which might affect clinical decision-making or the results of Comparative Effectives Research studies.

Methods: Two emergency department (ED) data sets extracted from EPIC databases (adult ED and children ED) were used as examples for examining the five concepts of DQ based on a DQ assessment framework designed for EHR databases. The first data set contained 70,061 visits; and the second data set contained 2,815,550 visits. SPSS Syntax examples as well as step-by-step instructions of how to apply the five key DQ concepts these EHR database extracts are provided.

Conclusions: SPSS Syntax to address each of the DQ concepts proposed by Kahn et al. (2012)1 was developed. The data set cleaned using Kahn's framework yielded more accurate results than the data set cleaned without this framework. Future plans involve creating functions in R language for cleaning data extracted from the EHR as well as an R package that combines DQ checks with missing data analysis functions.

No MeSH data available.


Related in: MedlinePlus