Cargando…
Data Cleaning in the Evaluation of a Multi-Site Intervention Project
CONTEXT: The High Value Healthcare Collaborative (HVHC) sepsis project was a two-year multi-site project where Member health care delivery systems worked on improving sepsis care using a dissemination & implementation framework designed by HVHC. As part of the project evaluation, participating M...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Ubiquity Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5983076/ https://www.ncbi.nlm.nih.gov/pubmed/29881755 http://dx.doi.org/10.5334/egems.196 |
_version_ | 1783328368520855552 |
---|---|
author | Welch, Gavin von Recklinghausen, Friedrich Taenzer, Andreas Savitz, Lucy Weiss, Lisa |
author_facet | Welch, Gavin von Recklinghausen, Friedrich Taenzer, Andreas Savitz, Lucy Weiss, Lisa |
author_sort | Welch, Gavin |
collection | PubMed |
description | CONTEXT: The High Value Healthcare Collaborative (HVHC) sepsis project was a two-year multi-site project where Member health care delivery systems worked on improving sepsis care using a dissemination & implementation framework designed by HVHC. As part of the project evaluation, participating Members provided 5 data submissions over the project period. Members created data files using a uniform specification, but the data sources and methods used to create the data sets differed. Extensive data cleaning was necessary to get a data set usable for the evaluation analysis. CASE DESCRIPTION: HVHC was the coordinating center for the project and received and cleaned all data submissions. Submissions received 3 sequentially more detailed levels of checking by HVHC. The most detailed level evaluated validity by comparing values within-Member over time and between Member. For a subset of episodes Member-submitted data were compared to matched Medicare claims data. FINDINGS: Inconsistencies in data submissions, particularly for length-of-stay variables were common in early submissions and decreased with subsequent submissions. Multiple resubmissions were sometimes required to get clean data. Data checking also uncovered a systematic difference in the way Medicare and some members defined intensive care unit stay. CONCLUSIONS: Data checking is a critical for ensuring valid analytic results for projects using electronic health record data. It is important to budget sufficient resources for data checking. Interim data submissions and checks help find anomalies early. Data resubmissions should be checked as fixes can introduce new errors. Communicating with those responsible for creating the data set provides critical information. |
format | Online Article Text |
id | pubmed-5983076 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Ubiquity Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-59830762018-06-07 Data Cleaning in the Evaluation of a Multi-Site Intervention Project Welch, Gavin von Recklinghausen, Friedrich Taenzer, Andreas Savitz, Lucy Weiss, Lisa EGEMS (Wash DC) Case Study CONTEXT: The High Value Healthcare Collaborative (HVHC) sepsis project was a two-year multi-site project where Member health care delivery systems worked on improving sepsis care using a dissemination & implementation framework designed by HVHC. As part of the project evaluation, participating Members provided 5 data submissions over the project period. Members created data files using a uniform specification, but the data sources and methods used to create the data sets differed. Extensive data cleaning was necessary to get a data set usable for the evaluation analysis. CASE DESCRIPTION: HVHC was the coordinating center for the project and received and cleaned all data submissions. Submissions received 3 sequentially more detailed levels of checking by HVHC. The most detailed level evaluated validity by comparing values within-Member over time and between Member. For a subset of episodes Member-submitted data were compared to matched Medicare claims data. FINDINGS: Inconsistencies in data submissions, particularly for length-of-stay variables were common in early submissions and decreased with subsequent submissions. Multiple resubmissions were sometimes required to get clean data. Data checking also uncovered a systematic difference in the way Medicare and some members defined intensive care unit stay. CONCLUSIONS: Data checking is a critical for ensuring valid analytic results for projects using electronic health record data. It is important to budget sufficient resources for data checking. Interim data submissions and checks help find anomalies early. Data resubmissions should be checked as fixes can introduce new errors. Communicating with those responsible for creating the data set provides critical information. Ubiquity Press 2017-12-15 /pmc/articles/PMC5983076/ /pubmed/29881755 http://dx.doi.org/10.5334/egems.196 Text en Copyright: © 2017 The Author(s) http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Case Study Welch, Gavin von Recklinghausen, Friedrich Taenzer, Andreas Savitz, Lucy Weiss, Lisa Data Cleaning in the Evaluation of a Multi-Site Intervention Project |
title | Data Cleaning in the Evaluation of a Multi-Site Intervention Project |
title_full | Data Cleaning in the Evaluation of a Multi-Site Intervention Project |
title_fullStr | Data Cleaning in the Evaluation of a Multi-Site Intervention Project |
title_full_unstemmed | Data Cleaning in the Evaluation of a Multi-Site Intervention Project |
title_short | Data Cleaning in the Evaluation of a Multi-Site Intervention Project |
title_sort | data cleaning in the evaluation of a multi-site intervention project |
topic | Case Study |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5983076/ https://www.ncbi.nlm.nih.gov/pubmed/29881755 http://dx.doi.org/10.5334/egems.196 |
work_keys_str_mv | AT welchgavin datacleaningintheevaluationofamultisiteinterventionproject AT vonrecklinghausenfriedrich datacleaningintheevaluationofamultisiteinterventionproject AT taenzerandreas datacleaningintheevaluationofamultisiteinterventionproject AT savitzlucy datacleaningintheevaluationofamultisiteinterventionproject AT weisslisa datacleaningintheevaluationofamultisiteinterventionproject |