Cargando…

Prevalence and Sources of Duplicate Information in the Electronic Medical Record

IMPORTANCE: Duplicated text is a well-documented hazard in electronic medical records (EMRs), leading to wasted clinician time, medical error, and burnout. This study hypothesizes that text duplication is prevalent and increases with time and EMR size and that duplicate information is shared across...

Descripción completa

Detalles Bibliográficos
Autores principales: Steinkamp, Jackson, Kantrowitz, Jacob J., Airan-Javia, Subha
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Medical Association 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9513649/
https://www.ncbi.nlm.nih.gov/pubmed/36156143
http://dx.doi.org/10.1001/jamanetworkopen.2022.33348
_version_ 1784798115629367296
author Steinkamp, Jackson
Kantrowitz, Jacob J.
Airan-Javia, Subha
author_facet Steinkamp, Jackson
Kantrowitz, Jacob J.
Airan-Javia, Subha
author_sort Steinkamp, Jackson
collection PubMed
description IMPORTANCE: Duplicated text is a well-documented hazard in electronic medical records (EMRs), leading to wasted clinician time, medical error, and burnout. This study hypothesizes that text duplication is prevalent and increases with time and EMR size and that duplicate information is shared across authors. OBJECTIVE: To examine the prevalence and scope of duplication behavior in clinical notes from a large academic health system and the factors associated with duplication. DESIGN, SETTING, AND PARTICIPANTS: This retrospective, cross-sectional analysis of note length and content duplication rates used a set of 10 adjacent word tokens (ie, a 10-gram) sliding-window approach to identify spans of text duplicated exactly from earlier notes in a patient’s record for all inpatient and outpatient notes written within the University of Pennsylvania Health System from January 1, 2015, through December 31, 2020. Text duplicated from a different author vs text duplicated from the same author was quantified. Furthermore, novel text and duplicated text per author for various note types and author types, as well as per patient record by number of notes in the record, were quantified. Information scatter, another documentation hazard, was defined as the inverse of novel text per note, and the association between information duplication and information scatter was graphed. Data analysis was performed from January to March 2022. MAIN OUTCOMES AND MEASURES: Total, novel, and duplicate text by note type and note author were determined, as were the mean intra-author and inter-author duplication per note by type and author. RESULTS: There were a total of 104 456 653 notes for 1 960 689 unique patients consisting of 32 991 489 889 words; 50.1% of the total text in the record (16 523 851 210 words) was duplicated from prior text written about the same patient. The duplication fraction increased year-over-year, from 33.0% for notes written in 2015 to 54.2% for notes written in 2020. Of the text duplicated, 54.1% came from text written by the same author, whereas 45.9% was duplicated from a different author. Records with more notes had more total duplicate text, approaching 60%. Note types with high information scatter tended to have low information overload, and vice versa, suggesting a trade-off between these 2 hazards under the current documentation paradigm. CONCLUSIONS AND RELEVANCE: Duplicate text casts doubt on the veracity of all information in the medical record, making it difficult to find and verify information in day-to-day clinical work. The findings of this cross-sectional study suggest that text duplication is a systemic hazard, requiring systemic interventions to fix, and simple solutions such as banning copy-paste may have unintended consequences, such as worsening information scatter. The note paradigm should be further examined as a major cause of duplication and scatter, and alternative paradigms should be evaluated.
format Online
Article
Text
id pubmed-9513649
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Medical Association
record_format MEDLINE/PubMed
spelling pubmed-95136492022-10-14 Prevalence and Sources of Duplicate Information in the Electronic Medical Record Steinkamp, Jackson Kantrowitz, Jacob J. Airan-Javia, Subha JAMA Netw Open Original Investigation IMPORTANCE: Duplicated text is a well-documented hazard in electronic medical records (EMRs), leading to wasted clinician time, medical error, and burnout. This study hypothesizes that text duplication is prevalent and increases with time and EMR size and that duplicate information is shared across authors. OBJECTIVE: To examine the prevalence and scope of duplication behavior in clinical notes from a large academic health system and the factors associated with duplication. DESIGN, SETTING, AND PARTICIPANTS: This retrospective, cross-sectional analysis of note length and content duplication rates used a set of 10 adjacent word tokens (ie, a 10-gram) sliding-window approach to identify spans of text duplicated exactly from earlier notes in a patient’s record for all inpatient and outpatient notes written within the University of Pennsylvania Health System from January 1, 2015, through December 31, 2020. Text duplicated from a different author vs text duplicated from the same author was quantified. Furthermore, novel text and duplicated text per author for various note types and author types, as well as per patient record by number of notes in the record, were quantified. Information scatter, another documentation hazard, was defined as the inverse of novel text per note, and the association between information duplication and information scatter was graphed. Data analysis was performed from January to March 2022. MAIN OUTCOMES AND MEASURES: Total, novel, and duplicate text by note type and note author were determined, as were the mean intra-author and inter-author duplication per note by type and author. RESULTS: There were a total of 104 456 653 notes for 1 960 689 unique patients consisting of 32 991 489 889 words; 50.1% of the total text in the record (16 523 851 210 words) was duplicated from prior text written about the same patient. The duplication fraction increased year-over-year, from 33.0% for notes written in 2015 to 54.2% for notes written in 2020. Of the text duplicated, 54.1% came from text written by the same author, whereas 45.9% was duplicated from a different author. Records with more notes had more total duplicate text, approaching 60%. Note types with high information scatter tended to have low information overload, and vice versa, suggesting a trade-off between these 2 hazards under the current documentation paradigm. CONCLUSIONS AND RELEVANCE: Duplicate text casts doubt on the veracity of all information in the medical record, making it difficult to find and verify information in day-to-day clinical work. The findings of this cross-sectional study suggest that text duplication is a systemic hazard, requiring systemic interventions to fix, and simple solutions such as banning copy-paste may have unintended consequences, such as worsening information scatter. The note paradigm should be further examined as a major cause of duplication and scatter, and alternative paradigms should be evaluated. American Medical Association 2022-09-26 /pmc/articles/PMC9513649/ /pubmed/36156143 http://dx.doi.org/10.1001/jamanetworkopen.2022.33348 Text en Copyright 2022 Steinkamp J et al. JAMA Network Open. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the CC-BY License.
spellingShingle Original Investigation
Steinkamp, Jackson
Kantrowitz, Jacob J.
Airan-Javia, Subha
Prevalence and Sources of Duplicate Information in the Electronic Medical Record
title Prevalence and Sources of Duplicate Information in the Electronic Medical Record
title_full Prevalence and Sources of Duplicate Information in the Electronic Medical Record
title_fullStr Prevalence and Sources of Duplicate Information in the Electronic Medical Record
title_full_unstemmed Prevalence and Sources of Duplicate Information in the Electronic Medical Record
title_short Prevalence and Sources of Duplicate Information in the Electronic Medical Record
title_sort prevalence and sources of duplicate information in the electronic medical record
topic Original Investigation
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9513649/
https://www.ncbi.nlm.nih.gov/pubmed/36156143
http://dx.doi.org/10.1001/jamanetworkopen.2022.33348
work_keys_str_mv AT steinkampjackson prevalenceandsourcesofduplicateinformationintheelectronicmedicalrecord
AT kantrowitzjacobj prevalenceandsourcesofduplicateinformationintheelectronicmedicalrecord
AT airanjaviasubha prevalenceandsourcesofduplicateinformationintheelectronicmedicalrecord