Cargando…

Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions

OBJECTIVE: To assess clinical documentation variations across health care institutions using different electronic medical record systems and investigate how they affect natural language processing (NLP) system portability. MATERIALS AND METHODS: Birth cohorts from Mayo Clinic and Sanford Children’s...

Descripción completa

Detalles Bibliográficos
Autores principales: Sohn, Sunghwan, Wang, Yanshan, Wi, Chung-Il, Krusemark, Elizabeth A, Ryu, Euijung, Ali, Mir H, Juhn, Young J, Liu, Hongfang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7378885/
https://www.ncbi.nlm.nih.gov/pubmed/29202185
http://dx.doi.org/10.1093/jamia/ocx138
_version_ 1783562519293460480
author Sohn, Sunghwan
Wang, Yanshan
Wi, Chung-Il
Krusemark, Elizabeth A
Ryu, Euijung
Ali, Mir H
Juhn, Young J
Liu, Hongfang
author_facet Sohn, Sunghwan
Wang, Yanshan
Wi, Chung-Il
Krusemark, Elizabeth A
Ryu, Euijung
Ali, Mir H
Juhn, Young J
Liu, Hongfang
author_sort Sohn, Sunghwan
collection PubMed
description OBJECTIVE: To assess clinical documentation variations across health care institutions using different electronic medical record systems and investigate how they affect natural language processing (NLP) system portability. MATERIALS AND METHODS: Birth cohorts from Mayo Clinic and Sanford Children’s Hospital (SCH) were used in this study (n = 298 for each). Documentation variations regarding asthma between the 2 cohorts were examined in various aspects: (1) overall corpus at the word level (ie, lexical variation), (2) topics and asthma-related concepts (ie, semantic variation), and (3) clinical note types (ie, process variation). We compared those statistics and explored NLP system portability for asthma ascertainment in 2 stages: prototype and refinement. RESULTS: There exist notable lexical variations (word-level similarity = 0.669) and process variations (differences in major note types containing asthma-related concepts). However, semantic-level corpora were relatively homogeneous (topic similarity = 0.944, asthma-related concept similarity = 0.971). The NLP system for asthma ascertainment had anF-score of 0.937 at Mayo, and produced 0.813 (prototype) and 0.908 (refinement) when applied at SCH. DISCUSSION: The criteria for asthma ascertainment are largely dependent on asthma-related concepts. Therefore, we believe that semantic similarity is important to estimate NLP system portability. As the Mayo Clinic and SCH corpora were relatively homogeneous at a semantic level, the NLP system, developed at Mayo Clinic, was imported to SCH successfully with proper adjustments to deal with the intrinsic corpus heterogeneity.
format Online
Article
Text
id pubmed-7378885
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-73788852020-07-29 Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions Sohn, Sunghwan Wang, Yanshan Wi, Chung-Il Krusemark, Elizabeth A Ryu, Euijung Ali, Mir H Juhn, Young J Liu, Hongfang J Am Med Inform Assoc Research and Applications OBJECTIVE: To assess clinical documentation variations across health care institutions using different electronic medical record systems and investigate how they affect natural language processing (NLP) system portability. MATERIALS AND METHODS: Birth cohorts from Mayo Clinic and Sanford Children’s Hospital (SCH) were used in this study (n = 298 for each). Documentation variations regarding asthma between the 2 cohorts were examined in various aspects: (1) overall corpus at the word level (ie, lexical variation), (2) topics and asthma-related concepts (ie, semantic variation), and (3) clinical note types (ie, process variation). We compared those statistics and explored NLP system portability for asthma ascertainment in 2 stages: prototype and refinement. RESULTS: There exist notable lexical variations (word-level similarity = 0.669) and process variations (differences in major note types containing asthma-related concepts). However, semantic-level corpora were relatively homogeneous (topic similarity = 0.944, asthma-related concept similarity = 0.971). The NLP system for asthma ascertainment had anF-score of 0.937 at Mayo, and produced 0.813 (prototype) and 0.908 (refinement) when applied at SCH. DISCUSSION: The criteria for asthma ascertainment are largely dependent on asthma-related concepts. Therefore, we believe that semantic similarity is important to estimate NLP system portability. As the Mayo Clinic and SCH corpora were relatively homogeneous at a semantic level, the NLP system, developed at Mayo Clinic, was imported to SCH successfully with proper adjustments to deal with the intrinsic corpus heterogeneity. Oxford University Press 2017-11-30 /pmc/articles/PMC7378885/ /pubmed/29202185 http://dx.doi.org/10.1093/jamia/ocx138 Text en © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial reuse, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Research and Applications
Sohn, Sunghwan
Wang, Yanshan
Wi, Chung-Il
Krusemark, Elizabeth A
Ryu, Euijung
Ali, Mir H
Juhn, Young J
Liu, Hongfang
Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions
title Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions
title_full Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions
title_fullStr Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions
title_full_unstemmed Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions
title_short Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions
title_sort clinical documentation variations and nlp system portability: a case study in asthma birth cohorts across institutions
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7378885/
https://www.ncbi.nlm.nih.gov/pubmed/29202185
http://dx.doi.org/10.1093/jamia/ocx138
work_keys_str_mv AT sohnsunghwan clinicaldocumentationvariationsandnlpsystemportabilityacasestudyinasthmabirthcohortsacrossinstitutions
AT wangyanshan clinicaldocumentationvariationsandnlpsystemportabilityacasestudyinasthmabirthcohortsacrossinstitutions
AT wichungil clinicaldocumentationvariationsandnlpsystemportabilityacasestudyinasthmabirthcohortsacrossinstitutions
AT krusemarkelizabetha clinicaldocumentationvariationsandnlpsystemportabilityacasestudyinasthmabirthcohortsacrossinstitutions
AT ryueuijung clinicaldocumentationvariationsandnlpsystemportabilityacasestudyinasthmabirthcohortsacrossinstitutions
AT alimirh clinicaldocumentationvariationsandnlpsystemportabilityacasestudyinasthmabirthcohortsacrossinstitutions
AT juhnyoungj clinicaldocumentationvariationsandnlpsystemportabilityacasestudyinasthmabirthcohortsacrossinstitutions
AT liuhongfang clinicaldocumentationvariationsandnlpsystemportabilityacasestudyinasthmabirthcohortsacrossinstitutions