Cargando…

An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C)

Despite recent methodology advancements in clinical natural language processing (NLP), the adoption of clinical NLP models within the translational research community remains hindered by process heterogeneity and human factor variations. Concurrently, these factors also dramatically increase the dif...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Sijia, Wen, Andrew, Wang, Liwei, He, Huan, Fu, Sunyang, Miller, Robert, Williams, Andrew, Harris, Daniel, Kavuluru, Ramakanth, Liu, Mei, Abu-el-Rub, Noor, Schutte, Dalton, Zhang, Rui, Rouhizadeh, Masoud, Osborne, John D, He, Yongqun, Topaloglu, Umit, Hong, Stephanie S, Saltz, Joel H, Schaffter, Thomas, Pfaff, Emily, Chute, Christopher G, Duong, Tim, Haendel, Melissa A, Fuentes, Rafael, Szolovits, Peter, Xu, Hua, Liu, Hongfang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654844/
https://www.ncbi.nlm.nih.gov/pubmed/37555837
http://dx.doi.org/10.1093/jamia/ocad134
_version_ 1785136705421967360
author Liu, Sijia
Wen, Andrew
Wang, Liwei
He, Huan
Fu, Sunyang
Miller, Robert
Williams, Andrew
Harris, Daniel
Kavuluru, Ramakanth
Liu, Mei
Abu-el-Rub, Noor
Schutte, Dalton
Zhang, Rui
Rouhizadeh, Masoud
Osborne, John D
He, Yongqun
Topaloglu, Umit
Hong, Stephanie S
Saltz, Joel H
Schaffter, Thomas
Pfaff, Emily
Chute, Christopher G
Duong, Tim
Haendel, Melissa A
Fuentes, Rafael
Szolovits, Peter
Xu, Hua
Liu, Hongfang
author_facet Liu, Sijia
Wen, Andrew
Wang, Liwei
He, Huan
Fu, Sunyang
Miller, Robert
Williams, Andrew
Harris, Daniel
Kavuluru, Ramakanth
Liu, Mei
Abu-el-Rub, Noor
Schutte, Dalton
Zhang, Rui
Rouhizadeh, Masoud
Osborne, John D
He, Yongqun
Topaloglu, Umit
Hong, Stephanie S
Saltz, Joel H
Schaffter, Thomas
Pfaff, Emily
Chute, Christopher G
Duong, Tim
Haendel, Melissa A
Fuentes, Rafael
Szolovits, Peter
Xu, Hua
Liu, Hongfang
author_sort Liu, Sijia
collection PubMed
description Despite recent methodology advancements in clinical natural language processing (NLP), the adoption of clinical NLP models within the translational research community remains hindered by process heterogeneity and human factor variations. Concurrently, these factors also dramatically increase the difficulty in developing NLP models in multi-site settings, which is necessary for algorithm robustness and generalizability. Here, we reported on our experience developing an NLP solution for Coronavirus Disease 2019 (COVID-19) signs and symptom extraction in an open NLP framework from a subset of sites participating in the National COVID Cohort (N3C). We then empirically highlight the benefits of multi-site data for both symbolic and statistical methods, as well as highlight the need for federated annotation and evaluation to resolve several pitfalls encountered in the course of these efforts.
format Online
Article
Text
id pubmed-10654844
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-106548442023-08-09 An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C) Liu, Sijia Wen, Andrew Wang, Liwei He, Huan Fu, Sunyang Miller, Robert Williams, Andrew Harris, Daniel Kavuluru, Ramakanth Liu, Mei Abu-el-Rub, Noor Schutte, Dalton Zhang, Rui Rouhizadeh, Masoud Osborne, John D He, Yongqun Topaloglu, Umit Hong, Stephanie S Saltz, Joel H Schaffter, Thomas Pfaff, Emily Chute, Christopher G Duong, Tim Haendel, Melissa A Fuentes, Rafael Szolovits, Peter Xu, Hua Liu, Hongfang J Am Med Inform Assoc Brief Communication Despite recent methodology advancements in clinical natural language processing (NLP), the adoption of clinical NLP models within the translational research community remains hindered by process heterogeneity and human factor variations. Concurrently, these factors also dramatically increase the difficulty in developing NLP models in multi-site settings, which is necessary for algorithm robustness and generalizability. Here, we reported on our experience developing an NLP solution for Coronavirus Disease 2019 (COVID-19) signs and symptom extraction in an open NLP framework from a subset of sites participating in the National COVID Cohort (N3C). We then empirically highlight the benefits of multi-site data for both symbolic and statistical methods, as well as highlight the need for federated annotation and evaluation to resolve several pitfalls encountered in the course of these efforts. Oxford University Press 2023-08-09 /pmc/articles/PMC10654844/ /pubmed/37555837 http://dx.doi.org/10.1093/jamia/ocad134 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Brief Communication
Liu, Sijia
Wen, Andrew
Wang, Liwei
He, Huan
Fu, Sunyang
Miller, Robert
Williams, Andrew
Harris, Daniel
Kavuluru, Ramakanth
Liu, Mei
Abu-el-Rub, Noor
Schutte, Dalton
Zhang, Rui
Rouhizadeh, Masoud
Osborne, John D
He, Yongqun
Topaloglu, Umit
Hong, Stephanie S
Saltz, Joel H
Schaffter, Thomas
Pfaff, Emily
Chute, Christopher G
Duong, Tim
Haendel, Melissa A
Fuentes, Rafael
Szolovits, Peter
Xu, Hua
Liu, Hongfang
An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C)
title An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C)
title_full An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C)
title_fullStr An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C)
title_full_unstemmed An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C)
title_short An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C)
title_sort open natural language processing (nlp) framework for ehr-based clinical research: a case demonstration using the national covid cohort collaborative (n3c)
topic Brief Communication
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654844/
https://www.ncbi.nlm.nih.gov/pubmed/37555837
http://dx.doi.org/10.1093/jamia/ocad134
work_keys_str_mv AT liusijia anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT wenandrew anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT wangliwei anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT hehuan anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT fusunyang anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT millerrobert anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT williamsandrew anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT harrisdaniel anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT kavulururamakanth anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT liumei anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT abuelrubnoor anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT schuttedalton anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT zhangrui anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT rouhizadehmasoud anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT osbornejohnd anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT heyongqun anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT topalogluumit anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT hongstephanies anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT saltzjoelh anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT schaffterthomas anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT pfaffemily anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT chutechristopherg anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT duongtim anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT haendelmelissaa anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT fuentesrafael anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT szolovitspeter anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT xuhua anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT liuhongfang anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT liusijia opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT wenandrew opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT wangliwei opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT hehuan opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT fusunyang opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT millerrobert opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT williamsandrew opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT harrisdaniel opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT kavulururamakanth opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT liumei opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT abuelrubnoor opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT schuttedalton opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT zhangrui opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT rouhizadehmasoud opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT osbornejohnd opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT heyongqun opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT topalogluumit opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT hongstephanies opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT saltzjoelh opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT schaffterthomas opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT pfaffemily opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT chutechristopherg opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT duongtim opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT haendelmelissaa opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT fuentesrafael opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT szolovitspeter opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT xuhua opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c
AT liuhongfang opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c