Cargando…
An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C)
Despite recent methodology advancements in clinical natural language processing (NLP), the adoption of clinical NLP models within the translational research community remains hindered by process heterogeneity and human factor variations. Concurrently, these factors also dramatically increase the dif...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654844/ https://www.ncbi.nlm.nih.gov/pubmed/37555837 http://dx.doi.org/10.1093/jamia/ocad134 |
_version_ | 1785136705421967360 |
---|---|
author | Liu, Sijia Wen, Andrew Wang, Liwei He, Huan Fu, Sunyang Miller, Robert Williams, Andrew Harris, Daniel Kavuluru, Ramakanth Liu, Mei Abu-el-Rub, Noor Schutte, Dalton Zhang, Rui Rouhizadeh, Masoud Osborne, John D He, Yongqun Topaloglu, Umit Hong, Stephanie S Saltz, Joel H Schaffter, Thomas Pfaff, Emily Chute, Christopher G Duong, Tim Haendel, Melissa A Fuentes, Rafael Szolovits, Peter Xu, Hua Liu, Hongfang |
author_facet | Liu, Sijia Wen, Andrew Wang, Liwei He, Huan Fu, Sunyang Miller, Robert Williams, Andrew Harris, Daniel Kavuluru, Ramakanth Liu, Mei Abu-el-Rub, Noor Schutte, Dalton Zhang, Rui Rouhizadeh, Masoud Osborne, John D He, Yongqun Topaloglu, Umit Hong, Stephanie S Saltz, Joel H Schaffter, Thomas Pfaff, Emily Chute, Christopher G Duong, Tim Haendel, Melissa A Fuentes, Rafael Szolovits, Peter Xu, Hua Liu, Hongfang |
author_sort | Liu, Sijia |
collection | PubMed |
description | Despite recent methodology advancements in clinical natural language processing (NLP), the adoption of clinical NLP models within the translational research community remains hindered by process heterogeneity and human factor variations. Concurrently, these factors also dramatically increase the difficulty in developing NLP models in multi-site settings, which is necessary for algorithm robustness and generalizability. Here, we reported on our experience developing an NLP solution for Coronavirus Disease 2019 (COVID-19) signs and symptom extraction in an open NLP framework from a subset of sites participating in the National COVID Cohort (N3C). We then empirically highlight the benefits of multi-site data for both symbolic and statistical methods, as well as highlight the need for federated annotation and evaluation to resolve several pitfalls encountered in the course of these efforts. |
format | Online Article Text |
id | pubmed-10654844 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-106548442023-08-09 An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C) Liu, Sijia Wen, Andrew Wang, Liwei He, Huan Fu, Sunyang Miller, Robert Williams, Andrew Harris, Daniel Kavuluru, Ramakanth Liu, Mei Abu-el-Rub, Noor Schutte, Dalton Zhang, Rui Rouhizadeh, Masoud Osborne, John D He, Yongqun Topaloglu, Umit Hong, Stephanie S Saltz, Joel H Schaffter, Thomas Pfaff, Emily Chute, Christopher G Duong, Tim Haendel, Melissa A Fuentes, Rafael Szolovits, Peter Xu, Hua Liu, Hongfang J Am Med Inform Assoc Brief Communication Despite recent methodology advancements in clinical natural language processing (NLP), the adoption of clinical NLP models within the translational research community remains hindered by process heterogeneity and human factor variations. Concurrently, these factors also dramatically increase the difficulty in developing NLP models in multi-site settings, which is necessary for algorithm robustness and generalizability. Here, we reported on our experience developing an NLP solution for Coronavirus Disease 2019 (COVID-19) signs and symptom extraction in an open NLP framework from a subset of sites participating in the National COVID Cohort (N3C). We then empirically highlight the benefits of multi-site data for both symbolic and statistical methods, as well as highlight the need for federated annotation and evaluation to resolve several pitfalls encountered in the course of these efforts. Oxford University Press 2023-08-09 /pmc/articles/PMC10654844/ /pubmed/37555837 http://dx.doi.org/10.1093/jamia/ocad134 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Brief Communication Liu, Sijia Wen, Andrew Wang, Liwei He, Huan Fu, Sunyang Miller, Robert Williams, Andrew Harris, Daniel Kavuluru, Ramakanth Liu, Mei Abu-el-Rub, Noor Schutte, Dalton Zhang, Rui Rouhizadeh, Masoud Osborne, John D He, Yongqun Topaloglu, Umit Hong, Stephanie S Saltz, Joel H Schaffter, Thomas Pfaff, Emily Chute, Christopher G Duong, Tim Haendel, Melissa A Fuentes, Rafael Szolovits, Peter Xu, Hua Liu, Hongfang An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C) |
title | An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C) |
title_full | An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C) |
title_fullStr | An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C) |
title_full_unstemmed | An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C) |
title_short | An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C) |
title_sort | open natural language processing (nlp) framework for ehr-based clinical research: a case demonstration using the national covid cohort collaborative (n3c) |
topic | Brief Communication |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654844/ https://www.ncbi.nlm.nih.gov/pubmed/37555837 http://dx.doi.org/10.1093/jamia/ocad134 |
work_keys_str_mv | AT liusijia anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT wenandrew anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT wangliwei anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT hehuan anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT fusunyang anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT millerrobert anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT williamsandrew anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT harrisdaniel anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT kavulururamakanth anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT liumei anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT abuelrubnoor anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT schuttedalton anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT zhangrui anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT rouhizadehmasoud anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT osbornejohnd anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT heyongqun anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT topalogluumit anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT hongstephanies anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT saltzjoelh anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT schaffterthomas anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT pfaffemily anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT chutechristopherg anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT duongtim anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT haendelmelissaa anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT fuentesrafael anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT szolovitspeter anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT xuhua anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT liuhongfang anopennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT liusijia opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT wenandrew opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT wangliwei opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT hehuan opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT fusunyang opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT millerrobert opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT williamsandrew opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT harrisdaniel opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT kavulururamakanth opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT liumei opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT abuelrubnoor opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT schuttedalton opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT zhangrui opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT rouhizadehmasoud opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT osbornejohnd opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT heyongqun opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT topalogluumit opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT hongstephanies opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT saltzjoelh opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT schaffterthomas opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT pfaffemily opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT chutechristopherg opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT duongtim opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT haendelmelissaa opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT fuentesrafael opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT szolovitspeter opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT xuhua opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c AT liuhongfang opennaturallanguageprocessingnlpframeworkforehrbasedclinicalresearchacasedemonstrationusingthenationalcovidcohortcollaborativen3c |