Cargando…

A natural language processing pipeline to synthesize patient-generated notes toward improving remote care and chronic disease management: a cystic fibrosis case study

OBJECTIVES: Patient-generated health data (PGHD) are important for tracking and monitoring out of clinic health events and supporting shared clinical decisions. Unstructured text as PGHD (eg, medical diary notes and transcriptions) may encapsulate rich information through narratives which can be cri...

Descripción completa

Detalles Bibliográficos
Autores principales: Hussain, Syed-Amad, Sezgin, Emre, Krivchenia, Katelyn, Luna, John, Rust, Steve, Huang, Yungui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8480545/
https://www.ncbi.nlm.nih.gov/pubmed/34604710
http://dx.doi.org/10.1093/jamiaopen/ooab084
_version_ 1784576474904264704
author Hussain, Syed-Amad
Sezgin, Emre
Krivchenia, Katelyn
Luna, John
Rust, Steve
Huang, Yungui
author_facet Hussain, Syed-Amad
Sezgin, Emre
Krivchenia, Katelyn
Luna, John
Rust, Steve
Huang, Yungui
author_sort Hussain, Syed-Amad
collection PubMed
description OBJECTIVES: Patient-generated health data (PGHD) are important for tracking and monitoring out of clinic health events and supporting shared clinical decisions. Unstructured text as PGHD (eg, medical diary notes and transcriptions) may encapsulate rich information through narratives which can be critical to better understand a patient’s condition. We propose a natural language processing (NLP) supported data synthesis pipeline for unstructured PGHD, focusing on children with special healthcare needs (CSHCN), and demonstrate it with a case study on cystic fibrosis (CF). MATERIALS AND METHODS: The proposed unstructured data synthesis and information extraction pipeline extract a broad range of health information by combining rule-based approaches with pretrained deep-learning models. Particularly, we build upon the scispaCy biomedical model suite, leveraging its named entity recognition capabilities to identify and link clinically relevant entities to established ontologies such as Systematized Nomenclature of Medicine (SNOMED) and RXNORM. We then use scispaCy’s syntax (grammar) parsing tools to retrieve phrases associated with the entities in medication, dose, therapies, symptoms, bowel movements, and nutrition ontological categories. The pipeline is illustrated and tested with simulated CF patient notes. RESULTS: The proposed hybrid deep-learning rule-based approach can operate over a variety of natural language note types and allow customization for a given patient or cohort. Viable information was successfully extracted from simulated CF notes. This hybrid pipeline is robust to misspellings and varied word representations and can be tailored to accommodate the needs of a specific patient, cohort, or clinician. DISCUSSION: The NLP pipeline can extract predefined or ontology-based entities from free-text PGHD, aiming to facilitate remote care and improve chronic disease management. Our implementation makes use of open source models, allowing for this solution to be easily replicated and integrated in different health systems. Outside of the clinic, the use of the NLP pipeline may increase the amount of clinical data recorded by families of CSHCN and ease the process to identify health events from the notes. Similarly, care coordinators, nurses and clinicians would be able to track adherence with medications, identify symptoms, and effectively intervene to improve clinical care. Furthermore, visualization tools can be applied to digest the structured data produced by the pipeline in support of the decision-making process for a patient, caregiver, or provider. CONCLUSION: Our study demonstrated that an NLP pipeline can be used to create an automated analysis and reporting mechanism for unstructured PGHD. Further studies are suggested with real-world data to assess pipeline performance and further implications.
format Online
Article
Text
id pubmed-8480545
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-84805452021-09-30 A natural language processing pipeline to synthesize patient-generated notes toward improving remote care and chronic disease management: a cystic fibrosis case study Hussain, Syed-Amad Sezgin, Emre Krivchenia, Katelyn Luna, John Rust, Steve Huang, Yungui JAMIA Open Research and Applications OBJECTIVES: Patient-generated health data (PGHD) are important for tracking and monitoring out of clinic health events and supporting shared clinical decisions. Unstructured text as PGHD (eg, medical diary notes and transcriptions) may encapsulate rich information through narratives which can be critical to better understand a patient’s condition. We propose a natural language processing (NLP) supported data synthesis pipeline for unstructured PGHD, focusing on children with special healthcare needs (CSHCN), and demonstrate it with a case study on cystic fibrosis (CF). MATERIALS AND METHODS: The proposed unstructured data synthesis and information extraction pipeline extract a broad range of health information by combining rule-based approaches with pretrained deep-learning models. Particularly, we build upon the scispaCy biomedical model suite, leveraging its named entity recognition capabilities to identify and link clinically relevant entities to established ontologies such as Systematized Nomenclature of Medicine (SNOMED) and RXNORM. We then use scispaCy’s syntax (grammar) parsing tools to retrieve phrases associated with the entities in medication, dose, therapies, symptoms, bowel movements, and nutrition ontological categories. The pipeline is illustrated and tested with simulated CF patient notes. RESULTS: The proposed hybrid deep-learning rule-based approach can operate over a variety of natural language note types and allow customization for a given patient or cohort. Viable information was successfully extracted from simulated CF notes. This hybrid pipeline is robust to misspellings and varied word representations and can be tailored to accommodate the needs of a specific patient, cohort, or clinician. DISCUSSION: The NLP pipeline can extract predefined or ontology-based entities from free-text PGHD, aiming to facilitate remote care and improve chronic disease management. Our implementation makes use of open source models, allowing for this solution to be easily replicated and integrated in different health systems. Outside of the clinic, the use of the NLP pipeline may increase the amount of clinical data recorded by families of CSHCN and ease the process to identify health events from the notes. Similarly, care coordinators, nurses and clinicians would be able to track adherence with medications, identify symptoms, and effectively intervene to improve clinical care. Furthermore, visualization tools can be applied to digest the structured data produced by the pipeline in support of the decision-making process for a patient, caregiver, or provider. CONCLUSION: Our study demonstrated that an NLP pipeline can be used to create an automated analysis and reporting mechanism for unstructured PGHD. Further studies are suggested with real-world data to assess pipeline performance and further implications. Oxford University Press 2021-09-29 /pmc/articles/PMC8480545/ /pubmed/34604710 http://dx.doi.org/10.1093/jamiaopen/ooab084 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Research and Applications
Hussain, Syed-Amad
Sezgin, Emre
Krivchenia, Katelyn
Luna, John
Rust, Steve
Huang, Yungui
A natural language processing pipeline to synthesize patient-generated notes toward improving remote care and chronic disease management: a cystic fibrosis case study
title A natural language processing pipeline to synthesize patient-generated notes toward improving remote care and chronic disease management: a cystic fibrosis case study
title_full A natural language processing pipeline to synthesize patient-generated notes toward improving remote care and chronic disease management: a cystic fibrosis case study
title_fullStr A natural language processing pipeline to synthesize patient-generated notes toward improving remote care and chronic disease management: a cystic fibrosis case study
title_full_unstemmed A natural language processing pipeline to synthesize patient-generated notes toward improving remote care and chronic disease management: a cystic fibrosis case study
title_short A natural language processing pipeline to synthesize patient-generated notes toward improving remote care and chronic disease management: a cystic fibrosis case study
title_sort natural language processing pipeline to synthesize patient-generated notes toward improving remote care and chronic disease management: a cystic fibrosis case study
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8480545/
https://www.ncbi.nlm.nih.gov/pubmed/34604710
http://dx.doi.org/10.1093/jamiaopen/ooab084
work_keys_str_mv AT hussainsyedamad anaturallanguageprocessingpipelinetosynthesizepatientgeneratednotestowardimprovingremotecareandchronicdiseasemanagementacysticfibrosiscasestudy
AT sezginemre anaturallanguageprocessingpipelinetosynthesizepatientgeneratednotestowardimprovingremotecareandchronicdiseasemanagementacysticfibrosiscasestudy
AT krivcheniakatelyn anaturallanguageprocessingpipelinetosynthesizepatientgeneratednotestowardimprovingremotecareandchronicdiseasemanagementacysticfibrosiscasestudy
AT lunajohn anaturallanguageprocessingpipelinetosynthesizepatientgeneratednotestowardimprovingremotecareandchronicdiseasemanagementacysticfibrosiscasestudy
AT ruststeve anaturallanguageprocessingpipelinetosynthesizepatientgeneratednotestowardimprovingremotecareandchronicdiseasemanagementacysticfibrosiscasestudy
AT huangyungui anaturallanguageprocessingpipelinetosynthesizepatientgeneratednotestowardimprovingremotecareandchronicdiseasemanagementacysticfibrosiscasestudy
AT hussainsyedamad naturallanguageprocessingpipelinetosynthesizepatientgeneratednotestowardimprovingremotecareandchronicdiseasemanagementacysticfibrosiscasestudy
AT sezginemre naturallanguageprocessingpipelinetosynthesizepatientgeneratednotestowardimprovingremotecareandchronicdiseasemanagementacysticfibrosiscasestudy
AT krivcheniakatelyn naturallanguageprocessingpipelinetosynthesizepatientgeneratednotestowardimprovingremotecareandchronicdiseasemanagementacysticfibrosiscasestudy
AT lunajohn naturallanguageprocessingpipelinetosynthesizepatientgeneratednotestowardimprovingremotecareandchronicdiseasemanagementacysticfibrosiscasestudy
AT ruststeve naturallanguageprocessingpipelinetosynthesizepatientgeneratednotestowardimprovingremotecareandchronicdiseasemanagementacysticfibrosiscasestudy
AT huangyungui naturallanguageprocessingpipelinetosynthesizepatientgeneratednotestowardimprovingremotecareandchronicdiseasemanagementacysticfibrosiscasestudy