Cargando…

Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review

BACKGROUND: Novel approaches that complement and go beyond evidence-based medicine are required in the domain of chronic diseases, given the growing incidence of such conditions on the worldwide population. A promising avenue is the secondary use of electronic health records (EHRs), where patient da...

Descripción completa

Detalles Bibliográficos
Autores principales: Sheikhalishahi, Seyedmostafa, Miotto, Riccardo, Dudley, Joel T, Lavelli, Alberto, Rinaldi, Fabio, Osmani, Venet
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6528438/
https://www.ncbi.nlm.nih.gov/pubmed/31066697
http://dx.doi.org/10.2196/12239
_version_ 1783420217813106688
author Sheikhalishahi, Seyedmostafa
Miotto, Riccardo
Dudley, Joel T
Lavelli, Alberto
Rinaldi, Fabio
Osmani, Venet
author_facet Sheikhalishahi, Seyedmostafa
Miotto, Riccardo
Dudley, Joel T
Lavelli, Alberto
Rinaldi, Fabio
Osmani, Venet
author_sort Sheikhalishahi, Seyedmostafa
collection PubMed
description BACKGROUND: Novel approaches that complement and go beyond evidence-based medicine are required in the domain of chronic diseases, given the growing incidence of such conditions on the worldwide population. A promising avenue is the secondary use of electronic health records (EHRs), where patient data are analyzed to conduct clinical and translational research. Methods based on machine learning to process EHRs are resulting in improved understanding of patient clinical trajectories and chronic disease risk prediction, creating a unique opportunity to derive previously unknown clinical insights. However, a wealth of clinical histories remains locked behind clinical narratives in free-form text. Consequently, unlocking the full potential of EHR data is contingent on the development of natural language processing (NLP) methods to automatically transform clinical text into structured clinical data that can guide clinical decisions and potentially delay or prevent disease onset. OBJECTIVE: The goal of the research was to provide a comprehensive overview of the development and uptake of NLP methods applied to free-text clinical notes related to chronic diseases, including the investigation of challenges faced by NLP methodologies in understanding clinical narratives. METHODS: Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines were followed and searches were conducted in 5 databases using “clinical notes,” “natural language processing,” and “chronic disease” and their variations as keywords to maximize coverage of the articles. RESULTS: Of the 2652 articles considered, 106 met the inclusion criteria. Review of the included papers resulted in identification of 43 chronic diseases, which were then further classified into 10 disease categories using the International Classification of Diseases, 10th Revision. The majority of studies focused on diseases of the circulatory system (n=38) while endocrine and metabolic diseases were fewest (n=14). This was due to the structure of clinical records related to metabolic diseases, which typically contain much more structured data, compared with medical records for diseases of the circulatory system, which focus more on unstructured data and consequently have seen a stronger focus of NLP. The review has shown that there is a significant increase in the use of machine learning methods compared to rule-based approaches; however, deep learning methods remain emergent (n=3). Consequently, the majority of works focus on classification of disease phenotype with only a handful of papers addressing extraction of comorbidities from the free text or integration of clinical notes with structured data. There is a notable use of relatively simple methods, such as shallow classifiers (or combination with rule-based methods), due to the interpretability of predictions, which still represents a significant issue for more complex methods. Finally, scarcity of publicly available data may also have contributed to insufficient development of more advanced methods, such as extraction of word embeddings from clinical notes. CONCLUSIONS: Efforts are still required to improve (1) progression of clinical NLP methods from extraction toward understanding; (2) recognition of relations among entities rather than entities in isolation; (3) temporal extraction to understand past, current, and future clinical events; (4) exploitation of alternative sources of clinical knowledge; and (5) availability of large-scale, de-identified clinical corpora.
format Online
Article
Text
id pubmed-6528438
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-65284382019-06-07 Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review Sheikhalishahi, Seyedmostafa Miotto, Riccardo Dudley, Joel T Lavelli, Alberto Rinaldi, Fabio Osmani, Venet JMIR Med Inform Review BACKGROUND: Novel approaches that complement and go beyond evidence-based medicine are required in the domain of chronic diseases, given the growing incidence of such conditions on the worldwide population. A promising avenue is the secondary use of electronic health records (EHRs), where patient data are analyzed to conduct clinical and translational research. Methods based on machine learning to process EHRs are resulting in improved understanding of patient clinical trajectories and chronic disease risk prediction, creating a unique opportunity to derive previously unknown clinical insights. However, a wealth of clinical histories remains locked behind clinical narratives in free-form text. Consequently, unlocking the full potential of EHR data is contingent on the development of natural language processing (NLP) methods to automatically transform clinical text into structured clinical data that can guide clinical decisions and potentially delay or prevent disease onset. OBJECTIVE: The goal of the research was to provide a comprehensive overview of the development and uptake of NLP methods applied to free-text clinical notes related to chronic diseases, including the investigation of challenges faced by NLP methodologies in understanding clinical narratives. METHODS: Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines were followed and searches were conducted in 5 databases using “clinical notes,” “natural language processing,” and “chronic disease” and their variations as keywords to maximize coverage of the articles. RESULTS: Of the 2652 articles considered, 106 met the inclusion criteria. Review of the included papers resulted in identification of 43 chronic diseases, which were then further classified into 10 disease categories using the International Classification of Diseases, 10th Revision. The majority of studies focused on diseases of the circulatory system (n=38) while endocrine and metabolic diseases were fewest (n=14). This was due to the structure of clinical records related to metabolic diseases, which typically contain much more structured data, compared with medical records for diseases of the circulatory system, which focus more on unstructured data and consequently have seen a stronger focus of NLP. The review has shown that there is a significant increase in the use of machine learning methods compared to rule-based approaches; however, deep learning methods remain emergent (n=3). Consequently, the majority of works focus on classification of disease phenotype with only a handful of papers addressing extraction of comorbidities from the free text or integration of clinical notes with structured data. There is a notable use of relatively simple methods, such as shallow classifiers (or combination with rule-based methods), due to the interpretability of predictions, which still represents a significant issue for more complex methods. Finally, scarcity of publicly available data may also have contributed to insufficient development of more advanced methods, such as extraction of word embeddings from clinical notes. CONCLUSIONS: Efforts are still required to improve (1) progression of clinical NLP methods from extraction toward understanding; (2) recognition of relations among entities rather than entities in isolation; (3) temporal extraction to understand past, current, and future clinical events; (4) exploitation of alternative sources of clinical knowledge; and (5) availability of large-scale, de-identified clinical corpora. JMIR Publications 2019-04-27 /pmc/articles/PMC6528438/ /pubmed/31066697 http://dx.doi.org/10.2196/12239 Text en ©Seyedmostafa Sheikhalishahi, Riccardo Miotto, Joel T Dudley, Alberto Lavelli, Fabio Rinaldi, Venet Osmani. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 27.04.2019. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Review
Sheikhalishahi, Seyedmostafa
Miotto, Riccardo
Dudley, Joel T
Lavelli, Alberto
Rinaldi, Fabio
Osmani, Venet
Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review
title Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review
title_full Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review
title_fullStr Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review
title_full_unstemmed Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review
title_short Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review
title_sort natural language processing of clinical notes on chronic diseases: systematic review
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6528438/
https://www.ncbi.nlm.nih.gov/pubmed/31066697
http://dx.doi.org/10.2196/12239
work_keys_str_mv AT sheikhalishahiseyedmostafa naturallanguageprocessingofclinicalnotesonchronicdiseasessystematicreview
AT miottoriccardo naturallanguageprocessingofclinicalnotesonchronicdiseasessystematicreview
AT dudleyjoelt naturallanguageprocessingofclinicalnotesonchronicdiseasessystematicreview
AT lavellialberto naturallanguageprocessingofclinicalnotesonchronicdiseasessystematicreview
AT rinaldifabio naturallanguageprocessingofclinicalnotesonchronicdiseasessystematicreview
AT osmanivenet naturallanguageprocessingofclinicalnotesonchronicdiseasessystematicreview