Cargando…

Natural language processing for the assessment of cardiovascular disease comorbidities: The cardio‐Canary comorbidity project

Objective: Accurate ascertainment of comorbidities is paramount in clinical research. While manual adjudication is labor‐intensive and expensive, the adoption of electronic health records enables computational analysis of free‐text documentation using natural language processing (NLP) tools. Hypothe...

Descripción completa

Detalles Bibliográficos
Autores principales: Berman, Adam N., Biery, David W., Ginder, Curtis, Hulme, Olivia L., Marcusa, Daniel, Leiva, Orly, Wu, Wanda Y., Cardin, Nicholas, Hainer, Jon, Bhatt, Deepak L., Di Carli, Marcelo F., Turchin, Alexander, Blankstein, Ron
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Wiley Periodicals, Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8428009/
https://www.ncbi.nlm.nih.gov/pubmed/34347314
http://dx.doi.org/10.1002/clc.23687
_version_ 1783750291556925440
author Berman, Adam N.
Biery, David W.
Ginder, Curtis
Hulme, Olivia L.
Marcusa, Daniel
Leiva, Orly
Wu, Wanda Y.
Cardin, Nicholas
Hainer, Jon
Bhatt, Deepak L.
Di Carli, Marcelo F.
Turchin, Alexander
Blankstein, Ron
author_facet Berman, Adam N.
Biery, David W.
Ginder, Curtis
Hulme, Olivia L.
Marcusa, Daniel
Leiva, Orly
Wu, Wanda Y.
Cardin, Nicholas
Hainer, Jon
Bhatt, Deepak L.
Di Carli, Marcelo F.
Turchin, Alexander
Blankstein, Ron
author_sort Berman, Adam N.
collection PubMed
description Objective: Accurate ascertainment of comorbidities is paramount in clinical research. While manual adjudication is labor‐intensive and expensive, the adoption of electronic health records enables computational analysis of free‐text documentation using natural language processing (NLP) tools. Hypothesis: We sought to develop highly accurate NLP modules to assess for the presence of five key cardiovascular comorbidities in a large electronic health record system. Methods: One‐thousand clinical notes were randomly selected from a cardiovascular registry at Mass General Brigham. Trained physicians manually adjudicated these notes for the following five diagnostic comorbidities: hypertension, dyslipidemia, diabetes, coronary artery disease, and stroke/transient ischemic attack. Using the open‐source Canary NLP system, five separate NLP modules were designed based on 800 “training‐set” notes and validated on 200 “test‐set” notes. Results: Across the five NLP modules, the sentence‐level and note‐level sensitivity, specificity, and positive predictive value was always greater than 85% and was most often greater than 90%. Accuracy tended to be highest for conditions with greater diagnostic clarity (e.g. diabetes and hypertension) and slightly lower for conditions whose greater diagnostic challenges (e.g. myocardial infarction and embolic stroke) may lead to less definitive documentation. Conclusion: We designed five open‐source and highly accurate NLP modules that can be used to assess for the presence of important cardiovascular comorbidities in free‐text health records. These modules have been placed in the public domain and can be used for clinical research, trial recruitment and population management at any institution as well as serve as the basis for further development of cardiovascular NLP tools.
format Online
Article
Text
id pubmed-8428009
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Wiley Periodicals, Inc.
record_format MEDLINE/PubMed
spelling pubmed-84280092021-09-13 Natural language processing for the assessment of cardiovascular disease comorbidities: The cardio‐Canary comorbidity project Berman, Adam N. Biery, David W. Ginder, Curtis Hulme, Olivia L. Marcusa, Daniel Leiva, Orly Wu, Wanda Y. Cardin, Nicholas Hainer, Jon Bhatt, Deepak L. Di Carli, Marcelo F. Turchin, Alexander Blankstein, Ron Clin Cardiol Clinical Investigations Objective: Accurate ascertainment of comorbidities is paramount in clinical research. While manual adjudication is labor‐intensive and expensive, the adoption of electronic health records enables computational analysis of free‐text documentation using natural language processing (NLP) tools. Hypothesis: We sought to develop highly accurate NLP modules to assess for the presence of five key cardiovascular comorbidities in a large electronic health record system. Methods: One‐thousand clinical notes were randomly selected from a cardiovascular registry at Mass General Brigham. Trained physicians manually adjudicated these notes for the following five diagnostic comorbidities: hypertension, dyslipidemia, diabetes, coronary artery disease, and stroke/transient ischemic attack. Using the open‐source Canary NLP system, five separate NLP modules were designed based on 800 “training‐set” notes and validated on 200 “test‐set” notes. Results: Across the five NLP modules, the sentence‐level and note‐level sensitivity, specificity, and positive predictive value was always greater than 85% and was most often greater than 90%. Accuracy tended to be highest for conditions with greater diagnostic clarity (e.g. diabetes and hypertension) and slightly lower for conditions whose greater diagnostic challenges (e.g. myocardial infarction and embolic stroke) may lead to less definitive documentation. Conclusion: We designed five open‐source and highly accurate NLP modules that can be used to assess for the presence of important cardiovascular comorbidities in free‐text health records. These modules have been placed in the public domain and can be used for clinical research, trial recruitment and population management at any institution as well as serve as the basis for further development of cardiovascular NLP tools. Wiley Periodicals, Inc. 2021-08-04 /pmc/articles/PMC8428009/ /pubmed/34347314 http://dx.doi.org/10.1002/clc.23687 Text en © 2021 The Authors. Clinical Cardiology published by Wiley Periodicals LLC. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Clinical Investigations
Berman, Adam N.
Biery, David W.
Ginder, Curtis
Hulme, Olivia L.
Marcusa, Daniel
Leiva, Orly
Wu, Wanda Y.
Cardin, Nicholas
Hainer, Jon
Bhatt, Deepak L.
Di Carli, Marcelo F.
Turchin, Alexander
Blankstein, Ron
Natural language processing for the assessment of cardiovascular disease comorbidities: The cardio‐Canary comorbidity project
title Natural language processing for the assessment of cardiovascular disease comorbidities: The cardio‐Canary comorbidity project
title_full Natural language processing for the assessment of cardiovascular disease comorbidities: The cardio‐Canary comorbidity project
title_fullStr Natural language processing for the assessment of cardiovascular disease comorbidities: The cardio‐Canary comorbidity project
title_full_unstemmed Natural language processing for the assessment of cardiovascular disease comorbidities: The cardio‐Canary comorbidity project
title_short Natural language processing for the assessment of cardiovascular disease comorbidities: The cardio‐Canary comorbidity project
title_sort natural language processing for the assessment of cardiovascular disease comorbidities: the cardio‐canary comorbidity project
topic Clinical Investigations
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8428009/
https://www.ncbi.nlm.nih.gov/pubmed/34347314
http://dx.doi.org/10.1002/clc.23687
work_keys_str_mv AT bermanadamn naturallanguageprocessingfortheassessmentofcardiovasculardiseasecomorbiditiesthecardiocanarycomorbidityproject
AT bierydavidw naturallanguageprocessingfortheassessmentofcardiovasculardiseasecomorbiditiesthecardiocanarycomorbidityproject
AT gindercurtis naturallanguageprocessingfortheassessmentofcardiovasculardiseasecomorbiditiesthecardiocanarycomorbidityproject
AT hulmeolivial naturallanguageprocessingfortheassessmentofcardiovasculardiseasecomorbiditiesthecardiocanarycomorbidityproject
AT marcusadaniel naturallanguageprocessingfortheassessmentofcardiovasculardiseasecomorbiditiesthecardiocanarycomorbidityproject
AT leivaorly naturallanguageprocessingfortheassessmentofcardiovasculardiseasecomorbiditiesthecardiocanarycomorbidityproject
AT wuwanday naturallanguageprocessingfortheassessmentofcardiovasculardiseasecomorbiditiesthecardiocanarycomorbidityproject
AT cardinnicholas naturallanguageprocessingfortheassessmentofcardiovasculardiseasecomorbiditiesthecardiocanarycomorbidityproject
AT hainerjon naturallanguageprocessingfortheassessmentofcardiovasculardiseasecomorbiditiesthecardiocanarycomorbidityproject
AT bhattdeepakl naturallanguageprocessingfortheassessmentofcardiovasculardiseasecomorbiditiesthecardiocanarycomorbidityproject
AT dicarlimarcelof naturallanguageprocessingfortheassessmentofcardiovasculardiseasecomorbiditiesthecardiocanarycomorbidityproject
AT turchinalexander naturallanguageprocessingfortheassessmentofcardiovasculardiseasecomorbiditiesthecardiocanarycomorbidityproject
AT blanksteinron naturallanguageprocessingfortheassessmentofcardiovasculardiseasecomorbiditiesthecardiocanarycomorbidityproject