Cargando…

Natural Language Processing to Improve Prediction of Incident Atrial Fibrillation Using Electronic Health Records

BACKGROUND: Models predicting atrial fibrillation (AF) risk, such as Cohorts for Heart and Aging Research in Genomic Epidemiology AF (CHARGE‐AF), have not performed as well in electronic health records. Natural language processing (NLP) may improve models by using narrative electronic health record...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ashburner, Jeffrey M., Chang, Yuchiao, Wang, Xin, Khurshid, Shaan, Anderson, Christopher D., Dahal, Kumar, Weisenfeld, Dana, Cai, Tianrun, Liao, Katherine P., Wagholikar, Kavishwar B., Murphy, Shawn N., Atlas, Steven J., Lubitz, Steven A., Singer, Daniel E.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	John Wiley and Sons Inc. 2022
Materias:	Original Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9375475/ https://www.ncbi.nlm.nih.gov/pubmed/35904194 http://dx.doi.org/10.1161/JAHA.122.026014

_version_	1784767971315417088
author	Ashburner, Jeffrey M. Chang, Yuchiao Wang, Xin Khurshid, Shaan Anderson, Christopher D. Dahal, Kumar Weisenfeld, Dana Cai, Tianrun Liao, Katherine P. Wagholikar, Kavishwar B. Murphy, Shawn N. Atlas, Steven J. Lubitz, Steven A. Singer, Daniel E.
author_facet	Ashburner, Jeffrey M. Chang, Yuchiao Wang, Xin Khurshid, Shaan Anderson, Christopher D. Dahal, Kumar Weisenfeld, Dana Cai, Tianrun Liao, Katherine P. Wagholikar, Kavishwar B. Murphy, Shawn N. Atlas, Steven J. Lubitz, Steven A. Singer, Daniel E.
author_sort	Ashburner, Jeffrey M.
collection	PubMed
description	BACKGROUND: Models predicting atrial fibrillation (AF) risk, such as Cohorts for Heart and Aging Research in Genomic Epidemiology AF (CHARGE‐AF), have not performed as well in electronic health records. Natural language processing (NLP) may improve models by using narrative electronic health record text. METHODS AND RESULTS: From a primary care network, we included patients aged ≥65 years with visits between 2003 and 2013 in development (n=32 960) and internal validation cohorts (n=13 992). An external validation cohort from a separate network from 2015 to 2020 included 39 051 patients. Model features were defined using electronic health record codified data and narrative data with NLP. We developed 2 models to predict 5‐year AF incidence using (1) codified+NLP data and (2) codified data only and evaluated model performance. The analysis included 2839 incident AF cases in the development cohort and 1057 and 2226 cases in internal and external validation cohorts, respectively. The C‐statistic was greater (P<0.001) in codified+NLP model (0.744 [95% CI, 0.735–0.753]) compared with codified‐only (0.730 [95% CI, 0.720–0.739]) in the development cohort. In internal validation, the C‐statistic of codified+NLP was modestly higher (0.735 [95% CI, 0.720–0.749]) compared with codified‐only (0.729 [95% CI, 0.715–0.744]; P=0.06) and CHARGE‐AF (0.717 [95% CI, 0.703–0.731]; P=0.002). Codified+NLP and codified‐only were well calibrated, whereas CHARGE‐AF underestimated AF risk. In external validation, the C‐statistic of codified+NLP (0.750 [95% CI, 0.740–0.760]) remained higher (P<0.001) than codified‐only (0.738 [95% CI, 0.727–0.748]) and CHARGE‐AF (0.735 [95% CI, 0.725–0.746]). CONCLUSIONS: Estimation of 5‐year risk of AF can be modestly improved using NLP to incorporate narrative electronic health record data.
format	Online Article Text
id	pubmed-9375475
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	John Wiley and Sons Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-93754752022-08-17 Natural Language Processing to Improve Prediction of Incident Atrial Fibrillation Using Electronic Health Records Ashburner, Jeffrey M. Chang, Yuchiao Wang, Xin Khurshid, Shaan Anderson, Christopher D. Dahal, Kumar Weisenfeld, Dana Cai, Tianrun Liao, Katherine P. Wagholikar, Kavishwar B. Murphy, Shawn N. Atlas, Steven J. Lubitz, Steven A. Singer, Daniel E. J Am Heart Assoc Original Research BACKGROUND: Models predicting atrial fibrillation (AF) risk, such as Cohorts for Heart and Aging Research in Genomic Epidemiology AF (CHARGE‐AF), have not performed as well in electronic health records. Natural language processing (NLP) may improve models by using narrative electronic health record text. METHODS AND RESULTS: From a primary care network, we included patients aged ≥65 years with visits between 2003 and 2013 in development (n=32 960) and internal validation cohorts (n=13 992). An external validation cohort from a separate network from 2015 to 2020 included 39 051 patients. Model features were defined using electronic health record codified data and narrative data with NLP. We developed 2 models to predict 5‐year AF incidence using (1) codified+NLP data and (2) codified data only and evaluated model performance. The analysis included 2839 incident AF cases in the development cohort and 1057 and 2226 cases in internal and external validation cohorts, respectively. The C‐statistic was greater (P<0.001) in codified+NLP model (0.744 [95% CI, 0.735–0.753]) compared with codified‐only (0.730 [95% CI, 0.720–0.739]) in the development cohort. In internal validation, the C‐statistic of codified+NLP was modestly higher (0.735 [95% CI, 0.720–0.749]) compared with codified‐only (0.729 [95% CI, 0.715–0.744]; P=0.06) and CHARGE‐AF (0.717 [95% CI, 0.703–0.731]; P=0.002). Codified+NLP and codified‐only were well calibrated, whereas CHARGE‐AF underestimated AF risk. In external validation, the C‐statistic of codified+NLP (0.750 [95% CI, 0.740–0.760]) remained higher (P<0.001) than codified‐only (0.738 [95% CI, 0.727–0.748]) and CHARGE‐AF (0.735 [95% CI, 0.725–0.746]). CONCLUSIONS: Estimation of 5‐year risk of AF can be modestly improved using NLP to incorporate narrative electronic health record data. John Wiley and Sons Inc. 2022-07-29 /pmc/articles/PMC9375475/ /pubmed/35904194 http://dx.doi.org/10.1161/JAHA.122.026014 Text en © 2022 The Authors. Published on behalf of the American Heart Association, Inc., by Wiley. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc-nd/4.0/ (https://creativecommons.org/licenses/by-nc-nd/4.0/) License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.
spellingShingle	Original Research Ashburner, Jeffrey M. Chang, Yuchiao Wang, Xin Khurshid, Shaan Anderson, Christopher D. Dahal, Kumar Weisenfeld, Dana Cai, Tianrun Liao, Katherine P. Wagholikar, Kavishwar B. Murphy, Shawn N. Atlas, Steven J. Lubitz, Steven A. Singer, Daniel E. Natural Language Processing to Improve Prediction of Incident Atrial Fibrillation Using Electronic Health Records
title	Natural Language Processing to Improve Prediction of Incident Atrial Fibrillation Using Electronic Health Records
title_full	Natural Language Processing to Improve Prediction of Incident Atrial Fibrillation Using Electronic Health Records
title_fullStr	Natural Language Processing to Improve Prediction of Incident Atrial Fibrillation Using Electronic Health Records
title_full_unstemmed	Natural Language Processing to Improve Prediction of Incident Atrial Fibrillation Using Electronic Health Records
title_short	Natural Language Processing to Improve Prediction of Incident Atrial Fibrillation Using Electronic Health Records
title_sort	natural language processing to improve prediction of incident atrial fibrillation using electronic health records
topic	Original Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9375475/ https://www.ncbi.nlm.nih.gov/pubmed/35904194 http://dx.doi.org/10.1161/JAHA.122.026014
work_keys_str_mv	AT ashburnerjeffreym naturallanguageprocessingtoimprovepredictionofincidentatrialfibrillationusingelectronichealthrecords AT changyuchiao naturallanguageprocessingtoimprovepredictionofincidentatrialfibrillationusingelectronichealthrecords AT wangxin naturallanguageprocessingtoimprovepredictionofincidentatrialfibrillationusingelectronichealthrecords AT khurshidshaan naturallanguageprocessingtoimprovepredictionofincidentatrialfibrillationusingelectronichealthrecords AT andersonchristopherd naturallanguageprocessingtoimprovepredictionofincidentatrialfibrillationusingelectronichealthrecords AT dahalkumar naturallanguageprocessingtoimprovepredictionofincidentatrialfibrillationusingelectronichealthrecords AT weisenfelddana naturallanguageprocessingtoimprovepredictionofincidentatrialfibrillationusingelectronichealthrecords AT caitianrun naturallanguageprocessingtoimprovepredictionofincidentatrialfibrillationusingelectronichealthrecords AT liaokatherinep naturallanguageprocessingtoimprovepredictionofincidentatrialfibrillationusingelectronichealthrecords AT wagholikarkavishwarb naturallanguageprocessingtoimprovepredictionofincidentatrialfibrillationusingelectronichealthrecords AT murphyshawnn naturallanguageprocessingtoimprovepredictionofincidentatrialfibrillationusingelectronichealthrecords AT atlasstevenj naturallanguageprocessingtoimprovepredictionofincidentatrialfibrillationusingelectronichealthrecords AT lubitzstevena naturallanguageprocessingtoimprovepredictionofincidentatrialfibrillationusingelectronichealthrecords AT singerdaniele naturallanguageprocessingtoimprovepredictionofincidentatrialfibrillationusingelectronichealthrecords

Natural Language Processing to Improve Prediction of Incident Atrial Fibrillation Using Electronic Health Records

Ejemplares similares