Cargando…

Large-scale identification of aortic stenosis and its severity using natural language processing on electronic health records

BACKGROUND: Systematic case identification is critical to improving population health, but widely used diagnosis code–based approaches for conditions like valvular heart disease are inaccurate and lack specificity. OBJECTIVE: To develop and validate natural language processing (NLP) algorithms to id...

Descripción completa

Detalles Bibliográficos
Autores principales: Solomon, Matthew D., Tabada, Grace, Allen, Amanda, Sung, Sue Hee, Go, Alan S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8890044/
https://www.ncbi.nlm.nih.gov/pubmed/35265904
http://dx.doi.org/10.1016/j.cvdhj.2021.03.003
_version_ 1784661545726246912
author Solomon, Matthew D.
Tabada, Grace
Allen, Amanda
Sung, Sue Hee
Go, Alan S.
author_facet Solomon, Matthew D.
Tabada, Grace
Allen, Amanda
Sung, Sue Hee
Go, Alan S.
author_sort Solomon, Matthew D.
collection PubMed
description BACKGROUND: Systematic case identification is critical to improving population health, but widely used diagnosis code–based approaches for conditions like valvular heart disease are inaccurate and lack specificity. OBJECTIVE: To develop and validate natural language processing (NLP) algorithms to identify aortic stenosis (AS) cases and associated parameters from semi-structured echocardiogram reports and compare their accuracy to administrative diagnosis codes. METHODS: Using 1003 physician-adjudicated echocardiogram reports from Kaiser Permanente Northern California, a large, integrated healthcare system (>4.5 million members), NLP algorithms were developed and validated to achieve positive and negative predictive values > 95% for identifying AS and associated echocardiographic parameters. Final NLP algorithms were applied to all adult echocardiography reports performed between 2008 and 2018 and compared to ICD-9/10 diagnosis code–based definitions for AS found from 14 days before to 6 months after the procedure date. RESULTS: A total of 927,884 eligible echocardiograms were identified during the study period among 519,967 patients. Application of the final NLP algorithm classified 104,090 (11.2%) echocardiograms with any AS (mean age 75.2 years, 52% women), with only 67,297 (64.6%) having a diagnosis code for AS between 14 days before and up to 6 months after the associated echocardiogram. Among those without associated diagnosis codes, 19% of patients had hemodynamically significant AS (ie, greater than mild disease). CONCLUSION: A validated NLP algorithm applied to a systemwide echocardiography database was substantially more accurate than diagnosis codes for identifying AS. Leveraging machine learning–based approaches on unstructured electronic health record data can facilitate more effective individual and population management than using administrative data alone.
format Online
Article
Text
id pubmed-8890044
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-88900442022-03-08 Large-scale identification of aortic stenosis and its severity using natural language processing on electronic health records Solomon, Matthew D. Tabada, Grace Allen, Amanda Sung, Sue Hee Go, Alan S. Cardiovasc Digit Health J Clinical BACKGROUND: Systematic case identification is critical to improving population health, but widely used diagnosis code–based approaches for conditions like valvular heart disease are inaccurate and lack specificity. OBJECTIVE: To develop and validate natural language processing (NLP) algorithms to identify aortic stenosis (AS) cases and associated parameters from semi-structured echocardiogram reports and compare their accuracy to administrative diagnosis codes. METHODS: Using 1003 physician-adjudicated echocardiogram reports from Kaiser Permanente Northern California, a large, integrated healthcare system (>4.5 million members), NLP algorithms were developed and validated to achieve positive and negative predictive values > 95% for identifying AS and associated echocardiographic parameters. Final NLP algorithms were applied to all adult echocardiography reports performed between 2008 and 2018 and compared to ICD-9/10 diagnosis code–based definitions for AS found from 14 days before to 6 months after the procedure date. RESULTS: A total of 927,884 eligible echocardiograms were identified during the study period among 519,967 patients. Application of the final NLP algorithm classified 104,090 (11.2%) echocardiograms with any AS (mean age 75.2 years, 52% women), with only 67,297 (64.6%) having a diagnosis code for AS between 14 days before and up to 6 months after the associated echocardiogram. Among those without associated diagnosis codes, 19% of patients had hemodynamically significant AS (ie, greater than mild disease). CONCLUSION: A validated NLP algorithm applied to a systemwide echocardiography database was substantially more accurate than diagnosis codes for identifying AS. Leveraging machine learning–based approaches on unstructured electronic health record data can facilitate more effective individual and population management than using administrative data alone. Elsevier 2021-03-18 /pmc/articles/PMC8890044/ /pubmed/35265904 http://dx.doi.org/10.1016/j.cvdhj.2021.03.003 Text en © 2021 Heart Rhythm Society. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Clinical
Solomon, Matthew D.
Tabada, Grace
Allen, Amanda
Sung, Sue Hee
Go, Alan S.
Large-scale identification of aortic stenosis and its severity using natural language processing on electronic health records
title Large-scale identification of aortic stenosis and its severity using natural language processing on electronic health records
title_full Large-scale identification of aortic stenosis and its severity using natural language processing on electronic health records
title_fullStr Large-scale identification of aortic stenosis and its severity using natural language processing on electronic health records
title_full_unstemmed Large-scale identification of aortic stenosis and its severity using natural language processing on electronic health records
title_short Large-scale identification of aortic stenosis and its severity using natural language processing on electronic health records
title_sort large-scale identification of aortic stenosis and its severity using natural language processing on electronic health records
topic Clinical
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8890044/
https://www.ncbi.nlm.nih.gov/pubmed/35265904
http://dx.doi.org/10.1016/j.cvdhj.2021.03.003
work_keys_str_mv AT solomonmatthewd largescaleidentificationofaorticstenosisanditsseverityusingnaturallanguageprocessingonelectronichealthrecords
AT tabadagrace largescaleidentificationofaorticstenosisanditsseverityusingnaturallanguageprocessingonelectronichealthrecords
AT allenamanda largescaleidentificationofaorticstenosisanditsseverityusingnaturallanguageprocessingonelectronichealthrecords
AT sungsuehee largescaleidentificationofaorticstenosisanditsseverityusingnaturallanguageprocessingonelectronichealthrecords
AT goalans largescaleidentificationofaorticstenosisanditsseverityusingnaturallanguageprocessingonelectronichealthrecords