Cargando…

An independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records

Electronic health records (EHR) provide an unprecedented opportunity to conduct large, cost-efficient, population-based studies. However, the studies of heterogeneous diseases, such as chronic obstructive pulmonary disease (COPD), often require labor-intensive clinical review and testing, limiting w...

Descripción completa

Detalles Bibliográficos
Autores principales: Chu, Su H., Wan, Emily S., Cho, Michael H., Goryachev, Sergey, Gainer, Vivian, Linneman, James, Scotty, Erica J., Hebbring, Scott J., Murphy, Shawn, Lasky-Su, Jessica, Weiss, Scott T., Smoller, Jordan W., Karlson, Elizabeth
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8497529/
https://www.ncbi.nlm.nih.gov/pubmed/34620889
http://dx.doi.org/10.1038/s41598-021-98719-w
_version_ 1784579974248792064
author Chu, Su H.
Wan, Emily S.
Cho, Michael H.
Goryachev, Sergey
Gainer, Vivian
Linneman, James
Scotty, Erica J.
Hebbring, Scott J.
Murphy, Shawn
Lasky-Su, Jessica
Weiss, Scott T.
Smoller, Jordan W.
Karlson, Elizabeth
author_facet Chu, Su H.
Wan, Emily S.
Cho, Michael H.
Goryachev, Sergey
Gainer, Vivian
Linneman, James
Scotty, Erica J.
Hebbring, Scott J.
Murphy, Shawn
Lasky-Su, Jessica
Weiss, Scott T.
Smoller, Jordan W.
Karlson, Elizabeth
author_sort Chu, Su H.
collection PubMed
description Electronic health records (EHR) provide an unprecedented opportunity to conduct large, cost-efficient, population-based studies. However, the studies of heterogeneous diseases, such as chronic obstructive pulmonary disease (COPD), often require labor-intensive clinical review and testing, limiting widespread use of these important resources. To develop a generalizable and efficient method for accurate identification of large COPD cohorts in EHRs, a COPD datamart was developed from 3420 participants meeting inclusion criteria in the Mass General Brigham Biobank. Training and test sets were selected and labeled with gold-standard COPD classifications obtained from chart review by pulmonologists. Multiple classes of algorithms were built utilizing both structured (e.g. ICD codes) and unstructured (e.g. medical notes) data via elastic net regression. Models explicitly including and excluding spirometry features were compared. External validation of the final algorithm was conducted in an independent biobank with a different EHR system. The final COPD classification model demonstrated excellent positive predictive value (PPV; 91.7%), sensitivity (71.7%), and specificity (94.4%). This algorithm performed well not only within the MGBB, but also demonstrated similar or improved classification performance in an independent biobank (PPV 93.5%, sensitivity 61.4%, specificity 90%). Ancillary comparisons showed that the classification model built including a binary feature for FEV1/FVC produced substantially higher sensitivity than those excluding. This study fills a gap in COPD research involving population-based EHRs, providing an important resource for the rapid, automated classification of COPD cases that is both cost-efficient and requires minimal information from unstructured medical records.
format Online
Article
Text
id pubmed-8497529
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-84975292021-10-12 An independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records Chu, Su H. Wan, Emily S. Cho, Michael H. Goryachev, Sergey Gainer, Vivian Linneman, James Scotty, Erica J. Hebbring, Scott J. Murphy, Shawn Lasky-Su, Jessica Weiss, Scott T. Smoller, Jordan W. Karlson, Elizabeth Sci Rep Article Electronic health records (EHR) provide an unprecedented opportunity to conduct large, cost-efficient, population-based studies. However, the studies of heterogeneous diseases, such as chronic obstructive pulmonary disease (COPD), often require labor-intensive clinical review and testing, limiting widespread use of these important resources. To develop a generalizable and efficient method for accurate identification of large COPD cohorts in EHRs, a COPD datamart was developed from 3420 participants meeting inclusion criteria in the Mass General Brigham Biobank. Training and test sets were selected and labeled with gold-standard COPD classifications obtained from chart review by pulmonologists. Multiple classes of algorithms were built utilizing both structured (e.g. ICD codes) and unstructured (e.g. medical notes) data via elastic net regression. Models explicitly including and excluding spirometry features were compared. External validation of the final algorithm was conducted in an independent biobank with a different EHR system. The final COPD classification model demonstrated excellent positive predictive value (PPV; 91.7%), sensitivity (71.7%), and specificity (94.4%). This algorithm performed well not only within the MGBB, but also demonstrated similar or improved classification performance in an independent biobank (PPV 93.5%, sensitivity 61.4%, specificity 90%). Ancillary comparisons showed that the classification model built including a binary feature for FEV1/FVC produced substantially higher sensitivity than those excluding. This study fills a gap in COPD research involving population-based EHRs, providing an important resource for the rapid, automated classification of COPD cases that is both cost-efficient and requires minimal information from unstructured medical records. Nature Publishing Group UK 2021-10-07 /pmc/articles/PMC8497529/ /pubmed/34620889 http://dx.doi.org/10.1038/s41598-021-98719-w Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Chu, Su H.
Wan, Emily S.
Cho, Michael H.
Goryachev, Sergey
Gainer, Vivian
Linneman, James
Scotty, Erica J.
Hebbring, Scott J.
Murphy, Shawn
Lasky-Su, Jessica
Weiss, Scott T.
Smoller, Jordan W.
Karlson, Elizabeth
An independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records
title An independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records
title_full An independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records
title_fullStr An independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records
title_full_unstemmed An independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records
title_short An independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records
title_sort independently validated, portable algorithm for the rapid identification of copd patients using electronic health records
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8497529/
https://www.ncbi.nlm.nih.gov/pubmed/34620889
http://dx.doi.org/10.1038/s41598-021-98719-w
work_keys_str_mv AT chusuh anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT wanemilys anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT chomichaelh anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT goryachevsergey anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT gainervivian anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT linnemanjames anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT scottyericaj anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT hebbringscottj anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT murphyshawn anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT laskysujessica anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT weissscottt anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT smollerjordanw anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT karlsonelizabeth anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT chusuh independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT wanemilys independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT chomichaelh independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT goryachevsergey independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT gainervivian independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT linnemanjames independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT scottyericaj independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT hebbringscottj independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT murphyshawn independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT laskysujessica independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT weissscottt independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT smollerjordanw independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords
AT karlsonelizabeth independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords