Cargando…

Detecting Miscoded Diabetes Diagnosis Codes in Electronic Health Records for Quality Improvement: Temporal Deep Learning Approach

BACKGROUND: Diabetes affects more than 30 million patients across the United States. With such a large disease burden, even a small error in classification can be significant. Currently billing codes, assigned at the time of a medical encounter, are the “gold standard” reflecting the actual diseases...

Descripción completa

Detalles Bibliográficos
Autores principales: Rashidian, Sina, Abell-Hart, Kayley, Hajagos, Janos, Moffitt, Richard, Lingam, Veena, Garcia, Victor, Tsai, Chao-Wei, Wang, Fusheng, Dong, Xinyu, Sun, Siao, Deng, Jianyuan, Gupta, Rajarsi, Miller, Joshua, Saltz, Joel, Saltz, Mary
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7775195/
https://www.ncbi.nlm.nih.gov/pubmed/33331828
http://dx.doi.org/10.2196/22649
_version_ 1783630426012647424
author Rashidian, Sina
Abell-Hart, Kayley
Hajagos, Janos
Moffitt, Richard
Lingam, Veena
Garcia, Victor
Tsai, Chao-Wei
Wang, Fusheng
Dong, Xinyu
Sun, Siao
Deng, Jianyuan
Gupta, Rajarsi
Miller, Joshua
Saltz, Joel
Saltz, Mary
author_facet Rashidian, Sina
Abell-Hart, Kayley
Hajagos, Janos
Moffitt, Richard
Lingam, Veena
Garcia, Victor
Tsai, Chao-Wei
Wang, Fusheng
Dong, Xinyu
Sun, Siao
Deng, Jianyuan
Gupta, Rajarsi
Miller, Joshua
Saltz, Joel
Saltz, Mary
author_sort Rashidian, Sina
collection PubMed
description BACKGROUND: Diabetes affects more than 30 million patients across the United States. With such a large disease burden, even a small error in classification can be significant. Currently billing codes, assigned at the time of a medical encounter, are the “gold standard” reflecting the actual diseases present in an individual, and thus in aggregate reflect disease prevalence in the population. These codes are generated by highly trained coders and by health care providers but are not always accurate. OBJECTIVE: This work provides a scalable deep learning methodology to more accurately classify individuals with diabetes across multiple health care systems. METHODS: We leveraged a long short-term memory-dense neural network (LSTM-DNN) model to identify patients with or without diabetes using data from 5 acute care facilities with 187,187 patients and 275,407 encounters, incorporating data elements including laboratory test results, diagnostic/procedure codes, medications, demographic data, and admission information. Furthermore, a blinded physician panel reviewed discordant cases, providing an estimate of the total impact on the population. RESULTS: When predicting the documented diagnosis of diabetes, our model achieved an 84% F1 score, 96% area under the curve–receiver operating characteristic curve, and 91% average precision on a heterogeneous data set from 5 distinct health facilities. However, in 81% of cases where the model disagreed with the documented phenotype, a blinded physician panel agreed with the model. Taken together, this suggests that 4.3% of our studied population have either missing or improper diabetes diagnosis. CONCLUSIONS: This study demonstrates that deep learning methods can improve clinical phenotyping even when patient data are noisy, sparse, and heterogeneous.
format Online
Article
Text
id pubmed-7775195
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-77751952021-01-15 Detecting Miscoded Diabetes Diagnosis Codes in Electronic Health Records for Quality Improvement: Temporal Deep Learning Approach Rashidian, Sina Abell-Hart, Kayley Hajagos, Janos Moffitt, Richard Lingam, Veena Garcia, Victor Tsai, Chao-Wei Wang, Fusheng Dong, Xinyu Sun, Siao Deng, Jianyuan Gupta, Rajarsi Miller, Joshua Saltz, Joel Saltz, Mary JMIR Med Inform Original Paper BACKGROUND: Diabetes affects more than 30 million patients across the United States. With such a large disease burden, even a small error in classification can be significant. Currently billing codes, assigned at the time of a medical encounter, are the “gold standard” reflecting the actual diseases present in an individual, and thus in aggregate reflect disease prevalence in the population. These codes are generated by highly trained coders and by health care providers but are not always accurate. OBJECTIVE: This work provides a scalable deep learning methodology to more accurately classify individuals with diabetes across multiple health care systems. METHODS: We leveraged a long short-term memory-dense neural network (LSTM-DNN) model to identify patients with or without diabetes using data from 5 acute care facilities with 187,187 patients and 275,407 encounters, incorporating data elements including laboratory test results, diagnostic/procedure codes, medications, demographic data, and admission information. Furthermore, a blinded physician panel reviewed discordant cases, providing an estimate of the total impact on the population. RESULTS: When predicting the documented diagnosis of diabetes, our model achieved an 84% F1 score, 96% area under the curve–receiver operating characteristic curve, and 91% average precision on a heterogeneous data set from 5 distinct health facilities. However, in 81% of cases where the model disagreed with the documented phenotype, a blinded physician panel agreed with the model. Taken together, this suggests that 4.3% of our studied population have either missing or improper diabetes diagnosis. CONCLUSIONS: This study demonstrates that deep learning methods can improve clinical phenotyping even when patient data are noisy, sparse, and heterogeneous. JMIR Publications 2020-12-17 /pmc/articles/PMC7775195/ /pubmed/33331828 http://dx.doi.org/10.2196/22649 Text en ©Sina Rashidian, Kayley Abell-Hart, Janos Hajagos, Richard Moffitt, Veena Lingam, Victor Garcia, Chao-Wei Tsai, Fusheng Wang, Xinyu Dong, Siao Sun, Jianyuan Deng, Rajarsi Gupta, Joshua Miller, Joel Saltz, Mary Saltz. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 17.12.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Rashidian, Sina
Abell-Hart, Kayley
Hajagos, Janos
Moffitt, Richard
Lingam, Veena
Garcia, Victor
Tsai, Chao-Wei
Wang, Fusheng
Dong, Xinyu
Sun, Siao
Deng, Jianyuan
Gupta, Rajarsi
Miller, Joshua
Saltz, Joel
Saltz, Mary
Detecting Miscoded Diabetes Diagnosis Codes in Electronic Health Records for Quality Improvement: Temporal Deep Learning Approach
title Detecting Miscoded Diabetes Diagnosis Codes in Electronic Health Records for Quality Improvement: Temporal Deep Learning Approach
title_full Detecting Miscoded Diabetes Diagnosis Codes in Electronic Health Records for Quality Improvement: Temporal Deep Learning Approach
title_fullStr Detecting Miscoded Diabetes Diagnosis Codes in Electronic Health Records for Quality Improvement: Temporal Deep Learning Approach
title_full_unstemmed Detecting Miscoded Diabetes Diagnosis Codes in Electronic Health Records for Quality Improvement: Temporal Deep Learning Approach
title_short Detecting Miscoded Diabetes Diagnosis Codes in Electronic Health Records for Quality Improvement: Temporal Deep Learning Approach
title_sort detecting miscoded diabetes diagnosis codes in electronic health records for quality improvement: temporal deep learning approach
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7775195/
https://www.ncbi.nlm.nih.gov/pubmed/33331828
http://dx.doi.org/10.2196/22649
work_keys_str_mv AT rashidiansina detectingmiscodeddiabetesdiagnosiscodesinelectronichealthrecordsforqualityimprovementtemporaldeeplearningapproach
AT abellhartkayley detectingmiscodeddiabetesdiagnosiscodesinelectronichealthrecordsforqualityimprovementtemporaldeeplearningapproach
AT hajagosjanos detectingmiscodeddiabetesdiagnosiscodesinelectronichealthrecordsforqualityimprovementtemporaldeeplearningapproach
AT moffittrichard detectingmiscodeddiabetesdiagnosiscodesinelectronichealthrecordsforqualityimprovementtemporaldeeplearningapproach
AT lingamveena detectingmiscodeddiabetesdiagnosiscodesinelectronichealthrecordsforqualityimprovementtemporaldeeplearningapproach
AT garciavictor detectingmiscodeddiabetesdiagnosiscodesinelectronichealthrecordsforqualityimprovementtemporaldeeplearningapproach
AT tsaichaowei detectingmiscodeddiabetesdiagnosiscodesinelectronichealthrecordsforqualityimprovementtemporaldeeplearningapproach
AT wangfusheng detectingmiscodeddiabetesdiagnosiscodesinelectronichealthrecordsforqualityimprovementtemporaldeeplearningapproach
AT dongxinyu detectingmiscodeddiabetesdiagnosiscodesinelectronichealthrecordsforqualityimprovementtemporaldeeplearningapproach
AT sunsiao detectingmiscodeddiabetesdiagnosiscodesinelectronichealthrecordsforqualityimprovementtemporaldeeplearningapproach
AT dengjianyuan detectingmiscodeddiabetesdiagnosiscodesinelectronichealthrecordsforqualityimprovementtemporaldeeplearningapproach
AT guptarajarsi detectingmiscodeddiabetesdiagnosiscodesinelectronichealthrecordsforqualityimprovementtemporaldeeplearningapproach
AT millerjoshua detectingmiscodeddiabetesdiagnosiscodesinelectronichealthrecordsforqualityimprovementtemporaldeeplearningapproach
AT saltzjoel detectingmiscodeddiabetesdiagnosiscodesinelectronichealthrecordsforqualityimprovementtemporaldeeplearningapproach
AT saltzmary detectingmiscodeddiabetesdiagnosiscodesinelectronichealthrecordsforqualityimprovementtemporaldeeplearningapproach