Cargando…

Predictive structured–unstructured interactions in EHR models: A case study of suicide prediction

Clinical risk prediction models powered by electronic health records (EHRs) are becoming increasingly widespread in clinical practice. With suicide-related mortality rates rising in recent years, it is becoming increasingly urgent to understand, predict, and prevent suicidal behavior. Here, we compa...

Descripción completa

Detalles Bibliográficos
Autores principales:	Bayramli, Ilkin, Castro, Victor, Barak-Corren, Yuval, Madsen, Emily M., Nock, Matthew K., Smoller, Jordan W., Reis, Ben Y.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8795240/ https://www.ncbi.nlm.nih.gov/pubmed/35087182 http://dx.doi.org/10.1038/s41746-022-00558-0

_version_	1784641017575636992
author	Bayramli, Ilkin Castro, Victor Barak-Corren, Yuval Madsen, Emily M. Nock, Matthew K. Smoller, Jordan W. Reis, Ben Y.
author_facet	Bayramli, Ilkin Castro, Victor Barak-Corren, Yuval Madsen, Emily M. Nock, Matthew K. Smoller, Jordan W. Reis, Ben Y.
author_sort	Bayramli, Ilkin
collection	PubMed
description	Clinical risk prediction models powered by electronic health records (EHRs) are becoming increasingly widespread in clinical practice. With suicide-related mortality rates rising in recent years, it is becoming increasingly urgent to understand, predict, and prevent suicidal behavior. Here, we compare the predictive value of structured and unstructured EHR data for predicting suicide risk. We find that Naive Bayes Classifier (NBC) and Random Forest (RF) models trained on structured EHR data perform better than those based on unstructured EHR data. An NBC model trained on both structured and unstructured data yields similar performance (AUC = 0.743) to an NBC model trained on structured data alone (0.742, p = 0.668), while an RF model trained on both data types yields significantly better results (AUC = 0.903) than an RF model trained on structured data alone (0.887, p < 0.001), likely due to the RF model’s ability to capture interactions between the two data types. To investigate these interactions, we propose and implement a general framework for identifying specific structured-unstructured feature pairs whose interactions differ between case and non-case cohorts, and thus have the potential to improve predictive performance and increase understanding of clinical risk. We find that such feature pairs tend to capture heterogeneous pairs of general concepts, rather than homogeneous pairs of specific concepts. These findings and this framework can be used to improve current and future EHR-based clinical modeling efforts.
format	Online Article Text
id	pubmed-8795240
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-87952402022-02-07 Predictive structured–unstructured interactions in EHR models: A case study of suicide prediction Bayramli, Ilkin Castro, Victor Barak-Corren, Yuval Madsen, Emily M. Nock, Matthew K. Smoller, Jordan W. Reis, Ben Y. NPJ Digit Med Article Clinical risk prediction models powered by electronic health records (EHRs) are becoming increasingly widespread in clinical practice. With suicide-related mortality rates rising in recent years, it is becoming increasingly urgent to understand, predict, and prevent suicidal behavior. Here, we compare the predictive value of structured and unstructured EHR data for predicting suicide risk. We find that Naive Bayes Classifier (NBC) and Random Forest (RF) models trained on structured EHR data perform better than those based on unstructured EHR data. An NBC model trained on both structured and unstructured data yields similar performance (AUC = 0.743) to an NBC model trained on structured data alone (0.742, p = 0.668), while an RF model trained on both data types yields significantly better results (AUC = 0.903) than an RF model trained on structured data alone (0.887, p < 0.001), likely due to the RF model’s ability to capture interactions between the two data types. To investigate these interactions, we propose and implement a general framework for identifying specific structured-unstructured feature pairs whose interactions differ between case and non-case cohorts, and thus have the potential to improve predictive performance and increase understanding of clinical risk. We find that such feature pairs tend to capture heterogeneous pairs of general concepts, rather than homogeneous pairs of specific concepts. These findings and this framework can be used to improve current and future EHR-based clinical modeling efforts. Nature Publishing Group UK 2022-01-27 /pmc/articles/PMC8795240/ /pubmed/35087182 http://dx.doi.org/10.1038/s41746-022-00558-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Article Bayramli, Ilkin Castro, Victor Barak-Corren, Yuval Madsen, Emily M. Nock, Matthew K. Smoller, Jordan W. Reis, Ben Y. Predictive structured–unstructured interactions in EHR models: A case study of suicide prediction
title	Predictive structured–unstructured interactions in EHR models: A case study of suicide prediction
title_full	Predictive structured–unstructured interactions in EHR models: A case study of suicide prediction
title_fullStr	Predictive structured–unstructured interactions in EHR models: A case study of suicide prediction
title_full_unstemmed	Predictive structured–unstructured interactions in EHR models: A case study of suicide prediction
title_short	Predictive structured–unstructured interactions in EHR models: A case study of suicide prediction
title_sort	predictive structured–unstructured interactions in ehr models: a case study of suicide prediction
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8795240/ https://www.ncbi.nlm.nih.gov/pubmed/35087182 http://dx.doi.org/10.1038/s41746-022-00558-0
work_keys_str_mv	AT bayramliilkin predictivestructuredunstructuredinteractionsinehrmodelsacasestudyofsuicideprediction AT castrovictor predictivestructuredunstructuredinteractionsinehrmodelsacasestudyofsuicideprediction AT barakcorrenyuval predictivestructuredunstructuredinteractionsinehrmodelsacasestudyofsuicideprediction AT madsenemilym predictivestructuredunstructuredinteractionsinehrmodelsacasestudyofsuicideprediction AT nockmatthewk predictivestructuredunstructuredinteractionsinehrmodelsacasestudyofsuicideprediction AT smollerjordanw predictivestructuredunstructuredinteractionsinehrmodelsacasestudyofsuicideprediction AT reisbeny predictivestructuredunstructuredinteractionsinehrmodelsacasestudyofsuicideprediction

Predictive structured–unstructured interactions in EHR models: A case study of suicide prediction

Ejemplares similares