Cargando…

Natural language processing and machine learning of electronic health records for prediction of first-time suicide attempts

OBJECTIVE: Limited research exists in predicting first-time suicide attempts that account for two-thirds of suicide decedents. We aimed to predict first-time suicide attempts using a large data-driven approach that applies natural language processing (NLP) and machine learning (ML) to unstructured (...

Descripción completa

Detalles Bibliográficos
Autores principales:	Tsui, Fuchiang R, Shi, Lingyun, Ruiz, Victor, Ryan, Neal D, Biernesser, Candice, Iyengar, Satish, Walsh, Colin G, Brent, David A
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2021
Materias:	Research and Applications
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7966858/ https://www.ncbi.nlm.nih.gov/pubmed/33758800 http://dx.doi.org/10.1093/jamiaopen/ooab011

_version_	1783665751546134528
author	Tsui, Fuchiang R Shi, Lingyun Ruiz, Victor Ryan, Neal D Biernesser, Candice Iyengar, Satish Walsh, Colin G Brent, David A
author_facet	Tsui, Fuchiang R Shi, Lingyun Ruiz, Victor Ryan, Neal D Biernesser, Candice Iyengar, Satish Walsh, Colin G Brent, David A
author_sort	Tsui, Fuchiang R
collection	PubMed
description	OBJECTIVE: Limited research exists in predicting first-time suicide attempts that account for two-thirds of suicide decedents. We aimed to predict first-time suicide attempts using a large data-driven approach that applies natural language processing (NLP) and machine learning (ML) to unstructured (narrative) clinical notes and structured electronic health record (EHR) data. METHODS: This case-control study included patients aged 10–75 years who were seen between 2007 and 2016 from emergency departments and inpatient units. Cases were first-time suicide attempts from coded diagnosis; controls were randomly selected without suicide attempts regardless of demographics, following a ratio of nine controls per case. Four data-driven ML models were evaluated using 2-year historical EHR data prior to suicide attempt or control index visits, with prediction windows from 7 to 730 days. Patients without any historical notes were excluded. Model evaluation on accuracy and robustness was performed on a blind dataset (30% cohort). RESULTS: The study cohort included 45 238 patients (5099 cases, 40 139 controls) comprising 54 651 variables from 5.7 million structured records and 798 665 notes. Using both unstructured and structured data resulted in significantly greater accuracy compared to structured data alone (area-under-the-curve [AUC]: 0.932 vs. 0.901 P < .001). The best-predicting model utilized 1726 variables with AUC = 0.932 (95% CI, 0.922–0.941). The model was robust across multiple prediction windows and subgroups by demographics, points of historical most recent clinical contact, and depression diagnosis history. CONCLUSIONS: Our large data-driven approach using both structured and unstructured EHR data demonstrated accurate and robust first-time suicide attempt prediction, and has the potential to be deployed across various populations and clinical settings.
format	Online Article Text
id	pubmed-7966858
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-79668582021-03-22 Natural language processing and machine learning of electronic health records for prediction of first-time suicide attempts Tsui, Fuchiang R Shi, Lingyun Ruiz, Victor Ryan, Neal D Biernesser, Candice Iyengar, Satish Walsh, Colin G Brent, David A JAMIA Open Research and Applications OBJECTIVE: Limited research exists in predicting first-time suicide attempts that account for two-thirds of suicide decedents. We aimed to predict first-time suicide attempts using a large data-driven approach that applies natural language processing (NLP) and machine learning (ML) to unstructured (narrative) clinical notes and structured electronic health record (EHR) data. METHODS: This case-control study included patients aged 10–75 years who were seen between 2007 and 2016 from emergency departments and inpatient units. Cases were first-time suicide attempts from coded diagnosis; controls were randomly selected without suicide attempts regardless of demographics, following a ratio of nine controls per case. Four data-driven ML models were evaluated using 2-year historical EHR data prior to suicide attempt or control index visits, with prediction windows from 7 to 730 days. Patients without any historical notes were excluded. Model evaluation on accuracy and robustness was performed on a blind dataset (30% cohort). RESULTS: The study cohort included 45 238 patients (5099 cases, 40 139 controls) comprising 54 651 variables from 5.7 million structured records and 798 665 notes. Using both unstructured and structured data resulted in significantly greater accuracy compared to structured data alone (area-under-the-curve [AUC]: 0.932 vs. 0.901 P < .001). The best-predicting model utilized 1726 variables with AUC = 0.932 (95% CI, 0.922–0.941). The model was robust across multiple prediction windows and subgroups by demographics, points of historical most recent clinical contact, and depression diagnosis history. CONCLUSIONS: Our large data-driven approach using both structured and unstructured EHR data demonstrated accurate and robust first-time suicide attempt prediction, and has the potential to be deployed across various populations and clinical settings. Oxford University Press 2021-03-17 /pmc/articles/PMC7966858/ /pubmed/33758800 http://dx.doi.org/10.1093/jamiaopen/ooab011 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of the American Medical Informatics Association. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Research and Applications Tsui, Fuchiang R Shi, Lingyun Ruiz, Victor Ryan, Neal D Biernesser, Candice Iyengar, Satish Walsh, Colin G Brent, David A Natural language processing and machine learning of electronic health records for prediction of first-time suicide attempts
title	Natural language processing and machine learning of electronic health records for prediction of first-time suicide attempts
title_full	Natural language processing and machine learning of electronic health records for prediction of first-time suicide attempts
title_fullStr	Natural language processing and machine learning of electronic health records for prediction of first-time suicide attempts
title_full_unstemmed	Natural language processing and machine learning of electronic health records for prediction of first-time suicide attempts
title_short	Natural language processing and machine learning of electronic health records for prediction of first-time suicide attempts
title_sort	natural language processing and machine learning of electronic health records for prediction of first-time suicide attempts
topic	Research and Applications
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7966858/ https://www.ncbi.nlm.nih.gov/pubmed/33758800 http://dx.doi.org/10.1093/jamiaopen/ooab011
work_keys_str_mv	AT tsuifuchiangr naturallanguageprocessingandmachinelearningofelectronichealthrecordsforpredictionoffirsttimesuicideattempts AT shilingyun naturallanguageprocessingandmachinelearningofelectronichealthrecordsforpredictionoffirsttimesuicideattempts AT ruizvictor naturallanguageprocessingandmachinelearningofelectronichealthrecordsforpredictionoffirsttimesuicideattempts AT ryanneald naturallanguageprocessingandmachinelearningofelectronichealthrecordsforpredictionoffirsttimesuicideattempts AT biernessercandice naturallanguageprocessingandmachinelearningofelectronichealthrecordsforpredictionoffirsttimesuicideattempts AT iyengarsatish naturallanguageprocessingandmachinelearningofelectronichealthrecordsforpredictionoffirsttimesuicideattempts AT walshcoling naturallanguageprocessingandmachinelearningofelectronichealthrecordsforpredictionoffirsttimesuicideattempts AT brentdavida naturallanguageprocessingandmachinelearningofelectronichealthrecordsforpredictionoffirsttimesuicideattempts

Natural language processing and machine learning of electronic health records for prediction of first-time suicide attempts

Ejemplares similares