Cargando…

Combining structured and unstructured data for predictive models: a deep learning approach

BACKGROUND: The broad adoption of electronic health records (EHRs) provides great opportunities to conduct health care research and solve various clinical problems in medicine. With recent advances and success, methods based on machine learning and deep learning have become increasingly popular in m...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhang, Dongdong, Yin, Changchang, Zeng, Jucheng, Yuan, Xiaohui, Zhang, Ping
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2020
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7596962/ https://www.ncbi.nlm.nih.gov/pubmed/33121479 http://dx.doi.org/10.1186/s12911-020-01297-6

_version_	1783602226928812032
author	Zhang, Dongdong Yin, Changchang Zeng, Jucheng Yuan, Xiaohui Zhang, Ping
author_facet	Zhang, Dongdong Yin, Changchang Zeng, Jucheng Yuan, Xiaohui Zhang, Ping
author_sort	Zhang, Dongdong
collection	PubMed
description	BACKGROUND: The broad adoption of electronic health records (EHRs) provides great opportunities to conduct health care research and solve various clinical problems in medicine. With recent advances and success, methods based on machine learning and deep learning have become increasingly popular in medical informatics. However, while many research studies utilize temporal structured data on predictive modeling, they typically neglect potentially valuable information in unstructured clinical notes. Integrating heterogeneous data types across EHRs through deep learning techniques may help improve the performance of prediction models. METHODS: In this research, we proposed 2 general-purpose multi-modal neural network architectures to enhance patient representation learning by combining sequential unstructured notes with structured data. The proposed fusion models leverage document embeddings for the representation of long clinical note documents and either convolutional neural network or long short-term memory networks to model the sequential clinical notes and temporal signals, and one-hot encoding for static information representation. The concatenated representation is the final patient representation which is used to make predictions. RESULTS: We evaluate the performance of proposed models on 3 risk prediction tasks (i.e. in-hospital mortality, 30-day hospital readmission, and long length of stay prediction) using derived data from the publicly available Medical Information Mart for Intensive Care III dataset. Our results show that by combining unstructured clinical notes with structured data, the proposed models outperform other models that utilize either unstructured notes or structured data only. CONCLUSIONS: The proposed fusion models learn better patient representation by combining structured and unstructured data. Integrating heterogeneous data types across EHRs helps improve the performance of prediction models and reduce errors.
format	Online Article Text
id	pubmed-7596962
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-75969622020-11-02 Combining structured and unstructured data for predictive models: a deep learning approach Zhang, Dongdong Yin, Changchang Zeng, Jucheng Yuan, Xiaohui Zhang, Ping BMC Med Inform Decis Mak Research Article BACKGROUND: The broad adoption of electronic health records (EHRs) provides great opportunities to conduct health care research and solve various clinical problems in medicine. With recent advances and success, methods based on machine learning and deep learning have become increasingly popular in medical informatics. However, while many research studies utilize temporal structured data on predictive modeling, they typically neglect potentially valuable information in unstructured clinical notes. Integrating heterogeneous data types across EHRs through deep learning techniques may help improve the performance of prediction models. METHODS: In this research, we proposed 2 general-purpose multi-modal neural network architectures to enhance patient representation learning by combining sequential unstructured notes with structured data. The proposed fusion models leverage document embeddings for the representation of long clinical note documents and either convolutional neural network or long short-term memory networks to model the sequential clinical notes and temporal signals, and one-hot encoding for static information representation. The concatenated representation is the final patient representation which is used to make predictions. RESULTS: We evaluate the performance of proposed models on 3 risk prediction tasks (i.e. in-hospital mortality, 30-day hospital readmission, and long length of stay prediction) using derived data from the publicly available Medical Information Mart for Intensive Care III dataset. Our results show that by combining unstructured clinical notes with structured data, the proposed models outperform other models that utilize either unstructured notes or structured data only. CONCLUSIONS: The proposed fusion models learn better patient representation by combining structured and unstructured data. Integrating heterogeneous data types across EHRs helps improve the performance of prediction models and reduce errors. BioMed Central 2020-10-29 /pmc/articles/PMC7596962/ /pubmed/33121479 http://dx.doi.org/10.1186/s12911-020-01297-6 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Article Zhang, Dongdong Yin, Changchang Zeng, Jucheng Yuan, Xiaohui Zhang, Ping Combining structured and unstructured data for predictive models: a deep learning approach
title	Combining structured and unstructured data for predictive models: a deep learning approach
title_full	Combining structured and unstructured data for predictive models: a deep learning approach
title_fullStr	Combining structured and unstructured data for predictive models: a deep learning approach
title_full_unstemmed	Combining structured and unstructured data for predictive models: a deep learning approach
title_short	Combining structured and unstructured data for predictive models: a deep learning approach
title_sort	combining structured and unstructured data for predictive models: a deep learning approach
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7596962/ https://www.ncbi.nlm.nih.gov/pubmed/33121479 http://dx.doi.org/10.1186/s12911-020-01297-6
work_keys_str_mv	AT zhangdongdong combiningstructuredandunstructureddataforpredictivemodelsadeeplearningapproach AT yinchangchang combiningstructuredandunstructureddataforpredictivemodelsadeeplearningapproach AT zengjucheng combiningstructuredandunstructureddataforpredictivemodelsadeeplearningapproach AT yuanxiaohui combiningstructuredandunstructureddataforpredictivemodelsadeeplearningapproach AT zhangping combiningstructuredandunstructureddataforpredictivemodelsadeeplearningapproach

Combining structured and unstructured data for predictive models: a deep learning approach

Ejemplares similares