Cargando…

Modeling Large Sparse Data for Feature Selection: Hospital Admission Predictions of the Dementia Patients Using Primary Care Electronic Health Records

A growing elderly population suffering from incurable, chronic conditions such as dementia present a continual strain on medical services due to mental impairment paired with high comorbidity resulting in increased hospitalization risk. The identification of at risk individuals allows for preventati...

Descripción completa

Detalles Bibliográficos
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	IEEE 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7737850/ https://www.ncbi.nlm.nih.gov/pubmed/33354439 http://dx.doi.org/10.1109/JTEHM.2020.3040236

_version_	1783623007104663552
collection	PubMed
description	A growing elderly population suffering from incurable, chronic conditions such as dementia present a continual strain on medical services due to mental impairment paired with high comorbidity resulting in increased hospitalization risk. The identification of at risk individuals allows for preventative measures to alleviate said strain. Electronic health records provide opportunity for big data analysis to address such applications. Such data however, provides a challenging problem space for traditional statistics and machine learning due to high dimensionality and sparse data elements. This article proposes a novel machine learning methodology: entropy regularization with ensemble deep neural networks (ECNN), which simultaneously provides high predictive performance of hospitalization of patients with dementia whilst enabling an interpretable heuristic analysis of the model architecture, able to identify individual features of importance within a large feature domain space. Experimental results on health records containing 54,647 features were able to identify 10 event indicators within a patient timeline: a collection of diagnostic events, medication prescriptions and procedural events, the highest ranked being essential hypertension. The resulting subset was still able to provide a highly competitive hospitalization prediction (Accuracy: 0.759) as compared to the full feature domain (Accuracy: 0.755) or traditional feature selection techniques (Accuracy: 0.737), a significant reduction in feature size. The discovery and heuristic evidence of correlation provide evidence for further clinical study of said medical events as potential novel indicators. There also remains great potential for adaption of ECNN within other medical big data domains as a data mining tool for novel risk factor identification.
format	Online Article Text
id	pubmed-7737850
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	IEEE
record_format	MEDLINE/PubMed
spelling	pubmed-77378502020-12-21 Modeling Large Sparse Data for Feature Selection: Hospital Admission Predictions of the Dementia Patients Using Primary Care Electronic Health Records IEEE J Transl Eng Health Med Article A growing elderly population suffering from incurable, chronic conditions such as dementia present a continual strain on medical services due to mental impairment paired with high comorbidity resulting in increased hospitalization risk. The identification of at risk individuals allows for preventative measures to alleviate said strain. Electronic health records provide opportunity for big data analysis to address such applications. Such data however, provides a challenging problem space for traditional statistics and machine learning due to high dimensionality and sparse data elements. This article proposes a novel machine learning methodology: entropy regularization with ensemble deep neural networks (ECNN), which simultaneously provides high predictive performance of hospitalization of patients with dementia whilst enabling an interpretable heuristic analysis of the model architecture, able to identify individual features of importance within a large feature domain space. Experimental results on health records containing 54,647 features were able to identify 10 event indicators within a patient timeline: a collection of diagnostic events, medication prescriptions and procedural events, the highest ranked being essential hypertension. The resulting subset was still able to provide a highly competitive hospitalization prediction (Accuracy: 0.759) as compared to the full feature domain (Accuracy: 0.755) or traditional feature selection techniques (Accuracy: 0.737), a significant reduction in feature size. The discovery and heuristic evidence of correlation provide evidence for further clinical study of said medical events as potential novel indicators. There also remains great potential for adaption of ECNN within other medical big data domains as a data mining tool for novel risk factor identification. IEEE 2020-11-24 /pmc/articles/PMC7737850/ /pubmed/33354439 http://dx.doi.org/10.1109/JTEHM.2020.3040236 Text en https://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
spellingShingle	Article Modeling Large Sparse Data for Feature Selection: Hospital Admission Predictions of the Dementia Patients Using Primary Care Electronic Health Records
title	Modeling Large Sparse Data for Feature Selection: Hospital Admission Predictions of the Dementia Patients Using Primary Care Electronic Health Records
title_full	Modeling Large Sparse Data for Feature Selection: Hospital Admission Predictions of the Dementia Patients Using Primary Care Electronic Health Records
title_fullStr	Modeling Large Sparse Data for Feature Selection: Hospital Admission Predictions of the Dementia Patients Using Primary Care Electronic Health Records
title_full_unstemmed	Modeling Large Sparse Data for Feature Selection: Hospital Admission Predictions of the Dementia Patients Using Primary Care Electronic Health Records
title_short	Modeling Large Sparse Data for Feature Selection: Hospital Admission Predictions of the Dementia Patients Using Primary Care Electronic Health Records
title_sort	modeling large sparse data for feature selection: hospital admission predictions of the dementia patients using primary care electronic health records
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7737850/ https://www.ncbi.nlm.nih.gov/pubmed/33354439 http://dx.doi.org/10.1109/JTEHM.2020.3040236
work_keys_str_mv	AT modelinglargesparsedataforfeatureselectionhospitaladmissionpredictionsofthedementiapatientsusingprimarycareelectronichealthrecords AT modelinglargesparsedataforfeatureselectionhospitaladmissionpredictionsofthedementiapatientsusingprimarycareelectronichealthrecords AT modelinglargesparsedataforfeatureselectionhospitaladmissionpredictionsofthedementiapatientsusingprimarycareelectronichealthrecords

Modeling Large Sparse Data for Feature Selection: Hospital Admission Predictions of the Dementia Patients Using Primary Care Electronic Health Records

Ejemplares similares