Cargando…

A Predictive Model for Medical Events Based on Contextual Embedding of Temporal Sequences

BACKGROUND: Medical concepts are inherently ambiguous and error-prone due to human fallibility, which makes it hard for them to be fully used by classical machine learning methods (eg, for tasks like early stage disease prediction). OBJECTIVE: Our work was to create a new machine-friendly representa...

Descripción completa

Detalles Bibliográficos
Autores principales: Farhan, Wael, Wang, Zhimu, Huang, Yingxiang, Wang, Shuang, Wang, Fei, Jiang, Xiaoqian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5148810/
https://www.ncbi.nlm.nih.gov/pubmed/27888170
http://dx.doi.org/10.2196/medinform.5977
_version_ 1782473885889855488
author Farhan, Wael
Wang, Zhimu
Huang, Yingxiang
Wang, Shuang
Wang, Fei
Jiang, Xiaoqian
author_facet Farhan, Wael
Wang, Zhimu
Huang, Yingxiang
Wang, Shuang
Wang, Fei
Jiang, Xiaoqian
author_sort Farhan, Wael
collection PubMed
description BACKGROUND: Medical concepts are inherently ambiguous and error-prone due to human fallibility, which makes it hard for them to be fully used by classical machine learning methods (eg, for tasks like early stage disease prediction). OBJECTIVE: Our work was to create a new machine-friendly representation that resembles the semantics of medical concepts. We then developed a sequential predictive model for medical events based on this new representation. METHODS: We developed novel contextual embedding techniques to combine different medical events (eg, diagnoses, prescriptions, and labs tests). Each medical event is converted into a numerical vector that resembles its “semantics,” via which the similarity between medical events can be easily measured. We developed simple and effective predictive models based on these vectors to predict novel diagnoses. RESULTS: We evaluated our sequential prediction model (and standard learning methods) in estimating the risk of potential diseases based on our contextual embedding representation. Our model achieved an area under the receiver operating characteristic (ROC) curve (AUC) of 0.79 on chronic systolic heart failure and an average AUC of 0.67 (over the 80 most common diagnoses) using the Medical Information Mart for Intensive Care III (MIMIC-III) dataset. CONCLUSIONS: We propose a general early prognosis predictor for 80 different diagnoses. Our method computes numeric representation for each medical event to uncover the potential meaning of those events. Our results demonstrate the efficiency of the proposed method, which will benefit patients and physicians by offering more accurate diagnosis.
format Online
Article
Text
id pubmed-5148810
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-51488102016-12-20 A Predictive Model for Medical Events Based on Contextual Embedding of Temporal Sequences Farhan, Wael Wang, Zhimu Huang, Yingxiang Wang, Shuang Wang, Fei Jiang, Xiaoqian JMIR Med Inform Original Paper BACKGROUND: Medical concepts are inherently ambiguous and error-prone due to human fallibility, which makes it hard for them to be fully used by classical machine learning methods (eg, for tasks like early stage disease prediction). OBJECTIVE: Our work was to create a new machine-friendly representation that resembles the semantics of medical concepts. We then developed a sequential predictive model for medical events based on this new representation. METHODS: We developed novel contextual embedding techniques to combine different medical events (eg, diagnoses, prescriptions, and labs tests). Each medical event is converted into a numerical vector that resembles its “semantics,” via which the similarity between medical events can be easily measured. We developed simple and effective predictive models based on these vectors to predict novel diagnoses. RESULTS: We evaluated our sequential prediction model (and standard learning methods) in estimating the risk of potential diseases based on our contextual embedding representation. Our model achieved an area under the receiver operating characteristic (ROC) curve (AUC) of 0.79 on chronic systolic heart failure and an average AUC of 0.67 (over the 80 most common diagnoses) using the Medical Information Mart for Intensive Care III (MIMIC-III) dataset. CONCLUSIONS: We propose a general early prognosis predictor for 80 different diagnoses. Our method computes numeric representation for each medical event to uncover the potential meaning of those events. Our results demonstrate the efficiency of the proposed method, which will benefit patients and physicians by offering more accurate diagnosis. JMIR Publications 2016-11-25 /pmc/articles/PMC5148810/ /pubmed/27888170 http://dx.doi.org/10.2196/medinform.5977 Text en ©Wael Farhan, Zhimu Wang, Yingxiang Huang, Shuang Wang, Fei Wang, Xiaoqian Jiang. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 25.11.2016. https://creativecommons.org/licenses/by/2.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/ (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Farhan, Wael
Wang, Zhimu
Huang, Yingxiang
Wang, Shuang
Wang, Fei
Jiang, Xiaoqian
A Predictive Model for Medical Events Based on Contextual Embedding of Temporal Sequences
title A Predictive Model for Medical Events Based on Contextual Embedding of Temporal Sequences
title_full A Predictive Model for Medical Events Based on Contextual Embedding of Temporal Sequences
title_fullStr A Predictive Model for Medical Events Based on Contextual Embedding of Temporal Sequences
title_full_unstemmed A Predictive Model for Medical Events Based on Contextual Embedding of Temporal Sequences
title_short A Predictive Model for Medical Events Based on Contextual Embedding of Temporal Sequences
title_sort predictive model for medical events based on contextual embedding of temporal sequences
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5148810/
https://www.ncbi.nlm.nih.gov/pubmed/27888170
http://dx.doi.org/10.2196/medinform.5977
work_keys_str_mv AT farhanwael apredictivemodelformedicaleventsbasedoncontextualembeddingoftemporalsequences
AT wangzhimu apredictivemodelformedicaleventsbasedoncontextualembeddingoftemporalsequences
AT huangyingxiang apredictivemodelformedicaleventsbasedoncontextualembeddingoftemporalsequences
AT wangshuang apredictivemodelformedicaleventsbasedoncontextualembeddingoftemporalsequences
AT wangfei apredictivemodelformedicaleventsbasedoncontextualembeddingoftemporalsequences
AT jiangxiaoqian apredictivemodelformedicaleventsbasedoncontextualembeddingoftemporalsequences
AT farhanwael predictivemodelformedicaleventsbasedoncontextualembeddingoftemporalsequences
AT wangzhimu predictivemodelformedicaleventsbasedoncontextualembeddingoftemporalsequences
AT huangyingxiang predictivemodelformedicaleventsbasedoncontextualembeddingoftemporalsequences
AT wangshuang predictivemodelformedicaleventsbasedoncontextualembeddingoftemporalsequences
AT wangfei predictivemodelformedicaleventsbasedoncontextualembeddingoftemporalsequences
AT jiangxiaoqian predictivemodelformedicaleventsbasedoncontextualembeddingoftemporalsequences