Cargando…

Improving the Performance of Outcome Prediction for Inpatients With Acute Myocardial Infarction Based on Embedding Representation Learned From Electronic Medical Records: Development and Validation Study

BACKGROUND: The widespread secondary use of electronic medical records (EMRs) promotes health care quality improvement. Representation learning that can automatically extract hidden information from EMR data has gained increasing attention. OBJECTIVE: We aimed to propose a patient representation wit...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Yanqun, Zheng, Zhimin, Ma, Moxuan, Xin, Xin, Liu, Honglei, Fei, Xiaolu, Wei, Lan, Chen, Hui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9386580/
https://www.ncbi.nlm.nih.gov/pubmed/35921141
http://dx.doi.org/10.2196/37486
_version_ 1784769844828176384
author Huang, Yanqun
Zheng, Zhimin
Ma, Moxuan
Xin, Xin
Liu, Honglei
Fei, Xiaolu
Wei, Lan
Chen, Hui
author_facet Huang, Yanqun
Zheng, Zhimin
Ma, Moxuan
Xin, Xin
Liu, Honglei
Fei, Xiaolu
Wei, Lan
Chen, Hui
author_sort Huang, Yanqun
collection PubMed
description BACKGROUND: The widespread secondary use of electronic medical records (EMRs) promotes health care quality improvement. Representation learning that can automatically extract hidden information from EMR data has gained increasing attention. OBJECTIVE: We aimed to propose a patient representation with more feature associations and task-specific feature importance to improve the outcome prediction performance for inpatients with acute myocardial infarction (AMI). METHODS: Medical concepts, including patients’ age, gender, disease diagnoses, laboratory tests, structured radiological features, procedures, and medications, were first embedded into real-value vectors using the improved skip-gram algorithm, where concepts in the context windows were selected by feature association strengths measured by association rule confidence. Then, each patient was represented as the sum of the feature embeddings weighted by the task-specific feature importance, which was applied to facilitate predictive model prediction from global and local perspectives. We finally applied the proposed patient representation into mortality risk prediction for 3010 and 1671 AMI inpatients from a public data set and a private data set, respectively, and compared it with several reference representation methods in terms of the area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), and F1-score. RESULTS: Compared with the reference methods, the proposed embedding-based representation showed consistently superior predictive performance on the 2 data sets, achieving mean AUROCs of 0.878 and 0.973, AUPRCs of 0.220 and 0.505, and F1-scores of 0.376 and 0.674 for the public and private data sets, respectively, while the greatest AUROCs, AUPRCs, and F1-scores among the reference methods were 0.847 and 0.939, 0.196 and 0.283, and 0.344 and 0.361 for the public and private data sets, respectively. Feature importance integrated in patient representation reflected features that were also critical in prediction tasks and clinical practice. CONCLUSIONS: The introduction of feature associations and feature importance facilitated an effective patient representation and contributed to prediction performance improvement and model interpretation.
format Online
Article
Text
id pubmed-9386580
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-93865802022-08-19 Improving the Performance of Outcome Prediction for Inpatients With Acute Myocardial Infarction Based on Embedding Representation Learned From Electronic Medical Records: Development and Validation Study Huang, Yanqun Zheng, Zhimin Ma, Moxuan Xin, Xin Liu, Honglei Fei, Xiaolu Wei, Lan Chen, Hui J Med Internet Res Original Paper BACKGROUND: The widespread secondary use of electronic medical records (EMRs) promotes health care quality improvement. Representation learning that can automatically extract hidden information from EMR data has gained increasing attention. OBJECTIVE: We aimed to propose a patient representation with more feature associations and task-specific feature importance to improve the outcome prediction performance for inpatients with acute myocardial infarction (AMI). METHODS: Medical concepts, including patients’ age, gender, disease diagnoses, laboratory tests, structured radiological features, procedures, and medications, were first embedded into real-value vectors using the improved skip-gram algorithm, where concepts in the context windows were selected by feature association strengths measured by association rule confidence. Then, each patient was represented as the sum of the feature embeddings weighted by the task-specific feature importance, which was applied to facilitate predictive model prediction from global and local perspectives. We finally applied the proposed patient representation into mortality risk prediction for 3010 and 1671 AMI inpatients from a public data set and a private data set, respectively, and compared it with several reference representation methods in terms of the area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), and F1-score. RESULTS: Compared with the reference methods, the proposed embedding-based representation showed consistently superior predictive performance on the 2 data sets, achieving mean AUROCs of 0.878 and 0.973, AUPRCs of 0.220 and 0.505, and F1-scores of 0.376 and 0.674 for the public and private data sets, respectively, while the greatest AUROCs, AUPRCs, and F1-scores among the reference methods were 0.847 and 0.939, 0.196 and 0.283, and 0.344 and 0.361 for the public and private data sets, respectively. Feature importance integrated in patient representation reflected features that were also critical in prediction tasks and clinical practice. CONCLUSIONS: The introduction of feature associations and feature importance facilitated an effective patient representation and contributed to prediction performance improvement and model interpretation. JMIR Publications 2022-08-03 /pmc/articles/PMC9386580/ /pubmed/35921141 http://dx.doi.org/10.2196/37486 Text en ©Yanqun Huang, Zhimin Zheng, Moxuan Ma, Xin Xin, Honglei Liu, Xiaolu Fei, Lan Wei, Hui Chen. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 03.08.2022. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Huang, Yanqun
Zheng, Zhimin
Ma, Moxuan
Xin, Xin
Liu, Honglei
Fei, Xiaolu
Wei, Lan
Chen, Hui
Improving the Performance of Outcome Prediction for Inpatients With Acute Myocardial Infarction Based on Embedding Representation Learned From Electronic Medical Records: Development and Validation Study
title Improving the Performance of Outcome Prediction for Inpatients With Acute Myocardial Infarction Based on Embedding Representation Learned From Electronic Medical Records: Development and Validation Study
title_full Improving the Performance of Outcome Prediction for Inpatients With Acute Myocardial Infarction Based on Embedding Representation Learned From Electronic Medical Records: Development and Validation Study
title_fullStr Improving the Performance of Outcome Prediction for Inpatients With Acute Myocardial Infarction Based on Embedding Representation Learned From Electronic Medical Records: Development and Validation Study
title_full_unstemmed Improving the Performance of Outcome Prediction for Inpatients With Acute Myocardial Infarction Based on Embedding Representation Learned From Electronic Medical Records: Development and Validation Study
title_short Improving the Performance of Outcome Prediction for Inpatients With Acute Myocardial Infarction Based on Embedding Representation Learned From Electronic Medical Records: Development and Validation Study
title_sort improving the performance of outcome prediction for inpatients with acute myocardial infarction based on embedding representation learned from electronic medical records: development and validation study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9386580/
https://www.ncbi.nlm.nih.gov/pubmed/35921141
http://dx.doi.org/10.2196/37486
work_keys_str_mv AT huangyanqun improvingtheperformanceofoutcomepredictionforinpatientswithacutemyocardialinfarctionbasedonembeddingrepresentationlearnedfromelectronicmedicalrecordsdevelopmentandvalidationstudy
AT zhengzhimin improvingtheperformanceofoutcomepredictionforinpatientswithacutemyocardialinfarctionbasedonembeddingrepresentationlearnedfromelectronicmedicalrecordsdevelopmentandvalidationstudy
AT mamoxuan improvingtheperformanceofoutcomepredictionforinpatientswithacutemyocardialinfarctionbasedonembeddingrepresentationlearnedfromelectronicmedicalrecordsdevelopmentandvalidationstudy
AT xinxin improvingtheperformanceofoutcomepredictionforinpatientswithacutemyocardialinfarctionbasedonembeddingrepresentationlearnedfromelectronicmedicalrecordsdevelopmentandvalidationstudy
AT liuhonglei improvingtheperformanceofoutcomepredictionforinpatientswithacutemyocardialinfarctionbasedonembeddingrepresentationlearnedfromelectronicmedicalrecordsdevelopmentandvalidationstudy
AT feixiaolu improvingtheperformanceofoutcomepredictionforinpatientswithacutemyocardialinfarctionbasedonembeddingrepresentationlearnedfromelectronicmedicalrecordsdevelopmentandvalidationstudy
AT weilan improvingtheperformanceofoutcomepredictionforinpatientswithacutemyocardialinfarctionbasedonembeddingrepresentationlearnedfromelectronicmedicalrecordsdevelopmentandvalidationstudy
AT chenhui improvingtheperformanceofoutcomepredictionforinpatientswithacutemyocardialinfarctionbasedonembeddingrepresentationlearnedfromelectronicmedicalrecordsdevelopmentandvalidationstudy