Cargando…

Modeling electronic health record data using an end-to-end knowledge-graph-informed topic model

The rapid growth of electronic health record (EHR) datasets opens up promising opportunities to understand human diseases in a systematic way. However, effective extraction of clinical knowledge from EHR data has been hindered by the sparse and noisy information. We present Graph ATtention-Embedded...

Descripción completa

Detalles Bibliográficos
Autores principales: Zou, Yuesong, Pesaranghader, Ahmad, Song, Ziyang, Verma, Aman, Buckeridge, David L., Li, Yue
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9596500/
https://www.ncbi.nlm.nih.gov/pubmed/36284225
http://dx.doi.org/10.1038/s41598-022-22956-w
_version_ 1784815885047824384
author Zou, Yuesong
Pesaranghader, Ahmad
Song, Ziyang
Verma, Aman
Buckeridge, David L.
Li, Yue
author_facet Zou, Yuesong
Pesaranghader, Ahmad
Song, Ziyang
Verma, Aman
Buckeridge, David L.
Li, Yue
author_sort Zou, Yuesong
collection PubMed
description The rapid growth of electronic health record (EHR) datasets opens up promising opportunities to understand human diseases in a systematic way. However, effective extraction of clinical knowledge from EHR data has been hindered by the sparse and noisy information. We present Graph ATtention-Embedded Topic Model (GAT-ETM), an end-to-end taxonomy-knowledge-graph-based multimodal embedded topic model. GAT-ETM distills latent disease topics from EHR data by learning the embedding from a constructed medical knowledge graph. We applied GAT-ETM to a large-scale EHR dataset consisting of over 1 million patients. We evaluated its performance based on topic quality, drug imputation, and disease diagnosis prediction. GAT-ETM demonstrated superior performance over the alternative methods on all tasks. Moreover, GAT-ETM learned clinically meaningful graph-informed embedding of the EHR codes and discovered interpretable and accurate patient representations for patient stratification and drug recommendations. GAT-ETM code is available at https://github.com/li-lab-mcgill/GAT-ETM.
format Online
Article
Text
id pubmed-9596500
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-95965002022-10-27 Modeling electronic health record data using an end-to-end knowledge-graph-informed topic model Zou, Yuesong Pesaranghader, Ahmad Song, Ziyang Verma, Aman Buckeridge, David L. Li, Yue Sci Rep Article The rapid growth of electronic health record (EHR) datasets opens up promising opportunities to understand human diseases in a systematic way. However, effective extraction of clinical knowledge from EHR data has been hindered by the sparse and noisy information. We present Graph ATtention-Embedded Topic Model (GAT-ETM), an end-to-end taxonomy-knowledge-graph-based multimodal embedded topic model. GAT-ETM distills latent disease topics from EHR data by learning the embedding from a constructed medical knowledge graph. We applied GAT-ETM to a large-scale EHR dataset consisting of over 1 million patients. We evaluated its performance based on topic quality, drug imputation, and disease diagnosis prediction. GAT-ETM demonstrated superior performance over the alternative methods on all tasks. Moreover, GAT-ETM learned clinically meaningful graph-informed embedding of the EHR codes and discovered interpretable and accurate patient representations for patient stratification and drug recommendations. GAT-ETM code is available at https://github.com/li-lab-mcgill/GAT-ETM. Nature Publishing Group UK 2022-10-25 /pmc/articles/PMC9596500/ /pubmed/36284225 http://dx.doi.org/10.1038/s41598-022-22956-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Zou, Yuesong
Pesaranghader, Ahmad
Song, Ziyang
Verma, Aman
Buckeridge, David L.
Li, Yue
Modeling electronic health record data using an end-to-end knowledge-graph-informed topic model
title Modeling electronic health record data using an end-to-end knowledge-graph-informed topic model
title_full Modeling electronic health record data using an end-to-end knowledge-graph-informed topic model
title_fullStr Modeling electronic health record data using an end-to-end knowledge-graph-informed topic model
title_full_unstemmed Modeling electronic health record data using an end-to-end knowledge-graph-informed topic model
title_short Modeling electronic health record data using an end-to-end knowledge-graph-informed topic model
title_sort modeling electronic health record data using an end-to-end knowledge-graph-informed topic model
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9596500/
https://www.ncbi.nlm.nih.gov/pubmed/36284225
http://dx.doi.org/10.1038/s41598-022-22956-w
work_keys_str_mv AT zouyuesong modelingelectronichealthrecorddatausinganendtoendknowledgegraphinformedtopicmodel
AT pesaranghaderahmad modelingelectronichealthrecorddatausinganendtoendknowledgegraphinformedtopicmodel
AT songziyang modelingelectronichealthrecorddatausinganendtoendknowledgegraphinformedtopicmodel
AT vermaaman modelingelectronichealthrecorddatausinganendtoendknowledgegraphinformedtopicmodel
AT buckeridgedavidl modelingelectronichealthrecorddatausinganendtoendknowledgegraphinformedtopicmodel
AT liyue modelingelectronichealthrecorddatausinganendtoendknowledgegraphinformedtopicmodel