Cargando…

Synthetic electronic health records generated with variational graph autoencoders

Data-driven medical care delivery must always respect patient privacy—a requirement that is not easily met. This issue has impeded improvements to healthcare software and has delayed the long-predicted prevalence of artificial intelligence in healthcare. Until now, it has been very difficult to shar...

Descripción completa

Detalles Bibliográficos
Autores principales: Nikolentzos, Giannis, Vazirgiannis, Michalis, Xypolopoulos, Christos, Lingman, Markus, Brandt, Erik G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10148837/
https://www.ncbi.nlm.nih.gov/pubmed/37120594
http://dx.doi.org/10.1038/s41746-023-00822-x
_version_ 1785035056796925952
author Nikolentzos, Giannis
Vazirgiannis, Michalis
Xypolopoulos, Christos
Lingman, Markus
Brandt, Erik G.
author_facet Nikolentzos, Giannis
Vazirgiannis, Michalis
Xypolopoulos, Christos
Lingman, Markus
Brandt, Erik G.
author_sort Nikolentzos, Giannis
collection PubMed
description Data-driven medical care delivery must always respect patient privacy—a requirement that is not easily met. This issue has impeded improvements to healthcare software and has delayed the long-predicted prevalence of artificial intelligence in healthcare. Until now, it has been very difficult to share data between healthcare organizations, resulting in poor statistical models due to unrepresentative patient cohorts. Synthetic data, i.e., artificial but realistic electronic health records, could overcome the drought that is troubling the healthcare sector. Deep neural network architectures, in particular, have shown an incredible ability to learn from complex data sets and generate large amounts of unseen data points with the same statistical properties as the training data. Here, we present a generative neural network model that can create synthetic health records with realistic timelines. These clinical trajectories are generated on a per-patient basis and are represented as linear-sequence graphs of clinical events over time. We use a variational graph autoencoder (VGAE) to generate synthetic samples from real-world electronic health records. Our approach generates health records not seen in the training data. We show that these artificial patient trajectories are realistic and preserve patient privacy and can therefore support the safe sharing of data across organizations.
format Online
Article
Text
id pubmed-10148837
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-101488372023-05-01 Synthetic electronic health records generated with variational graph autoencoders Nikolentzos, Giannis Vazirgiannis, Michalis Xypolopoulos, Christos Lingman, Markus Brandt, Erik G. NPJ Digit Med Article Data-driven medical care delivery must always respect patient privacy—a requirement that is not easily met. This issue has impeded improvements to healthcare software and has delayed the long-predicted prevalence of artificial intelligence in healthcare. Until now, it has been very difficult to share data between healthcare organizations, resulting in poor statistical models due to unrepresentative patient cohorts. Synthetic data, i.e., artificial but realistic electronic health records, could overcome the drought that is troubling the healthcare sector. Deep neural network architectures, in particular, have shown an incredible ability to learn from complex data sets and generate large amounts of unseen data points with the same statistical properties as the training data. Here, we present a generative neural network model that can create synthetic health records with realistic timelines. These clinical trajectories are generated on a per-patient basis and are represented as linear-sequence graphs of clinical events over time. We use a variational graph autoencoder (VGAE) to generate synthetic samples from real-world electronic health records. Our approach generates health records not seen in the training data. We show that these artificial patient trajectories are realistic and preserve patient privacy and can therefore support the safe sharing of data across organizations. Nature Publishing Group UK 2023-04-29 /pmc/articles/PMC10148837/ /pubmed/37120594 http://dx.doi.org/10.1038/s41746-023-00822-x Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Nikolentzos, Giannis
Vazirgiannis, Michalis
Xypolopoulos, Christos
Lingman, Markus
Brandt, Erik G.
Synthetic electronic health records generated with variational graph autoencoders
title Synthetic electronic health records generated with variational graph autoencoders
title_full Synthetic electronic health records generated with variational graph autoencoders
title_fullStr Synthetic electronic health records generated with variational graph autoencoders
title_full_unstemmed Synthetic electronic health records generated with variational graph autoencoders
title_short Synthetic electronic health records generated with variational graph autoencoders
title_sort synthetic electronic health records generated with variational graph autoencoders
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10148837/
https://www.ncbi.nlm.nih.gov/pubmed/37120594
http://dx.doi.org/10.1038/s41746-023-00822-x
work_keys_str_mv AT nikolentzosgiannis syntheticelectronichealthrecordsgeneratedwithvariationalgraphautoencoders
AT vazirgiannismichalis syntheticelectronichealthrecordsgeneratedwithvariationalgraphautoencoders
AT xypolopouloschristos syntheticelectronichealthrecordsgeneratedwithvariationalgraphautoencoders
AT lingmanmarkus syntheticelectronichealthrecordsgeneratedwithvariationalgraphautoencoders
AT brandterikg syntheticelectronichealthrecordsgeneratedwithvariationalgraphautoencoders