Synthetic data as an enabler for machine learning applications in medicine

Synthetic data generation is the process of using machine learning methods to train a model that captures the patterns in a real dataset. Then new or synthetic data can be generated from that trained model. The synthetic data does not have a one-to-one mapping to the original data or to real patient...

Descripción completa

Detalles Bibliográficos
Autores principales: Rajotte, Jean-Francois, Bergen, Robert, Buckeridge, David L., El Emam, Khaled, Ng, Raymond, Strome, Elissa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9619172/
https://www.ncbi.nlm.nih.gov/pubmed/36325058
http://dx.doi.org/10.1016/j.isci.2022.105331
_version_ 1784821218423078912
author Rajotte, Jean-Francois
Bergen, Robert
Buckeridge, David L.
El Emam, Khaled
Ng, Raymond
Strome, Elissa
author_facet Rajotte, Jean-Francois
Bergen, Robert
Buckeridge, David L.
El Emam, Khaled
Ng, Raymond
Strome, Elissa
author_sort Rajotte, Jean-Francois
collection PubMed
description Synthetic data generation is the process of using machine learning methods to train a model that captures the patterns in a real dataset. Then new or synthetic data can be generated from that trained model. The synthetic data does not have a one-to-one mapping to the original data or to real patients, and therefore has the potential of privacy preserving properties. There is a growing interest in the application of synthetic data across health and life sciences, but to fully realize the benefits, further education, research, and policy innovation is required. This article summarizes the opportunities and challenges of SDG for health data, and provides directions for how this technology can be leveraged to accelerate data access for secondary purposes.
format Online
Article
Text
id pubmed-9619172
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-96191722022-11-01 Synthetic data as an enabler for machine learning applications in medicine Rajotte, Jean-Francois Bergen, Robert Buckeridge, David L. El Emam, Khaled Ng, Raymond Strome, Elissa iScience Perspective Synthetic data generation is the process of using machine learning methods to train a model that captures the patterns in a real dataset. Then new or synthetic data can be generated from that trained model. The synthetic data does not have a one-to-one mapping to the original data or to real patients, and therefore has the potential of privacy preserving properties. There is a growing interest in the application of synthetic data across health and life sciences, but to fully realize the benefits, further education, research, and policy innovation is required. This article summarizes the opportunities and challenges of SDG for health data, and provides directions for how this technology can be leveraged to accelerate data access for secondary purposes. Elsevier 2022-10-13 /pmc/articles/PMC9619172/ /pubmed/36325058 http://dx.doi.org/10.1016/j.isci.2022.105331 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Perspective
Rajotte, Jean-Francois
Bergen, Robert
Buckeridge, David L.
El Emam, Khaled
Ng, Raymond
Strome, Elissa
Synthetic data as an enabler for machine learning applications in medicine
title Synthetic data as an enabler for machine learning applications in medicine
title_full Synthetic data as an enabler for machine learning applications in medicine
title_fullStr Synthetic data as an enabler for machine learning applications in medicine
title_full_unstemmed Synthetic data as an enabler for machine learning applications in medicine
title_short Synthetic data as an enabler for machine learning applications in medicine
title_sort synthetic data as an enabler for machine learning applications in medicine
topic Perspective
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9619172/
https://www.ncbi.nlm.nih.gov/pubmed/36325058
http://dx.doi.org/10.1016/j.isci.2022.105331
work_keys_str_mv AT rajottejeanfrancois syntheticdataasanenablerformachinelearningapplicationsinmedicine
AT bergenrobert syntheticdataasanenablerformachinelearningapplicationsinmedicine
AT buckeridgedavidl syntheticdataasanenablerformachinelearningapplicationsinmedicine
AT elemamkhaled syntheticdataasanenablerformachinelearningapplicationsinmedicine
AT ngraymond syntheticdataasanenablerformachinelearningapplicationsinmedicine
AT stromeelissa syntheticdataasanenablerformachinelearningapplicationsinmedicine