Cargando…

Synthetic data as an enabler for machine learning applications in medicine

Synthetic data generation is the process of using machine learning methods to train a model that captures the patterns in a real dataset. Then new or synthetic data can be generated from that trained model. The synthetic data does not have a one-to-one mapping to the original data or to real patient...

Descripción completa

Detalles Bibliográficos
Autores principales: Rajotte, Jean-Francois, Bergen, Robert, Buckeridge, David L., El Emam, Khaled, Ng, Raymond, Strome, Elissa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9619172/
https://www.ncbi.nlm.nih.gov/pubmed/36325058
http://dx.doi.org/10.1016/j.isci.2022.105331
Descripción
Sumario:Synthetic data generation is the process of using machine learning methods to train a model that captures the patterns in a real dataset. Then new or synthetic data can be generated from that trained model. The synthetic data does not have a one-to-one mapping to the original data or to real patients, and therefore has the potential of privacy preserving properties. There is a growing interest in the application of synthetic data across health and life sciences, but to fully realize the benefits, further education, research, and policy innovation is required. This article summarizes the opportunities and challenges of SDG for health data, and provides directions for how this technology can be leveraged to accelerate data access for secondary purposes.