Cargando…
An overview of synthetic administrative data for research
Use of administrative data for research and for planning services has increased over recent decades due to the value of the large, rich information available. However, concerns about the release of sensitive or personal data and the associated disclosure risk can lead to lengthy approval processes a...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Swansea University
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10464868/ https://www.ncbi.nlm.nih.gov/pubmed/37650026 http://dx.doi.org/10.23889/ijpds.v7i1.1727 |
_version_ | 1785098558591991808 |
---|---|
author | Kokosi, Theodora De Stavola, Bianca Mitra, Robin Frayling, Lora Doherty, Aiden Dove, Iain Sonnenberg, Pam Harron, Katie |
author_facet | Kokosi, Theodora De Stavola, Bianca Mitra, Robin Frayling, Lora Doherty, Aiden Dove, Iain Sonnenberg, Pam Harron, Katie |
author_sort | Kokosi, Theodora |
collection | PubMed |
description | Use of administrative data for research and for planning services has increased over recent decades due to the value of the large, rich information available. However, concerns about the release of sensitive or personal data and the associated disclosure risk can lead to lengthy approval processes and restricted data access. This can delay or prevent the production of timely evidence. A promising solution to facilitate more efficient data access is to create synthetic versions of the original datasets which are less likely to hold confidential information and can minimise disclosure risk. Such data may be used as an interim solution, allowing researchers to develop their analysis plans on non-disclosive data, whilst waiting for access to the real data. We aim to provide an overview of the background and uses of synthetic data and describe common methods used to generate synthetic data in the context of UK administrative research. We propose a simplified terminology for categories of synthetic data (univariate, multivariate, and complex modality synthetic data) as well as a more comprehensive description of the terminology used in the existing literature and illustrate challenges and future directions for research. |
format | Online Article Text |
id | pubmed-10464868 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Swansea University |
record_format | MEDLINE/PubMed |
spelling | pubmed-104648682023-08-30 An overview of synthetic administrative data for research Kokosi, Theodora De Stavola, Bianca Mitra, Robin Frayling, Lora Doherty, Aiden Dove, Iain Sonnenberg, Pam Harron, Katie Int J Popul Data Sci Population Data Science Use of administrative data for research and for planning services has increased over recent decades due to the value of the large, rich information available. However, concerns about the release of sensitive or personal data and the associated disclosure risk can lead to lengthy approval processes and restricted data access. This can delay or prevent the production of timely evidence. A promising solution to facilitate more efficient data access is to create synthetic versions of the original datasets which are less likely to hold confidential information and can minimise disclosure risk. Such data may be used as an interim solution, allowing researchers to develop their analysis plans on non-disclosive data, whilst waiting for access to the real data. We aim to provide an overview of the background and uses of synthetic data and describe common methods used to generate synthetic data in the context of UK administrative research. We propose a simplified terminology for categories of synthetic data (univariate, multivariate, and complex modality synthetic data) as well as a more comprehensive description of the terminology used in the existing literature and illustrate challenges and future directions for research. Swansea University 2022-05-23 /pmc/articles/PMC10464868/ /pubmed/37650026 http://dx.doi.org/10.23889/ijpds.v7i1.1727 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. |
spellingShingle | Population Data Science Kokosi, Theodora De Stavola, Bianca Mitra, Robin Frayling, Lora Doherty, Aiden Dove, Iain Sonnenberg, Pam Harron, Katie An overview of synthetic administrative data for research |
title | An overview of synthetic administrative data for research |
title_full | An overview of synthetic administrative data for research |
title_fullStr | An overview of synthetic administrative data for research |
title_full_unstemmed | An overview of synthetic administrative data for research |
title_short | An overview of synthetic administrative data for research |
title_sort | overview of synthetic administrative data for research |
topic | Population Data Science |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10464868/ https://www.ncbi.nlm.nih.gov/pubmed/37650026 http://dx.doi.org/10.23889/ijpds.v7i1.1727 |
work_keys_str_mv | AT kokositheodora anoverviewofsyntheticadministrativedataforresearch AT destavolabianca anoverviewofsyntheticadministrativedataforresearch AT mitrarobin anoverviewofsyntheticadministrativedataforresearch AT fraylinglora anoverviewofsyntheticadministrativedataforresearch AT dohertyaiden anoverviewofsyntheticadministrativedataforresearch AT doveiain anoverviewofsyntheticadministrativedataforresearch AT sonnenbergpam anoverviewofsyntheticadministrativedataforresearch AT harronkatie anoverviewofsyntheticadministrativedataforresearch AT kokositheodora overviewofsyntheticadministrativedataforresearch AT destavolabianca overviewofsyntheticadministrativedataforresearch AT mitrarobin overviewofsyntheticadministrativedataforresearch AT fraylinglora overviewofsyntheticadministrativedataforresearch AT dohertyaiden overviewofsyntheticadministrativedataforresearch AT doveiain overviewofsyntheticadministrativedataforresearch AT sonnenbergpam overviewofsyntheticadministrativedataforresearch AT harronkatie overviewofsyntheticadministrativedataforresearch |