Cargando…

Fully synthetic neuroimaging data for replication and exploration

Scientific transparency, data exploration, and education are advanced through data sharing. However, risk for disclosure of personal information and institutional data sharing regulations can impede human subject/patient data sharing and thus limit open science initiatives. Sharing fully synthetic d...

Descripción completa

Detalles Bibliográficos
Autores principales: Vaden, Kenneth I., Gebregziabher, Mulugeta, Eckert, Mark A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7688496/
https://www.ncbi.nlm.nih.gov/pubmed/32828925
http://dx.doi.org/10.1016/j.neuroimage.2020.117284
_version_ 1783613716956184576
author Vaden, Kenneth I.
Gebregziabher, Mulugeta
Eckert, Mark A.
author_facet Vaden, Kenneth I.
Gebregziabher, Mulugeta
Eckert, Mark A.
author_sort Vaden, Kenneth I.
collection PubMed
description Scientific transparency, data exploration, and education are advanced through data sharing. However, risk for disclosure of personal information and institutional data sharing regulations can impede human subject/patient data sharing and thus limit open science initiatives. Sharing fully synthetic data is an alternative when it is not possible to share real or observed data. Here we describe a data sharing approach that borrows principles and methods from multiple imputation to replace observed values with synthetic values, thereby creating a fully synthetic neuroimaging dataset that accurately represents the covariance structure of the observed dataset. Predictor tables composed of demographic, site, behavioral and total intracranial volume (ICV) variables from 264 pediatric cases were used to create synthetic predictor tables, which were then used to synthesize gray matter images derived from T1-weighted data. The synthetic predictor tables demonstrated pooled variance and statistical estimates that closely approximated the observed data, as reflected in measures of efficiency and statistical bias. Similarly, the synthetic gray matter data accurately represented the variance and voxel-level associations with predictor variables (age, sex, verbal IQ, and ICV). The magnitude and spatial distribution of gray matter effects in the observed imaging data were replicated in the pooled results from the synthetic datasets. This approach for generating fully synthetic neuroimaging data has widespread potential for data sharing, including replication, new discovery, and education. Fully synthetic neuroimaging datasets can enable data-sharing because it accurately represents patterns of variance in the original data, while diminishing the risk of privacy disclosures that can accompany neuroimaging data sharing.
format Online
Article
Text
id pubmed-7688496
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-76884962020-12-01 Fully synthetic neuroimaging data for replication and exploration Vaden, Kenneth I. Gebregziabher, Mulugeta Eckert, Mark A. Neuroimage Article Scientific transparency, data exploration, and education are advanced through data sharing. However, risk for disclosure of personal information and institutional data sharing regulations can impede human subject/patient data sharing and thus limit open science initiatives. Sharing fully synthetic data is an alternative when it is not possible to share real or observed data. Here we describe a data sharing approach that borrows principles and methods from multiple imputation to replace observed values with synthetic values, thereby creating a fully synthetic neuroimaging dataset that accurately represents the covariance structure of the observed dataset. Predictor tables composed of demographic, site, behavioral and total intracranial volume (ICV) variables from 264 pediatric cases were used to create synthetic predictor tables, which were then used to synthesize gray matter images derived from T1-weighted data. The synthetic predictor tables demonstrated pooled variance and statistical estimates that closely approximated the observed data, as reflected in measures of efficiency and statistical bias. Similarly, the synthetic gray matter data accurately represented the variance and voxel-level associations with predictor variables (age, sex, verbal IQ, and ICV). The magnitude and spatial distribution of gray matter effects in the observed imaging data were replicated in the pooled results from the synthetic datasets. This approach for generating fully synthetic neuroimaging data has widespread potential for data sharing, including replication, new discovery, and education. Fully synthetic neuroimaging datasets can enable data-sharing because it accurately represents patterns of variance in the original data, while diminishing the risk of privacy disclosures that can accompany neuroimaging data sharing. 2020-08-20 2020-12 /pmc/articles/PMC7688496/ /pubmed/32828925 http://dx.doi.org/10.1016/j.neuroimage.2020.117284 Text en This is an open access article under the CC BY-NC-ND license. (http://creativecommons.org/licenses/by-nc-nd/4.0/)
spellingShingle Article
Vaden, Kenneth I.
Gebregziabher, Mulugeta
Eckert, Mark A.
Fully synthetic neuroimaging data for replication and exploration
title Fully synthetic neuroimaging data for replication and exploration
title_full Fully synthetic neuroimaging data for replication and exploration
title_fullStr Fully synthetic neuroimaging data for replication and exploration
title_full_unstemmed Fully synthetic neuroimaging data for replication and exploration
title_short Fully synthetic neuroimaging data for replication and exploration
title_sort fully synthetic neuroimaging data for replication and exploration
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7688496/
https://www.ncbi.nlm.nih.gov/pubmed/32828925
http://dx.doi.org/10.1016/j.neuroimage.2020.117284
work_keys_str_mv AT vadenkennethi fullysyntheticneuroimagingdataforreplicationandexploration
AT gebregziabhermulugeta fullysyntheticneuroimagingdataforreplicationandexploration
AT fullysyntheticneuroimagingdataforreplicationandexploration
AT eckertmarka fullysyntheticneuroimagingdataforreplicationandexploration