Cargando…

Cloud-based biomedical data storage and analysis for genomic research: Landscape analysis of data governance in emerging NIH-supported platforms

The storage, sharing, and analysis of genomic data poses technical and logistical challenges that have precipitated the development of cloud-based computing platforms designed to facilitate collaboration and maximize the scientific utility of data. To understand cloud platforms’ policies and procedu...

Descripción completa

Detalles Bibliográficos
Autores principales: Dahlquist, Jacklyn M., Nelson, Sarah C., Fullerton, Stephanie M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10173774/
https://www.ncbi.nlm.nih.gov/pubmed/37181330
http://dx.doi.org/10.1016/j.xhgg.2023.100196
_version_ 1785039894198878208
author Dahlquist, Jacklyn M.
Nelson, Sarah C.
Fullerton, Stephanie M.
author_facet Dahlquist, Jacklyn M.
Nelson, Sarah C.
Fullerton, Stephanie M.
author_sort Dahlquist, Jacklyn M.
collection PubMed
description The storage, sharing, and analysis of genomic data poses technical and logistical challenges that have precipitated the development of cloud-based computing platforms designed to facilitate collaboration and maximize the scientific utility of data. To understand cloud platforms’ policies and procedures and the implications for different stakeholder groups, in summer 2021, we reviewed publicly available documents (N = 94) sourced from platform websites, scientific literature, and lay media for five NIH-funded cloud platforms (the All of Us Research Hub, NHGRI AnVIL, NHLBI BioData Catalyst, NCI Genomic Data Commons, and the Kids First Data Resource Center) and a pre-existing data sharing mechanism, dbGaP. Platform policies were compared across seven categories of data governance: data submission, data ingestion, user authentication and authorization, data security, data access, auditing, and sanctions. Our analysis finds similarities across the platforms, including reliance on a formal data ingestion process, multiple tiers of data access with varying user authentication and/or authorization requirements, platform and user data security measures, and auditing for inappropriate data use. Platforms differ in how data tiers are organized, as well as the specifics of user authentication and authorization across access tiers. Our analysis maps elements of data governance across emerging NIH-funded cloud platforms and as such provides a key resource for stakeholders seeking to understand and utilize data access and analysis options across platforms and to surface aspects of governance that may require harmonization to achieve the desired interoperability.
format Online
Article
Text
id pubmed-10173774
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-101737742023-05-12 Cloud-based biomedical data storage and analysis for genomic research: Landscape analysis of data governance in emerging NIH-supported platforms Dahlquist, Jacklyn M. Nelson, Sarah C. Fullerton, Stephanie M. HGG Adv Article The storage, sharing, and analysis of genomic data poses technical and logistical challenges that have precipitated the development of cloud-based computing platforms designed to facilitate collaboration and maximize the scientific utility of data. To understand cloud platforms’ policies and procedures and the implications for different stakeholder groups, in summer 2021, we reviewed publicly available documents (N = 94) sourced from platform websites, scientific literature, and lay media for five NIH-funded cloud platforms (the All of Us Research Hub, NHGRI AnVIL, NHLBI BioData Catalyst, NCI Genomic Data Commons, and the Kids First Data Resource Center) and a pre-existing data sharing mechanism, dbGaP. Platform policies were compared across seven categories of data governance: data submission, data ingestion, user authentication and authorization, data security, data access, auditing, and sanctions. Our analysis finds similarities across the platforms, including reliance on a formal data ingestion process, multiple tiers of data access with varying user authentication and/or authorization requirements, platform and user data security measures, and auditing for inappropriate data use. Platforms differ in how data tiers are organized, as well as the specifics of user authentication and authorization across access tiers. Our analysis maps elements of data governance across emerging NIH-funded cloud platforms and as such provides a key resource for stakeholders seeking to understand and utilize data access and analysis options across platforms and to surface aspects of governance that may require harmonization to achieve the desired interoperability. Elsevier 2023-04-12 /pmc/articles/PMC10173774/ /pubmed/37181330 http://dx.doi.org/10.1016/j.xhgg.2023.100196 Text en © 2023 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Dahlquist, Jacklyn M.
Nelson, Sarah C.
Fullerton, Stephanie M.
Cloud-based biomedical data storage and analysis for genomic research: Landscape analysis of data governance in emerging NIH-supported platforms
title Cloud-based biomedical data storage and analysis for genomic research: Landscape analysis of data governance in emerging NIH-supported platforms
title_full Cloud-based biomedical data storage and analysis for genomic research: Landscape analysis of data governance in emerging NIH-supported platforms
title_fullStr Cloud-based biomedical data storage and analysis for genomic research: Landscape analysis of data governance in emerging NIH-supported platforms
title_full_unstemmed Cloud-based biomedical data storage and analysis for genomic research: Landscape analysis of data governance in emerging NIH-supported platforms
title_short Cloud-based biomedical data storage and analysis for genomic research: Landscape analysis of data governance in emerging NIH-supported platforms
title_sort cloud-based biomedical data storage and analysis for genomic research: landscape analysis of data governance in emerging nih-supported platforms
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10173774/
https://www.ncbi.nlm.nih.gov/pubmed/37181330
http://dx.doi.org/10.1016/j.xhgg.2023.100196
work_keys_str_mv AT dahlquistjacklynm cloudbasedbiomedicaldatastorageandanalysisforgenomicresearchlandscapeanalysisofdatagovernanceinemergingnihsupportedplatforms
AT nelsonsarahc cloudbasedbiomedicaldatastorageandanalysisforgenomicresearchlandscapeanalysisofdatagovernanceinemergingnihsupportedplatforms
AT fullertonstephaniem cloudbasedbiomedicaldatastorageandanalysisforgenomicresearchlandscapeanalysisofdatagovernanceinemergingnihsupportedplatforms