Cargando…

Global Alliance for Genomics and Health Meets Bioconductor: Toward Reproducible and Agile Cancer Genomics at Cloud Scale

PURPOSE: Institutional efforts toward the democratization of cloud-scale data and analysis methods for cancer genomics are proceeding rapidly. As part of this effort, we bridge two major bioinformatic initiatives: the Global Alliance for Genomics and Health (GA4GH) and Bioconductor. METHODS: We desc...

Descripción completa

Detalles Bibliográficos
Autores principales: Carey, Vincent J., Ramos, Marcel, Stubbs, Benjamin J., Gopaulakrishnan, Shweta, Oh, Sehyun, Turaga, Nitesh, Waldron, Levi, Morgan, Martin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society of Clinical Oncology 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7265787/
https://www.ncbi.nlm.nih.gov/pubmed/32453635
http://dx.doi.org/10.1200/CCI.19.00111
_version_ 1783541189573607424
author Carey, Vincent J.
Ramos, Marcel
Stubbs, Benjamin J.
Gopaulakrishnan, Shweta
Oh, Sehyun
Turaga, Nitesh
Waldron, Levi
Morgan, Martin
author_facet Carey, Vincent J.
Ramos, Marcel
Stubbs, Benjamin J.
Gopaulakrishnan, Shweta
Oh, Sehyun
Turaga, Nitesh
Waldron, Levi
Morgan, Martin
author_sort Carey, Vincent J.
collection PubMed
description PURPOSE: Institutional efforts toward the democratization of cloud-scale data and analysis methods for cancer genomics are proceeding rapidly. As part of this effort, we bridge two major bioinformatic initiatives: the Global Alliance for Genomics and Health (GA4GH) and Bioconductor. METHODS: We describe in detail a use case in pancancer transcriptomics conducted by blending implementations of the GA4GH Workflow Execution Services and Tool Registry Service concepts with the Bioconductor curatedTCGAData and BiocOncoTK packages. RESULTS: We carried out the analysis with a formally archived workflow and container at dockstore.org and a workspace and notebook at app.terra.bio. The analysis identified relationships between microsatellite instability and biomarkers of immune dysregulation at a finer level of granularity than previously reported. Our use of standard approaches to containerization and workflow programming allows this analysis to be replicated and extended. CONCLUSION: Experimental use of dockstore.org and app.terra.bio in concert with Bioconductor enabled novel statistical analysis of large genomic projects without the need for local supercomputing resources but involved challenges related to container design, script archiving, and unit testing. Best practices and cost/benefit metrics for the management and analysis of globally federated genomic data and annotation are evolving. The creation and execution of use cases like the one reported here will be helpful in the development and comparison of approaches to federated data/analysis systems in cancer genomics.
format Online
Article
Text
id pubmed-7265787
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher American Society of Clinical Oncology
record_format MEDLINE/PubMed
spelling pubmed-72657872021-05-26 Global Alliance for Genomics and Health Meets Bioconductor: Toward Reproducible and Agile Cancer Genomics at Cloud Scale Carey, Vincent J. Ramos, Marcel Stubbs, Benjamin J. Gopaulakrishnan, Shweta Oh, Sehyun Turaga, Nitesh Waldron, Levi Morgan, Martin JCO Clin Cancer Inform Original Reports PURPOSE: Institutional efforts toward the democratization of cloud-scale data and analysis methods for cancer genomics are proceeding rapidly. As part of this effort, we bridge two major bioinformatic initiatives: the Global Alliance for Genomics and Health (GA4GH) and Bioconductor. METHODS: We describe in detail a use case in pancancer transcriptomics conducted by blending implementations of the GA4GH Workflow Execution Services and Tool Registry Service concepts with the Bioconductor curatedTCGAData and BiocOncoTK packages. RESULTS: We carried out the analysis with a formally archived workflow and container at dockstore.org and a workspace and notebook at app.terra.bio. The analysis identified relationships between microsatellite instability and biomarkers of immune dysregulation at a finer level of granularity than previously reported. Our use of standard approaches to containerization and workflow programming allows this analysis to be replicated and extended. CONCLUSION: Experimental use of dockstore.org and app.terra.bio in concert with Bioconductor enabled novel statistical analysis of large genomic projects without the need for local supercomputing resources but involved challenges related to container design, script archiving, and unit testing. Best practices and cost/benefit metrics for the management and analysis of globally federated genomic data and annotation are evolving. The creation and execution of use cases like the one reported here will be helpful in the development and comparison of approaches to federated data/analysis systems in cancer genomics. American Society of Clinical Oncology 2020-05-26 /pmc/articles/PMC7265787/ /pubmed/32453635 http://dx.doi.org/10.1200/CCI.19.00111 Text en © 2020 by American Society of Clinical Oncology https://creativecommons.org/licenses/by/4.0/ Licensed under the Creative Commons Attribution 4.0 License: https://creativecommons.org/licenses/by/4.0/
spellingShingle Original Reports
Carey, Vincent J.
Ramos, Marcel
Stubbs, Benjamin J.
Gopaulakrishnan, Shweta
Oh, Sehyun
Turaga, Nitesh
Waldron, Levi
Morgan, Martin
Global Alliance for Genomics and Health Meets Bioconductor: Toward Reproducible and Agile Cancer Genomics at Cloud Scale
title Global Alliance for Genomics and Health Meets Bioconductor: Toward Reproducible and Agile Cancer Genomics at Cloud Scale
title_full Global Alliance for Genomics and Health Meets Bioconductor: Toward Reproducible and Agile Cancer Genomics at Cloud Scale
title_fullStr Global Alliance for Genomics and Health Meets Bioconductor: Toward Reproducible and Agile Cancer Genomics at Cloud Scale
title_full_unstemmed Global Alliance for Genomics and Health Meets Bioconductor: Toward Reproducible and Agile Cancer Genomics at Cloud Scale
title_short Global Alliance for Genomics and Health Meets Bioconductor: Toward Reproducible and Agile Cancer Genomics at Cloud Scale
title_sort global alliance for genomics and health meets bioconductor: toward reproducible and agile cancer genomics at cloud scale
topic Original Reports
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7265787/
https://www.ncbi.nlm.nih.gov/pubmed/32453635
http://dx.doi.org/10.1200/CCI.19.00111
work_keys_str_mv AT careyvincentj globalallianceforgenomicsandhealthmeetsbioconductortowardreproducibleandagilecancergenomicsatcloudscale
AT ramosmarcel globalallianceforgenomicsandhealthmeetsbioconductortowardreproducibleandagilecancergenomicsatcloudscale
AT stubbsbenjaminj globalallianceforgenomicsandhealthmeetsbioconductortowardreproducibleandagilecancergenomicsatcloudscale
AT gopaulakrishnanshweta globalallianceforgenomicsandhealthmeetsbioconductortowardreproducibleandagilecancergenomicsatcloudscale
AT ohsehyun globalallianceforgenomicsandhealthmeetsbioconductortowardreproducibleandagilecancergenomicsatcloudscale
AT turaganitesh globalallianceforgenomicsandhealthmeetsbioconductortowardreproducibleandagilecancergenomicsatcloudscale
AT waldronlevi globalallianceforgenomicsandhealthmeetsbioconductortowardreproducibleandagilecancergenomicsatcloudscale
AT morganmartin globalallianceforgenomicsandhealthmeetsbioconductortowardreproducibleandagilecancergenomicsatcloudscale