Cargando…

Wrangling Galaxy’s reference data

Summary: The Galaxy platform has developed into a fully featured collaborative workbench, with goals of inherently capturing provenance to enable reproducible data analysis, and of making it straightforward to run one’s own server. However, many Galaxy platform tools rely on the presence of referenc...

Descripción completa

Detalles Bibliográficos
Autores principales: Blankenberg, Daniel, Johnson, James E., Taylor, James, Nekrutenko, Anton
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4071198/
https://www.ncbi.nlm.nih.gov/pubmed/24585771
http://dx.doi.org/10.1093/bioinformatics/btu119
_version_ 1782322786247639040
author Blankenberg, Daniel
Johnson, James E.
Taylor, James
Nekrutenko, Anton
author_facet Blankenberg, Daniel
Johnson, James E.
Taylor, James
Nekrutenko, Anton
author_sort Blankenberg, Daniel
collection PubMed
description Summary: The Galaxy platform has developed into a fully featured collaborative workbench, with goals of inherently capturing provenance to enable reproducible data analysis, and of making it straightforward to run one’s own server. However, many Galaxy platform tools rely on the presence of reference data, such as alignment indexes, to function efficiently. Until now, the building of this cache of data for Galaxy has been an error-prone manual process lacking reproducibility and provenance. The Galaxy Data Manager framework is an enhancement that changes the management of Galaxy’s built-in data cache from a manual procedure to an automated graphical user interface (GUI) driven process, which contains the same openness, reproducibility and provenance that is afforded to Galaxy’s analysis tools. Data Manager tools allow the Galaxy administrator to download, create and install additional datasets for any type of reference data in real time. Availability and implementation: The Galaxy Data Manager framework is implemented in Python and has been integrated as part of the core Galaxy platform. Individual Data Manager tools can be defined locally or installed from a ToolShed, allowing the Galaxy community to define additional Data Manager tools as needed, with full versioning and dependency support. Contact: dan@bx.psu.edu. or anton@bx.psu.edu Supplementary information: Supplementary data is available at Bioinformatics online.
format Online
Article
Text
id pubmed-4071198
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-40711982014-06-26 Wrangling Galaxy’s reference data Blankenberg, Daniel Johnson, James E. Taylor, James Nekrutenko, Anton Bioinformatics Applications Notes Summary: The Galaxy platform has developed into a fully featured collaborative workbench, with goals of inherently capturing provenance to enable reproducible data analysis, and of making it straightforward to run one’s own server. However, many Galaxy platform tools rely on the presence of reference data, such as alignment indexes, to function efficiently. Until now, the building of this cache of data for Galaxy has been an error-prone manual process lacking reproducibility and provenance. The Galaxy Data Manager framework is an enhancement that changes the management of Galaxy’s built-in data cache from a manual procedure to an automated graphical user interface (GUI) driven process, which contains the same openness, reproducibility and provenance that is afforded to Galaxy’s analysis tools. Data Manager tools allow the Galaxy administrator to download, create and install additional datasets for any type of reference data in real time. Availability and implementation: The Galaxy Data Manager framework is implemented in Python and has been integrated as part of the core Galaxy platform. Individual Data Manager tools can be defined locally or installed from a ToolShed, allowing the Galaxy community to define additional Data Manager tools as needed, with full versioning and dependency support. Contact: dan@bx.psu.edu. or anton@bx.psu.edu Supplementary information: Supplementary data is available at Bioinformatics online. Oxford University Press 2014-07-01 2014-02-28 /pmc/articles/PMC4071198/ /pubmed/24585771 http://dx.doi.org/10.1093/bioinformatics/btu119 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Applications Notes
Blankenberg, Daniel
Johnson, James E.
Taylor, James
Nekrutenko, Anton
Wrangling Galaxy’s reference data
title Wrangling Galaxy’s reference data
title_full Wrangling Galaxy’s reference data
title_fullStr Wrangling Galaxy’s reference data
title_full_unstemmed Wrangling Galaxy’s reference data
title_short Wrangling Galaxy’s reference data
title_sort wrangling galaxy’s reference data
topic Applications Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4071198/
https://www.ncbi.nlm.nih.gov/pubmed/24585771
http://dx.doi.org/10.1093/bioinformatics/btu119
work_keys_str_mv AT blankenbergdaniel wranglinggalaxysreferencedata
AT johnsonjamese wranglinggalaxysreferencedata
AT wranglinggalaxysreferencedata
AT taylorjames wranglinggalaxysreferencedata
AT nekrutenkoanton wranglinggalaxysreferencedata