Cargando…
Wrangling Galaxy’s reference data
Summary: The Galaxy platform has developed into a fully featured collaborative workbench, with goals of inherently capturing provenance to enable reproducible data analysis, and of making it straightforward to run one’s own server. However, many Galaxy platform tools rely on the presence of referenc...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4071198/ https://www.ncbi.nlm.nih.gov/pubmed/24585771 http://dx.doi.org/10.1093/bioinformatics/btu119 |
_version_ | 1782322786247639040 |
---|---|
author | Blankenberg, Daniel Johnson, James E. Taylor, James Nekrutenko, Anton |
author_facet | Blankenberg, Daniel Johnson, James E. Taylor, James Nekrutenko, Anton |
author_sort | Blankenberg, Daniel |
collection | PubMed |
description | Summary: The Galaxy platform has developed into a fully featured collaborative workbench, with goals of inherently capturing provenance to enable reproducible data analysis, and of making it straightforward to run one’s own server. However, many Galaxy platform tools rely on the presence of reference data, such as alignment indexes, to function efficiently. Until now, the building of this cache of data for Galaxy has been an error-prone manual process lacking reproducibility and provenance. The Galaxy Data Manager framework is an enhancement that changes the management of Galaxy’s built-in data cache from a manual procedure to an automated graphical user interface (GUI) driven process, which contains the same openness, reproducibility and provenance that is afforded to Galaxy’s analysis tools. Data Manager tools allow the Galaxy administrator to download, create and install additional datasets for any type of reference data in real time. Availability and implementation: The Galaxy Data Manager framework is implemented in Python and has been integrated as part of the core Galaxy platform. Individual Data Manager tools can be defined locally or installed from a ToolShed, allowing the Galaxy community to define additional Data Manager tools as needed, with full versioning and dependency support. Contact: dan@bx.psu.edu. or anton@bx.psu.edu Supplementary information: Supplementary data is available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-4071198 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-40711982014-06-26 Wrangling Galaxy’s reference data Blankenberg, Daniel Johnson, James E. Taylor, James Nekrutenko, Anton Bioinformatics Applications Notes Summary: The Galaxy platform has developed into a fully featured collaborative workbench, with goals of inherently capturing provenance to enable reproducible data analysis, and of making it straightforward to run one’s own server. However, many Galaxy platform tools rely on the presence of reference data, such as alignment indexes, to function efficiently. Until now, the building of this cache of data for Galaxy has been an error-prone manual process lacking reproducibility and provenance. The Galaxy Data Manager framework is an enhancement that changes the management of Galaxy’s built-in data cache from a manual procedure to an automated graphical user interface (GUI) driven process, which contains the same openness, reproducibility and provenance that is afforded to Galaxy’s analysis tools. Data Manager tools allow the Galaxy administrator to download, create and install additional datasets for any type of reference data in real time. Availability and implementation: The Galaxy Data Manager framework is implemented in Python and has been integrated as part of the core Galaxy platform. Individual Data Manager tools can be defined locally or installed from a ToolShed, allowing the Galaxy community to define additional Data Manager tools as needed, with full versioning and dependency support. Contact: dan@bx.psu.edu. or anton@bx.psu.edu Supplementary information: Supplementary data is available at Bioinformatics online. Oxford University Press 2014-07-01 2014-02-28 /pmc/articles/PMC4071198/ /pubmed/24585771 http://dx.doi.org/10.1093/bioinformatics/btu119 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Notes Blankenberg, Daniel Johnson, James E. Taylor, James Nekrutenko, Anton Wrangling Galaxy’s reference data |
title | Wrangling Galaxy’s reference data |
title_full | Wrangling Galaxy’s reference data |
title_fullStr | Wrangling Galaxy’s reference data |
title_full_unstemmed | Wrangling Galaxy’s reference data |
title_short | Wrangling Galaxy’s reference data |
title_sort | wrangling galaxy’s reference data |
topic | Applications Notes |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4071198/ https://www.ncbi.nlm.nih.gov/pubmed/24585771 http://dx.doi.org/10.1093/bioinformatics/btu119 |
work_keys_str_mv | AT blankenbergdaniel wranglinggalaxysreferencedata AT johnsonjamese wranglinggalaxysreferencedata AT wranglinggalaxysreferencedata AT taylorjames wranglinggalaxysreferencedata AT nekrutenkoanton wranglinggalaxysreferencedata |