Cargando…

Provenance in bioinformatics workflows

In this work, we used the PROV-DM model to manage data provenance in workflows of genome projects. This provenance model allows the storage of details of one workflow execution, e.g., raw and produced data and computational tools, their versions and parameters. Using this model, biologists can acces...

Descripción completa

Detalles Bibliográficos
Autores principales: de Paula, Renato, Holanda, Maristela, Gomes, Luciana SA, Lifschitz, Sergio, Walter, Maria Emilia MT
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3816297/
https://www.ncbi.nlm.nih.gov/pubmed/24564294
http://dx.doi.org/10.1186/1471-2105-14-S11-S6
_version_ 1782477942647947264
author de Paula, Renato
Holanda, Maristela
Gomes, Luciana SA
Lifschitz, Sergio
Walter, Maria Emilia MT
author_facet de Paula, Renato
Holanda, Maristela
Gomes, Luciana SA
Lifschitz, Sergio
Walter, Maria Emilia MT
author_sort de Paula, Renato
collection PubMed
description In this work, we used the PROV-DM model to manage data provenance in workflows of genome projects. This provenance model allows the storage of details of one workflow execution, e.g., raw and produced data and computational tools, their versions and parameters. Using this model, biologists can access details of one particular execution of a workflow, compare results produced by different executions, and plan new experiments more efficiently. In addition to this, a provenance simulator was created, which facilitates the inclusion of provenance data of one genome project workflow execution. Finally, we discuss one case study, which aims to identify genes involved in specific metabolic pathways of Bacillus cereus, as well as to compare this isolate with other phylogenetic related bacteria from the Bacillus group. B. cereus is an extremophilic bacteria, collected in warm water in the Midwestern Region of Brazil, its DNA samples having been sequenced with an NGS machine.
format Online
Article
Text
id pubmed-3816297
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38162972013-11-04 Provenance in bioinformatics workflows de Paula, Renato Holanda, Maristela Gomes, Luciana SA Lifschitz, Sergio Walter, Maria Emilia MT BMC Bioinformatics Research Article In this work, we used the PROV-DM model to manage data provenance in workflows of genome projects. This provenance model allows the storage of details of one workflow execution, e.g., raw and produced data and computational tools, their versions and parameters. Using this model, biologists can access details of one particular execution of a workflow, compare results produced by different executions, and plan new experiments more efficiently. In addition to this, a provenance simulator was created, which facilitates the inclusion of provenance data of one genome project workflow execution. Finally, we discuss one case study, which aims to identify genes involved in specific metabolic pathways of Bacillus cereus, as well as to compare this isolate with other phylogenetic related bacteria from the Bacillus group. B. cereus is an extremophilic bacteria, collected in warm water in the Midwestern Region of Brazil, its DNA samples having been sequenced with an NGS machine. BioMed Central 2013-11-04 /pmc/articles/PMC3816297/ /pubmed/24564294 http://dx.doi.org/10.1186/1471-2105-14-S11-S6 Text en Copyright © 2013 de Paula et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
de Paula, Renato
Holanda, Maristela
Gomes, Luciana SA
Lifschitz, Sergio
Walter, Maria Emilia MT
Provenance in bioinformatics workflows
title Provenance in bioinformatics workflows
title_full Provenance in bioinformatics workflows
title_fullStr Provenance in bioinformatics workflows
title_full_unstemmed Provenance in bioinformatics workflows
title_short Provenance in bioinformatics workflows
title_sort provenance in bioinformatics workflows
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3816297/
https://www.ncbi.nlm.nih.gov/pubmed/24564294
http://dx.doi.org/10.1186/1471-2105-14-S11-S6
work_keys_str_mv AT depaularenato provenanceinbioinformaticsworkflows
AT holandamaristela provenanceinbioinformaticsworkflows
AT gomeslucianasa provenanceinbioinformaticsworkflows
AT lifschitzsergio provenanceinbioinformaticsworkflows
AT waltermariaemiliamt provenanceinbioinformaticsworkflows