Cargando…

A collaborative semantic-based provenance management platform for reproducibility

Scientific data management plays a key role in the reproducibility of scientific results. To reproduce results, not only the results but also the data and steps of scientific experiments must be made findable, accessible, interoperable, and reusable. Tracking, managing, describing, and visualizing p...

Descripción completa

Detalles Bibliográficos
Autores principales: Samuel, Sheeba, König-Ries, Birgitta
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9044346/
https://www.ncbi.nlm.nih.gov/pubmed/35494870
http://dx.doi.org/10.7717/peerj-cs.921
_version_ 1784695085780172800
author Samuel, Sheeba
König-Ries, Birgitta
author_facet Samuel, Sheeba
König-Ries, Birgitta
author_sort Samuel, Sheeba
collection PubMed
description Scientific data management plays a key role in the reproducibility of scientific results. To reproduce results, not only the results but also the data and steps of scientific experiments must be made findable, accessible, interoperable, and reusable. Tracking, managing, describing, and visualizing provenance helps in the understandability, reproducibility, and reuse of experiments for the scientific community. Current systems lack a link between the data, steps, and results from the computational and non-computational processes of an experiment. Such a link, however, is vital for the reproducibility of results. We present a novel solution for the end-to-end provenance management of scientific experiments. We provide a framework, CAESAR (CollAborative Environment for Scientific Analysis with Reproducibility), which allows scientists to capture, manage, query and visualize the complete path of a scientific experiment consisting of computational and non-computational data and steps in an interoperable way. CAESAR integrates the REPRODUCE-ME provenance model, extended from existing semantic web standards, to represent the whole picture of an experiment describing the path it took from its design to its result. ProvBook, an extension for Jupyter Notebooks, is developed and integrated into CAESAR to support computational reproducibility. We have applied and evaluated our contributions to a set of scientific experiments in microscopy research projects.
format Online
Article
Text
id pubmed-9044346
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-90443462022-04-28 A collaborative semantic-based provenance management platform for reproducibility Samuel, Sheeba König-Ries, Birgitta PeerJ Comput Sci Computational Biology Scientific data management plays a key role in the reproducibility of scientific results. To reproduce results, not only the results but also the data and steps of scientific experiments must be made findable, accessible, interoperable, and reusable. Tracking, managing, describing, and visualizing provenance helps in the understandability, reproducibility, and reuse of experiments for the scientific community. Current systems lack a link between the data, steps, and results from the computational and non-computational processes of an experiment. Such a link, however, is vital for the reproducibility of results. We present a novel solution for the end-to-end provenance management of scientific experiments. We provide a framework, CAESAR (CollAborative Environment for Scientific Analysis with Reproducibility), which allows scientists to capture, manage, query and visualize the complete path of a scientific experiment consisting of computational and non-computational data and steps in an interoperable way. CAESAR integrates the REPRODUCE-ME provenance model, extended from existing semantic web standards, to represent the whole picture of an experiment describing the path it took from its design to its result. ProvBook, an extension for Jupyter Notebooks, is developed and integrated into CAESAR to support computational reproducibility. We have applied and evaluated our contributions to a set of scientific experiments in microscopy research projects. PeerJ Inc. 2022-03-10 /pmc/articles/PMC9044346/ /pubmed/35494870 http://dx.doi.org/10.7717/peerj-cs.921 Text en ©2022 Samuel and König-Ries https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Computational Biology
Samuel, Sheeba
König-Ries, Birgitta
A collaborative semantic-based provenance management platform for reproducibility
title A collaborative semantic-based provenance management platform for reproducibility
title_full A collaborative semantic-based provenance management platform for reproducibility
title_fullStr A collaborative semantic-based provenance management platform for reproducibility
title_full_unstemmed A collaborative semantic-based provenance management platform for reproducibility
title_short A collaborative semantic-based provenance management platform for reproducibility
title_sort collaborative semantic-based provenance management platform for reproducibility
topic Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9044346/
https://www.ncbi.nlm.nih.gov/pubmed/35494870
http://dx.doi.org/10.7717/peerj-cs.921
work_keys_str_mv AT samuelsheeba acollaborativesemanticbasedprovenancemanagementplatformforreproducibility
AT konigriesbirgitta acollaborativesemanticbasedprovenancemanagementplatformforreproducibility
AT samuelsheeba collaborativesemanticbasedprovenancemanagementplatformforreproducibility
AT konigriesbirgitta collaborativesemanticbasedprovenancemanagementplatformforreproducibility