Cargando…

Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation

BACKGROUND: Electronic Laboratory Notebooks (ELNs) are used to document experiments and investigations in the wet-lab. Protocols in ELNs contain a detailed description of the conducted steps including the necessary information to understand the procedure and the raised research data as well as to re...

Descripción completa

Detalles Bibliográficos
Autores principales: Schröder, Max, Staehlke, Susanne, Groth, Paul, Nebe, J. Barbara, Spors, Sascha, Krüger, Frank
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8802522/
https://www.ncbi.nlm.nih.gov/pubmed/35101121
http://dx.doi.org/10.1186/s13326-021-00257-x
_version_ 1784642698923212800
author Schröder, Max
Staehlke, Susanne
Groth, Paul
Nebe, J. Barbara
Spors, Sascha
Krüger, Frank
author_facet Schröder, Max
Staehlke, Susanne
Groth, Paul
Nebe, J. Barbara
Spors, Sascha
Krüger, Frank
author_sort Schröder, Max
collection PubMed
description BACKGROUND: Electronic Laboratory Notebooks (ELNs) are used to document experiments and investigations in the wet-lab. Protocols in ELNs contain a detailed description of the conducted steps including the necessary information to understand the procedure and the raised research data as well as to reproduce the research investigation. The purpose of this study is to investigate whether such ELN protocols can be used to create semantic documentation of the provenance of research data by the use of ontologies and linked data methodologies. METHODS: Based on an ELN protocol of a biomedical wet-lab experiment, a retrospective provenance model of the raised research data describing the details of the experiment in a machine-interpretable way is manually engineered. Furthermore, an automated approach for knowledge acquisition from ELN protocols is derived from these results. This structure-based approach exploits the structure in the experiment’s description such as headings, tables, and links, to translate the ELN protocol into a semantic knowledge representation. To satisfy the Findable, Accessible, Interoperable, and Reuseable (FAIR) guiding principles, a ready-to-publish bundle is created that contains the research data together with their semantic documentation. RESULTS: While the manual modelling efforts serve as proof of concept by employing one protocol, the automated structure-based approach demonstrates the potential generalisation with seven ELN protocols. For each of those protocols, a ready-to-publish bundle is created and, by employing the SPARQL query language, it is illustrated that questions about the processes and the obtained research data can be answered. CONCLUSIONS: The semantic documentation of research data obtained from the ELN protocols allows for the representation of the retrospective provenance of research data in a machine-interpretable way. Research Object Crate (RO-Crate) bundles including these models enable researchers to easily share the research data including the corresponding documentation, but also to search and relate the experiment to each other.
format Online
Article
Text
id pubmed-8802522
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-88025222022-02-02 Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation Schröder, Max Staehlke, Susanne Groth, Paul Nebe, J. Barbara Spors, Sascha Krüger, Frank J Biomed Semantics Research BACKGROUND: Electronic Laboratory Notebooks (ELNs) are used to document experiments and investigations in the wet-lab. Protocols in ELNs contain a detailed description of the conducted steps including the necessary information to understand the procedure and the raised research data as well as to reproduce the research investigation. The purpose of this study is to investigate whether such ELN protocols can be used to create semantic documentation of the provenance of research data by the use of ontologies and linked data methodologies. METHODS: Based on an ELN protocol of a biomedical wet-lab experiment, a retrospective provenance model of the raised research data describing the details of the experiment in a machine-interpretable way is manually engineered. Furthermore, an automated approach for knowledge acquisition from ELN protocols is derived from these results. This structure-based approach exploits the structure in the experiment’s description such as headings, tables, and links, to translate the ELN protocol into a semantic knowledge representation. To satisfy the Findable, Accessible, Interoperable, and Reuseable (FAIR) guiding principles, a ready-to-publish bundle is created that contains the research data together with their semantic documentation. RESULTS: While the manual modelling efforts serve as proof of concept by employing one protocol, the automated structure-based approach demonstrates the potential generalisation with seven ELN protocols. For each of those protocols, a ready-to-publish bundle is created and, by employing the SPARQL query language, it is illustrated that questions about the processes and the obtained research data can be answered. CONCLUSIONS: The semantic documentation of research data obtained from the ELN protocols allows for the representation of the retrospective provenance of research data in a machine-interpretable way. Research Object Crate (RO-Crate) bundles including these models enable researchers to easily share the research data including the corresponding documentation, but also to search and relate the experiment to each other. BioMed Central 2022-01-31 /pmc/articles/PMC8802522/ /pubmed/35101121 http://dx.doi.org/10.1186/s13326-021-00257-x Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Schröder, Max
Staehlke, Susanne
Groth, Paul
Nebe, J. Barbara
Spors, Sascha
Krüger, Frank
Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation
title Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation
title_full Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation
title_fullStr Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation
title_full_unstemmed Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation
title_short Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation
title_sort structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8802522/
https://www.ncbi.nlm.nih.gov/pubmed/35101121
http://dx.doi.org/10.1186/s13326-021-00257-x
work_keys_str_mv AT schrodermax structurebasedknowledgeacquisitionfromelectroniclabnotebooksforresearchdataprovenancedocumentation
AT staehlkesusanne structurebasedknowledgeacquisitionfromelectroniclabnotebooksforresearchdataprovenancedocumentation
AT grothpaul structurebasedknowledgeacquisitionfromelectroniclabnotebooksforresearchdataprovenancedocumentation
AT nebejbarbara structurebasedknowledgeacquisitionfromelectroniclabnotebooksforresearchdataprovenancedocumentation
AT sporssascha structurebasedknowledgeacquisitionfromelectroniclabnotebooksforresearchdataprovenancedocumentation
AT krugerfrank structurebasedknowledgeacquisitionfromelectroniclabnotebooksforresearchdataprovenancedocumentation