Cargando…

A workflow reproducibility scale for automatic validation of biological interpretation results

BACKGROUND: Reproducibility of data analysis workflow is a key issue in the field of bioinformatics. Recent computing technologies, such as virtualization, have made it possible to reproduce workflow execution with ease. However, the reproducibility of results is not well discussed; that is, there i...

Descripción completa

Detalles Bibliográficos
Autores principales:	Suetake, Hirotaka, Fukusato, Tsukasa, Igarashi, Takeo, Ohta, Tazro
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2023
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10164546/ https://www.ncbi.nlm.nih.gov/pubmed/37150537 http://dx.doi.org/10.1093/gigascience/giad031

_version_	1785038092060590080
author	Suetake, Hirotaka Fukusato, Tsukasa Igarashi, Takeo Ohta, Tazro
author_facet	Suetake, Hirotaka Fukusato, Tsukasa Igarashi, Takeo Ohta, Tazro
author_sort	Suetake, Hirotaka
collection	PubMed
description	BACKGROUND: Reproducibility of data analysis workflow is a key issue in the field of bioinformatics. Recent computing technologies, such as virtualization, have made it possible to reproduce workflow execution with ease. However, the reproducibility of results is not well discussed; that is, there is no standard way to verify whether the biological interpretation of reproduced results is the same. Therefore, it still remains a challenge to automatically evaluate the reproducibility of results. RESULTS: We propose a new metric, a reproducibility scale of workflow execution results, to evaluate the reproducibility of results. This metric is based on the idea of evaluating the reproducibility of results using biological feature values (e.g., number of reads, mapping rate, and variant frequency) representing their biological interpretation. We also implemented a prototype system that automatically evaluates the reproducibility of results using the proposed metric. To demonstrate our approach, we conducted an experiment using workflows used by researchers in real research projects and the use cases that are frequently encountered in the field of bioinformatics. CONCLUSIONS: Our approach enables automatic evaluation of the reproducibility of results using a fine-grained scale. By introducing our approach, it is possible to evolve from a binary view of whether the results are superficially identical or not to a more graduated view. We believe that our approach will contribute to more informed discussion on reproducibility in bioinformatics.
format	Online Article Text
id	pubmed-10164546
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-101645462023-05-08 A workflow reproducibility scale for automatic validation of biological interpretation results Suetake, Hirotaka Fukusato, Tsukasa Igarashi, Takeo Ohta, Tazro Gigascience Research BACKGROUND: Reproducibility of data analysis workflow is a key issue in the field of bioinformatics. Recent computing technologies, such as virtualization, have made it possible to reproduce workflow execution with ease. However, the reproducibility of results is not well discussed; that is, there is no standard way to verify whether the biological interpretation of reproduced results is the same. Therefore, it still remains a challenge to automatically evaluate the reproducibility of results. RESULTS: We propose a new metric, a reproducibility scale of workflow execution results, to evaluate the reproducibility of results. This metric is based on the idea of evaluating the reproducibility of results using biological feature values (e.g., number of reads, mapping rate, and variant frequency) representing their biological interpretation. We also implemented a prototype system that automatically evaluates the reproducibility of results using the proposed metric. To demonstrate our approach, we conducted an experiment using workflows used by researchers in real research projects and the use cases that are frequently encountered in the field of bioinformatics. CONCLUSIONS: Our approach enables automatic evaluation of the reproducibility of results using a fine-grained scale. By introducing our approach, it is possible to evolve from a binary view of whether the results are superficially identical or not to a more graduated view. We believe that our approach will contribute to more informed discussion on reproducibility in bioinformatics. Oxford University Press 2023-05-08 /pmc/articles/PMC10164546/ /pubmed/37150537 http://dx.doi.org/10.1093/gigascience/giad031 Text en © The Author(s) 2023. Published by Oxford University Press GigaScience. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Suetake, Hirotaka Fukusato, Tsukasa Igarashi, Takeo Ohta, Tazro A workflow reproducibility scale for automatic validation of biological interpretation results
title	A workflow reproducibility scale for automatic validation of biological interpretation results
title_full	A workflow reproducibility scale for automatic validation of biological interpretation results
title_fullStr	A workflow reproducibility scale for automatic validation of biological interpretation results
title_full_unstemmed	A workflow reproducibility scale for automatic validation of biological interpretation results
title_short	A workflow reproducibility scale for automatic validation of biological interpretation results
title_sort	workflow reproducibility scale for automatic validation of biological interpretation results
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10164546/ https://www.ncbi.nlm.nih.gov/pubmed/37150537 http://dx.doi.org/10.1093/gigascience/giad031
work_keys_str_mv	AT suetakehirotaka aworkflowreproducibilityscaleforautomaticvalidationofbiologicalinterpretationresults AT fukusatotsukasa aworkflowreproducibilityscaleforautomaticvalidationofbiologicalinterpretationresults AT igarashitakeo aworkflowreproducibilityscaleforautomaticvalidationofbiologicalinterpretationresults AT ohtatazro aworkflowreproducibilityscaleforautomaticvalidationofbiologicalinterpretationresults AT suetakehirotaka workflowreproducibilityscaleforautomaticvalidationofbiologicalinterpretationresults AT fukusatotsukasa workflowreproducibilityscaleforautomaticvalidationofbiologicalinterpretationresults AT igarashitakeo workflowreproducibilityscaleforautomaticvalidationofbiologicalinterpretationresults AT ohtatazro workflowreproducibilityscaleforautomaticvalidationofbiologicalinterpretationresults

A workflow reproducibility scale for automatic validation of biological interpretation results

Ejemplares similares