Cargando…

Experimenting with reproducibility: a case study of robustness in bioinformatics

Reproducibility has been shown to be limited in many scientific fields. This question is a fundamental tenet of scientific activity, but the related issues of reusability of scientific data are poorly documented. Here, we present a case study of our difficulties in reproducing a published bioinforma...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Yang-Min, Poline, Jean-Baptiste, Dumas, Guillaume
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6054242/
https://www.ncbi.nlm.nih.gov/pubmed/29961842
http://dx.doi.org/10.1093/gigascience/giy077
_version_ 1783340976656351232
author Kim, Yang-Min
Poline, Jean-Baptiste
Dumas, Guillaume
author_facet Kim, Yang-Min
Poline, Jean-Baptiste
Dumas, Guillaume
author_sort Kim, Yang-Min
collection PubMed
description Reproducibility has been shown to be limited in many scientific fields. This question is a fundamental tenet of scientific activity, but the related issues of reusability of scientific data are poorly documented. Here, we present a case study of our difficulties in reproducing a published bioinformatics method even though code and data were available. First, we tried to re-run the analysis with the code and data provided by the authors. Second, we reimplemented the whole method in a Python package to avoid dependency on a MATLAB license and ease the execution of the code on a high-performance computing cluster. Third, we assessed reusability of our reimplementation and the quality of our documentation, testing how easy it would be to start from our implementation to reproduce the results. In a second section, we propose solutions from this case study and other observations to improve reproducibility and research efficiency at the individual and collective levels. While finalizing our code, we created case-specific documentation and tutorials for the associated Python package StratiPy. Readers are invited to experiment with our reproducibility case study by generating the two confusion matrices (see more in section “Robustness: from MATLAB to Python, language and organization"). Here, we propose two options: a step-by-step process to follow in a Jupyter/IPython notebook or a Docker container ready to be built and run.
format Online
Article
Text
id pubmed-6054242
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-60542422018-07-25 Experimenting with reproducibility: a case study of robustness in bioinformatics Kim, Yang-Min Poline, Jean-Baptiste Dumas, Guillaume Gigascience Review Reproducibility has been shown to be limited in many scientific fields. This question is a fundamental tenet of scientific activity, but the related issues of reusability of scientific data are poorly documented. Here, we present a case study of our difficulties in reproducing a published bioinformatics method even though code and data were available. First, we tried to re-run the analysis with the code and data provided by the authors. Second, we reimplemented the whole method in a Python package to avoid dependency on a MATLAB license and ease the execution of the code on a high-performance computing cluster. Third, we assessed reusability of our reimplementation and the quality of our documentation, testing how easy it would be to start from our implementation to reproduce the results. In a second section, we propose solutions from this case study and other observations to improve reproducibility and research efficiency at the individual and collective levels. While finalizing our code, we created case-specific documentation and tutorials for the associated Python package StratiPy. Readers are invited to experiment with our reproducibility case study by generating the two confusion matrices (see more in section “Robustness: from MATLAB to Python, language and organization"). Here, we propose two options: a step-by-step process to follow in a Jupyter/IPython notebook or a Docker container ready to be built and run. Oxford University Press 2018-06-28 /pmc/articles/PMC6054242/ /pubmed/29961842 http://dx.doi.org/10.1093/gigascience/giy077 Text en © The Authors 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Review
Kim, Yang-Min
Poline, Jean-Baptiste
Dumas, Guillaume
Experimenting with reproducibility: a case study of robustness in bioinformatics
title Experimenting with reproducibility: a case study of robustness in bioinformatics
title_full Experimenting with reproducibility: a case study of robustness in bioinformatics
title_fullStr Experimenting with reproducibility: a case study of robustness in bioinformatics
title_full_unstemmed Experimenting with reproducibility: a case study of robustness in bioinformatics
title_short Experimenting with reproducibility: a case study of robustness in bioinformatics
title_sort experimenting with reproducibility: a case study of robustness in bioinformatics
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6054242/
https://www.ncbi.nlm.nih.gov/pubmed/29961842
http://dx.doi.org/10.1093/gigascience/giy077
work_keys_str_mv AT kimyangmin experimentingwithreproducibilityacasestudyofrobustnessinbioinformatics
AT polinejeanbaptiste experimentingwithreproducibilityacasestudyofrobustnessinbioinformatics
AT dumasguillaume experimentingwithreproducibilityacasestudyofrobustnessinbioinformatics