Cargando…

Microbench: automated metadata management for systems biology benchmarking and reproducibility in Python

MOTIVATION: Computational systems biology analyses typically make use of multiple software and their dependencies, which are often run across heterogeneous compute environments. This can introduce differences in performance and reproducibility. Capturing metadata (e.g. package versions, GPU model) c...

Descripción completa

Detalles Bibliográficos
Autores principales: Lubbock, Alexander L R, Lopez, Carlos F
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9563693/
https://www.ncbi.nlm.nih.gov/pubmed/36000837
http://dx.doi.org/10.1093/bioinformatics/btac580
_version_ 1784808464744185856
author Lubbock, Alexander L R
Lopez, Carlos F
author_facet Lubbock, Alexander L R
Lopez, Carlos F
author_sort Lubbock, Alexander L R
collection PubMed
description MOTIVATION: Computational systems biology analyses typically make use of multiple software and their dependencies, which are often run across heterogeneous compute environments. This can introduce differences in performance and reproducibility. Capturing metadata (e.g. package versions, GPU model) currently requires repetitious code and is difficult to store centrally for analysis. Even where virtual environments and containers are used, updates over time mean that versioning metadata should still be captured within analysis pipelines to guarantee reproducibility. RESULTS: Microbench is a simple and extensible Python package to automate metadata capture to a file or Redis database. Captured metadata can include execution time, software package versions, environment variables, hardware information, Python version and more, with plugins. We present three case studies demonstrating Microbench usage to benchmark code execution and examine environment metadata for reproducibility purposes. AVAILABILITY AND IMPLEMENTATION: Install from the Python Package Index using pip install microbench. Source code is available from https://github.com/alubbock/microbench. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9563693
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-95636932022-10-18 Microbench: automated metadata management for systems biology benchmarking and reproducibility in Python Lubbock, Alexander L R Lopez, Carlos F Bioinformatics Applications Notes MOTIVATION: Computational systems biology analyses typically make use of multiple software and their dependencies, which are often run across heterogeneous compute environments. This can introduce differences in performance and reproducibility. Capturing metadata (e.g. package versions, GPU model) currently requires repetitious code and is difficult to store centrally for analysis. Even where virtual environments and containers are used, updates over time mean that versioning metadata should still be captured within analysis pipelines to guarantee reproducibility. RESULTS: Microbench is a simple and extensible Python package to automate metadata capture to a file or Redis database. Captured metadata can include execution time, software package versions, environment variables, hardware information, Python version and more, with plugins. We present three case studies demonstrating Microbench usage to benchmark code execution and examine environment metadata for reproducibility purposes. AVAILABILITY AND IMPLEMENTATION: Install from the Python Package Index using pip install microbench. Source code is available from https://github.com/alubbock/microbench. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-08-24 /pmc/articles/PMC9563693/ /pubmed/36000837 http://dx.doi.org/10.1093/bioinformatics/btac580 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Applications Notes
Lubbock, Alexander L R
Lopez, Carlos F
Microbench: automated metadata management for systems biology benchmarking and reproducibility in Python
title Microbench: automated metadata management for systems biology benchmarking and reproducibility in Python
title_full Microbench: automated metadata management for systems biology benchmarking and reproducibility in Python
title_fullStr Microbench: automated metadata management for systems biology benchmarking and reproducibility in Python
title_full_unstemmed Microbench: automated metadata management for systems biology benchmarking and reproducibility in Python
title_short Microbench: automated metadata management for systems biology benchmarking and reproducibility in Python
title_sort microbench: automated metadata management for systems biology benchmarking and reproducibility in python
topic Applications Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9563693/
https://www.ncbi.nlm.nih.gov/pubmed/36000837
http://dx.doi.org/10.1093/bioinformatics/btac580
work_keys_str_mv AT lubbockalexanderlr microbenchautomatedmetadatamanagementforsystemsbiologybenchmarkingandreproducibilityinpython
AT lopezcarlosf microbenchautomatedmetadatamanagementforsystemsbiologybenchmarkingandreproducibilityinpython