Cargando…

An R package that automatically collects and archives details for reproducible computing

BACKGROUND: It is scientifically and ethically imperative that the results of statistical analysis of biomedical research data be computationally reproducible in the sense that the reported results can be easily recapitulated from the study data. Some statistical analyses are computationally a funct...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Zhifa, Pounds, Stan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4026591/
https://www.ncbi.nlm.nih.gov/pubmed/24886202
http://dx.doi.org/10.1186/1471-2105-15-138
_version_ 1782316860587376640
author Liu, Zhifa
Pounds, Stan
author_facet Liu, Zhifa
Pounds, Stan
author_sort Liu, Zhifa
collection PubMed
description BACKGROUND: It is scientifically and ethically imperative that the results of statistical analysis of biomedical research data be computationally reproducible in the sense that the reported results can be easily recapitulated from the study data. Some statistical analyses are computationally a function of many data files, program files, and other details that are updated or corrected over time. In many applications, it is infeasible to manually maintain an accurate and complete record of all these details about a particular analysis. RESULTS: Therefore, we developed the rctrack package that automatically collects and archives read only copies of program files, data files, and other details needed to computationally reproduce an analysis. CONCLUSIONS: The rctrack package uses the trace function to temporarily embed detail collection procedures into functions that read files, write files, or generate random numbers so that no special modifications of the primary R program are necessary. At the conclusion of the analysis, rctrack uses these details to automatically generate a read only archive of data files, program files, result files, and other details needed to recapitulate the analysis results. Information about this archive may be included as an appendix of a report generated by Sweave or knitR. Here, we describe the usage, implementation, and other features of the rctrack package. The rctrack package is freely available from http://www.stjuderesearch.org/site/depts/biostats/rctrack under the GPL license.
format Online
Article
Text
id pubmed-4026591
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40265912014-05-21 An R package that automatically collects and archives details for reproducible computing Liu, Zhifa Pounds, Stan BMC Bioinformatics Software BACKGROUND: It is scientifically and ethically imperative that the results of statistical analysis of biomedical research data be computationally reproducible in the sense that the reported results can be easily recapitulated from the study data. Some statistical analyses are computationally a function of many data files, program files, and other details that are updated or corrected over time. In many applications, it is infeasible to manually maintain an accurate and complete record of all these details about a particular analysis. RESULTS: Therefore, we developed the rctrack package that automatically collects and archives read only copies of program files, data files, and other details needed to computationally reproduce an analysis. CONCLUSIONS: The rctrack package uses the trace function to temporarily embed detail collection procedures into functions that read files, write files, or generate random numbers so that no special modifications of the primary R program are necessary. At the conclusion of the analysis, rctrack uses these details to automatically generate a read only archive of data files, program files, result files, and other details needed to recapitulate the analysis results. Information about this archive may be included as an appendix of a report generated by Sweave or knitR. Here, we describe the usage, implementation, and other features of the rctrack package. The rctrack package is freely available from http://www.stjuderesearch.org/site/depts/biostats/rctrack under the GPL license. BioMed Central 2014-05-10 /pmc/articles/PMC4026591/ /pubmed/24886202 http://dx.doi.org/10.1186/1471-2105-15-138 Text en Copyright © 2014 Liu and Pounds; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Liu, Zhifa
Pounds, Stan
An R package that automatically collects and archives details for reproducible computing
title An R package that automatically collects and archives details for reproducible computing
title_full An R package that automatically collects and archives details for reproducible computing
title_fullStr An R package that automatically collects and archives details for reproducible computing
title_full_unstemmed An R package that automatically collects and archives details for reproducible computing
title_short An R package that automatically collects and archives details for reproducible computing
title_sort r package that automatically collects and archives details for reproducible computing
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4026591/
https://www.ncbi.nlm.nih.gov/pubmed/24886202
http://dx.doi.org/10.1186/1471-2105-15-138
work_keys_str_mv AT liuzhifa anrpackagethatautomaticallycollectsandarchivesdetailsforreproduciblecomputing
AT poundsstan anrpackagethatautomaticallycollectsandarchivesdetailsforreproduciblecomputing
AT liuzhifa rpackagethatautomaticallycollectsandarchivesdetailsforreproduciblecomputing
AT poundsstan rpackagethatautomaticallycollectsandarchivesdetailsforreproduciblecomputing