Cargando…

OpenStats: A robust and scalable software package for reproducible analysis of high-throughput phenotypic data

Reproducibility in the statistical analyses of data from high-throughput phenotyping screens requires a robust and reliable analysis foundation that allows modelling of different possible statistical scenarios. Regular challenges are scalability and extensibility of the analysis software. In this ma...

Descripción completa

Detalles Bibliográficos
Autores principales: Haselimashhadi, Hamed, Mason, Jeremy C., Mallon, Ann-Marie, Smedley, Damian, Meehan, Terrence F., Parkinson, Helen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7773254/
https://www.ncbi.nlm.nih.gov/pubmed/33378393
http://dx.doi.org/10.1371/journal.pone.0242933
_version_ 1783630022224904192
author Haselimashhadi, Hamed
Mason, Jeremy C.
Mallon, Ann-Marie
Smedley, Damian
Meehan, Terrence F.
Parkinson, Helen
author_facet Haselimashhadi, Hamed
Mason, Jeremy C.
Mallon, Ann-Marie
Smedley, Damian
Meehan, Terrence F.
Parkinson, Helen
author_sort Haselimashhadi, Hamed
collection PubMed
description Reproducibility in the statistical analyses of data from high-throughput phenotyping screens requires a robust and reliable analysis foundation that allows modelling of different possible statistical scenarios. Regular challenges are scalability and extensibility of the analysis software. In this manuscript, we describe OpenStats, a freely available software package that addresses these challenges. We show the performance of the software in a high-throughput phenomic pipeline in the International Mouse Phenotyping Consortium (IMPC) and compare the agreement of the results with the most similar implementation in the literature. OpenStats has significant improvements in speed and scalability compared to existing software packages including a 13-fold improvement in computational time to the current production analysis pipeline in the IMPC. Reduced complexity also promotes FAIR data analysis by providing transparency and benefiting other groups in reproducing and re-usability of the statistical methods and results. OpenStats is freely available under a Creative Commons license at www.bioconductor.org/packages/OpenStats.
format Online
Article
Text
id pubmed-7773254
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-77732542021-01-07 OpenStats: A robust and scalable software package for reproducible analysis of high-throughput phenotypic data Haselimashhadi, Hamed Mason, Jeremy C. Mallon, Ann-Marie Smedley, Damian Meehan, Terrence F. Parkinson, Helen PLoS One Research Article Reproducibility in the statistical analyses of data from high-throughput phenotyping screens requires a robust and reliable analysis foundation that allows modelling of different possible statistical scenarios. Regular challenges are scalability and extensibility of the analysis software. In this manuscript, we describe OpenStats, a freely available software package that addresses these challenges. We show the performance of the software in a high-throughput phenomic pipeline in the International Mouse Phenotyping Consortium (IMPC) and compare the agreement of the results with the most similar implementation in the literature. OpenStats has significant improvements in speed and scalability compared to existing software packages including a 13-fold improvement in computational time to the current production analysis pipeline in the IMPC. Reduced complexity also promotes FAIR data analysis by providing transparency and benefiting other groups in reproducing and re-usability of the statistical methods and results. OpenStats is freely available under a Creative Commons license at www.bioconductor.org/packages/OpenStats. Public Library of Science 2020-12-30 /pmc/articles/PMC7773254/ /pubmed/33378393 http://dx.doi.org/10.1371/journal.pone.0242933 Text en © 2020 Haselimashhadi et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Haselimashhadi, Hamed
Mason, Jeremy C.
Mallon, Ann-Marie
Smedley, Damian
Meehan, Terrence F.
Parkinson, Helen
OpenStats: A robust and scalable software package for reproducible analysis of high-throughput phenotypic data
title OpenStats: A robust and scalable software package for reproducible analysis of high-throughput phenotypic data
title_full OpenStats: A robust and scalable software package for reproducible analysis of high-throughput phenotypic data
title_fullStr OpenStats: A robust and scalable software package for reproducible analysis of high-throughput phenotypic data
title_full_unstemmed OpenStats: A robust and scalable software package for reproducible analysis of high-throughput phenotypic data
title_short OpenStats: A robust and scalable software package for reproducible analysis of high-throughput phenotypic data
title_sort openstats: a robust and scalable software package for reproducible analysis of high-throughput phenotypic data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7773254/
https://www.ncbi.nlm.nih.gov/pubmed/33378393
http://dx.doi.org/10.1371/journal.pone.0242933
work_keys_str_mv AT haselimashhadihamed openstatsarobustandscalablesoftwarepackageforreproducibleanalysisofhighthroughputphenotypicdata
AT masonjeremyc openstatsarobustandscalablesoftwarepackageforreproducibleanalysisofhighthroughputphenotypicdata
AT mallonannmarie openstatsarobustandscalablesoftwarepackageforreproducibleanalysisofhighthroughputphenotypicdata
AT smedleydamian openstatsarobustandscalablesoftwarepackageforreproducibleanalysisofhighthroughputphenotypicdata
AT meehanterrencef openstatsarobustandscalablesoftwarepackageforreproducibleanalysisofhighthroughputphenotypicdata
AT parkinsonhelen openstatsarobustandscalablesoftwarepackageforreproducibleanalysisofhighthroughputphenotypicdata