Cargando…
A reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using BIDS Apps
Neuroimaging research faces a crisis of reproducibility. With massive sample sizes and greater data complexity, this problem becomes more acute. Software that operates on imaging data defined using the Brain Imaging Data Structure (BIDS) – BIDS Apps – have provided a substantial advance. However, ev...
Autores principales: | , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10461987/ https://www.ncbi.nlm.nih.gov/pubmed/37645999 http://dx.doi.org/10.1101/2023.08.16.552472 |
_version_ | 1785097970270601216 |
---|---|
author | Zhao, Chenying Jarecka, Dorota Covitz, Sydney Chen, Yibei Eickhoff, Simon B. Fair, Damien A. Franco, Alexandre R. Halchenko, Yaroslav O. Hendrickson, Timothy J. Hoffstaedter, Felix Houghton, Audrey Kiar, Gregory Macdonald, Austin Mehta, Kahini Milham, Michael P. Salo, Taylor Hanke, Michael Ghosh, Satrajit S. Cieslak, Matthew Satterthwaite, Theodore D. |
author_facet | Zhao, Chenying Jarecka, Dorota Covitz, Sydney Chen, Yibei Eickhoff, Simon B. Fair, Damien A. Franco, Alexandre R. Halchenko, Yaroslav O. Hendrickson, Timothy J. Hoffstaedter, Felix Houghton, Audrey Kiar, Gregory Macdonald, Austin Mehta, Kahini Milham, Michael P. Salo, Taylor Hanke, Michael Ghosh, Satrajit S. Cieslak, Matthew Satterthwaite, Theodore D. |
author_sort | Zhao, Chenying |
collection | PubMed |
description | Neuroimaging research faces a crisis of reproducibility. With massive sample sizes and greater data complexity, this problem becomes more acute. Software that operates on imaging data defined using the Brain Imaging Data Structure (BIDS) – BIDS Apps – have provided a substantial advance. However, even using BIDS Apps, a full audit trail of data processing is a necessary prerequisite for fully reproducible research. Obtaining a faithful record of the audit trail is challenging – especially for large datasets. Recently, the FAIRly big framework was introduced as a way to facilitate reproducible processing of large-scale data by leveraging DataLad – a version control system for data management. However, the current implementation of this framework was more of a proof of concept, and could not be immediately reused by other investigators for different use cases. Here we introduce the BIDS App Bootstrap (BABS), a user-friendly and generalizable Python package for reproducible image processing at scale. BABS facilitates the reproducible application of BIDS Apps to large-scale datasets. Leveraging DataLad and the FAIRly big framework, BABS tracks the full audit trail of data processing in a scalable way by automatically preparing all scripts necessary for data processing and version tracking on high performance computing (HPC) systems. Currently, BABS supports jobs submissions and audits on Sun Grid Engine (SGE) and Slurm HPCs with a parsimonious set of programs. To demonstrate its scalability, we applied BABS to data from the Healthy Brain Network (HBN; n=2,565). Taken together, BABS allows reproducible and scalable image processing and is broadly extensible via an open-source development model. |
format | Online Article Text |
id | pubmed-10461987 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-104619872023-08-29 A reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using BIDS Apps Zhao, Chenying Jarecka, Dorota Covitz, Sydney Chen, Yibei Eickhoff, Simon B. Fair, Damien A. Franco, Alexandre R. Halchenko, Yaroslav O. Hendrickson, Timothy J. Hoffstaedter, Felix Houghton, Audrey Kiar, Gregory Macdonald, Austin Mehta, Kahini Milham, Michael P. Salo, Taylor Hanke, Michael Ghosh, Satrajit S. Cieslak, Matthew Satterthwaite, Theodore D. bioRxiv Article Neuroimaging research faces a crisis of reproducibility. With massive sample sizes and greater data complexity, this problem becomes more acute. Software that operates on imaging data defined using the Brain Imaging Data Structure (BIDS) – BIDS Apps – have provided a substantial advance. However, even using BIDS Apps, a full audit trail of data processing is a necessary prerequisite for fully reproducible research. Obtaining a faithful record of the audit trail is challenging – especially for large datasets. Recently, the FAIRly big framework was introduced as a way to facilitate reproducible processing of large-scale data by leveraging DataLad – a version control system for data management. However, the current implementation of this framework was more of a proof of concept, and could not be immediately reused by other investigators for different use cases. Here we introduce the BIDS App Bootstrap (BABS), a user-friendly and generalizable Python package for reproducible image processing at scale. BABS facilitates the reproducible application of BIDS Apps to large-scale datasets. Leveraging DataLad and the FAIRly big framework, BABS tracks the full audit trail of data processing in a scalable way by automatically preparing all scripts necessary for data processing and version tracking on high performance computing (HPC) systems. Currently, BABS supports jobs submissions and audits on Sun Grid Engine (SGE) and Slurm HPCs with a parsimonious set of programs. To demonstrate its scalability, we applied BABS to data from the Healthy Brain Network (HBN; n=2,565). Taken together, BABS allows reproducible and scalable image processing and is broadly extensible via an open-source development model. Cold Spring Harbor Laboratory 2023-08-18 /pmc/articles/PMC10461987/ /pubmed/37645999 http://dx.doi.org/10.1101/2023.08.16.552472 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Zhao, Chenying Jarecka, Dorota Covitz, Sydney Chen, Yibei Eickhoff, Simon B. Fair, Damien A. Franco, Alexandre R. Halchenko, Yaroslav O. Hendrickson, Timothy J. Hoffstaedter, Felix Houghton, Audrey Kiar, Gregory Macdonald, Austin Mehta, Kahini Milham, Michael P. Salo, Taylor Hanke, Michael Ghosh, Satrajit S. Cieslak, Matthew Satterthwaite, Theodore D. A reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using BIDS Apps |
title | A reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using BIDS Apps |
title_full | A reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using BIDS Apps |
title_fullStr | A reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using BIDS Apps |
title_full_unstemmed | A reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using BIDS Apps |
title_short | A reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using BIDS Apps |
title_sort | reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using bids apps |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10461987/ https://www.ncbi.nlm.nih.gov/pubmed/37645999 http://dx.doi.org/10.1101/2023.08.16.552472 |
work_keys_str_mv | AT zhaochenying areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT jareckadorota areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT covitzsydney areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT chenyibei areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT eickhoffsimonb areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT fairdamiena areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT francoalexandrer areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT halchenkoyaroslavo areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT hendricksontimothyj areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT hoffstaedterfelix areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT houghtonaudrey areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT kiargregory areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT macdonaldaustin areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT mehtakahini areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT milhammichaelp areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT salotaylor areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT hankemichael areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT ghoshsatrajits areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT cieslakmatthew areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT satterthwaitetheodored areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT zhaochenying reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT jareckadorota reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT covitzsydney reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT chenyibei reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT eickhoffsimonb reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT fairdamiena reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT francoalexandrer reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT halchenkoyaroslavo reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT hendricksontimothyj reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT hoffstaedterfelix reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT houghtonaudrey reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT kiargregory reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT macdonaldaustin reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT mehtakahini reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT milhammichaelp reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT salotaylor reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT hankemichael reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT ghoshsatrajits reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT cieslakmatthew reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps AT satterthwaitetheodored reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps |