Cargando…

A reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using BIDS Apps

Neuroimaging research faces a crisis of reproducibility. With massive sample sizes and greater data complexity, this problem becomes more acute. Software that operates on imaging data defined using the Brain Imaging Data Structure (BIDS) – BIDS Apps – have provided a substantial advance. However, ev...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Chenying, Jarecka, Dorota, Covitz, Sydney, Chen, Yibei, Eickhoff, Simon B., Fair, Damien A., Franco, Alexandre R., Halchenko, Yaroslav O., Hendrickson, Timothy J., Hoffstaedter, Felix, Houghton, Audrey, Kiar, Gregory, Macdonald, Austin, Mehta, Kahini, Milham, Michael P., Salo, Taylor, Hanke, Michael, Ghosh, Satrajit S., Cieslak, Matthew, Satterthwaite, Theodore D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10461987/
https://www.ncbi.nlm.nih.gov/pubmed/37645999
http://dx.doi.org/10.1101/2023.08.16.552472
_version_ 1785097970270601216
author Zhao, Chenying
Jarecka, Dorota
Covitz, Sydney
Chen, Yibei
Eickhoff, Simon B.
Fair, Damien A.
Franco, Alexandre R.
Halchenko, Yaroslav O.
Hendrickson, Timothy J.
Hoffstaedter, Felix
Houghton, Audrey
Kiar, Gregory
Macdonald, Austin
Mehta, Kahini
Milham, Michael P.
Salo, Taylor
Hanke, Michael
Ghosh, Satrajit S.
Cieslak, Matthew
Satterthwaite, Theodore D.
author_facet Zhao, Chenying
Jarecka, Dorota
Covitz, Sydney
Chen, Yibei
Eickhoff, Simon B.
Fair, Damien A.
Franco, Alexandre R.
Halchenko, Yaroslav O.
Hendrickson, Timothy J.
Hoffstaedter, Felix
Houghton, Audrey
Kiar, Gregory
Macdonald, Austin
Mehta, Kahini
Milham, Michael P.
Salo, Taylor
Hanke, Michael
Ghosh, Satrajit S.
Cieslak, Matthew
Satterthwaite, Theodore D.
author_sort Zhao, Chenying
collection PubMed
description Neuroimaging research faces a crisis of reproducibility. With massive sample sizes and greater data complexity, this problem becomes more acute. Software that operates on imaging data defined using the Brain Imaging Data Structure (BIDS) – BIDS Apps – have provided a substantial advance. However, even using BIDS Apps, a full audit trail of data processing is a necessary prerequisite for fully reproducible research. Obtaining a faithful record of the audit trail is challenging – especially for large datasets. Recently, the FAIRly big framework was introduced as a way to facilitate reproducible processing of large-scale data by leveraging DataLad – a version control system for data management. However, the current implementation of this framework was more of a proof of concept, and could not be immediately reused by other investigators for different use cases. Here we introduce the BIDS App Bootstrap (BABS), a user-friendly and generalizable Python package for reproducible image processing at scale. BABS facilitates the reproducible application of BIDS Apps to large-scale datasets. Leveraging DataLad and the FAIRly big framework, BABS tracks the full audit trail of data processing in a scalable way by automatically preparing all scripts necessary for data processing and version tracking on high performance computing (HPC) systems. Currently, BABS supports jobs submissions and audits on Sun Grid Engine (SGE) and Slurm HPCs with a parsimonious set of programs. To demonstrate its scalability, we applied BABS to data from the Healthy Brain Network (HBN; n=2,565). Taken together, BABS allows reproducible and scalable image processing and is broadly extensible via an open-source development model.
format Online
Article
Text
id pubmed-10461987
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-104619872023-08-29 A reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using BIDS Apps Zhao, Chenying Jarecka, Dorota Covitz, Sydney Chen, Yibei Eickhoff, Simon B. Fair, Damien A. Franco, Alexandre R. Halchenko, Yaroslav O. Hendrickson, Timothy J. Hoffstaedter, Felix Houghton, Audrey Kiar, Gregory Macdonald, Austin Mehta, Kahini Milham, Michael P. Salo, Taylor Hanke, Michael Ghosh, Satrajit S. Cieslak, Matthew Satterthwaite, Theodore D. bioRxiv Article Neuroimaging research faces a crisis of reproducibility. With massive sample sizes and greater data complexity, this problem becomes more acute. Software that operates on imaging data defined using the Brain Imaging Data Structure (BIDS) – BIDS Apps – have provided a substantial advance. However, even using BIDS Apps, a full audit trail of data processing is a necessary prerequisite for fully reproducible research. Obtaining a faithful record of the audit trail is challenging – especially for large datasets. Recently, the FAIRly big framework was introduced as a way to facilitate reproducible processing of large-scale data by leveraging DataLad – a version control system for data management. However, the current implementation of this framework was more of a proof of concept, and could not be immediately reused by other investigators for different use cases. Here we introduce the BIDS App Bootstrap (BABS), a user-friendly and generalizable Python package for reproducible image processing at scale. BABS facilitates the reproducible application of BIDS Apps to large-scale datasets. Leveraging DataLad and the FAIRly big framework, BABS tracks the full audit trail of data processing in a scalable way by automatically preparing all scripts necessary for data processing and version tracking on high performance computing (HPC) systems. Currently, BABS supports jobs submissions and audits on Sun Grid Engine (SGE) and Slurm HPCs with a parsimonious set of programs. To demonstrate its scalability, we applied BABS to data from the Healthy Brain Network (HBN; n=2,565). Taken together, BABS allows reproducible and scalable image processing and is broadly extensible via an open-source development model. Cold Spring Harbor Laboratory 2023-08-18 /pmc/articles/PMC10461987/ /pubmed/37645999 http://dx.doi.org/10.1101/2023.08.16.552472 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Zhao, Chenying
Jarecka, Dorota
Covitz, Sydney
Chen, Yibei
Eickhoff, Simon B.
Fair, Damien A.
Franco, Alexandre R.
Halchenko, Yaroslav O.
Hendrickson, Timothy J.
Hoffstaedter, Felix
Houghton, Audrey
Kiar, Gregory
Macdonald, Austin
Mehta, Kahini
Milham, Michael P.
Salo, Taylor
Hanke, Michael
Ghosh, Satrajit S.
Cieslak, Matthew
Satterthwaite, Theodore D.
A reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using BIDS Apps
title A reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using BIDS Apps
title_full A reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using BIDS Apps
title_fullStr A reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using BIDS Apps
title_full_unstemmed A reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using BIDS Apps
title_short A reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using BIDS Apps
title_sort reproducible and generalizable software workflow for analysis of large-scale neuroimaging data collections using bids apps
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10461987/
https://www.ncbi.nlm.nih.gov/pubmed/37645999
http://dx.doi.org/10.1101/2023.08.16.552472
work_keys_str_mv AT zhaochenying areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT jareckadorota areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT covitzsydney areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT chenyibei areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT eickhoffsimonb areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT fairdamiena areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT francoalexandrer areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT halchenkoyaroslavo areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT hendricksontimothyj areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT hoffstaedterfelix areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT houghtonaudrey areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT kiargregory areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT macdonaldaustin areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT mehtakahini areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT milhammichaelp areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT salotaylor areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT hankemichael areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT ghoshsatrajits areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT cieslakmatthew areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT satterthwaitetheodored areproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT zhaochenying reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT jareckadorota reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT covitzsydney reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT chenyibei reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT eickhoffsimonb reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT fairdamiena reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT francoalexandrer reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT halchenkoyaroslavo reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT hendricksontimothyj reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT hoffstaedterfelix reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT houghtonaudrey reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT kiargregory reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT macdonaldaustin reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT mehtakahini reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT milhammichaelp reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT salotaylor reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT hankemichael reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT ghoshsatrajits reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT cieslakmatthew reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps
AT satterthwaitetheodored reproducibleandgeneralizablesoftwareworkflowforanalysisoflargescaleneuroimagingdatacollectionsusingbidsapps