Cargando…

An Integrated Toolkit for Extensible and Reproducible Neuroscience

As neuroimagery datasets continue to grow in size, the complexity of data analyses can require a detailed understanding and implementation of systems computer science for storage, access, processing, and sharing. Currently, several general data standards (e.g., Zarr, HDF5, precomputed) and purpose-b...

Descripción completa

Detalles Bibliográficos
Autores principales: Matelsky, Jordan K, Rodriguez, Luis M, Xenes, Daniel, Gion, Timothy, Hider, Robert, Wester, Brock A, Gray-Roncal, William
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9044020/
https://www.ncbi.nlm.nih.gov/pubmed/34891768
http://dx.doi.org/10.1109/EMBC46164.2021.9630199
_version_ 1784695013856247808
author Matelsky, Jordan K
Rodriguez, Luis M
Xenes, Daniel
Gion, Timothy
Hider, Robert
Wester, Brock A
Gray-Roncal, William
author_facet Matelsky, Jordan K
Rodriguez, Luis M
Xenes, Daniel
Gion, Timothy
Hider, Robert
Wester, Brock A
Gray-Roncal, William
author_sort Matelsky, Jordan K
collection PubMed
description As neuroimagery datasets continue to grow in size, the complexity of data analyses can require a detailed understanding and implementation of systems computer science for storage, access, processing, and sharing. Currently, several general data standards (e.g., Zarr, HDF5, precomputed) and purpose-built ecosystems (e.g., BossDB, CloudVolume, DVID, and Knossos) exist. Each of these systems has advantages and limitations and is most appropriate for different use cases. Using datasets that don’t fit into RAM in this heterogeneous environment is challenging, and significant barriers exist to leverage underlying research investments. In this manuscript, we outline our perspective for how to approach this challenge through the use of community provided, standardized interfaces that unify various computational backends and abstract computer science challenges from the scientist. We introduce desirable design patterns and share our reference implementation called intern.
format Online
Article
Text
id pubmed-9044020
institution National Center for Biotechnology Information
language English
publishDate 2021
record_format MEDLINE/PubMed
spelling pubmed-90440202022-04-27 An Integrated Toolkit for Extensible and Reproducible Neuroscience Matelsky, Jordan K Rodriguez, Luis M Xenes, Daniel Gion, Timothy Hider, Robert Wester, Brock A Gray-Roncal, William Annu Int Conf IEEE Eng Med Biol Soc Article As neuroimagery datasets continue to grow in size, the complexity of data analyses can require a detailed understanding and implementation of systems computer science for storage, access, processing, and sharing. Currently, several general data standards (e.g., Zarr, HDF5, precomputed) and purpose-built ecosystems (e.g., BossDB, CloudVolume, DVID, and Knossos) exist. Each of these systems has advantages and limitations and is most appropriate for different use cases. Using datasets that don’t fit into RAM in this heterogeneous environment is challenging, and significant barriers exist to leverage underlying research investments. In this manuscript, we outline our perspective for how to approach this challenge through the use of community provided, standardized interfaces that unify various computational backends and abstract computer science challenges from the scientist. We introduce desirable design patterns and share our reference implementation called intern. 2021-11 /pmc/articles/PMC9044020/ /pubmed/34891768 http://dx.doi.org/10.1109/EMBC46164.2021.9630199 Text en https://creativecommons.org/licenses/by/3.0/This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/ (https://creativecommons.org/licenses/by/3.0/)
spellingShingle Article
Matelsky, Jordan K
Rodriguez, Luis M
Xenes, Daniel
Gion, Timothy
Hider, Robert
Wester, Brock A
Gray-Roncal, William
An Integrated Toolkit for Extensible and Reproducible Neuroscience
title An Integrated Toolkit for Extensible and Reproducible Neuroscience
title_full An Integrated Toolkit for Extensible and Reproducible Neuroscience
title_fullStr An Integrated Toolkit for Extensible and Reproducible Neuroscience
title_full_unstemmed An Integrated Toolkit for Extensible and Reproducible Neuroscience
title_short An Integrated Toolkit for Extensible and Reproducible Neuroscience
title_sort integrated toolkit for extensible and reproducible neuroscience
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9044020/
https://www.ncbi.nlm.nih.gov/pubmed/34891768
http://dx.doi.org/10.1109/EMBC46164.2021.9630199
work_keys_str_mv AT matelskyjordank anintegratedtoolkitforextensibleandreproducibleneuroscience
AT rodriguezluism anintegratedtoolkitforextensibleandreproducibleneuroscience
AT xenesdaniel anintegratedtoolkitforextensibleandreproducibleneuroscience
AT giontimothy anintegratedtoolkitforextensibleandreproducibleneuroscience
AT hiderrobert anintegratedtoolkitforextensibleandreproducibleneuroscience
AT westerbrocka anintegratedtoolkitforextensibleandreproducibleneuroscience
AT grayroncalwilliam anintegratedtoolkitforextensibleandreproducibleneuroscience
AT matelskyjordank integratedtoolkitforextensibleandreproducibleneuroscience
AT rodriguezluism integratedtoolkitforextensibleandreproducibleneuroscience
AT xenesdaniel integratedtoolkitforextensibleandreproducibleneuroscience
AT giontimothy integratedtoolkitforextensibleandreproducibleneuroscience
AT hiderrobert integratedtoolkitforextensibleandreproducibleneuroscience
AT westerbrocka integratedtoolkitforextensibleandreproducibleneuroscience
AT grayroncalwilliam integratedtoolkitforextensibleandreproducibleneuroscience