Cargando…

An Integrated Toolkit for Extensible and Reproducible Neuroscience

As neuroimagery datasets continue to grow in size, the complexity of data analyses can require a detailed understanding and implementation of systems computer science for storage, access, processing, and sharing. Currently, several general data standards (e.g., Zarr, HDF5, precomputed) and purpose-b...

Descripción completa

Detalles Bibliográficos
Autores principales: Matelsky, Jordan K, Rodriguez, Luis M, Xenes, Daniel, Gion, Timothy, Hider, Robert, Wester, Brock A, Gray-Roncal, William
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9044020/
https://www.ncbi.nlm.nih.gov/pubmed/34891768
http://dx.doi.org/10.1109/EMBC46164.2021.9630199
Descripción
Sumario:As neuroimagery datasets continue to grow in size, the complexity of data analyses can require a detailed understanding and implementation of systems computer science for storage, access, processing, and sharing. Currently, several general data standards (e.g., Zarr, HDF5, precomputed) and purpose-built ecosystems (e.g., BossDB, CloudVolume, DVID, and Knossos) exist. Each of these systems has advantages and limitations and is most appropriate for different use cases. Using datasets that don’t fit into RAM in this heterogeneous environment is challenging, and significant barriers exist to leverage underlying research investments. In this manuscript, we outline our perspective for how to approach this challenge through the use of community provided, standardized interfaces that unify various computational backends and abstract computer science challenges from the scientist. We introduce desirable design patterns and share our reference implementation called intern.