Cargando…

A scalable neuroinformatics data flow for electrophysiological signals using MapReduce

Data-driven neuroscience research is providing new insights in progression of neurological disorders and supporting the development of improved treatment approaches. However, the volume, velocity, and variety of neuroscience data generated from sophisticated recording instruments and acquisition met...

Descripción completa

Detalles Bibliográficos
Autores principales: Jayapandian, Catherine, Wei, Annan, Ramesh, Priya, Zonjy, Bilal, Lhatoo, Samden D., Loparo, Kenneth, Zhang, Guo-Qiang, Sahoo, Satya S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4360820/
https://www.ncbi.nlm.nih.gov/pubmed/25852536
http://dx.doi.org/10.3389/fninf.2015.00004
_version_ 1782361590102753280
author Jayapandian, Catherine
Wei, Annan
Ramesh, Priya
Zonjy, Bilal
Lhatoo, Samden D.
Loparo, Kenneth
Zhang, Guo-Qiang
Sahoo, Satya S.
author_facet Jayapandian, Catherine
Wei, Annan
Ramesh, Priya
Zonjy, Bilal
Lhatoo, Samden D.
Loparo, Kenneth
Zhang, Guo-Qiang
Sahoo, Satya S.
author_sort Jayapandian, Catherine
collection PubMed
description Data-driven neuroscience research is providing new insights in progression of neurological disorders and supporting the development of improved treatment approaches. However, the volume, velocity, and variety of neuroscience data generated from sophisticated recording instruments and acquisition methods have exacerbated the limited scalability of existing neuroinformatics tools. This makes it difficult for neuroscience researchers to effectively leverage the growing multi-modal neuroscience data to advance research in serious neurological disorders, such as epilepsy. We describe the development of the Cloudwave data flow that uses new data partitioning techniques to store and analyze electrophysiological signal in distributed computing infrastructure. The Cloudwave data flow uses MapReduce parallel programming algorithm to implement an integrated signal data processing pipeline that scales with large volume of data generated at high velocity. Using an epilepsy domain ontology together with an epilepsy focused extensible data representation format called Cloudwave Signal Format (CSF), the data flow addresses the challenge of data heterogeneity and is interoperable with existing neuroinformatics data representation formats, such as HDF5. The scalability of the Cloudwave data flow is evaluated using a 30-node cluster installed with the open source Hadoop software stack. The results demonstrate that the Cloudwave data flow can process increasing volume of signal data by leveraging Hadoop Data Nodes to reduce the total data processing time. The Cloudwave data flow is a template for developing highly scalable neuroscience data processing pipelines using MapReduce algorithms to support a variety of user applications.
format Online
Article
Text
id pubmed-4360820
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-43608202015-04-07 A scalable neuroinformatics data flow for electrophysiological signals using MapReduce Jayapandian, Catherine Wei, Annan Ramesh, Priya Zonjy, Bilal Lhatoo, Samden D. Loparo, Kenneth Zhang, Guo-Qiang Sahoo, Satya S. Front Neuroinform Neuroscience Data-driven neuroscience research is providing new insights in progression of neurological disorders and supporting the development of improved treatment approaches. However, the volume, velocity, and variety of neuroscience data generated from sophisticated recording instruments and acquisition methods have exacerbated the limited scalability of existing neuroinformatics tools. This makes it difficult for neuroscience researchers to effectively leverage the growing multi-modal neuroscience data to advance research in serious neurological disorders, such as epilepsy. We describe the development of the Cloudwave data flow that uses new data partitioning techniques to store and analyze electrophysiological signal in distributed computing infrastructure. The Cloudwave data flow uses MapReduce parallel programming algorithm to implement an integrated signal data processing pipeline that scales with large volume of data generated at high velocity. Using an epilepsy domain ontology together with an epilepsy focused extensible data representation format called Cloudwave Signal Format (CSF), the data flow addresses the challenge of data heterogeneity and is interoperable with existing neuroinformatics data representation formats, such as HDF5. The scalability of the Cloudwave data flow is evaluated using a 30-node cluster installed with the open source Hadoop software stack. The results demonstrate that the Cloudwave data flow can process increasing volume of signal data by leveraging Hadoop Data Nodes to reduce the total data processing time. The Cloudwave data flow is a template for developing highly scalable neuroscience data processing pipelines using MapReduce algorithms to support a variety of user applications. Frontiers Media S.A. 2015-03-16 /pmc/articles/PMC4360820/ /pubmed/25852536 http://dx.doi.org/10.3389/fninf.2015.00004 Text en Copyright © 2015 Jayapandian, Wei, Ramesh, Zonjy, Lhatoo, Loparo, Zhang and Sahoo. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Jayapandian, Catherine
Wei, Annan
Ramesh, Priya
Zonjy, Bilal
Lhatoo, Samden D.
Loparo, Kenneth
Zhang, Guo-Qiang
Sahoo, Satya S.
A scalable neuroinformatics data flow for electrophysiological signals using MapReduce
title A scalable neuroinformatics data flow for electrophysiological signals using MapReduce
title_full A scalable neuroinformatics data flow for electrophysiological signals using MapReduce
title_fullStr A scalable neuroinformatics data flow for electrophysiological signals using MapReduce
title_full_unstemmed A scalable neuroinformatics data flow for electrophysiological signals using MapReduce
title_short A scalable neuroinformatics data flow for electrophysiological signals using MapReduce
title_sort scalable neuroinformatics data flow for electrophysiological signals using mapreduce
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4360820/
https://www.ncbi.nlm.nih.gov/pubmed/25852536
http://dx.doi.org/10.3389/fninf.2015.00004
work_keys_str_mv AT jayapandiancatherine ascalableneuroinformaticsdataflowforelectrophysiologicalsignalsusingmapreduce
AT weiannan ascalableneuroinformaticsdataflowforelectrophysiologicalsignalsusingmapreduce
AT rameshpriya ascalableneuroinformaticsdataflowforelectrophysiologicalsignalsusingmapreduce
AT zonjybilal ascalableneuroinformaticsdataflowforelectrophysiologicalsignalsusingmapreduce
AT lhatoosamdend ascalableneuroinformaticsdataflowforelectrophysiologicalsignalsusingmapreduce
AT loparokenneth ascalableneuroinformaticsdataflowforelectrophysiologicalsignalsusingmapreduce
AT zhangguoqiang ascalableneuroinformaticsdataflowforelectrophysiologicalsignalsusingmapreduce
AT sahoosatyas ascalableneuroinformaticsdataflowforelectrophysiologicalsignalsusingmapreduce
AT jayapandiancatherine scalableneuroinformaticsdataflowforelectrophysiologicalsignalsusingmapreduce
AT weiannan scalableneuroinformaticsdataflowforelectrophysiologicalsignalsusingmapreduce
AT rameshpriya scalableneuroinformaticsdataflowforelectrophysiologicalsignalsusingmapreduce
AT zonjybilal scalableneuroinformaticsdataflowforelectrophysiologicalsignalsusingmapreduce
AT lhatoosamdend scalableneuroinformaticsdataflowforelectrophysiologicalsignalsusingmapreduce
AT loparokenneth scalableneuroinformaticsdataflowforelectrophysiologicalsignalsusingmapreduce
AT zhangguoqiang scalableneuroinformaticsdataflowforelectrophysiologicalsignalsusingmapreduce
AT sahoosatyas scalableneuroinformaticsdataflowforelectrophysiologicalsignalsusingmapreduce