Cargando…

Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency

Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and a...

Descripción completa

Detalles Bibliográficos
Autores principales: Aniceto, Rodrigo, Xavier, Rene, Guimarães, Valeria, Hondo, Fernanda, Holanda, Maristela, Walter, Maria Emilia, Lifschitz, Sérgio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4629038/
https://www.ncbi.nlm.nih.gov/pubmed/26558254
http://dx.doi.org/10.1155/2015/502795
_version_ 1782398514297307136
author Aniceto, Rodrigo
Xavier, Rene
Guimarães, Valeria
Hondo, Fernanda
Holanda, Maristela
Walter, Maria Emilia
Lifschitz, Sérgio
author_facet Aniceto, Rodrigo
Xavier, Rene
Guimarães, Valeria
Hondo, Fernanda
Holanda, Maristela
Walter, Maria Emilia
Lifschitz, Sérgio
author_sort Aniceto, Rodrigo
collection PubMed
description Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB.
format Online
Article
Text
id pubmed-4629038
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-46290382015-11-10 Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency Aniceto, Rodrigo Xavier, Rene Guimarães, Valeria Hondo, Fernanda Holanda, Maristela Walter, Maria Emilia Lifschitz, Sérgio Int J Genomics Research Article Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB. Hindawi Publishing Corporation 2015 2015-10-19 /pmc/articles/PMC4629038/ /pubmed/26558254 http://dx.doi.org/10.1155/2015/502795 Text en Copyright © 2015 Rodrigo Aniceto et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Aniceto, Rodrigo
Xavier, Rene
Guimarães, Valeria
Hondo, Fernanda
Holanda, Maristela
Walter, Maria Emilia
Lifschitz, Sérgio
Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency
title Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency
title_full Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency
title_fullStr Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency
title_full_unstemmed Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency
title_short Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency
title_sort evaluating the cassandra nosql database approach for genomic data persistency
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4629038/
https://www.ncbi.nlm.nih.gov/pubmed/26558254
http://dx.doi.org/10.1155/2015/502795
work_keys_str_mv AT anicetorodrigo evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency
AT xavierrene evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency
AT guimaraesvaleria evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency
AT hondofernanda evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency
AT holandamaristela evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency
AT waltermariaemilia evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency
AT lifschitzsergio evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency