Cargando…
Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency
Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and a...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi Publishing Corporation
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4629038/ https://www.ncbi.nlm.nih.gov/pubmed/26558254 http://dx.doi.org/10.1155/2015/502795 |
_version_ | 1782398514297307136 |
---|---|
author | Aniceto, Rodrigo Xavier, Rene Guimarães, Valeria Hondo, Fernanda Holanda, Maristela Walter, Maria Emilia Lifschitz, Sérgio |
author_facet | Aniceto, Rodrigo Xavier, Rene Guimarães, Valeria Hondo, Fernanda Holanda, Maristela Walter, Maria Emilia Lifschitz, Sérgio |
author_sort | Aniceto, Rodrigo |
collection | PubMed |
description | Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB. |
format | Online Article Text |
id | pubmed-4629038 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Hindawi Publishing Corporation |
record_format | MEDLINE/PubMed |
spelling | pubmed-46290382015-11-10 Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency Aniceto, Rodrigo Xavier, Rene Guimarães, Valeria Hondo, Fernanda Holanda, Maristela Walter, Maria Emilia Lifschitz, Sérgio Int J Genomics Research Article Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB. Hindawi Publishing Corporation 2015 2015-10-19 /pmc/articles/PMC4629038/ /pubmed/26558254 http://dx.doi.org/10.1155/2015/502795 Text en Copyright © 2015 Rodrigo Aniceto et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Aniceto, Rodrigo Xavier, Rene Guimarães, Valeria Hondo, Fernanda Holanda, Maristela Walter, Maria Emilia Lifschitz, Sérgio Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency |
title | Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency |
title_full | Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency |
title_fullStr | Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency |
title_full_unstemmed | Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency |
title_short | Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency |
title_sort | evaluating the cassandra nosql database approach for genomic data persistency |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4629038/ https://www.ncbi.nlm.nih.gov/pubmed/26558254 http://dx.doi.org/10.1155/2015/502795 |
work_keys_str_mv | AT anicetorodrigo evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency AT xavierrene evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency AT guimaraesvaleria evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency AT hondofernanda evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency AT holandamaristela evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency AT waltermariaemilia evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency AT lifschitzsergio evaluatingthecassandranosqldatabaseapproachforgenomicdatapersistency |