Cargando…
Analyzing big datasets of genomic sequences: fast and scalable collection of k-mer statistics
BACKGROUND: Distributed approaches based on the MapReduce programming paradigm have started to be proposed in the Bioinformatics domain, due to the large amount of data produced by the next-generation sequencing techniques. However, the use of MapReduce and related Big Data technologies and framewor...
Autores principales: | Ferraro Petrillo, Umberto, Sorella, Mara, Cattaneo, Giuseppe, Giancarlo, Raffaele, Rombo, Simona E. |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6471689/ https://www.ncbi.nlm.nih.gov/pubmed/30999863 http://dx.doi.org/10.1186/s12859-019-2694-8 |
Ejemplares similares
-
DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks
por: Di Rocco, Lorenzo, et al.
Publicado: (2022) -
FASTA/Q data compressors for MapReduce-Hadoop genomics: space and time savings made easy
por: Ferraro Petrillo, Umberto, et al.
Publicado: (2021) -
Correction to: FASTA/Q data compressors for MapReduce-Hadoop genomics: space and time savings made easy
por: Ferraro Petrillo, Umberto, et al.
Publicado: (2022) -
Understanding big data scalability
por: Isaacson, Cory
Publicado: (2015) -
Statistical tests and identifiability conditions for pooling and analyzing multisite datasets
por: Zhou, Hao Henry, et al.
Publicado: (2018)