Cargando…

Data structures based on k-mers for querying large collections of sequencing data sets

High-throughput sequencing data sets are usually deposited in public repositories (e.g., the European Nucleotide Archive) to ensure reproducibility. As the amount of data has reached petabyte scale, repositories do not allow one to perform online sequence searches, yet, such a feature would be highl...

Descripción completa

Detalles Bibliográficos
Autores principales:	Marchet, Camille, Boucher, Christina, Puglisi, Simon J., Medvedev, Paul, Salson, Mikaël, Chikhi, Rayan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Cold Spring Harbor Laboratory Press 2021
Materias:	Review
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7849385/ https://www.ncbi.nlm.nih.gov/pubmed/33328168 http://dx.doi.org/10.1101/gr.260604.119

Ejemplares similares

REINDEER: efficient indexing of k-mer presence and abundance in sequencing datasets
por: Marchet, Camille, et al.
Publicado: (2020)

Disk compression of k-mer sets
por: Rahman, Amatur, et al.
Publicado: (2021)

DE-kupl: exhaustive capture of biological variation in RNA-seq data through k-mer decomposition
por: Audoux, Jérôme, et al.
Publicado: (2017)

Querying large read collections in main memory: a versatile data structure
por: Philippe, Nicolas, et al.
Publicado: (2011)

The K-mer File Format: a standardized and compact disk representation of sets of k-mers
por: Dufresne, Yoann, et al.
Publicado: (2022)

kmtricks: efficient and flexible construction of Bloom filters for large sequencing data collections
por: Lemane, Téo, et al.
Publicado: (2022)

k mdiff, large-scale and user-friendly differential k-mer analyses
por: Lemane, Téo, et al.
Publicado: (2022)

aKmerBroom: Ancient oral DNA decontamination using Bloom filters on k-mer sets
por: Duitama González, Camila, et al.
Publicado: (2023)

Compacting de Bruijn graphs from sequencing data quickly and in low memory
por: Chikhi, Rayan, et al.
Publicado: (2016)

iMOKA: k-mer based software to analyze large collections of sequencing data
por: Lorenzi, Claudio, et al.
Publicado: (2020)

Kohdista: an efficient method to index and query possible Rmap alignments
por: Muggli, Martin D., et al.
Publicado: (2019)

Fulgor: A fast and compact k-mer index for large-scale matching and color queries
por: Fan, Jason, et al.
Publicado: (2023)

Querying Large Physics Data Sets Over an Information Grid
por: Baker, Nigel, et al.
Publicado: (2001)

decOM: similarity-based microbial source tracking of ancient oral samples using k-mer-based methods
por: Duitama González, Camila, et al.
Publicado: (2023)

Mapping-friendly sequence reductions: Going beyond homopolymer compression
por: Blassel, Luc, et al.
Publicado: (2022)

The K-mer antibiotic resistance gene variant analyzer (KARGVA)
por: Marini, Simone, et al.
Publicado: (2023)

Secure and Privacy-Preserving Body Sensor Data Collection and Query Scheme
por: Zhu, Hui, et al.
Publicado: (2016)

FQSqueezer: k-mer-based compression of sequencing data
por: Deorowicz, Sebastian
Publicado: (2020)

These Are Not the K-mers You Are Looking For: Efficient Online K-mer Counting Using a Probabilistic Data Structure
por: Zhang, Qingpeng, et al.
Publicado: (2014)

Data structure set-trie for storing and querying sets: Theoretical and empirical analysis
por: Savnik, Iztok, et al.
Publicado: (2021)

TPMS: a set of utilities for querying collections of gene trees
por: Bigot, Thomas, et al.
Publicado: (2013)

μ- PBWT: a lightweight r-indexing of the PBWT for storing and querying UK Biobank data
por: Cozzi, Davide, et al.
Publicado: (2023)

Collect, combine, and transform data using Power Query in Excel and Power BI
por: Raviv, Gil
Publicado: (2019)

AMR-meta: a k-mer and metafeature approach to classify antimicrobial resistance from high-throughput short-read metagenomics data
por: Marini, Simone, et al.
Publicado: (2022)

Top-k dominating queries on incomplete large dataset
por: Wu, Jimmy Ming-Tai, et al.
Publicado: (2021)

DatView: a graphical user interface for visualizing and querying large data sets in serial femtosecond crystallography
por: Stander, Natasha, et al.
Publicado: (2019)

Querying a web of linked data: foundations and query execution
por: Hartig, O
Publicado: (2016)

Mapsembler, targeted and micro assembly of large NGS datasets on a desktop computer
por: Peterlongo, Pierre, et al.
Publicado: (2012)

Query Lifting: Language-integrated query for heterogeneous nested collections
por: Ricciotti, Wilmer, et al.
Publicado: (2021)

Efficient mapping of accurate long reads in minimizer space with mapquik
por: Ekim, Bariş, et al.
Publicado: (2023)

Knowledge and Theme Discovery across Very Large Biological Data Sets Using Distributed Queries: A Prototype Combining Unstructured and Structured Data
por: Mudunuri, Uma S., et al.
Publicado: (2013)

On-Demand Information Retrieval in Sensor Networks with Localised Query and Energy-Balanced Data Collection
por: Teng, Rui, et al.
Publicado: (2010)

Misassembly detection using paired-end sequence reads and optical mapping data
por: Muggli, Martin D., et al.
Publicado: (2015)

HyDA-Vista: towards optimal guided selection of k-mer size for sequence assembly
por: Shariat, Basir, et al.
Publicado: (2014)

Author Correction: FQSqueezer: k-mer-based compression of sequencing data
por: Deorowicz, Sebastian
Publicado: (2020)

A PID-Based kNN Query Processing Algorithm for Spatial Data
por: Qiao, Baiyou, et al.
Publicado: (2022)

Survey on Exact kNN Queries over High-Dimensional Data Space
por: Ukey, Nimish, et al.
Publicado: (2023)

Fast and accurate correction of optical mapping data via spaced seeds
por: Salmela, Leena, et al.
Publicado: (2020)

Fast and accurate correction of optical mapping data via spaced seeds
por: Salmela, Leena, et al.
Publicado: (2020)

Matchtigs: minimum plain text representation of k-mer sets
por: Schmidt, Sebastian, et al.
Publicado: (2023)

Cannot write session to /tmp/vufind_sessions/sess_tpunvq2jcefedl1t9g0fjhh5o7