Cargando…

A representation of a compressed de Bruijn graph for pan-genome analysis that enables search

BACKGROUND: Recently, Marcus et al. (Bioinformatics 30:3476–83, 2014) proposed to use a compressed de Bruijn graph to describe the relationship between the genomes of many individuals/strains of the same or closely related species. They devised an [Formula: see text] time algorithm called splitMEM t...

Descripción completa

Detalles Bibliográficos
Autores principales:	Beller, Timo, Ohlebusch, Enno
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2016
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4950428/ https://www.ncbi.nlm.nih.gov/pubmed/27437028 http://dx.doi.org/10.1186/s13015-016-0083-7

Descripción
Sumario:	BACKGROUND: Recently, Marcus et al. (Bioinformatics 30:3476–83, 2014) proposed to use a compressed de Bruijn graph to describe the relationship between the genomes of many individuals/strains of the same or closely related species. They devised an [Formula: see text] time algorithm called splitMEM that constructs this graph directly (i.e., without using the uncompressed de Bruijn graph) based on a suffix tree, where n is the total length of the genomes and g is the length of the longest genome. Baier et al. (Bioinformatics 32:497–504, 2016) improved their result. RESULTS: In this paper, we propose a new space-efficient representation of the compressed de Bruijn graph that adds the possibility to search for a pattern (e.g. an allele—a variant form of a gene) within the pan-genome. The ability to search within the pan-genome graph is of utmost importance and is a design goal of pan-genome data structures.

A representation of a compressed de Bruijn graph for pan-genome analysis that enables search

Ejemplares similares