Cargando…

On the use of algebraic topology concepts to check the consistency of genome assembly

This paper presents a preliminary work consisting of two contributions. The first one is the design of a very efficient algorithm based on an “Overlap-Layout-Consensus” (OLC) graph to assemble the long reads provided by 3rd generation technologies. The second concerns the analysis of this graph usin...

Descripción completa

Detalles Bibliográficos
Autor principal: Gibrat, Jean-François
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Biophysical Society of Japan (BSJ) 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6975893/
https://www.ncbi.nlm.nih.gov/pubmed/31984196
http://dx.doi.org/10.2142/biophysico.16.0_444
Descripción
Sumario:This paper presents a preliminary work consisting of two contributions. The first one is the design of a very efficient algorithm based on an “Overlap-Layout-Consensus” (OLC) graph to assemble the long reads provided by 3rd generation technologies. The second concerns the analysis of this graph using algebraic topology concepts to determine, in advance, whether the assembly of the genome will be straightforward, i.e., whether it will lead to a pseudo-Hamiltonian path or cycle, or whether the results will need to be scrutinized. In the latter case, it will be necessary to look for “loops” in the OLC assembly graph caused by unresolved repeated genomic regions, and then try to untie the “knots” created by these regions.