Cargando…

Linearization of genome sequence graphs revisited

The need to include the genetic variation within a population into a reference genome led to the concept of a genome sequence graph. Nodes of such a graph are labeled with DNA sequences occurring in represented genomes. Due to double-stranded nature of DNA, each node may be oriented in one of two po...

Descripción completa

Detalles Bibliográficos
Autores principales: Lisiecka, Anna, Dojer, Norbert
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8264155/
https://www.ncbi.nlm.nih.gov/pubmed/34278263
http://dx.doi.org/10.1016/j.isci.2021.102755
Descripción
Sumario:The need to include the genetic variation within a population into a reference genome led to the concept of a genome sequence graph. Nodes of such a graph are labeled with DNA sequences occurring in represented genomes. Due to double-stranded nature of DNA, each node may be oriented in one of two possible ways, resulting in marking one end of the labeling sequence as in-side and the other as out-side. Edges join pairs of sides and reflect adjacency between node sequences in genomes constituting the graph. Linearization of a sequence graph aims at orienting and ordering graph nodes in a way that makes it more efficient for visualization and further analysis, e.g. access and traversal. We propose a new linearization algorithm, called ALIBI – Algorithm for Linearization by Incremental graph BuIlding. The evaluation shows that ALIBI is computationally very efficient and generates high-quality results.