Cargando…

Genome alignment with graph data structures: a comparison

BACKGROUND: Recent advances in rapid, low-cost sequencing have opened up the opportunity to study complete genome sequences. The computational approach of multiple genome alignment allows investigation of evolutionarily related genomes in an integrated fashion, providing a basis for downstream analy...

Descripción completa

Detalles Bibliográficos
Autores principales: Kehr, Birte, Trappe, Kathrin, Holtgrewe, Manuel, Reinert, Knut
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4020321/
https://www.ncbi.nlm.nih.gov/pubmed/24712884
http://dx.doi.org/10.1186/1471-2105-15-99
_version_ 1782316049677418496
author Kehr, Birte
Trappe, Kathrin
Holtgrewe, Manuel
Reinert, Knut
author_facet Kehr, Birte
Trappe, Kathrin
Holtgrewe, Manuel
Reinert, Knut
author_sort Kehr, Birte
collection PubMed
description BACKGROUND: Recent advances in rapid, low-cost sequencing have opened up the opportunity to study complete genome sequences. The computational approach of multiple genome alignment allows investigation of evolutionarily related genomes in an integrated fashion, providing a basis for downstream analyses such as rearrangement studies and phylogenetic inference. Graphs have proven to be a powerful tool for coping with the complexity of genome-scale sequence alignments. The potential of graphs to intuitively represent all aspects of genome alignments led to the development of graph-based approaches for genome alignment. These approaches construct a graph from a set of local alignments, and derive a genome alignment through identification and removal of graph substructures that indicate errors in the alignment. RESULTS: We compare the structures of commonly used graphs in terms of their abilities to represent alignment information. We describe how the graphs can be transformed into each other, and identify and classify graph substructures common to one or more graphs. Based on previous approaches, we compile a list of modifications that remove these substructures. CONCLUSION: We show that crucial pieces of alignment information, associated with inversions and duplications, are not visible in the structure of all graphs. If we neglect vertex or edge labels, the graphs differ in their information content. Still, many ideas are shared among all graph-based approaches. Based on these findings, we outline a conceptual framework for graph-based genome alignment that can assist in the development of future genome alignment tools.
format Online
Article
Text
id pubmed-4020321
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40203212014-05-28 Genome alignment with graph data structures: a comparison Kehr, Birte Trappe, Kathrin Holtgrewe, Manuel Reinert, Knut BMC Bioinformatics Research Article BACKGROUND: Recent advances in rapid, low-cost sequencing have opened up the opportunity to study complete genome sequences. The computational approach of multiple genome alignment allows investigation of evolutionarily related genomes in an integrated fashion, providing a basis for downstream analyses such as rearrangement studies and phylogenetic inference. Graphs have proven to be a powerful tool for coping with the complexity of genome-scale sequence alignments. The potential of graphs to intuitively represent all aspects of genome alignments led to the development of graph-based approaches for genome alignment. These approaches construct a graph from a set of local alignments, and derive a genome alignment through identification and removal of graph substructures that indicate errors in the alignment. RESULTS: We compare the structures of commonly used graphs in terms of their abilities to represent alignment information. We describe how the graphs can be transformed into each other, and identify and classify graph substructures common to one or more graphs. Based on previous approaches, we compile a list of modifications that remove these substructures. CONCLUSION: We show that crucial pieces of alignment information, associated with inversions and duplications, are not visible in the structure of all graphs. If we neglect vertex or edge labels, the graphs differ in their information content. Still, many ideas are shared among all graph-based approaches. Based on these findings, we outline a conceptual framework for graph-based genome alignment that can assist in the development of future genome alignment tools. BioMed Central 2014-04-09 /pmc/articles/PMC4020321/ /pubmed/24712884 http://dx.doi.org/10.1186/1471-2105-15-99 Text en Copyright © 2014 Kehr et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Kehr, Birte
Trappe, Kathrin
Holtgrewe, Manuel
Reinert, Knut
Genome alignment with graph data structures: a comparison
title Genome alignment with graph data structures: a comparison
title_full Genome alignment with graph data structures: a comparison
title_fullStr Genome alignment with graph data structures: a comparison
title_full_unstemmed Genome alignment with graph data structures: a comparison
title_short Genome alignment with graph data structures: a comparison
title_sort genome alignment with graph data structures: a comparison
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4020321/
https://www.ncbi.nlm.nih.gov/pubmed/24712884
http://dx.doi.org/10.1186/1471-2105-15-99
work_keys_str_mv AT kehrbirte genomealignmentwithgraphdatastructuresacomparison
AT trappekathrin genomealignmentwithgraphdatastructuresacomparison
AT holtgrewemanuel genomealignmentwithgraphdatastructuresacomparison
AT reinertknut genomealignmentwithgraphdatastructuresacomparison