Cargando…
Efficient merging of genome profile alignments
MOTIVATION: Whole-genome alignment (WGA) methods show insufficient scalability toward the generation of large-scale WGAs. Profile alignment-based approaches revolutionized the fields of multiple sequence alignment construction methods by significantly reducing computational complexity and runtime. H...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6612806/ https://www.ncbi.nlm.nih.gov/pubmed/31510683 http://dx.doi.org/10.1093/bioinformatics/btz377 |
_version_ | 1783432941218562048 |
---|---|
author | Hennig, André Nieselt, Kay |
author_facet | Hennig, André Nieselt, Kay |
author_sort | Hennig, André |
collection | PubMed |
description | MOTIVATION: Whole-genome alignment (WGA) methods show insufficient scalability toward the generation of large-scale WGAs. Profile alignment-based approaches revolutionized the fields of multiple sequence alignment construction methods by significantly reducing computational complexity and runtime. However, WGAs need to consider genomic rearrangements between genomes, which make the profile-based extension of several whole-genomes challenging. Currently, none of the available methods offer the possibility to align or extend WGA profiles. RESULTS: Here, we present genome profile alignment, an approach that aligns the profiles of WGAs and that is capable of producing large-scale WGAs many times faster than conventional methods. Our concept relies on already available whole-genome aligners, which are used to compute several smaller sets of aligned genomes that are combined to a full WGA with a divide and conquer approach. To align or extend WGA profiles, we make use of the SuperGenome data structure, which features a bidirectional mapping between individual sequence and alignment coordinates. This data structure is used to efficiently transfer different coordinate systems into a common one based on the principles of profiles alignments. The approach allows the computation of a WGA where alignments are subsequently merged along a guide tree. The current implementation uses progressiveMauve and offers the possibility for parallel computation of independent genome alignments. Our results based on various bacterial datasets up to several hundred genomes show that we can reduce the runtime from months to hours with a quality that is negligibly worse than the WGA computed with the conventional progressiveMauve tool. AVAILABILITY AND IMPLEMENTATION: GPA is freely available at https://lambda.informatik.uni-tuebingen.de/gitlab/ahennig/GPA. GPA is implemented in Java, uses progressiveMauve and offers a parallel computation of WGAs. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-6612806 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-66128062019-07-12 Efficient merging of genome profile alignments Hennig, André Nieselt, Kay Bioinformatics Ismb/Eccb 2019 Conference Proceedings MOTIVATION: Whole-genome alignment (WGA) methods show insufficient scalability toward the generation of large-scale WGAs. Profile alignment-based approaches revolutionized the fields of multiple sequence alignment construction methods by significantly reducing computational complexity and runtime. However, WGAs need to consider genomic rearrangements between genomes, which make the profile-based extension of several whole-genomes challenging. Currently, none of the available methods offer the possibility to align or extend WGA profiles. RESULTS: Here, we present genome profile alignment, an approach that aligns the profiles of WGAs and that is capable of producing large-scale WGAs many times faster than conventional methods. Our concept relies on already available whole-genome aligners, which are used to compute several smaller sets of aligned genomes that are combined to a full WGA with a divide and conquer approach. To align or extend WGA profiles, we make use of the SuperGenome data structure, which features a bidirectional mapping between individual sequence and alignment coordinates. This data structure is used to efficiently transfer different coordinate systems into a common one based on the principles of profiles alignments. The approach allows the computation of a WGA where alignments are subsequently merged along a guide tree. The current implementation uses progressiveMauve and offers the possibility for parallel computation of independent genome alignments. Our results based on various bacterial datasets up to several hundred genomes show that we can reduce the runtime from months to hours with a quality that is negligibly worse than the WGA computed with the conventional progressiveMauve tool. AVAILABILITY AND IMPLEMENTATION: GPA is freely available at https://lambda.informatik.uni-tuebingen.de/gitlab/ahennig/GPA. GPA is implemented in Java, uses progressiveMauve and offers a parallel computation of WGAs. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-07 2019-07-05 /pmc/articles/PMC6612806/ /pubmed/31510683 http://dx.doi.org/10.1093/bioinformatics/btz377 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Ismb/Eccb 2019 Conference Proceedings Hennig, André Nieselt, Kay Efficient merging of genome profile alignments |
title | Efficient merging of genome profile alignments |
title_full | Efficient merging of genome profile alignments |
title_fullStr | Efficient merging of genome profile alignments |
title_full_unstemmed | Efficient merging of genome profile alignments |
title_short | Efficient merging of genome profile alignments |
title_sort | efficient merging of genome profile alignments |
topic | Ismb/Eccb 2019 Conference Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6612806/ https://www.ncbi.nlm.nih.gov/pubmed/31510683 http://dx.doi.org/10.1093/bioinformatics/btz377 |
work_keys_str_mv | AT hennigandre efficientmergingofgenomeprofilealignments AT nieseltkay efficientmergingofgenomeprofilealignments |