Cargando…

Meta-Alignment with Crumble and Prune: Partitioning very large alignment problems for performance and parallelization

BACKGROUND: Continuing research into the global multiple sequence alignment problem has resulted in more sophisticated and principled alignment methods. Unfortunately these new algorithms often require large amounts of time and memory to run, making it nearly impossible to run these algorithms on la...

Descripción completa

Detalles Bibliográficos
Autores principales: Roskin, Krishna M, Paten, Benedict, Haussler, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3114744/
https://www.ncbi.nlm.nih.gov/pubmed/21569267
http://dx.doi.org/10.1186/1471-2105-12-144
_version_ 1782206103529979904
author Roskin, Krishna M
Paten, Benedict
Haussler, David
author_facet Roskin, Krishna M
Paten, Benedict
Haussler, David
author_sort Roskin, Krishna M
collection PubMed
description BACKGROUND: Continuing research into the global multiple sequence alignment problem has resulted in more sophisticated and principled alignment methods. Unfortunately these new algorithms often require large amounts of time and memory to run, making it nearly impossible to run these algorithms on large datasets. As a solution, we present two general methods, Crumble and Prune, for breaking a phylogenetic alignment problem into smaller, more tractable sub-problems. We call Crumble and Prune meta-alignment methods because they use existing alignment algorithms and can be used with many current alignment programs. Crumble breaks long alignment problems into shorter sub-problems. Prune divides the phylogenetic tree into a collection of smaller trees to reduce the number of sequences in each alignment problem. These methods are orthogonal: they can be applied together to provide better scaling in terms of sequence length and in sequence depth. Both methods partition the problem such that many of the sub-problems can be solved independently. The results are then combined to form a solution to the full alignment problem. RESULTS: Crumble and Prune each provide a significant performance improvement with little loss of accuracy. In some cases, a gain in accuracy was observed. Crumble and Prune were tested on real and simulated data. Furthermore, we have implemented a system called Job-tree that allows hierarchical sub-problems to be solved in parallel on a compute cluster, significantly shortening the run-time. CONCLUSIONS: These methods enabled us to solve gigabase alignment problems. These methods could enable a new generation of biologically realistic alignment algorithms to be applied to real world, large scale alignment problems.
format Online
Article
Text
id pubmed-3114744
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31147442011-06-15 Meta-Alignment with Crumble and Prune: Partitioning very large alignment problems for performance and parallelization Roskin, Krishna M Paten, Benedict Haussler, David BMC Bioinformatics Software BACKGROUND: Continuing research into the global multiple sequence alignment problem has resulted in more sophisticated and principled alignment methods. Unfortunately these new algorithms often require large amounts of time and memory to run, making it nearly impossible to run these algorithms on large datasets. As a solution, we present two general methods, Crumble and Prune, for breaking a phylogenetic alignment problem into smaller, more tractable sub-problems. We call Crumble and Prune meta-alignment methods because they use existing alignment algorithms and can be used with many current alignment programs. Crumble breaks long alignment problems into shorter sub-problems. Prune divides the phylogenetic tree into a collection of smaller trees to reduce the number of sequences in each alignment problem. These methods are orthogonal: they can be applied together to provide better scaling in terms of sequence length and in sequence depth. Both methods partition the problem such that many of the sub-problems can be solved independently. The results are then combined to form a solution to the full alignment problem. RESULTS: Crumble and Prune each provide a significant performance improvement with little loss of accuracy. In some cases, a gain in accuracy was observed. Crumble and Prune were tested on real and simulated data. Furthermore, we have implemented a system called Job-tree that allows hierarchical sub-problems to be solved in parallel on a compute cluster, significantly shortening the run-time. CONCLUSIONS: These methods enabled us to solve gigabase alignment problems. These methods could enable a new generation of biologically realistic alignment algorithms to be applied to real world, large scale alignment problems. BioMed Central 2011-05-10 /pmc/articles/PMC3114744/ /pubmed/21569267 http://dx.doi.org/10.1186/1471-2105-12-144 Text en Copyright © 2011 Roskin et al; licensee BioMed Central Ltd. https://creativecommons.org/licenses/by/2.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0 (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Roskin, Krishna M
Paten, Benedict
Haussler, David
Meta-Alignment with Crumble and Prune: Partitioning very large alignment problems for performance and parallelization
title Meta-Alignment with Crumble and Prune: Partitioning very large alignment problems for performance and parallelization
title_full Meta-Alignment with Crumble and Prune: Partitioning very large alignment problems for performance and parallelization
title_fullStr Meta-Alignment with Crumble and Prune: Partitioning very large alignment problems for performance and parallelization
title_full_unstemmed Meta-Alignment with Crumble and Prune: Partitioning very large alignment problems for performance and parallelization
title_short Meta-Alignment with Crumble and Prune: Partitioning very large alignment problems for performance and parallelization
title_sort meta-alignment with crumble and prune: partitioning very large alignment problems for performance and parallelization
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3114744/
https://www.ncbi.nlm.nih.gov/pubmed/21569267
http://dx.doi.org/10.1186/1471-2105-12-144
work_keys_str_mv AT roskinkrishnam metaalignmentwithcrumbleandprunepartitioningverylargealignmentproblemsforperformanceandparallelization
AT patenbenedict metaalignmentwithcrumbleandprunepartitioningverylargealignmentproblemsforperformanceandparallelization
AT hausslerdavid metaalignmentwithcrumbleandprunepartitioningverylargealignmentproblemsforperformanceandparallelization