Cargando…

AxPcoords & parallel AxParafit: statistical co-phylogenetic analyses on thousands of taxa

BACKGROUND: Current tools for Co-phylogenetic analyses are not able to cope with the continuous accumulation of phylogenetic data. The sophisticated statistical test for host-parasite co-phylogenetic analyses implemented in Parafit does not allow it to handle large datasets in reasonable times. The...

Descripción completa

Detalles Bibliográficos
Autores principales: Stamatakis, Alexandros, Auch, Alexander F, Meier-Kolthoff, Jan, Göker, Markus
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2194794/
https://www.ncbi.nlm.nih.gov/pubmed/17953748
http://dx.doi.org/10.1186/1471-2105-8-405
_version_ 1782147697022599168
author Stamatakis, Alexandros
Auch, Alexander F
Meier-Kolthoff, Jan
Göker, Markus
author_facet Stamatakis, Alexandros
Auch, Alexander F
Meier-Kolthoff, Jan
Göker, Markus
author_sort Stamatakis, Alexandros
collection PubMed
description BACKGROUND: Current tools for Co-phylogenetic analyses are not able to cope with the continuous accumulation of phylogenetic data. The sophisticated statistical test for host-parasite co-phylogenetic analyses implemented in Parafit does not allow it to handle large datasets in reasonable times. The Parafit and DistPCoA programs are the by far most compute-intensive components of the Parafit analysis pipeline. We present AxParafit and AxPcoords (Ax stands for Accelerated) which are highly optimized versions of Parafit and DistPCoA respectively. RESULTS: Both programs have been entirely re-written in C. Via optimization of the algorithm and the C code as well as integration of highly tuned BLAS and LAPACK methods AxParafit runs 5–61 times faster than Parafit with a lower memory footprint (up to 35% reduction) while the performance benefit increases with growing dataset size. The MPI-based parallel implementation of AxParafit shows good scalability on up to 128 processors, even on medium-sized datasets. The parallel analysis with AxParafit on 128 CPUs for a medium-sized dataset with an 512 by 512 association matrix is more than 1,200/128 times faster per processor than the sequential Parafit run. AxPcoords is 8–26 times faster than DistPCoA and numerically stable on large datasets. We outline the substantial benefits of using parallel AxParafit by example of a large-scale empirical study on smut fungi and their host plants. To the best of our knowledge, this study represents the largest co-phylogenetic analysis to date. CONCLUSION: The highly efficient AxPcoords and AxParafit programs allow for large-scale co-phylogenetic analyses on several thousands of taxa for the first time. In addition, AxParafit and AxPcoords have been integrated into the easy-to-use CopyCat tool.
format Text
id pubmed-2194794
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-21947942008-01-13 AxPcoords & parallel AxParafit: statistical co-phylogenetic analyses on thousands of taxa Stamatakis, Alexandros Auch, Alexander F Meier-Kolthoff, Jan Göker, Markus BMC Bioinformatics Software BACKGROUND: Current tools for Co-phylogenetic analyses are not able to cope with the continuous accumulation of phylogenetic data. The sophisticated statistical test for host-parasite co-phylogenetic analyses implemented in Parafit does not allow it to handle large datasets in reasonable times. The Parafit and DistPCoA programs are the by far most compute-intensive components of the Parafit analysis pipeline. We present AxParafit and AxPcoords (Ax stands for Accelerated) which are highly optimized versions of Parafit and DistPCoA respectively. RESULTS: Both programs have been entirely re-written in C. Via optimization of the algorithm and the C code as well as integration of highly tuned BLAS and LAPACK methods AxParafit runs 5–61 times faster than Parafit with a lower memory footprint (up to 35% reduction) while the performance benefit increases with growing dataset size. The MPI-based parallel implementation of AxParafit shows good scalability on up to 128 processors, even on medium-sized datasets. The parallel analysis with AxParafit on 128 CPUs for a medium-sized dataset with an 512 by 512 association matrix is more than 1,200/128 times faster per processor than the sequential Parafit run. AxPcoords is 8–26 times faster than DistPCoA and numerically stable on large datasets. We outline the substantial benefits of using parallel AxParafit by example of a large-scale empirical study on smut fungi and their host plants. To the best of our knowledge, this study represents the largest co-phylogenetic analysis to date. CONCLUSION: The highly efficient AxPcoords and AxParafit programs allow for large-scale co-phylogenetic analyses on several thousands of taxa for the first time. In addition, AxParafit and AxPcoords have been integrated into the easy-to-use CopyCat tool. BioMed Central 2007-10-22 /pmc/articles/PMC2194794/ /pubmed/17953748 http://dx.doi.org/10.1186/1471-2105-8-405 Text en Copyright © 2007 Stamatakis et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Stamatakis, Alexandros
Auch, Alexander F
Meier-Kolthoff, Jan
Göker, Markus
AxPcoords & parallel AxParafit: statistical co-phylogenetic analyses on thousands of taxa
title AxPcoords & parallel AxParafit: statistical co-phylogenetic analyses on thousands of taxa
title_full AxPcoords & parallel AxParafit: statistical co-phylogenetic analyses on thousands of taxa
title_fullStr AxPcoords & parallel AxParafit: statistical co-phylogenetic analyses on thousands of taxa
title_full_unstemmed AxPcoords & parallel AxParafit: statistical co-phylogenetic analyses on thousands of taxa
title_short AxPcoords & parallel AxParafit: statistical co-phylogenetic analyses on thousands of taxa
title_sort axpcoords & parallel axparafit: statistical co-phylogenetic analyses on thousands of taxa
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2194794/
https://www.ncbi.nlm.nih.gov/pubmed/17953748
http://dx.doi.org/10.1186/1471-2105-8-405
work_keys_str_mv AT stamatakisalexandros axpcoordsparallelaxparafitstatisticalcophylogeneticanalysesonthousandsoftaxa
AT auchalexanderf axpcoordsparallelaxparafitstatisticalcophylogeneticanalysesonthousandsoftaxa
AT meierkolthoffjan axpcoordsparallelaxparafitstatisticalcophylogeneticanalysesonthousandsoftaxa
AT gokermarkus axpcoordsparallelaxparafitstatisticalcophylogeneticanalysesonthousandsoftaxa