Cargando…

ParGenes: a tool for massively parallel model selection and phylogenetic tree inference on thousands of genes

MOTIVATION: Coalescent- and reconciliation-based methods are now widely used to infer species phylogenies from genomic data. They typically use per-gene phylogenies as input, which requires conducting multiple individual tree inferences on a large set of multiple sequence alignments (MSAs). At prese...

Descripción completa

Detalles Bibliográficos
Autores principales: Morel, Benoit, Kozlov, Alexey M, Stamatakis, Alexandros
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6513153/
https://www.ncbi.nlm.nih.gov/pubmed/30321303
http://dx.doi.org/10.1093/bioinformatics/bty839
_version_ 1783417730219638784
author Morel, Benoit
Kozlov, Alexey M
Stamatakis, Alexandros
author_facet Morel, Benoit
Kozlov, Alexey M
Stamatakis, Alexandros
author_sort Morel, Benoit
collection PubMed
description MOTIVATION: Coalescent- and reconciliation-based methods are now widely used to infer species phylogenies from genomic data. They typically use per-gene phylogenies as input, which requires conducting multiple individual tree inferences on a large set of multiple sequence alignments (MSAs). At present, no easy-to-use parallel tool for this task exists. Ad hoc scripts for this purpose do not only induce additional implementation overhead, but can also lead to poor resource utilization and long times-to-solution. We present ParGenes, a tool for simultaneously determining the best-fit model and inferring maximum likelihood (ML) phylogenies on thousands of independent MSAs using supercomputers. RESULTS: ParGenes executes common phylogenetic pipeline steps such as model-testing, ML inference(s), bootstrapping and computation of branch support values via a single parallel program invocation. We evaluated ParGenes by inferring > 20 000 phylogenetic gene trees with bootstrap support values from Ensembl Compara and VectorBase alignments in 28 h on a cluster with 1024 nodes. AVAILABILITY AND IMPLEMENTATION: GNU GPL at https://github.com/BenoitMorel/ParGenes. SUPPLEMENTARY INFORMATION: Supplementary material is available at Bioinformatics online.
format Online
Article
Text
id pubmed-6513153
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-65131532019-05-20 ParGenes: a tool for massively parallel model selection and phylogenetic tree inference on thousands of genes Morel, Benoit Kozlov, Alexey M Stamatakis, Alexandros Bioinformatics Applications Notes MOTIVATION: Coalescent- and reconciliation-based methods are now widely used to infer species phylogenies from genomic data. They typically use per-gene phylogenies as input, which requires conducting multiple individual tree inferences on a large set of multiple sequence alignments (MSAs). At present, no easy-to-use parallel tool for this task exists. Ad hoc scripts for this purpose do not only induce additional implementation overhead, but can also lead to poor resource utilization and long times-to-solution. We present ParGenes, a tool for simultaneously determining the best-fit model and inferring maximum likelihood (ML) phylogenies on thousands of independent MSAs using supercomputers. RESULTS: ParGenes executes common phylogenetic pipeline steps such as model-testing, ML inference(s), bootstrapping and computation of branch support values via a single parallel program invocation. We evaluated ParGenes by inferring > 20 000 phylogenetic gene trees with bootstrap support values from Ensembl Compara and VectorBase alignments in 28 h on a cluster with 1024 nodes. AVAILABILITY AND IMPLEMENTATION: GNU GPL at https://github.com/BenoitMorel/ParGenes. SUPPLEMENTARY INFORMATION: Supplementary material is available at Bioinformatics online. Oxford University Press 2019-05-15 2018-10-15 /pmc/articles/PMC6513153/ /pubmed/30321303 http://dx.doi.org/10.1093/bioinformatics/bty839 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Applications Notes
Morel, Benoit
Kozlov, Alexey M
Stamatakis, Alexandros
ParGenes: a tool for massively parallel model selection and phylogenetic tree inference on thousands of genes
title ParGenes: a tool for massively parallel model selection and phylogenetic tree inference on thousands of genes
title_full ParGenes: a tool for massively parallel model selection and phylogenetic tree inference on thousands of genes
title_fullStr ParGenes: a tool for massively parallel model selection and phylogenetic tree inference on thousands of genes
title_full_unstemmed ParGenes: a tool for massively parallel model selection and phylogenetic tree inference on thousands of genes
title_short ParGenes: a tool for massively parallel model selection and phylogenetic tree inference on thousands of genes
title_sort pargenes: a tool for massively parallel model selection and phylogenetic tree inference on thousands of genes
topic Applications Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6513153/
https://www.ncbi.nlm.nih.gov/pubmed/30321303
http://dx.doi.org/10.1093/bioinformatics/bty839
work_keys_str_mv AT morelbenoit pargenesatoolformassivelyparallelmodelselectionandphylogenetictreeinferenceonthousandsofgenes
AT kozlovalexeym pargenesatoolformassivelyparallelmodelselectionandphylogenetictreeinferenceonthousandsofgenes
AT stamatakisalexandros pargenesatoolformassivelyparallelmodelselectionandphylogenetictreeinferenceonthousandsofgenes