Cargando…

LMAP: Lightweight Multigene Analyses in PAML

BACKGROUND: Uncovering how phenotypic diversity arises and is maintained in nature has long been a major interest of evolutionary biologists. Recent advances in genome sequencing technologies have remarkably increased the efficiency to pinpoint genes involved in the adaptive evolution of phenotypes....

Descripción completa

Detalles Bibliográficos
Autores principales: Maldonado, Emanuel, Almeida, Daniela, Escalona, Tibisay, Khan, Imran, Vasconcelos, Vitor, Antunes, Agostinho
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5011788/
https://www.ncbi.nlm.nih.gov/pubmed/27597435
http://dx.doi.org/10.1186/s12859-016-1204-5
_version_ 1782451893800271872
author Maldonado, Emanuel
Almeida, Daniela
Escalona, Tibisay
Khan, Imran
Vasconcelos, Vitor
Antunes, Agostinho
author_facet Maldonado, Emanuel
Almeida, Daniela
Escalona, Tibisay
Khan, Imran
Vasconcelos, Vitor
Antunes, Agostinho
author_sort Maldonado, Emanuel
collection PubMed
description BACKGROUND: Uncovering how phenotypic diversity arises and is maintained in nature has long been a major interest of evolutionary biologists. Recent advances in genome sequencing technologies have remarkably increased the efficiency to pinpoint genes involved in the adaptive evolution of phenotypes. Reliability of such findings is most often examined with statistical and computational methods using Maximum Likelihood codon-based models (i.e., site, branch, branch-site and clade models), such as those available in codeml from the Phylogenetic Analysis by Maximum Likelihood (PAML) package. While these models represent a well-defined workflow for documenting adaptive evolution, in practice they can be challenging for researchers having a vast amount of data, as multiple types of relevant codon-based datasets are generated, making the overall process hard and tedious to handle, error-prone and time-consuming. RESULTS: We introduce LMAP (Lightweight Multigene Analyses in PAML), a user-friendly command-line and interactive package, designed to handle the codeml workflow, namely: directory organization, execution, results gathering and organization for Likelihood Ratio Test estimations with minimal manual user intervention. LMAP was developed for the workstation multi-core environment and provides a unique advantage for processing one, or more, if not all codeml codon-based models for multiple datasets at a time. Our software, proved efficiency throughout the codeml workflow, including, but not limited, to simultaneously handling more than 20 datasets. CONCLUSIONS: We have developed a simple and versatile LMAP package, with outstanding performance, enabling researchers to analyze multiple different codon-based datasets in a high-throughput fashion. At minimum, two file types are required within a single input directory: one for the multiple sequence alignment and another for the phylogenetic tree. To our knowledge, no other software combines all codeml codon substitution models of adaptive evolution. LMAP has been developed as an open-source package, allowing its integration into more complex open-source bioinformatics pipelines. LMAP package is released under GPLv3 license and is freely available at http://lmapaml.sourceforge.net/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1204-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5011788
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50117882016-09-15 LMAP: Lightweight Multigene Analyses in PAML Maldonado, Emanuel Almeida, Daniela Escalona, Tibisay Khan, Imran Vasconcelos, Vitor Antunes, Agostinho BMC Bioinformatics Software BACKGROUND: Uncovering how phenotypic diversity arises and is maintained in nature has long been a major interest of evolutionary biologists. Recent advances in genome sequencing technologies have remarkably increased the efficiency to pinpoint genes involved in the adaptive evolution of phenotypes. Reliability of such findings is most often examined with statistical and computational methods using Maximum Likelihood codon-based models (i.e., site, branch, branch-site and clade models), such as those available in codeml from the Phylogenetic Analysis by Maximum Likelihood (PAML) package. While these models represent a well-defined workflow for documenting adaptive evolution, in practice they can be challenging for researchers having a vast amount of data, as multiple types of relevant codon-based datasets are generated, making the overall process hard and tedious to handle, error-prone and time-consuming. RESULTS: We introduce LMAP (Lightweight Multigene Analyses in PAML), a user-friendly command-line and interactive package, designed to handle the codeml workflow, namely: directory organization, execution, results gathering and organization for Likelihood Ratio Test estimations with minimal manual user intervention. LMAP was developed for the workstation multi-core environment and provides a unique advantage for processing one, or more, if not all codeml codon-based models for multiple datasets at a time. Our software, proved efficiency throughout the codeml workflow, including, but not limited, to simultaneously handling more than 20 datasets. CONCLUSIONS: We have developed a simple and versatile LMAP package, with outstanding performance, enabling researchers to analyze multiple different codon-based datasets in a high-throughput fashion. At minimum, two file types are required within a single input directory: one for the multiple sequence alignment and another for the phylogenetic tree. To our knowledge, no other software combines all codeml codon substitution models of adaptive evolution. LMAP has been developed as an open-source package, allowing its integration into more complex open-source bioinformatics pipelines. LMAP package is released under GPLv3 license and is freely available at http://lmapaml.sourceforge.net/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1204-5) contains supplementary material, which is available to authorized users. BioMed Central 2016-09-06 /pmc/articles/PMC5011788/ /pubmed/27597435 http://dx.doi.org/10.1186/s12859-016-1204-5 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Maldonado, Emanuel
Almeida, Daniela
Escalona, Tibisay
Khan, Imran
Vasconcelos, Vitor
Antunes, Agostinho
LMAP: Lightweight Multigene Analyses in PAML
title LMAP: Lightweight Multigene Analyses in PAML
title_full LMAP: Lightweight Multigene Analyses in PAML
title_fullStr LMAP: Lightweight Multigene Analyses in PAML
title_full_unstemmed LMAP: Lightweight Multigene Analyses in PAML
title_short LMAP: Lightweight Multigene Analyses in PAML
title_sort lmap: lightweight multigene analyses in paml
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5011788/
https://www.ncbi.nlm.nih.gov/pubmed/27597435
http://dx.doi.org/10.1186/s12859-016-1204-5
work_keys_str_mv AT maldonadoemanuel lmaplightweightmultigeneanalysesinpaml
AT almeidadaniela lmaplightweightmultigeneanalysesinpaml
AT escalonatibisay lmaplightweightmultigeneanalysesinpaml
AT khanimran lmaplightweightmultigeneanalysesinpaml
AT vasconcelosvitor lmaplightweightmultigeneanalysesinpaml
AT antunesagostinho lmaplightweightmultigeneanalysesinpaml