Cargando…

QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution

Amino acid substitution models play a crucial role in phylogenetic analyses. Maximum likelihood (ML) methods have been proposed to estimate amino acid substitution models; however, they are typically complicated and slow. In this article, we propose QMaker, a new ML method to estimate a general time...

Descripción completa

Detalles Bibliográficos
Autores principales: Minh, Bui Quang, Dang, Cuong Cao, Vinh, Le Sy, Lanfear, Robert
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8357343/
https://www.ncbi.nlm.nih.gov/pubmed/33616668
http://dx.doi.org/10.1093/sysbio/syab010
_version_ 1783737109941583872
author Minh, Bui Quang
Dang, Cuong Cao
Vinh, Le Sy
Lanfear, Robert
author_facet Minh, Bui Quang
Dang, Cuong Cao
Vinh, Le Sy
Lanfear, Robert
author_sort Minh, Bui Quang
collection PubMed
description Amino acid substitution models play a crucial role in phylogenetic analyses. Maximum likelihood (ML) methods have been proposed to estimate amino acid substitution models; however, they are typically complicated and slow. In this article, we propose QMaker, a new ML method to estimate a general time-reversible [Formula: see text] matrix from a large protein data set consisting of multiple sequence alignments. QMaker combines an efficient ML tree search algorithm, a model selection for handling the model heterogeneity among alignments, and the consideration of rate mixture models among sites. We provide QMaker as a user-friendly function in the IQ-TREE software package (http://www.iqtree.org) supporting the use of multiple CPU cores so that biologists can easily estimate amino acid substitution models from their own protein alignments. We used QMaker to estimate new empirical general amino acid substitution models from the current Pfam database as well as five clade-specific models for mammals, birds, insects, yeasts, and plants. Our results show that the new models considerably improve the fit between model and data and in some cases influence the inference of phylogenetic tree topologies.[Amino acid replacement matrices; amino acid substitution models; maximum likelihood estimation; phylogenetic inferences.]
format Online
Article
Text
id pubmed-8357343
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-83573432021-08-12 QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution Minh, Bui Quang Dang, Cuong Cao Vinh, Le Sy Lanfear, Robert Syst Biol Regular Articles Amino acid substitution models play a crucial role in phylogenetic analyses. Maximum likelihood (ML) methods have been proposed to estimate amino acid substitution models; however, they are typically complicated and slow. In this article, we propose QMaker, a new ML method to estimate a general time-reversible [Formula: see text] matrix from a large protein data set consisting of multiple sequence alignments. QMaker combines an efficient ML tree search algorithm, a model selection for handling the model heterogeneity among alignments, and the consideration of rate mixture models among sites. We provide QMaker as a user-friendly function in the IQ-TREE software package (http://www.iqtree.org) supporting the use of multiple CPU cores so that biologists can easily estimate amino acid substitution models from their own protein alignments. We used QMaker to estimate new empirical general amino acid substitution models from the current Pfam database as well as five clade-specific models for mammals, birds, insects, yeasts, and plants. Our results show that the new models considerably improve the fit between model and data and in some cases influence the inference of phylogenetic tree topologies.[Amino acid replacement matrices; amino acid substitution models; maximum likelihood estimation; phylogenetic inferences.] Oxford University Press 2021-02-22 /pmc/articles/PMC8357343/ /pubmed/33616668 http://dx.doi.org/10.1093/sysbio/syab010 Text en © The Author(s) 2021. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Regular Articles
Minh, Bui Quang
Dang, Cuong Cao
Vinh, Le Sy
Lanfear, Robert
QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution
title QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution
title_full QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution
title_fullStr QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution
title_full_unstemmed QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution
title_short QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution
title_sort qmaker: fast and accurate method to estimate empirical models of protein evolution
topic Regular Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8357343/
https://www.ncbi.nlm.nih.gov/pubmed/33616668
http://dx.doi.org/10.1093/sysbio/syab010
work_keys_str_mv AT minhbuiquang qmakerfastandaccuratemethodtoestimateempiricalmodelsofproteinevolution
AT dangcuongcao qmakerfastandaccuratemethodtoestimateempiricalmodelsofproteinevolution
AT vinhlesy qmakerfastandaccuratemethodtoestimateempiricalmodelsofproteinevolution
AT lanfearrobert qmakerfastandaccuratemethodtoestimateempiricalmodelsofproteinevolution