Cargando…

‘Multi-SpaM’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees

Word-based or ‘alignment-free’ methods for phylogeny inference have become popular in recent years. These methods are much faster than traditional, alignment-based approaches, but they are generally less accurate. Most alignment-free methods calculate ‘pairwise’ distances between nucleic-acid or pro...

Descripción completa

Detalles Bibliográficos
Autores principales: Dencker, Thomas, Leimeister, Chris-André, Gerth, Michael, Bleidorn, Christoph, Snir, Sagi, Morgenstern, Burkhard
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671388/
https://www.ncbi.nlm.nih.gov/pubmed/33575565
http://dx.doi.org/10.1093/nargab/lqz013
_version_ 1783610919773798400
author Dencker, Thomas
Leimeister, Chris-André
Gerth, Michael
Bleidorn, Christoph
Snir, Sagi
Morgenstern, Burkhard
author_facet Dencker, Thomas
Leimeister, Chris-André
Gerth, Michael
Bleidorn, Christoph
Snir, Sagi
Morgenstern, Burkhard
author_sort Dencker, Thomas
collection PubMed
description Word-based or ‘alignment-free’ methods for phylogeny inference have become popular in recent years. These methods are much faster than traditional, alignment-based approaches, but they are generally less accurate. Most alignment-free methods calculate ‘pairwise’ distances between nucleic-acid or protein sequences; these distance values can then be used as input for tree-reconstruction programs such as neighbor-joining. In this paper, we propose the first word-based phylogeny approach that is based on ‘multiple’ sequence comparison and ‘maximum likelihood’. Our algorithm first samples small, gap-free alignments involving four taxa each. For each of these alignments, it then calculates a quartet tree and, finally, the program ‘Quartet MaxCut’ is used to infer a super tree for the full set of input taxa from the calculated quartet trees. Experimental results show that trees produced with our approach are of high quality.
format Online
Article
Text
id pubmed-7671388
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-76713882021-02-10 ‘Multi-SpaM’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees Dencker, Thomas Leimeister, Chris-André Gerth, Michael Bleidorn, Christoph Snir, Sagi Morgenstern, Burkhard NAR Genom Bioinform Methods Article Word-based or ‘alignment-free’ methods for phylogeny inference have become popular in recent years. These methods are much faster than traditional, alignment-based approaches, but they are generally less accurate. Most alignment-free methods calculate ‘pairwise’ distances between nucleic-acid or protein sequences; these distance values can then be used as input for tree-reconstruction programs such as neighbor-joining. In this paper, we propose the first word-based phylogeny approach that is based on ‘multiple’ sequence comparison and ‘maximum likelihood’. Our algorithm first samples small, gap-free alignments involving four taxa each. For each of these alignments, it then calculates a quartet tree and, finally, the program ‘Quartet MaxCut’ is used to infer a super tree for the full set of input taxa from the calculated quartet trees. Experimental results show that trees produced with our approach are of high quality. Oxford University Press 2019-10-30 /pmc/articles/PMC7671388/ /pubmed/33575565 http://dx.doi.org/10.1093/nargab/lqz013 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Article
Dencker, Thomas
Leimeister, Chris-André
Gerth, Michael
Bleidorn, Christoph
Snir, Sagi
Morgenstern, Burkhard
‘Multi-SpaM’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees
title ‘Multi-SpaM’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees
title_full ‘Multi-SpaM’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees
title_fullStr ‘Multi-SpaM’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees
title_full_unstemmed ‘Multi-SpaM’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees
title_short ‘Multi-SpaM’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees
title_sort ‘multi-spam’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees
topic Methods Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671388/
https://www.ncbi.nlm.nih.gov/pubmed/33575565
http://dx.doi.org/10.1093/nargab/lqz013
work_keys_str_mv AT denckerthomas multispamamaximumlikelihoodapproachtophylogenyreconstructionusingmultiplespacedwordmatchesandquartettrees
AT leimeisterchrisandre multispamamaximumlikelihoodapproachtophylogenyreconstructionusingmultiplespacedwordmatchesandquartettrees
AT gerthmichael multispamamaximumlikelihoodapproachtophylogenyreconstructionusingmultiplespacedwordmatchesandquartettrees
AT bleidornchristoph multispamamaximumlikelihoodapproachtophylogenyreconstructionusingmultiplespacedwordmatchesandquartettrees
AT snirsagi multispamamaximumlikelihoodapproachtophylogenyreconstructionusingmultiplespacedwordmatchesandquartettrees
AT morgensternburkhard multispamamaximumlikelihoodapproachtophylogenyreconstructionusingmultiplespacedwordmatchesandquartettrees