Cargando…
‘Multi-SpaM’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees
Word-based or ‘alignment-free’ methods for phylogeny inference have become popular in recent years. These methods are much faster than traditional, alignment-based approaches, but they are generally less accurate. Most alignment-free methods calculate ‘pairwise’ distances between nucleic-acid or pro...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671388/ https://www.ncbi.nlm.nih.gov/pubmed/33575565 http://dx.doi.org/10.1093/nargab/lqz013 |
_version_ | 1783610919773798400 |
---|---|
author | Dencker, Thomas Leimeister, Chris-André Gerth, Michael Bleidorn, Christoph Snir, Sagi Morgenstern, Burkhard |
author_facet | Dencker, Thomas Leimeister, Chris-André Gerth, Michael Bleidorn, Christoph Snir, Sagi Morgenstern, Burkhard |
author_sort | Dencker, Thomas |
collection | PubMed |
description | Word-based or ‘alignment-free’ methods for phylogeny inference have become popular in recent years. These methods are much faster than traditional, alignment-based approaches, but they are generally less accurate. Most alignment-free methods calculate ‘pairwise’ distances between nucleic-acid or protein sequences; these distance values can then be used as input for tree-reconstruction programs such as neighbor-joining. In this paper, we propose the first word-based phylogeny approach that is based on ‘multiple’ sequence comparison and ‘maximum likelihood’. Our algorithm first samples small, gap-free alignments involving four taxa each. For each of these alignments, it then calculates a quartet tree and, finally, the program ‘Quartet MaxCut’ is used to infer a super tree for the full set of input taxa from the calculated quartet trees. Experimental results show that trees produced with our approach are of high quality. |
format | Online Article Text |
id | pubmed-7671388 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-76713882021-02-10 ‘Multi-SpaM’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees Dencker, Thomas Leimeister, Chris-André Gerth, Michael Bleidorn, Christoph Snir, Sagi Morgenstern, Burkhard NAR Genom Bioinform Methods Article Word-based or ‘alignment-free’ methods for phylogeny inference have become popular in recent years. These methods are much faster than traditional, alignment-based approaches, but they are generally less accurate. Most alignment-free methods calculate ‘pairwise’ distances between nucleic-acid or protein sequences; these distance values can then be used as input for tree-reconstruction programs such as neighbor-joining. In this paper, we propose the first word-based phylogeny approach that is based on ‘multiple’ sequence comparison and ‘maximum likelihood’. Our algorithm first samples small, gap-free alignments involving four taxa each. For each of these alignments, it then calculates a quartet tree and, finally, the program ‘Quartet MaxCut’ is used to infer a super tree for the full set of input taxa from the calculated quartet trees. Experimental results show that trees produced with our approach are of high quality. Oxford University Press 2019-10-30 /pmc/articles/PMC7671388/ /pubmed/33575565 http://dx.doi.org/10.1093/nargab/lqz013 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Article Dencker, Thomas Leimeister, Chris-André Gerth, Michael Bleidorn, Christoph Snir, Sagi Morgenstern, Burkhard ‘Multi-SpaM’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees |
title | ‘Multi-SpaM’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees |
title_full | ‘Multi-SpaM’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees |
title_fullStr | ‘Multi-SpaM’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees |
title_full_unstemmed | ‘Multi-SpaM’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees |
title_short | ‘Multi-SpaM’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees |
title_sort | ‘multi-spam’: a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees |
topic | Methods Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671388/ https://www.ncbi.nlm.nih.gov/pubmed/33575565 http://dx.doi.org/10.1093/nargab/lqz013 |
work_keys_str_mv | AT denckerthomas multispamamaximumlikelihoodapproachtophylogenyreconstructionusingmultiplespacedwordmatchesandquartettrees AT leimeisterchrisandre multispamamaximumlikelihoodapproachtophylogenyreconstructionusingmultiplespacedwordmatchesandquartettrees AT gerthmichael multispamamaximumlikelihoodapproachtophylogenyreconstructionusingmultiplespacedwordmatchesandquartettrees AT bleidornchristoph multispamamaximumlikelihoodapproachtophylogenyreconstructionusingmultiplespacedwordmatchesandquartettrees AT snirsagi multispamamaximumlikelihoodapproachtophylogenyreconstructionusingmultiplespacedwordmatchesandquartettrees AT morgensternburkhard multispamamaximumlikelihoodapproachtophylogenyreconstructionusingmultiplespacedwordmatchesandquartettrees |