Cargando…
Phylonium: fast estimation of evolutionary distances from large samples of similar genomes
MOTIVATION: Tracking disease outbreaks by whole-genome sequencing leads to the collection of large samples of closely related sequences. Five years ago, we published a method to accurately compute all pairwise distances for such samples by indexing each sequence. Since indexing is slow, we now ask w...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7141870/ https://www.ncbi.nlm.nih.gov/pubmed/31790149 http://dx.doi.org/10.1093/bioinformatics/btz903 |
_version_ | 1783519274296410112 |
---|---|
author | Klötzl, Fabian Haubold, Bernhard |
author_facet | Klötzl, Fabian Haubold, Bernhard |
author_sort | Klötzl, Fabian |
collection | PubMed |
description | MOTIVATION: Tracking disease outbreaks by whole-genome sequencing leads to the collection of large samples of closely related sequences. Five years ago, we published a method to accurately compute all pairwise distances for such samples by indexing each sequence. Since indexing is slow, we now ask whether it is possible to achieve similar accuracy when indexing only a single sequence. RESULTS: We have implemented this idea in the program phylonium and show that it is as accurate as its predecessor and roughly 100 times faster when applied to all 2678 Escherichia coli genomes contained in ENSEMBL. One of the best published programs for rapidly computing pairwise distances, mash, analyzes the same dataset four times faster but, with default settings, it is less accurate than phylonium. AVAILABILITY AND IMPLEMENTATION: Phylonium runs under the UNIX command line; its C++ sources and documentation are available from github.com/evolbioinf/phylonium. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-7141870 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-71418702020-04-13 Phylonium: fast estimation of evolutionary distances from large samples of similar genomes Klötzl, Fabian Haubold, Bernhard Bioinformatics Original Papers MOTIVATION: Tracking disease outbreaks by whole-genome sequencing leads to the collection of large samples of closely related sequences. Five years ago, we published a method to accurately compute all pairwise distances for such samples by indexing each sequence. Since indexing is slow, we now ask whether it is possible to achieve similar accuracy when indexing only a single sequence. RESULTS: We have implemented this idea in the program phylonium and show that it is as accurate as its predecessor and roughly 100 times faster when applied to all 2678 Escherichia coli genomes contained in ENSEMBL. One of the best published programs for rapidly computing pairwise distances, mash, analyzes the same dataset four times faster but, with default settings, it is less accurate than phylonium. AVAILABILITY AND IMPLEMENTATION: Phylonium runs under the UNIX command line; its C++ sources and documentation are available from github.com/evolbioinf/phylonium. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-04-01 2019-12-02 /pmc/articles/PMC7141870/ /pubmed/31790149 http://dx.doi.org/10.1093/bioinformatics/btz903 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Klötzl, Fabian Haubold, Bernhard Phylonium: fast estimation of evolutionary distances from large samples of similar genomes |
title |
Phylonium: fast estimation of evolutionary distances from large samples of similar genomes |
title_full |
Phylonium: fast estimation of evolutionary distances from large samples of similar genomes |
title_fullStr |
Phylonium: fast estimation of evolutionary distances from large samples of similar genomes |
title_full_unstemmed |
Phylonium: fast estimation of evolutionary distances from large samples of similar genomes |
title_short |
Phylonium: fast estimation of evolutionary distances from large samples of similar genomes |
title_sort | phylonium: fast estimation of evolutionary distances from large samples of similar genomes |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7141870/ https://www.ncbi.nlm.nih.gov/pubmed/31790149 http://dx.doi.org/10.1093/bioinformatics/btz903 |
work_keys_str_mv | AT klotzlfabian phyloniumfastestimationofevolutionarydistancesfromlargesamplesofsimilargenomes AT hauboldbernhard phyloniumfastestimationofevolutionarydistancesfromlargesamplesofsimilargenomes |