Cargando…

Phylonium: fast estimation of evolutionary distances from large samples of similar genomes

MOTIVATION: Tracking disease outbreaks by whole-genome sequencing leads to the collection of large samples of closely related sequences. Five years ago, we published a method to accurately compute all pairwise distances for such samples by indexing each sequence. Since indexing is slow, we now ask w...

Descripción completa

Detalles Bibliográficos
Autores principales: Klötzl, Fabian, Haubold, Bernhard
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7141870/
https://www.ncbi.nlm.nih.gov/pubmed/31790149
http://dx.doi.org/10.1093/bioinformatics/btz903
_version_ 1783519274296410112
author Klötzl, Fabian
Haubold, Bernhard
author_facet Klötzl, Fabian
Haubold, Bernhard
author_sort Klötzl, Fabian
collection PubMed
description MOTIVATION: Tracking disease outbreaks by whole-genome sequencing leads to the collection of large samples of closely related sequences. Five years ago, we published a method to accurately compute all pairwise distances for such samples by indexing each sequence. Since indexing is slow, we now ask whether it is possible to achieve similar accuracy when indexing only a single sequence. RESULTS: We have implemented this idea in the program phylonium and show that it is as accurate as its predecessor and roughly 100 times faster when applied to all 2678 Escherichia coli genomes contained in ENSEMBL. One of the best published programs for rapidly computing pairwise distances, mash, analyzes the same dataset four times faster but, with default settings, it is less accurate than phylonium. AVAILABILITY AND IMPLEMENTATION: Phylonium runs under the UNIX command line; its C++ sources and documentation are available from github.com/evolbioinf/phylonium. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-7141870
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-71418702020-04-13 Phylonium: fast estimation of evolutionary distances from large samples of similar genomes Klötzl, Fabian Haubold, Bernhard Bioinformatics Original Papers MOTIVATION: Tracking disease outbreaks by whole-genome sequencing leads to the collection of large samples of closely related sequences. Five years ago, we published a method to accurately compute all pairwise distances for such samples by indexing each sequence. Since indexing is slow, we now ask whether it is possible to achieve similar accuracy when indexing only a single sequence. RESULTS: We have implemented this idea in the program phylonium and show that it is as accurate as its predecessor and roughly 100 times faster when applied to all 2678 Escherichia coli genomes contained in ENSEMBL. One of the best published programs for rapidly computing pairwise distances, mash, analyzes the same dataset four times faster but, with default settings, it is less accurate than phylonium. AVAILABILITY AND IMPLEMENTATION: Phylonium runs under the UNIX command line; its C++ sources and documentation are available from github.com/evolbioinf/phylonium. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-04-01 2019-12-02 /pmc/articles/PMC7141870/ /pubmed/31790149 http://dx.doi.org/10.1093/bioinformatics/btz903 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Klötzl, Fabian
Haubold, Bernhard
Phylonium: fast estimation of evolutionary distances from large samples of similar genomes
title Phylonium: fast estimation of evolutionary distances from large samples of similar genomes
title_full Phylonium: fast estimation of evolutionary distances from large samples of similar genomes
title_fullStr Phylonium: fast estimation of evolutionary distances from large samples of similar genomes
title_full_unstemmed Phylonium: fast estimation of evolutionary distances from large samples of similar genomes
title_short Phylonium: fast estimation of evolutionary distances from large samples of similar genomes
title_sort phylonium: fast estimation of evolutionary distances from large samples of similar genomes
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7141870/
https://www.ncbi.nlm.nih.gov/pubmed/31790149
http://dx.doi.org/10.1093/bioinformatics/btz903
work_keys_str_mv AT klotzlfabian phyloniumfastestimationofevolutionarydistancesfromlargesamplesofsimilargenomes
AT hauboldbernhard phyloniumfastestimationofevolutionarydistancesfromlargesamplesofsimilargenomes