Cargando…

Phylonium: fast estimation of evolutionary distances from large samples of similar genomes

MOTIVATION: Tracking disease outbreaks by whole-genome sequencing leads to the collection of large samples of closely related sequences. Five years ago, we published a method to accurately compute all pairwise distances for such samples by indexing each sequence. Since indexing is slow, we now ask w...

Descripción completa

Detalles Bibliográficos
Autores principales:	Klötzl, Fabian, Haubold, Bernhard
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2020
Materias:	Original Papers
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7141870/ https://www.ncbi.nlm.nih.gov/pubmed/31790149 http://dx.doi.org/10.1093/bioinformatics/btz903

_version_	1783519274296410112
author	Klötzl, Fabian Haubold, Bernhard
author_facet	Klötzl, Fabian Haubold, Bernhard
author_sort	Klötzl, Fabian
collection	PubMed
description	MOTIVATION: Tracking disease outbreaks by whole-genome sequencing leads to the collection of large samples of closely related sequences. Five years ago, we published a method to accurately compute all pairwise distances for such samples by indexing each sequence. Since indexing is slow, we now ask whether it is possible to achieve similar accuracy when indexing only a single sequence. RESULTS: We have implemented this idea in the program phylonium and show that it is as accurate as its predecessor and roughly 100 times faster when applied to all 2678 Escherichia coli genomes contained in ENSEMBL. One of the best published programs for rapidly computing pairwise distances, mash, analyzes the same dataset four times faster but, with default settings, it is less accurate than phylonium. AVAILABILITY AND IMPLEMENTATION: Phylonium runs under the UNIX command line; its C++ sources and documentation are available from github.com/evolbioinf/phylonium. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format	Online Article Text
id	pubmed-7141870
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-71418702020-04-13 Phylonium: fast estimation of evolutionary distances from large samples of similar genomes Klötzl, Fabian Haubold, Bernhard Bioinformatics Original Papers MOTIVATION: Tracking disease outbreaks by whole-genome sequencing leads to the collection of large samples of closely related sequences. Five years ago, we published a method to accurately compute all pairwise distances for such samples by indexing each sequence. Since indexing is slow, we now ask whether it is possible to achieve similar accuracy when indexing only a single sequence. RESULTS: We have implemented this idea in the program phylonium and show that it is as accurate as its predecessor and roughly 100 times faster when applied to all 2678 Escherichia coli genomes contained in ENSEMBL. One of the best published programs for rapidly computing pairwise distances, mash, analyzes the same dataset four times faster but, with default settings, it is less accurate than phylonium. AVAILABILITY AND IMPLEMENTATION: Phylonium runs under the UNIX command line; its C++ sources and documentation are available from github.com/evolbioinf/phylonium. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-04-01 2019-12-02 /pmc/articles/PMC7141870/ /pubmed/31790149 http://dx.doi.org/10.1093/bioinformatics/btz903 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Original Papers Klötzl, Fabian Haubold, Bernhard Phylonium: fast estimation of evolutionary distances from large samples of similar genomes
title	Phylonium: fast estimation of evolutionary distances from large samples of similar genomes
title_full	Phylonium: fast estimation of evolutionary distances from large samples of similar genomes
title_fullStr	Phylonium: fast estimation of evolutionary distances from large samples of similar genomes
title_full_unstemmed	Phylonium: fast estimation of evolutionary distances from large samples of similar genomes
title_short	Phylonium: fast estimation of evolutionary distances from large samples of similar genomes
title_sort	phylonium: fast estimation of evolutionary distances from large samples of similar genomes
topic	Original Papers
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7141870/ https://www.ncbi.nlm.nih.gov/pubmed/31790149 http://dx.doi.org/10.1093/bioinformatics/btz903
work_keys_str_mv	AT klotzlfabian phyloniumfastestimationofevolutionarydistancesfromlargesamplesofsimilargenomes AT hauboldbernhard phyloniumfastestimationofevolutionarydistancesfromlargesamplesofsimilargenomes

Phylonium: fast estimation of evolutionary distances from large samples of similar genomes

Ejemplares similares