Cargando…

Read2Tree: scalable and accurate phylogenetic trees from raw reads

The inference of phylogenetic trees is foundational to biology. However, state-of-the-art phylogenomics requires running complex pipelines, at significant computational and labour costs, with additional constraints in sequencing coverage, assembly and annotation quality. To overcome these challenges...

Descripción completa

Detalles Bibliográficos
Autores principales: Dylus, David, Altenhoff, Adrian, Majidian, Sina, Sedlazeck, Fritz J, Dessimoz, Christophe
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9774205/
https://www.ncbi.nlm.nih.gov/pubmed/36561179
http://dx.doi.org/10.1101/2022.04.18.488678
_version_ 1784855352260427776
author Dylus, David
Altenhoff, Adrian
Majidian, Sina
Sedlazeck, Fritz J
Dessimoz, Christophe
author_facet Dylus, David
Altenhoff, Adrian
Majidian, Sina
Sedlazeck, Fritz J
Dessimoz, Christophe
author_sort Dylus, David
collection PubMed
description The inference of phylogenetic trees is foundational to biology. However, state-of-the-art phylogenomics requires running complex pipelines, at significant computational and labour costs, with additional constraints in sequencing coverage, assembly and annotation quality. To overcome these challenges, we present Read2Tree, which directly processes raw sequencing reads into groups of corresponding genes. In a benchmark encompassing a broad variety of datasets, our assembly-free approach was 10–100x faster than conventional approaches, and in most cases more accurate—the exception being when sequencing coverage was high and reference species very distant. To illustrate the broad applicability of the tool, we reconstructed a yeast tree of life of 435 species spanning 590 million years of evolution. Applied to Coronaviridae samples, Read2Tree accurately classified highly diverse animal samples and near-identical SARS-CoV-2 sequences on a single tree—thereby exhibiting remarkable breadth and depth. The speed, accuracy, and versatility of Read2Tree enables comparative genomics at scale.
format Online
Article
Text
id pubmed-9774205
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-97742052022-12-23 Read2Tree: scalable and accurate phylogenetic trees from raw reads Dylus, David Altenhoff, Adrian Majidian, Sina Sedlazeck, Fritz J Dessimoz, Christophe bioRxiv Article The inference of phylogenetic trees is foundational to biology. However, state-of-the-art phylogenomics requires running complex pipelines, at significant computational and labour costs, with additional constraints in sequencing coverage, assembly and annotation quality. To overcome these challenges, we present Read2Tree, which directly processes raw sequencing reads into groups of corresponding genes. In a benchmark encompassing a broad variety of datasets, our assembly-free approach was 10–100x faster than conventional approaches, and in most cases more accurate—the exception being when sequencing coverage was high and reference species very distant. To illustrate the broad applicability of the tool, we reconstructed a yeast tree of life of 435 species spanning 590 million years of evolution. Applied to Coronaviridae samples, Read2Tree accurately classified highly diverse animal samples and near-identical SARS-CoV-2 sequences on a single tree—thereby exhibiting remarkable breadth and depth. The speed, accuracy, and versatility of Read2Tree enables comparative genomics at scale. Cold Spring Harbor Laboratory 2022-12-13 /pmc/articles/PMC9774205/ /pubmed/36561179 http://dx.doi.org/10.1101/2022.04.18.488678 Text en https://creativecommons.org/licenses/by-nd/4.0/This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, and only so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle Article
Dylus, David
Altenhoff, Adrian
Majidian, Sina
Sedlazeck, Fritz J
Dessimoz, Christophe
Read2Tree: scalable and accurate phylogenetic trees from raw reads
title Read2Tree: scalable and accurate phylogenetic trees from raw reads
title_full Read2Tree: scalable and accurate phylogenetic trees from raw reads
title_fullStr Read2Tree: scalable and accurate phylogenetic trees from raw reads
title_full_unstemmed Read2Tree: scalable and accurate phylogenetic trees from raw reads
title_short Read2Tree: scalable and accurate phylogenetic trees from raw reads
title_sort read2tree: scalable and accurate phylogenetic trees from raw reads
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9774205/
https://www.ncbi.nlm.nih.gov/pubmed/36561179
http://dx.doi.org/10.1101/2022.04.18.488678
work_keys_str_mv AT dylusdavid read2treescalableandaccuratephylogenetictreesfromrawreads
AT altenhoffadrian read2treescalableandaccuratephylogenetictreesfromrawreads
AT majidiansina read2treescalableandaccuratephylogenetictreesfromrawreads
AT sedlazeckfritzj read2treescalableandaccuratephylogenetictreesfromrawreads
AT dessimozchristophe read2treescalableandaccuratephylogenetictreesfromrawreads