Cargando…

Taxonium, a web-based tool for exploring large phylogenetic trees

The COVID-19 pandemic has resulted in a step change in the scale of sequencing data, with more genomes of SARS-CoV-2 having been sequenced than any other organism on earth. These sequences reveal key insights when represented as a phylogenetic tree, which captures the evolutionary history of the vir...

Descripción completa

Detalles Bibliográficos
Autor principal: Sanderson, Theo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: eLife Sciences Publications, Ltd 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9704803/
https://www.ncbi.nlm.nih.gov/pubmed/36377483
http://dx.doi.org/10.7554/eLife.82392
_version_ 1784840135336001536
author Sanderson, Theo
author_facet Sanderson, Theo
author_sort Sanderson, Theo
collection PubMed
description The COVID-19 pandemic has resulted in a step change in the scale of sequencing data, with more genomes of SARS-CoV-2 having been sequenced than any other organism on earth. These sequences reveal key insights when represented as a phylogenetic tree, which captures the evolutionary history of the virus, and allows the identification of transmission events and the emergence of new variants. However, existing web-based tools for exploring phylogenies do not scale to the size of datasets now available for SARS-CoV-2. We have developed Taxonium, a new tool that uses WebGL to allow the exploration of trees with tens of millions of nodes in the browser for the first time. Taxonium links each node to associated metadata and supports mutation-annotated trees, which are able to capture all known genetic variation in a dataset. It can either be run entirely locally in the browser, from a server-based backend, or as a desktop application. We describe insights that analysing a tree of five million sequences can provide into SARS-CoV-2 evolution, and provide a tool at cov2tree.org for exploring a public tree of more than five million SARS-CoV-2 sequences. Taxonium can be applied to any tree, and is available at taxonium.org, with source code at github.com/theosanderson/taxonium.
format Online
Article
Text
id pubmed-9704803
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher eLife Sciences Publications, Ltd
record_format MEDLINE/PubMed
spelling pubmed-97048032022-11-29 Taxonium, a web-based tool for exploring large phylogenetic trees Sanderson, Theo eLife Epidemiology and Global Health The COVID-19 pandemic has resulted in a step change in the scale of sequencing data, with more genomes of SARS-CoV-2 having been sequenced than any other organism on earth. These sequences reveal key insights when represented as a phylogenetic tree, which captures the evolutionary history of the virus, and allows the identification of transmission events and the emergence of new variants. However, existing web-based tools for exploring phylogenies do not scale to the size of datasets now available for SARS-CoV-2. We have developed Taxonium, a new tool that uses WebGL to allow the exploration of trees with tens of millions of nodes in the browser for the first time. Taxonium links each node to associated metadata and supports mutation-annotated trees, which are able to capture all known genetic variation in a dataset. It can either be run entirely locally in the browser, from a server-based backend, or as a desktop application. We describe insights that analysing a tree of five million sequences can provide into SARS-CoV-2 evolution, and provide a tool at cov2tree.org for exploring a public tree of more than five million SARS-CoV-2 sequences. Taxonium can be applied to any tree, and is available at taxonium.org, with source code at github.com/theosanderson/taxonium. eLife Sciences Publications, Ltd 2022-11-15 /pmc/articles/PMC9704803/ /pubmed/36377483 http://dx.doi.org/10.7554/eLife.82392 Text en © 2022, Sanderson https://creativecommons.org/licenses/by/4.0/This article is distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use and redistribution provided that the original author and source are credited.
spellingShingle Epidemiology and Global Health
Sanderson, Theo
Taxonium, a web-based tool for exploring large phylogenetic trees
title Taxonium, a web-based tool for exploring large phylogenetic trees
title_full Taxonium, a web-based tool for exploring large phylogenetic trees
title_fullStr Taxonium, a web-based tool for exploring large phylogenetic trees
title_full_unstemmed Taxonium, a web-based tool for exploring large phylogenetic trees
title_short Taxonium, a web-based tool for exploring large phylogenetic trees
title_sort taxonium, a web-based tool for exploring large phylogenetic trees
topic Epidemiology and Global Health
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9704803/
https://www.ncbi.nlm.nih.gov/pubmed/36377483
http://dx.doi.org/10.7554/eLife.82392
work_keys_str_mv AT sandersontheo taxoniumawebbasedtoolforexploringlargephylogenetictrees