Cargando…

matOptimize: a parallel tree optimization method enables online phylogenetics for SARS-CoV-2

MOTIVATION: Phylogenetic tree optimization is necessary for precise analysis of evolutionary and transmission dynamics, but existing tools are inadequate for handling the scale and pace of data produced during the coronavirus disease 2019 (COVID-19) pandemic. One transformative approach, online phyl...

Descripción completa

Detalles Bibliográficos
Autores principales: Ye, Cheng, Thornlow, Bryan, Hinrichs, Angie, Kramer, Alexander, Mirchandani, Cade, Torvi, Devika, Lanfear, Robert, Corbett-Detig, Russell, Turakhia, Yatish
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344837/
https://www.ncbi.nlm.nih.gov/pubmed/35731204
http://dx.doi.org/10.1093/bioinformatics/btac401
_version_ 1784761302438117376
author Ye, Cheng
Thornlow, Bryan
Hinrichs, Angie
Kramer, Alexander
Mirchandani, Cade
Torvi, Devika
Lanfear, Robert
Corbett-Detig, Russell
Turakhia, Yatish
author_facet Ye, Cheng
Thornlow, Bryan
Hinrichs, Angie
Kramer, Alexander
Mirchandani, Cade
Torvi, Devika
Lanfear, Robert
Corbett-Detig, Russell
Turakhia, Yatish
author_sort Ye, Cheng
collection PubMed
description MOTIVATION: Phylogenetic tree optimization is necessary for precise analysis of evolutionary and transmission dynamics, but existing tools are inadequate for handling the scale and pace of data produced during the coronavirus disease 2019 (COVID-19) pandemic. One transformative approach, online phylogenetics, aims to incrementally add samples to an ever-growing phylogeny, but there are no previously existing approaches that can efficiently optimize this vast phylogeny under the time constraints of the pandemic. RESULTS: Here, we present matOptimize, a fast and memory-efficient phylogenetic tree optimization tool based on parsimony that can be parallelized across multiple CPU threads and nodes, and provides orders of magnitude improvement in runtime and peak memory usage compared to existing state-of-the-art methods. We have developed this method particularly to address the pressing need during the COVID-19 pandemic for daily maintenance and optimization of a comprehensive SARS-CoV-2 phylogeny. matOptimize is currently helping refine on a daily basis possibly the largest-ever phylogenetic tree, containing millions of SARS-CoV-2 sequences. AVAILABILITY AND IMPLEMENTATION: The matOptimize code is freely available as part of the UShER package (https://github.com/yatisht/usher) and can also be installed via bioconda (https://bioconda.github.io/recipes/usher/README.html). All scripts we used to perform the experiments in this manuscript are available at https://github.com/yceh/matOptimize-experiments. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9344837
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-93448372022-08-03 matOptimize: a parallel tree optimization method enables online phylogenetics for SARS-CoV-2 Ye, Cheng Thornlow, Bryan Hinrichs, Angie Kramer, Alexander Mirchandani, Cade Torvi, Devika Lanfear, Robert Corbett-Detig, Russell Turakhia, Yatish Bioinformatics Original Papers MOTIVATION: Phylogenetic tree optimization is necessary for precise analysis of evolutionary and transmission dynamics, but existing tools are inadequate for handling the scale and pace of data produced during the coronavirus disease 2019 (COVID-19) pandemic. One transformative approach, online phylogenetics, aims to incrementally add samples to an ever-growing phylogeny, but there are no previously existing approaches that can efficiently optimize this vast phylogeny under the time constraints of the pandemic. RESULTS: Here, we present matOptimize, a fast and memory-efficient phylogenetic tree optimization tool based on parsimony that can be parallelized across multiple CPU threads and nodes, and provides orders of magnitude improvement in runtime and peak memory usage compared to existing state-of-the-art methods. We have developed this method particularly to address the pressing need during the COVID-19 pandemic for daily maintenance and optimization of a comprehensive SARS-CoV-2 phylogeny. matOptimize is currently helping refine on a daily basis possibly the largest-ever phylogenetic tree, containing millions of SARS-CoV-2 sequences. AVAILABILITY AND IMPLEMENTATION: The matOptimize code is freely available as part of the UShER package (https://github.com/yatisht/usher) and can also be installed via bioconda (https://bioconda.github.io/recipes/usher/README.html). All scripts we used to perform the experiments in this manuscript are available at https://github.com/yceh/matOptimize-experiments. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-06-22 /pmc/articles/PMC9344837/ /pubmed/35731204 http://dx.doi.org/10.1093/bioinformatics/btac401 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
Ye, Cheng
Thornlow, Bryan
Hinrichs, Angie
Kramer, Alexander
Mirchandani, Cade
Torvi, Devika
Lanfear, Robert
Corbett-Detig, Russell
Turakhia, Yatish
matOptimize: a parallel tree optimization method enables online phylogenetics for SARS-CoV-2
title matOptimize: a parallel tree optimization method enables online phylogenetics for SARS-CoV-2
title_full matOptimize: a parallel tree optimization method enables online phylogenetics for SARS-CoV-2
title_fullStr matOptimize: a parallel tree optimization method enables online phylogenetics for SARS-CoV-2
title_full_unstemmed matOptimize: a parallel tree optimization method enables online phylogenetics for SARS-CoV-2
title_short matOptimize: a parallel tree optimization method enables online phylogenetics for SARS-CoV-2
title_sort matoptimize: a parallel tree optimization method enables online phylogenetics for sars-cov-2
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344837/
https://www.ncbi.nlm.nih.gov/pubmed/35731204
http://dx.doi.org/10.1093/bioinformatics/btac401
work_keys_str_mv AT yecheng matoptimizeaparalleltreeoptimizationmethodenablesonlinephylogeneticsforsarscov2
AT thornlowbryan matoptimizeaparalleltreeoptimizationmethodenablesonlinephylogeneticsforsarscov2
AT hinrichsangie matoptimizeaparalleltreeoptimizationmethodenablesonlinephylogeneticsforsarscov2
AT krameralexander matoptimizeaparalleltreeoptimizationmethodenablesonlinephylogeneticsforsarscov2
AT mirchandanicade matoptimizeaparalleltreeoptimizationmethodenablesonlinephylogeneticsforsarscov2
AT torvidevika matoptimizeaparalleltreeoptimizationmethodenablesonlinephylogeneticsforsarscov2
AT lanfearrobert matoptimizeaparalleltreeoptimizationmethodenablesonlinephylogeneticsforsarscov2
AT corbettdetigrussell matoptimizeaparalleltreeoptimizationmethodenablesonlinephylogeneticsforsarscov2
AT turakhiayatish matoptimizeaparalleltreeoptimizationmethodenablesonlinephylogeneticsforsarscov2