Cargando…

ClusTRace, a bioinformatic pipeline for analyzing clusters in virus phylogenies

BACKGROUND: SARS-CoV-2 is the highly transmissible etiologic agent of coronavirus disease 2019 (COVID-19) and has become a global scientific and public health challenge since December 2019. Several new variants of SARS-CoV-2 have emerged globally raising concern about prevention and treatment of COV...

Descripción completa

Detalles Bibliográficos
Autores principales: Plyusnin, Ilya, Truong Nguyen, Phuoc Thien, Sironen, Tarja, Vapalahti, Olli, Smura, Teemu, Kant, Ravi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9143711/
https://www.ncbi.nlm.nih.gov/pubmed/35643449
http://dx.doi.org/10.1186/s12859-022-04709-8
_version_ 1784715872886063104
author Plyusnin, Ilya
Truong Nguyen, Phuoc Thien
Sironen, Tarja
Vapalahti, Olli
Smura, Teemu
Kant, Ravi
author_facet Plyusnin, Ilya
Truong Nguyen, Phuoc Thien
Sironen, Tarja
Vapalahti, Olli
Smura, Teemu
Kant, Ravi
author_sort Plyusnin, Ilya
collection PubMed
description BACKGROUND: SARS-CoV-2 is the highly transmissible etiologic agent of coronavirus disease 2019 (COVID-19) and has become a global scientific and public health challenge since December 2019. Several new variants of SARS-CoV-2 have emerged globally raising concern about prevention and treatment of COVID-19. Early detection and in-depth analysis of the emerging variants allowing pre-emptive alert and mitigation efforts are thus of paramount importance. RESULTS: Here we present ClusTRace, a novel bioinformatic pipeline for a fast and scalable analysis of sequence clusters or clades in large viral phylogenies. ClusTRace offers several high-level functionalities including lineage assignment, outlier filtering, aligning, phylogenetic tree reconstruction, cluster extraction, variant calling, visualization and reporting. ClusTRace was developed as an aid for COVID-19 transmission chain tracing in Finland with the main emphasis on fast screening of phylogenies for markers of super-spreading events and other features of concern, such as high rates of cluster growth and/or accumulation of novel mutations. CONCLUSIONS: ClusTRace provides an effective interface that can significantly cut down learning and operating costs related to complex bioinformatic analysis of large viral sequence sets and phylogenies. All code is freely available from https://bitbucket.org/plyusnin/clustrace/ SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04709-8.
format Online
Article
Text
id pubmed-9143711
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-91437112022-05-30 ClusTRace, a bioinformatic pipeline for analyzing clusters in virus phylogenies Plyusnin, Ilya Truong Nguyen, Phuoc Thien Sironen, Tarja Vapalahti, Olli Smura, Teemu Kant, Ravi BMC Bioinformatics Software BACKGROUND: SARS-CoV-2 is the highly transmissible etiologic agent of coronavirus disease 2019 (COVID-19) and has become a global scientific and public health challenge since December 2019. Several new variants of SARS-CoV-2 have emerged globally raising concern about prevention and treatment of COVID-19. Early detection and in-depth analysis of the emerging variants allowing pre-emptive alert and mitigation efforts are thus of paramount importance. RESULTS: Here we present ClusTRace, a novel bioinformatic pipeline for a fast and scalable analysis of sequence clusters or clades in large viral phylogenies. ClusTRace offers several high-level functionalities including lineage assignment, outlier filtering, aligning, phylogenetic tree reconstruction, cluster extraction, variant calling, visualization and reporting. ClusTRace was developed as an aid for COVID-19 transmission chain tracing in Finland with the main emphasis on fast screening of phylogenies for markers of super-spreading events and other features of concern, such as high rates of cluster growth and/or accumulation of novel mutations. CONCLUSIONS: ClusTRace provides an effective interface that can significantly cut down learning and operating costs related to complex bioinformatic analysis of large viral sequence sets and phylogenies. All code is freely available from https://bitbucket.org/plyusnin/clustrace/ SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04709-8. BioMed Central 2022-05-28 /pmc/articles/PMC9143711/ /pubmed/35643449 http://dx.doi.org/10.1186/s12859-022-04709-8 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Plyusnin, Ilya
Truong Nguyen, Phuoc Thien
Sironen, Tarja
Vapalahti, Olli
Smura, Teemu
Kant, Ravi
ClusTRace, a bioinformatic pipeline for analyzing clusters in virus phylogenies
title ClusTRace, a bioinformatic pipeline for analyzing clusters in virus phylogenies
title_full ClusTRace, a bioinformatic pipeline for analyzing clusters in virus phylogenies
title_fullStr ClusTRace, a bioinformatic pipeline for analyzing clusters in virus phylogenies
title_full_unstemmed ClusTRace, a bioinformatic pipeline for analyzing clusters in virus phylogenies
title_short ClusTRace, a bioinformatic pipeline for analyzing clusters in virus phylogenies
title_sort clustrace, a bioinformatic pipeline for analyzing clusters in virus phylogenies
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9143711/
https://www.ncbi.nlm.nih.gov/pubmed/35643449
http://dx.doi.org/10.1186/s12859-022-04709-8
work_keys_str_mv AT plyusninilya clustraceabioinformaticpipelineforanalyzingclustersinvirusphylogenies
AT truongnguyenphuocthien clustraceabioinformaticpipelineforanalyzingclustersinvirusphylogenies
AT sironentarja clustraceabioinformaticpipelineforanalyzingclustersinvirusphylogenies
AT vapalahtiolli clustraceabioinformaticpipelineforanalyzingclustersinvirusphylogenies
AT smurateemu clustraceabioinformaticpipelineforanalyzingclustersinvirusphylogenies
AT kantravi clustraceabioinformaticpipelineforanalyzingclustersinvirusphylogenies