Cargando…

Pitfalls of barcodes in the study of worldwide SARS-CoV-2 variation and phylodynamics

Analysis of SARS-CoV-2 genome variation using a minimal number of selected informative sites conforming a genetic barcode presents several drawbacks. We show that purely mathematical procedures for site selection should be supervised by known phylogeny (i) to ensure that solid tree branches are repr...

Descripción completa

Detalles Bibliográficos
Autores principales: Pardo-Seco, Jacobo, Gómez-Carballa, Alberto, Bello, Xabier, Martinón-Torres, Federico, Salas, Antonio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Science Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7840454/
https://www.ncbi.nlm.nih.gov/pubmed/33410308
http://dx.doi.org/10.24272/j.issn.2095-8137.2020.364
_version_ 1783643577823264768
author Pardo-Seco, Jacobo
Gómez-Carballa, Alberto
Bello, Xabier
Martinón-Torres, Federico
Salas, Antonio
author_facet Pardo-Seco, Jacobo
Gómez-Carballa, Alberto
Bello, Xabier
Martinón-Torres, Federico
Salas, Antonio
author_sort Pardo-Seco, Jacobo
collection PubMed
description Analysis of SARS-CoV-2 genome variation using a minimal number of selected informative sites conforming a genetic barcode presents several drawbacks. We show that purely mathematical procedures for site selection should be supervised by known phylogeny (i) to ensure that solid tree branches are represented instead of mutational hotspots with poor phylogeographic proprieties, and (ii) to avoid phylogenetic redundancy. We propose a procedure that prevents information redundancy in site selection by considering the cumulative informativeness of previously selected sites (as a proxy for phylogenetic-based criteria). This procedure demonstrates that, for short barcodes (e.g., 11 sites), there are thousands of informative site combinations that improve previous proposals. We also show that barcodes based on worldwide databases inevitably prioritize variants located at the basal nodes of the phylogeny, such that most representative genomes in these ancestral nodes are no longer in circulation. Consequently, coronavirus phylodynamics cannot be properly captured by universal genomic barcodes because most SARS-CoV-2 variation is generated in geographically restricted areas by the continuous introduction of domestic variants.
format Online
Article
Text
id pubmed-7840454
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Science Press
record_format MEDLINE/PubMed
spelling pubmed-78404542021-01-29 Pitfalls of barcodes in the study of worldwide SARS-CoV-2 variation and phylodynamics Pardo-Seco, Jacobo Gómez-Carballa, Alberto Bello, Xabier Martinón-Torres, Federico Salas, Antonio Zool Res Letters to the Editor Analysis of SARS-CoV-2 genome variation using a minimal number of selected informative sites conforming a genetic barcode presents several drawbacks. We show that purely mathematical procedures for site selection should be supervised by known phylogeny (i) to ensure that solid tree branches are represented instead of mutational hotspots with poor phylogeographic proprieties, and (ii) to avoid phylogenetic redundancy. We propose a procedure that prevents information redundancy in site selection by considering the cumulative informativeness of previously selected sites (as a proxy for phylogenetic-based criteria). This procedure demonstrates that, for short barcodes (e.g., 11 sites), there are thousands of informative site combinations that improve previous proposals. We also show that barcodes based on worldwide databases inevitably prioritize variants located at the basal nodes of the phylogeny, such that most representative genomes in these ancestral nodes are no longer in circulation. Consequently, coronavirus phylodynamics cannot be properly captured by universal genomic barcodes because most SARS-CoV-2 variation is generated in geographically restricted areas by the continuous introduction of domestic variants. Science Press 2021-01-18 /pmc/articles/PMC7840454/ /pubmed/33410308 http://dx.doi.org/10.24272/j.issn.2095-8137.2020.364 Text en Editorial Office of Zoological Research, Kunming Institute of Zoology, Chinese Academy of Sciences http://creativecommons.org/licenses/by-nc/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Letters to the Editor
Pardo-Seco, Jacobo
Gómez-Carballa, Alberto
Bello, Xabier
Martinón-Torres, Federico
Salas, Antonio
Pitfalls of barcodes in the study of worldwide SARS-CoV-2 variation and phylodynamics
title Pitfalls of barcodes in the study of worldwide SARS-CoV-2 variation and phylodynamics
title_full Pitfalls of barcodes in the study of worldwide SARS-CoV-2 variation and phylodynamics
title_fullStr Pitfalls of barcodes in the study of worldwide SARS-CoV-2 variation and phylodynamics
title_full_unstemmed Pitfalls of barcodes in the study of worldwide SARS-CoV-2 variation and phylodynamics
title_short Pitfalls of barcodes in the study of worldwide SARS-CoV-2 variation and phylodynamics
title_sort pitfalls of barcodes in the study of worldwide sars-cov-2 variation and phylodynamics
topic Letters to the Editor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7840454/
https://www.ncbi.nlm.nih.gov/pubmed/33410308
http://dx.doi.org/10.24272/j.issn.2095-8137.2020.364
work_keys_str_mv AT pardosecojacobo pitfallsofbarcodesinthestudyofworldwidesarscov2variationandphylodynamics
AT gomezcarballaalberto pitfallsofbarcodesinthestudyofworldwidesarscov2variationandphylodynamics
AT belloxabier pitfallsofbarcodesinthestudyofworldwidesarscov2variationandphylodynamics
AT martinontorresfederico pitfallsofbarcodesinthestudyofworldwidesarscov2variationandphylodynamics
AT salasantonio pitfallsofbarcodesinthestudyofworldwidesarscov2variationandphylodynamics