Cargando…

SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels

BACKGROUND: Faced with the ongoing global pandemic of coronavirus disease, the ‘National Reference Centre for Whole Genome Sequencing of microbial pathogens: database and bioinformatic analysis’ (GENPAT) formally established at the ‘Istituto Zooprofilattico Sperimentale dell’Abruzzo e del Molise’ (I...

Descripción completa

Detalles Bibliográficos
Autores principales: Di Pasquale, Adriano, Radomski, Nicolas, Mangone, Iolanda, Calistri, Paolo, Lorusso, Alessio, Cammà, Cesare
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8556844/
https://www.ncbi.nlm.nih.gov/pubmed/34717546
http://dx.doi.org/10.1186/s12864-021-08112-0
_version_ 1784592254206214144
author Di Pasquale, Adriano
Radomski, Nicolas
Mangone, Iolanda
Calistri, Paolo
Lorusso, Alessio
Cammà, Cesare
author_facet Di Pasquale, Adriano
Radomski, Nicolas
Mangone, Iolanda
Calistri, Paolo
Lorusso, Alessio
Cammà, Cesare
author_sort Di Pasquale, Adriano
collection PubMed
description BACKGROUND: Faced with the ongoing global pandemic of coronavirus disease, the ‘National Reference Centre for Whole Genome Sequencing of microbial pathogens: database and bioinformatic analysis’ (GENPAT) formally established at the ‘Istituto Zooprofilattico Sperimentale dell’Abruzzo e del Molise’ (IZSAM) in Teramo (Italy) is in charge of the SARS-CoV-2 surveillance at the genomic scale. In a context of SARS-CoV-2 surveillance requiring correct and fast assessment of epidemiological clusters from substantial amount of samples, the present study proposes an analytical workflow for identifying accurately the PANGO lineages of SARS-CoV-2 samples and building of discriminant minimum spanning trees (MST) bypassing the usual time consuming phylogenomic inferences based on multiple sequence alignment (MSA) and substitution model. RESULTS: GENPAT constituted two collections of SARS-CoV-2 samples. The first collection consisted of SARS-CoV-2 positive swabs collected by IZSAM from the Abruzzo region (Italy), then sequenced by next generation sequencing (NGS) and analyzed in GENPAT (n = 1592), while the second collection included samples from several Italian provinces and retrieved from the reference Global Initiative on Sharing All Influenza Data (GISAID) (n = 17,201). The main results of the present work showed that (i) GENPAT and GISAID detected the same PANGO lineages, (ii) the PANGO lineages B.1.177 (i.e. historical in Italy) and B.1.1.7 (i.e. ‘UK variant’) are major concerns today in several Italian provinces, and the new MST-based method (iii) clusters most of the PANGO lineages together, (iv) with a higher dicriminatory power than PANGO lineages, (v) and faster that the usual phylogenomic methods based on MSA and substitution model. CONCLUSIONS: The genome sequencing efforts of Italian provinces, combined with a structured national system of NGS data management, provided support for surveillance SARS-CoV-2 in Italy. We propose to build phylogenomic trees of SARS-CoV-2 variants through an accurate, discriminant and fast MST-based method avoiding the typical time consuming steps related to MSA and substitution model-based phylogenomic inference. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-08112-0.
format Online
Article
Text
id pubmed-8556844
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-85568442021-11-01 SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels Di Pasquale, Adriano Radomski, Nicolas Mangone, Iolanda Calistri, Paolo Lorusso, Alessio Cammà, Cesare BMC Genomics Research Article BACKGROUND: Faced with the ongoing global pandemic of coronavirus disease, the ‘National Reference Centre for Whole Genome Sequencing of microbial pathogens: database and bioinformatic analysis’ (GENPAT) formally established at the ‘Istituto Zooprofilattico Sperimentale dell’Abruzzo e del Molise’ (IZSAM) in Teramo (Italy) is in charge of the SARS-CoV-2 surveillance at the genomic scale. In a context of SARS-CoV-2 surveillance requiring correct and fast assessment of epidemiological clusters from substantial amount of samples, the present study proposes an analytical workflow for identifying accurately the PANGO lineages of SARS-CoV-2 samples and building of discriminant minimum spanning trees (MST) bypassing the usual time consuming phylogenomic inferences based on multiple sequence alignment (MSA) and substitution model. RESULTS: GENPAT constituted two collections of SARS-CoV-2 samples. The first collection consisted of SARS-CoV-2 positive swabs collected by IZSAM from the Abruzzo region (Italy), then sequenced by next generation sequencing (NGS) and analyzed in GENPAT (n = 1592), while the second collection included samples from several Italian provinces and retrieved from the reference Global Initiative on Sharing All Influenza Data (GISAID) (n = 17,201). The main results of the present work showed that (i) GENPAT and GISAID detected the same PANGO lineages, (ii) the PANGO lineages B.1.177 (i.e. historical in Italy) and B.1.1.7 (i.e. ‘UK variant’) are major concerns today in several Italian provinces, and the new MST-based method (iii) clusters most of the PANGO lineages together, (iv) with a higher dicriminatory power than PANGO lineages, (v) and faster that the usual phylogenomic methods based on MSA and substitution model. CONCLUSIONS: The genome sequencing efforts of Italian provinces, combined with a structured national system of NGS data management, provided support for surveillance SARS-CoV-2 in Italy. We propose to build phylogenomic trees of SARS-CoV-2 variants through an accurate, discriminant and fast MST-based method avoiding the typical time consuming steps related to MSA and substitution model-based phylogenomic inference. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-08112-0. BioMed Central 2021-10-30 /pmc/articles/PMC8556844/ /pubmed/34717546 http://dx.doi.org/10.1186/s12864-021-08112-0 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Di Pasquale, Adriano
Radomski, Nicolas
Mangone, Iolanda
Calistri, Paolo
Lorusso, Alessio
Cammà, Cesare
SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels
title SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels
title_full SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels
title_fullStr SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels
title_full_unstemmed SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels
title_short SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels
title_sort sars-cov-2 surveillance in italy through phylogenomic inferences based on hamming distances derived from pan-snps, -mnps and -indels
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8556844/
https://www.ncbi.nlm.nih.gov/pubmed/34717546
http://dx.doi.org/10.1186/s12864-021-08112-0
work_keys_str_mv AT dipasqualeadriano sarscov2surveillanceinitalythroughphylogenomicinferencesbasedonhammingdistancesderivedfrompansnpsmnpsandindels
AT radomskinicolas sarscov2surveillanceinitalythroughphylogenomicinferencesbasedonhammingdistancesderivedfrompansnpsmnpsandindels
AT mangoneiolanda sarscov2surveillanceinitalythroughphylogenomicinferencesbasedonhammingdistancesderivedfrompansnpsmnpsandindels
AT calistripaolo sarscov2surveillanceinitalythroughphylogenomicinferencesbasedonhammingdistancesderivedfrompansnpsmnpsandindels
AT lorussoalessio sarscov2surveillanceinitalythroughphylogenomicinferencesbasedonhammingdistancesderivedfrompansnpsmnpsandindels
AT cammacesare sarscov2surveillanceinitalythroughphylogenomicinferencesbasedonhammingdistancesderivedfrompansnpsmnpsandindels