Cargando…
SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels
BACKGROUND: Faced with the ongoing global pandemic of coronavirus disease, the ‘National Reference Centre for Whole Genome Sequencing of microbial pathogens: database and bioinformatic analysis’ (GENPAT) formally established at the ‘Istituto Zooprofilattico Sperimentale dell’Abruzzo e del Molise’ (I...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8556844/ https://www.ncbi.nlm.nih.gov/pubmed/34717546 http://dx.doi.org/10.1186/s12864-021-08112-0 |
_version_ | 1784592254206214144 |
---|---|
author | Di Pasquale, Adriano Radomski, Nicolas Mangone, Iolanda Calistri, Paolo Lorusso, Alessio Cammà, Cesare |
author_facet | Di Pasquale, Adriano Radomski, Nicolas Mangone, Iolanda Calistri, Paolo Lorusso, Alessio Cammà, Cesare |
author_sort | Di Pasquale, Adriano |
collection | PubMed |
description | BACKGROUND: Faced with the ongoing global pandemic of coronavirus disease, the ‘National Reference Centre for Whole Genome Sequencing of microbial pathogens: database and bioinformatic analysis’ (GENPAT) formally established at the ‘Istituto Zooprofilattico Sperimentale dell’Abruzzo e del Molise’ (IZSAM) in Teramo (Italy) is in charge of the SARS-CoV-2 surveillance at the genomic scale. In a context of SARS-CoV-2 surveillance requiring correct and fast assessment of epidemiological clusters from substantial amount of samples, the present study proposes an analytical workflow for identifying accurately the PANGO lineages of SARS-CoV-2 samples and building of discriminant minimum spanning trees (MST) bypassing the usual time consuming phylogenomic inferences based on multiple sequence alignment (MSA) and substitution model. RESULTS: GENPAT constituted two collections of SARS-CoV-2 samples. The first collection consisted of SARS-CoV-2 positive swabs collected by IZSAM from the Abruzzo region (Italy), then sequenced by next generation sequencing (NGS) and analyzed in GENPAT (n = 1592), while the second collection included samples from several Italian provinces and retrieved from the reference Global Initiative on Sharing All Influenza Data (GISAID) (n = 17,201). The main results of the present work showed that (i) GENPAT and GISAID detected the same PANGO lineages, (ii) the PANGO lineages B.1.177 (i.e. historical in Italy) and B.1.1.7 (i.e. ‘UK variant’) are major concerns today in several Italian provinces, and the new MST-based method (iii) clusters most of the PANGO lineages together, (iv) with a higher dicriminatory power than PANGO lineages, (v) and faster that the usual phylogenomic methods based on MSA and substitution model. CONCLUSIONS: The genome sequencing efforts of Italian provinces, combined with a structured national system of NGS data management, provided support for surveillance SARS-CoV-2 in Italy. We propose to build phylogenomic trees of SARS-CoV-2 variants through an accurate, discriminant and fast MST-based method avoiding the typical time consuming steps related to MSA and substitution model-based phylogenomic inference. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-08112-0. |
format | Online Article Text |
id | pubmed-8556844 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-85568442021-11-01 SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels Di Pasquale, Adriano Radomski, Nicolas Mangone, Iolanda Calistri, Paolo Lorusso, Alessio Cammà, Cesare BMC Genomics Research Article BACKGROUND: Faced with the ongoing global pandemic of coronavirus disease, the ‘National Reference Centre for Whole Genome Sequencing of microbial pathogens: database and bioinformatic analysis’ (GENPAT) formally established at the ‘Istituto Zooprofilattico Sperimentale dell’Abruzzo e del Molise’ (IZSAM) in Teramo (Italy) is in charge of the SARS-CoV-2 surveillance at the genomic scale. In a context of SARS-CoV-2 surveillance requiring correct and fast assessment of epidemiological clusters from substantial amount of samples, the present study proposes an analytical workflow for identifying accurately the PANGO lineages of SARS-CoV-2 samples and building of discriminant minimum spanning trees (MST) bypassing the usual time consuming phylogenomic inferences based on multiple sequence alignment (MSA) and substitution model. RESULTS: GENPAT constituted two collections of SARS-CoV-2 samples. The first collection consisted of SARS-CoV-2 positive swabs collected by IZSAM from the Abruzzo region (Italy), then sequenced by next generation sequencing (NGS) and analyzed in GENPAT (n = 1592), while the second collection included samples from several Italian provinces and retrieved from the reference Global Initiative on Sharing All Influenza Data (GISAID) (n = 17,201). The main results of the present work showed that (i) GENPAT and GISAID detected the same PANGO lineages, (ii) the PANGO lineages B.1.177 (i.e. historical in Italy) and B.1.1.7 (i.e. ‘UK variant’) are major concerns today in several Italian provinces, and the new MST-based method (iii) clusters most of the PANGO lineages together, (iv) with a higher dicriminatory power than PANGO lineages, (v) and faster that the usual phylogenomic methods based on MSA and substitution model. CONCLUSIONS: The genome sequencing efforts of Italian provinces, combined with a structured national system of NGS data management, provided support for surveillance SARS-CoV-2 in Italy. We propose to build phylogenomic trees of SARS-CoV-2 variants through an accurate, discriminant and fast MST-based method avoiding the typical time consuming steps related to MSA and substitution model-based phylogenomic inference. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-08112-0. BioMed Central 2021-10-30 /pmc/articles/PMC8556844/ /pubmed/34717546 http://dx.doi.org/10.1186/s12864-021-08112-0 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Di Pasquale, Adriano Radomski, Nicolas Mangone, Iolanda Calistri, Paolo Lorusso, Alessio Cammà, Cesare SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels |
title | SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels |
title_full | SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels |
title_fullStr | SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels |
title_full_unstemmed | SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels |
title_short | SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels |
title_sort | sars-cov-2 surveillance in italy through phylogenomic inferences based on hamming distances derived from pan-snps, -mnps and -indels |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8556844/ https://www.ncbi.nlm.nih.gov/pubmed/34717546 http://dx.doi.org/10.1186/s12864-021-08112-0 |
work_keys_str_mv | AT dipasqualeadriano sarscov2surveillanceinitalythroughphylogenomicinferencesbasedonhammingdistancesderivedfrompansnpsmnpsandindels AT radomskinicolas sarscov2surveillanceinitalythroughphylogenomicinferencesbasedonhammingdistancesderivedfrompansnpsmnpsandindels AT mangoneiolanda sarscov2surveillanceinitalythroughphylogenomicinferencesbasedonhammingdistancesderivedfrompansnpsmnpsandindels AT calistripaolo sarscov2surveillanceinitalythroughphylogenomicinferencesbasedonhammingdistancesderivedfrompansnpsmnpsandindels AT lorussoalessio sarscov2surveillanceinitalythroughphylogenomicinferencesbasedonhammingdistancesderivedfrompansnpsmnpsandindels AT cammacesare sarscov2surveillanceinitalythroughphylogenomicinferencesbasedonhammingdistancesderivedfrompansnpsmnpsandindels |