Cargando…
A scalable analytical approach from bacterial genomes to epidemiology
Recent years have seen a remarkable increase in the practicality of sequencing whole genomes from large numbers of bacterial isolates. The availability of this data has huge potential to deliver new insights into the evolution and epidemiology of bacterial pathogens, but the scalability of the analy...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Royal Society
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9393561/ https://www.ncbi.nlm.nih.gov/pubmed/35989600 http://dx.doi.org/10.1098/rstb.2021.0246 |
_version_ | 1784771296240861184 |
---|---|
author | Didelot, Xavier Parkhill, Julian |
author_facet | Didelot, Xavier Parkhill, Julian |
author_sort | Didelot, Xavier |
collection | PubMed |
description | Recent years have seen a remarkable increase in the practicality of sequencing whole genomes from large numbers of bacterial isolates. The availability of this data has huge potential to deliver new insights into the evolution and epidemiology of bacterial pathogens, but the scalability of the analytical methodology has been lagging behind that of the sequencing technology. Here we present a step-by-step approach for such large-scale genomic epidemiology analyses, from bacterial genomes to epidemiological interpretations. A central component of this approach is the dated phylogeny, which is a phylogenetic tree with branch lengths measured in units of time. The construction of dated phylogenies from bacterial genomic data needs to account for the disruptive effect of recombination on phylogenetic relationships, and we describe how this can be achieved. Dated phylogenies can then be used to perform fine-scale or large-scale epidemiological analyses, depending on the proportion of cases for which genomes are available. A key feature of this approach is computational scalability and in particular the ability to process hundreds or thousands of genomes within a matter of hours. This is a clear advantage of the step-by-step approach described here. We discuss other advantages and disadvantages of the approach, as well as potential improvements and avenues for future research. This article is part of a discussion meeting issue ‘Genomic population structures of microbial pathogens’. |
format | Online Article Text |
id | pubmed-9393561 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | The Royal Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-93935612022-08-30 A scalable analytical approach from bacterial genomes to epidemiology Didelot, Xavier Parkhill, Julian Philos Trans R Soc Lond B Biol Sci Articles Recent years have seen a remarkable increase in the practicality of sequencing whole genomes from large numbers of bacterial isolates. The availability of this data has huge potential to deliver new insights into the evolution and epidemiology of bacterial pathogens, but the scalability of the analytical methodology has been lagging behind that of the sequencing technology. Here we present a step-by-step approach for such large-scale genomic epidemiology analyses, from bacterial genomes to epidemiological interpretations. A central component of this approach is the dated phylogeny, which is a phylogenetic tree with branch lengths measured in units of time. The construction of dated phylogenies from bacterial genomic data needs to account for the disruptive effect of recombination on phylogenetic relationships, and we describe how this can be achieved. Dated phylogenies can then be used to perform fine-scale or large-scale epidemiological analyses, depending on the proportion of cases for which genomes are available. A key feature of this approach is computational scalability and in particular the ability to process hundreds or thousands of genomes within a matter of hours. This is a clear advantage of the step-by-step approach described here. We discuss other advantages and disadvantages of the approach, as well as potential improvements and avenues for future research. This article is part of a discussion meeting issue ‘Genomic population structures of microbial pathogens’. The Royal Society 2022-10-10 2022-08-22 /pmc/articles/PMC9393561/ /pubmed/35989600 http://dx.doi.org/10.1098/rstb.2021.0246 Text en © 2022 The Authors. https://creativecommons.org/licenses/by/4.0/Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, provided the original author and source are credited. |
spellingShingle | Articles Didelot, Xavier Parkhill, Julian A scalable analytical approach from bacterial genomes to epidemiology |
title | A scalable analytical approach from bacterial genomes to epidemiology |
title_full | A scalable analytical approach from bacterial genomes to epidemiology |
title_fullStr | A scalable analytical approach from bacterial genomes to epidemiology |
title_full_unstemmed | A scalable analytical approach from bacterial genomes to epidemiology |
title_short | A scalable analytical approach from bacterial genomes to epidemiology |
title_sort | scalable analytical approach from bacterial genomes to epidemiology |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9393561/ https://www.ncbi.nlm.nih.gov/pubmed/35989600 http://dx.doi.org/10.1098/rstb.2021.0246 |
work_keys_str_mv | AT didelotxavier ascalableanalyticalapproachfrombacterialgenomestoepidemiology AT parkhilljulian ascalableanalyticalapproachfrombacterialgenomestoepidemiology AT didelotxavier scalableanalyticalapproachfrombacterialgenomestoepidemiology AT parkhilljulian scalableanalyticalapproachfrombacterialgenomestoepidemiology |