Cargando…
Mandrake: visualizing microbial population structure by embedding millions of genomes into a low-dimensional representation
In less than a decade, population genomics of microbes has progressed from the effort of sequencing dozens of strains to thousands, or even tens of thousands of strains in a single study. There are now hundreds of thousands of genomes available even for a single bacterial species, and the number of...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Royal Society
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9393562/ https://www.ncbi.nlm.nih.gov/pubmed/35989601 http://dx.doi.org/10.1098/rstb.2021.0237 |
_version_ | 1784771296463159296 |
---|---|
author | Lees, John A. Tonkin-Hill, Gerry Yang, Zhirong Corander, Jukka |
author_facet | Lees, John A. Tonkin-Hill, Gerry Yang, Zhirong Corander, Jukka |
author_sort | Lees, John A. |
collection | PubMed |
description | In less than a decade, population genomics of microbes has progressed from the effort of sequencing dozens of strains to thousands, or even tens of thousands of strains in a single study. There are now hundreds of thousands of genomes available even for a single bacterial species, and the number of genomes is expected to continue to increase at an accelerated pace given the advances in sequencing technology and widespread genomic surveillance initiatives. This explosion of data calls for innovative methods to enable rapid exploration of the structure of a population based on different data modalities, such as multiple sequence alignments, assemblies and estimates of gene content across different genomes. Here, we present Mandrake, an efficient implementation of a dimensional reduction method tailored for the needs of large-scale population genomics. Mandrake is capable of visualizing population structure from millions of whole genomes, and we illustrate its usefulness with several datasets representing major pathogens. Our method is freely available both as an analysis pipeline (https://github.com/johnlees/mandrake) and as a browser-based interactive application (https://gtonkinhill.github.io/mandrake-web/). This article is part of a discussion meeting issue ‘Genomic population structures of microbial pathogens’. |
format | Online Article Text |
id | pubmed-9393562 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | The Royal Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-93935622022-08-30 Mandrake: visualizing microbial population structure by embedding millions of genomes into a low-dimensional representation Lees, John A. Tonkin-Hill, Gerry Yang, Zhirong Corander, Jukka Philos Trans R Soc Lond B Biol Sci Articles In less than a decade, population genomics of microbes has progressed from the effort of sequencing dozens of strains to thousands, or even tens of thousands of strains in a single study. There are now hundreds of thousands of genomes available even for a single bacterial species, and the number of genomes is expected to continue to increase at an accelerated pace given the advances in sequencing technology and widespread genomic surveillance initiatives. This explosion of data calls for innovative methods to enable rapid exploration of the structure of a population based on different data modalities, such as multiple sequence alignments, assemblies and estimates of gene content across different genomes. Here, we present Mandrake, an efficient implementation of a dimensional reduction method tailored for the needs of large-scale population genomics. Mandrake is capable of visualizing population structure from millions of whole genomes, and we illustrate its usefulness with several datasets representing major pathogens. Our method is freely available both as an analysis pipeline (https://github.com/johnlees/mandrake) and as a browser-based interactive application (https://gtonkinhill.github.io/mandrake-web/). This article is part of a discussion meeting issue ‘Genomic population structures of microbial pathogens’. The Royal Society 2022-10-10 2022-08-22 /pmc/articles/PMC9393562/ /pubmed/35989601 http://dx.doi.org/10.1098/rstb.2021.0237 Text en © 2022 The Authors. https://creativecommons.org/licenses/by/4.0/Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, provided the original author and source are credited. |
spellingShingle | Articles Lees, John A. Tonkin-Hill, Gerry Yang, Zhirong Corander, Jukka Mandrake: visualizing microbial population structure by embedding millions of genomes into a low-dimensional representation |
title | Mandrake: visualizing microbial population structure by embedding millions of genomes into a low-dimensional representation |
title_full | Mandrake: visualizing microbial population structure by embedding millions of genomes into a low-dimensional representation |
title_fullStr | Mandrake: visualizing microbial population structure by embedding millions of genomes into a low-dimensional representation |
title_full_unstemmed | Mandrake: visualizing microbial population structure by embedding millions of genomes into a low-dimensional representation |
title_short | Mandrake: visualizing microbial population structure by embedding millions of genomes into a low-dimensional representation |
title_sort | mandrake: visualizing microbial population structure by embedding millions of genomes into a low-dimensional representation |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9393562/ https://www.ncbi.nlm.nih.gov/pubmed/35989601 http://dx.doi.org/10.1098/rstb.2021.0237 |
work_keys_str_mv | AT leesjohna mandrakevisualizingmicrobialpopulationstructurebyembeddingmillionsofgenomesintoalowdimensionalrepresentation AT tonkinhillgerry mandrakevisualizingmicrobialpopulationstructurebyembeddingmillionsofgenomesintoalowdimensionalrepresentation AT yangzhirong mandrakevisualizingmicrobialpopulationstructurebyembeddingmillionsofgenomesintoalowdimensionalrepresentation AT coranderjukka mandrakevisualizingmicrobialpopulationstructurebyembeddingmillionsofgenomesintoalowdimensionalrepresentation |