Cargando…
metaVaR: Introducing metavariant species models for reference-free metagenomic-based population genomics
The availability of large metagenomic data offers great opportunities for the population genomic analysis of uncultured organisms, which represent a large part of the unexplored biosphere and play a key ecological role. However, the majority of these organisms lack a reference genome or transcriptom...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7773188/ https://www.ncbi.nlm.nih.gov/pubmed/33378381 http://dx.doi.org/10.1371/journal.pone.0244637 |
_version_ | 1783630011125727232 |
---|---|
author | Laso-Jadart, Romuald Ambroise, Christophe Peterlongo, Pierre Madoui, Mohammed-Amin |
author_facet | Laso-Jadart, Romuald Ambroise, Christophe Peterlongo, Pierre Madoui, Mohammed-Amin |
author_sort | Laso-Jadart, Romuald |
collection | PubMed |
description | The availability of large metagenomic data offers great opportunities for the population genomic analysis of uncultured organisms, which represent a large part of the unexplored biosphere and play a key ecological role. However, the majority of these organisms lack a reference genome or transcriptome, which constitutes a technical obstacle for classical population genomic analyses. We introduce the metavariant species (MVS) model, in which a species is represented only by intra-species nucleotide polymorphism. We designed a method combining reference-free variant calling, multiple density-based clustering and maximum-weighted independent set algorithms to cluster intra-species variants into MVSs directly from multisample metagenomic raw reads without a reference genome or read assembly. The frequencies of the MVS variants are then used to compute population genomic statistics such as F(ST), in order to estimate genomic differentiation between populations and to identify loci under natural selection. The MVS construction was tested on simulated and real metagenomic data. MVSs showed the required quality for robust population genomics and allowed an accurate estimation of genomic differentiation (ΔF(ST) < 0.0001 and <0.03 on simulated and real data respectively). Loci predicted under natural selection on real data were all detected by MVSs. MVSs represent a new paradigm that may simplify and enhance holistic approaches for population genomics and the evolution of microorganisms. |
format | Online Article Text |
id | pubmed-7773188 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-77731882021-01-08 metaVaR: Introducing metavariant species models for reference-free metagenomic-based population genomics Laso-Jadart, Romuald Ambroise, Christophe Peterlongo, Pierre Madoui, Mohammed-Amin PLoS One Research Article The availability of large metagenomic data offers great opportunities for the population genomic analysis of uncultured organisms, which represent a large part of the unexplored biosphere and play a key ecological role. However, the majority of these organisms lack a reference genome or transcriptome, which constitutes a technical obstacle for classical population genomic analyses. We introduce the metavariant species (MVS) model, in which a species is represented only by intra-species nucleotide polymorphism. We designed a method combining reference-free variant calling, multiple density-based clustering and maximum-weighted independent set algorithms to cluster intra-species variants into MVSs directly from multisample metagenomic raw reads without a reference genome or read assembly. The frequencies of the MVS variants are then used to compute population genomic statistics such as F(ST), in order to estimate genomic differentiation between populations and to identify loci under natural selection. The MVS construction was tested on simulated and real metagenomic data. MVSs showed the required quality for robust population genomics and allowed an accurate estimation of genomic differentiation (ΔF(ST) < 0.0001 and <0.03 on simulated and real data respectively). Loci predicted under natural selection on real data were all detected by MVSs. MVSs represent a new paradigm that may simplify and enhance holistic approaches for population genomics and the evolution of microorganisms. Public Library of Science 2020-12-30 /pmc/articles/PMC7773188/ /pubmed/33378381 http://dx.doi.org/10.1371/journal.pone.0244637 Text en © 2020 Laso-Jadart et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Laso-Jadart, Romuald Ambroise, Christophe Peterlongo, Pierre Madoui, Mohammed-Amin metaVaR: Introducing metavariant species models for reference-free metagenomic-based population genomics |
title | metaVaR: Introducing metavariant species models for reference-free metagenomic-based population genomics |
title_full | metaVaR: Introducing metavariant species models for reference-free metagenomic-based population genomics |
title_fullStr | metaVaR: Introducing metavariant species models for reference-free metagenomic-based population genomics |
title_full_unstemmed | metaVaR: Introducing metavariant species models for reference-free metagenomic-based population genomics |
title_short | metaVaR: Introducing metavariant species models for reference-free metagenomic-based population genomics |
title_sort | metavar: introducing metavariant species models for reference-free metagenomic-based population genomics |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7773188/ https://www.ncbi.nlm.nih.gov/pubmed/33378381 http://dx.doi.org/10.1371/journal.pone.0244637 |
work_keys_str_mv | AT lasojadartromuald metavarintroducingmetavariantspeciesmodelsforreferencefreemetagenomicbasedpopulationgenomics AT ambroisechristophe metavarintroducingmetavariantspeciesmodelsforreferencefreemetagenomicbasedpopulationgenomics AT peterlongopierre metavarintroducingmetavariantspeciesmodelsforreferencefreemetagenomicbasedpopulationgenomics AT madouimohammedamin metavarintroducingmetavariantspeciesmodelsforreferencefreemetagenomicbasedpopulationgenomics |