Cargando…

To Dereplicate or Not To Dereplicate?

Metagenome-assembled genomes (MAGs) expand our understanding of microbial diversity, evolution, and ecology. Concerns have been raised on how sequencing, assembly, binning, and quality assessment tools may result in MAGs that do not reflect single populations in nature. Here, we reflect on another i...

Descripción completa

Detalles Bibliográficos
Autores principales: Evans, Jacob T., Denef, Vincent J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7380574/
https://www.ncbi.nlm.nih.gov/pubmed/32434845
http://dx.doi.org/10.1128/mSphere.00971-19
_version_ 1783562873753042944
author Evans, Jacob T.
Denef, Vincent J.
author_facet Evans, Jacob T.
Denef, Vincent J.
author_sort Evans, Jacob T.
collection PubMed
description Metagenome-assembled genomes (MAGs) expand our understanding of microbial diversity, evolution, and ecology. Concerns have been raised on how sequencing, assembly, binning, and quality assessment tools may result in MAGs that do not reflect single populations in nature. Here, we reflect on another issue, i.e., how to handle highly similar MAGs assembled from independent data sets. Obtaining multiple genomic representatives for a species is highly valuable, as it allows for population genomic analyses; however, when retaining genomes of closely related populations, it complicates MAG quality assessment and abundance inferences. We show that (i) published data sets contain a large fraction of MAGs sharing >99% average nucleotide identity, (ii) different software packages and parameters used to resolve this redundancy remove very different numbers of MAGs, and (iii) the removal of closely related genomes leads to losses of population-specific auxiliary genes. Finally, we highlight some approaches that can infer strain-specific dynamics across a sample series without dereplication.
format Online
Article
Text
id pubmed-7380574
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-73805742020-07-31 To Dereplicate or Not To Dereplicate? Evans, Jacob T. Denef, Vincent J. mSphere Opinion/Hypothesis Metagenome-assembled genomes (MAGs) expand our understanding of microbial diversity, evolution, and ecology. Concerns have been raised on how sequencing, assembly, binning, and quality assessment tools may result in MAGs that do not reflect single populations in nature. Here, we reflect on another issue, i.e., how to handle highly similar MAGs assembled from independent data sets. Obtaining multiple genomic representatives for a species is highly valuable, as it allows for population genomic analyses; however, when retaining genomes of closely related populations, it complicates MAG quality assessment and abundance inferences. We show that (i) published data sets contain a large fraction of MAGs sharing >99% average nucleotide identity, (ii) different software packages and parameters used to resolve this redundancy remove very different numbers of MAGs, and (iii) the removal of closely related genomes leads to losses of population-specific auxiliary genes. Finally, we highlight some approaches that can infer strain-specific dynamics across a sample series without dereplication. American Society for Microbiology 2020-05-20 /pmc/articles/PMC7380574/ /pubmed/32434845 http://dx.doi.org/10.1128/mSphere.00971-19 Text en Copyright © 2020 Evans and Denef. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Opinion/Hypothesis
Evans, Jacob T.
Denef, Vincent J.
To Dereplicate or Not To Dereplicate?
title To Dereplicate or Not To Dereplicate?
title_full To Dereplicate or Not To Dereplicate?
title_fullStr To Dereplicate or Not To Dereplicate?
title_full_unstemmed To Dereplicate or Not To Dereplicate?
title_short To Dereplicate or Not To Dereplicate?
title_sort to dereplicate or not to dereplicate?
topic Opinion/Hypothesis
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7380574/
https://www.ncbi.nlm.nih.gov/pubmed/32434845
http://dx.doi.org/10.1128/mSphere.00971-19
work_keys_str_mv AT evansjacobt todereplicateornottodereplicate
AT denefvincentj todereplicateornottodereplicate