Cargando…

MSPminer: abundance-based reconstitution of microbial pan-genomes from shotgun metagenomic data

MOTIVATION: Analysis toolkits for shotgun metagenomic data achieve strain-level characterization of complex microbial communities by capturing intra-species gene content variation. Yet, these tools are hampered by the extent of reference genomes that are far from covering all microbial variability,...

Descripción completa

Detalles Bibliográficos
Autores principales: Plaza Oñate, Florian, Le Chatelier, Emmanuelle, Almeida, Mathieu, Cervino, Alessandra C L, Gauthier, Franck, Magoulès, Frédéric, Ehrlich, S Dusko, Pichaud, Matthieu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6499236/
https://www.ncbi.nlm.nih.gov/pubmed/30252023
http://dx.doi.org/10.1093/bioinformatics/bty830
_version_ 1783415766715990016
author Plaza Oñate, Florian
Le Chatelier, Emmanuelle
Almeida, Mathieu
Cervino, Alessandra C L
Gauthier, Franck
Magoulès, Frédéric
Ehrlich, S Dusko
Pichaud, Matthieu
author_facet Plaza Oñate, Florian
Le Chatelier, Emmanuelle
Almeida, Mathieu
Cervino, Alessandra C L
Gauthier, Franck
Magoulès, Frédéric
Ehrlich, S Dusko
Pichaud, Matthieu
author_sort Plaza Oñate, Florian
collection PubMed
description MOTIVATION: Analysis toolkits for shotgun metagenomic data achieve strain-level characterization of complex microbial communities by capturing intra-species gene content variation. Yet, these tools are hampered by the extent of reference genomes that are far from covering all microbial variability, as many species are still not sequenced or have only few strains available. Binning co-abundant genes obtained from de novo assembly is a powerful reference-free technique to discover and reconstitute gene repertoire of microbial species. While current methods accurately identify species core parts, they miss many accessory genes or split them into small gene groups that remain unassociated to core clusters. RESULTS: We introduce MSPminer, a computationally efficient software tool that reconstitutes Metagenomic Species Pan-genomes (MSPs) by binning co-abundant genes across metagenomic samples. MSPminer relies on a new robust measure of proportionality coupled with an empirical classifier to group and distinguish not only species core genes but accessory genes also. Applied to a large scale metagenomic dataset, MSPminer successfully delineates in a few hours the gene repertoires of 1661 microbial species with similar specificity and higher sensitivity than existing tools. The taxonomic annotation of MSPs reveals microorganisms hitherto unknown and brings coherence in the nomenclature of the species of the human gut microbiota. The provided MSPs can be readily used for taxonomic profiling and biomarkers discovery in human gut metagenomic samples. In addition, MSPminer can be applied on gene count tables from other ecosystems to perform similar analyses. AVAILABILITY AND IMPLEMENTATION: The binary is freely available for non-commercial users at www.enterome.com/downloads. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6499236
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-64992362019-05-07 MSPminer: abundance-based reconstitution of microbial pan-genomes from shotgun metagenomic data Plaza Oñate, Florian Le Chatelier, Emmanuelle Almeida, Mathieu Cervino, Alessandra C L Gauthier, Franck Magoulès, Frédéric Ehrlich, S Dusko Pichaud, Matthieu Bioinformatics Original Papers MOTIVATION: Analysis toolkits for shotgun metagenomic data achieve strain-level characterization of complex microbial communities by capturing intra-species gene content variation. Yet, these tools are hampered by the extent of reference genomes that are far from covering all microbial variability, as many species are still not sequenced or have only few strains available. Binning co-abundant genes obtained from de novo assembly is a powerful reference-free technique to discover and reconstitute gene repertoire of microbial species. While current methods accurately identify species core parts, they miss many accessory genes or split them into small gene groups that remain unassociated to core clusters. RESULTS: We introduce MSPminer, a computationally efficient software tool that reconstitutes Metagenomic Species Pan-genomes (MSPs) by binning co-abundant genes across metagenomic samples. MSPminer relies on a new robust measure of proportionality coupled with an empirical classifier to group and distinguish not only species core genes but accessory genes also. Applied to a large scale metagenomic dataset, MSPminer successfully delineates in a few hours the gene repertoires of 1661 microbial species with similar specificity and higher sensitivity than existing tools. The taxonomic annotation of MSPs reveals microorganisms hitherto unknown and brings coherence in the nomenclature of the species of the human gut microbiota. The provided MSPs can be readily used for taxonomic profiling and biomarkers discovery in human gut metagenomic samples. In addition, MSPminer can be applied on gene count tables from other ecosystems to perform similar analyses. AVAILABILITY AND IMPLEMENTATION: The binary is freely available for non-commercial users at www.enterome.com/downloads. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-05-01 2018-09-25 /pmc/articles/PMC6499236/ /pubmed/30252023 http://dx.doi.org/10.1093/bioinformatics/bty830 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Plaza Oñate, Florian
Le Chatelier, Emmanuelle
Almeida, Mathieu
Cervino, Alessandra C L
Gauthier, Franck
Magoulès, Frédéric
Ehrlich, S Dusko
Pichaud, Matthieu
MSPminer: abundance-based reconstitution of microbial pan-genomes from shotgun metagenomic data
title MSPminer: abundance-based reconstitution of microbial pan-genomes from shotgun metagenomic data
title_full MSPminer: abundance-based reconstitution of microbial pan-genomes from shotgun metagenomic data
title_fullStr MSPminer: abundance-based reconstitution of microbial pan-genomes from shotgun metagenomic data
title_full_unstemmed MSPminer: abundance-based reconstitution of microbial pan-genomes from shotgun metagenomic data
title_short MSPminer: abundance-based reconstitution of microbial pan-genomes from shotgun metagenomic data
title_sort mspminer: abundance-based reconstitution of microbial pan-genomes from shotgun metagenomic data
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6499236/
https://www.ncbi.nlm.nih.gov/pubmed/30252023
http://dx.doi.org/10.1093/bioinformatics/bty830
work_keys_str_mv AT plazaonateflorian mspminerabundancebasedreconstitutionofmicrobialpangenomesfromshotgunmetagenomicdata
AT lechatelieremmanuelle mspminerabundancebasedreconstitutionofmicrobialpangenomesfromshotgunmetagenomicdata
AT almeidamathieu mspminerabundancebasedreconstitutionofmicrobialpangenomesfromshotgunmetagenomicdata
AT cervinoalessandracl mspminerabundancebasedreconstitutionofmicrobialpangenomesfromshotgunmetagenomicdata
AT gauthierfranck mspminerabundancebasedreconstitutionofmicrobialpangenomesfromshotgunmetagenomicdata
AT magoulesfrederic mspminerabundancebasedreconstitutionofmicrobialpangenomesfromshotgunmetagenomicdata
AT ehrlichsdusko mspminerabundancebasedreconstitutionofmicrobialpangenomesfromshotgunmetagenomicdata
AT pichaudmatthieu mspminerabundancebasedreconstitutionofmicrobialpangenomesfromshotgunmetagenomicdata