Cargando…

MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm

BACKGROUND: Recovering individual genomes from metagenomic datasets allows access to uncultivated microbial populations that may have important roles in natural and engineered ecosystems. Understanding the roles of these uncultivated populations has broad application in ecology, evolution, biotechno...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Yu-Wei, Tang, Yung-Hsu, Tringe, Susannah G, Simmons, Blake A, Singer, Steven W
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4129434/
https://www.ncbi.nlm.nih.gov/pubmed/25136443
http://dx.doi.org/10.1186/2049-2618-2-26
_version_ 1782330235913502720
author Wu, Yu-Wei
Tang, Yung-Hsu
Tringe, Susannah G
Simmons, Blake A
Singer, Steven W
author_facet Wu, Yu-Wei
Tang, Yung-Hsu
Tringe, Susannah G
Simmons, Blake A
Singer, Steven W
author_sort Wu, Yu-Wei
collection PubMed
description BACKGROUND: Recovering individual genomes from metagenomic datasets allows access to uncultivated microbial populations that may have important roles in natural and engineered ecosystems. Understanding the roles of these uncultivated populations has broad application in ecology, evolution, biotechnology and medicine. Accurate binning of assembled metagenomic sequences is an essential step in recovering the genomes and understanding microbial functions. RESULTS: We have developed a binning algorithm, MaxBin, which automates the binning of assembled metagenomic scaffolds using an expectation-maximization algorithm after the assembly of metagenomic sequencing reads. Binning of simulated metagenomic datasets demonstrated that MaxBin had high levels of accuracy in binning microbial genomes. MaxBin was used to recover genomes from metagenomic data obtained through the Human Microbiome Project, which demonstrated its ability to recover genomes from real metagenomic datasets with variable sequencing coverages. Application of MaxBin to metagenomes obtained from microbial consortia adapted to grow on cellulose allowed genomic analysis of new, uncultivated, cellulolytic bacterial populations, including an abundant myxobacterial population distantly related to Sorangium cellulosum that possessed a much smaller genome (5 MB versus 13 to 14 MB) but has a more extensive set of genes for biomass deconstruction. For the cellulolytic consortia, the MaxBin results were compared to binning using emergent self-organizing maps (ESOMs) and differential coverage binning, demonstrating that it performed comparably to these methods but had distinct advantages in automation, resolution of related genomes and sensitivity. CONCLUSIONS: The automatic binning software that we developed successfully classifies assembled sequences in metagenomic datasets into recovered individual genomes. The isolation of dozens of species in cellulolytic microbial consortia, including a novel species of myxobacteria that has the smallest genome among all sequenced aerobic myxobacteria, was easily achieved using the binning software. This work demonstrates that the processes required for recovering genomes from assembled metagenomic datasets can be readily automated, an important advance in understanding the metabolic potential of microbes in natural environments. MaxBin is available at https://sourceforge.net/projects/maxbin/.
format Online
Article
Text
id pubmed-4129434
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41294342014-08-18 MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm Wu, Yu-Wei Tang, Yung-Hsu Tringe, Susannah G Simmons, Blake A Singer, Steven W Microbiome Methodology BACKGROUND: Recovering individual genomes from metagenomic datasets allows access to uncultivated microbial populations that may have important roles in natural and engineered ecosystems. Understanding the roles of these uncultivated populations has broad application in ecology, evolution, biotechnology and medicine. Accurate binning of assembled metagenomic sequences is an essential step in recovering the genomes and understanding microbial functions. RESULTS: We have developed a binning algorithm, MaxBin, which automates the binning of assembled metagenomic scaffolds using an expectation-maximization algorithm after the assembly of metagenomic sequencing reads. Binning of simulated metagenomic datasets demonstrated that MaxBin had high levels of accuracy in binning microbial genomes. MaxBin was used to recover genomes from metagenomic data obtained through the Human Microbiome Project, which demonstrated its ability to recover genomes from real metagenomic datasets with variable sequencing coverages. Application of MaxBin to metagenomes obtained from microbial consortia adapted to grow on cellulose allowed genomic analysis of new, uncultivated, cellulolytic bacterial populations, including an abundant myxobacterial population distantly related to Sorangium cellulosum that possessed a much smaller genome (5 MB versus 13 to 14 MB) but has a more extensive set of genes for biomass deconstruction. For the cellulolytic consortia, the MaxBin results were compared to binning using emergent self-organizing maps (ESOMs) and differential coverage binning, demonstrating that it performed comparably to these methods but had distinct advantages in automation, resolution of related genomes and sensitivity. CONCLUSIONS: The automatic binning software that we developed successfully classifies assembled sequences in metagenomic datasets into recovered individual genomes. The isolation of dozens of species in cellulolytic microbial consortia, including a novel species of myxobacteria that has the smallest genome among all sequenced aerobic myxobacteria, was easily achieved using the binning software. This work demonstrates that the processes required for recovering genomes from assembled metagenomic datasets can be readily automated, an important advance in understanding the metabolic potential of microbes in natural environments. MaxBin is available at https://sourceforge.net/projects/maxbin/. BioMed Central 2014-08-01 /pmc/articles/PMC4129434/ /pubmed/25136443 http://dx.doi.org/10.1186/2049-2618-2-26 Text en Copyright © 2014 Wu et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Wu, Yu-Wei
Tang, Yung-Hsu
Tringe, Susannah G
Simmons, Blake A
Singer, Steven W
MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm
title MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm
title_full MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm
title_fullStr MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm
title_full_unstemmed MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm
title_short MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm
title_sort maxbin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4129434/
https://www.ncbi.nlm.nih.gov/pubmed/25136443
http://dx.doi.org/10.1186/2049-2618-2-26
work_keys_str_mv AT wuyuwei maxbinanautomatedbinningmethodtorecoverindividualgenomesfrommetagenomesusinganexpectationmaximizationalgorithm
AT tangyunghsu maxbinanautomatedbinningmethodtorecoverindividualgenomesfrommetagenomesusinganexpectationmaximizationalgorithm
AT tringesusannahg maxbinanautomatedbinningmethodtorecoverindividualgenomesfrommetagenomesusinganexpectationmaximizationalgorithm
AT simmonsblakea maxbinanautomatedbinningmethodtorecoverindividualgenomesfrommetagenomesusinganexpectationmaximizationalgorithm
AT singerstevenw maxbinanautomatedbinningmethodtorecoverindividualgenomesfrommetagenomesusinganexpectationmaximizationalgorithm