Cargando…
Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets
BACKGROUND: Shotgun metagenomics based on untargeted sequencing can explore the taxonomic profile and the function of unknown microorganisms in samples, and complement the shortage of amplicon sequencing. Binning assembled sequences into individual groups, which represent microbial genomes, is the k...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7469296/ https://www.ncbi.nlm.nih.gov/pubmed/32723290 http://dx.doi.org/10.1186/s12859-020-03667-3 |
_version_ | 1783578396520873984 |
---|---|
author | Yue, Yi Huang, Hao Qi, Zhao Dou, Hui-Min Liu, Xin-Yi Han, Tian-Fei Chen, Yue Song, Xiang-Jun Zhang, You-Hua Tu, Jian |
author_facet | Yue, Yi Huang, Hao Qi, Zhao Dou, Hui-Min Liu, Xin-Yi Han, Tian-Fei Chen, Yue Song, Xiang-Jun Zhang, You-Hua Tu, Jian |
author_sort | Yue, Yi |
collection | PubMed |
description | BACKGROUND: Shotgun metagenomics based on untargeted sequencing can explore the taxonomic profile and the function of unknown microorganisms in samples, and complement the shortage of amplicon sequencing. Binning assembled sequences into individual groups, which represent microbial genomes, is the key step and a major challenge in metagenomic research. Both supervised and unsupervised machine learning methods have been employed in binning. Genome binning belonging to unsupervised method clusters contigs into individual genome bins by machine learning methods without the assistance of any reference databases. So far a lot of genome binning tools have emerged. Evaluating these genome tools is of great significance to microbiological research. In this study, we evaluate 15 genome binning tools containing 12 original binning tools and 3 refining binning tools by comparing the performance of these tools on chicken gut metagenomic datasets and the first CAMI challenge datasets. RESULTS: For chicken gut metagenomic datasets, original genome binner MetaBat, Groopm2 and Autometa performed better than other original binner, and MetaWrap combined the binning results of them generated the most high-quality genome bins. For CAMI datasets, Groopm2 achieved the highest purity (> 0.9) with good completeness (> 0.8), and reconstructed the most high-quality genome bins among original genome binners. Compared with Groopm2, MetaBat2 had similar performance with higher completeness and lower purity. Genome refining binners DASTool predicated the most high-quality genome bins among all genomes binners. Most genome binner performed well for unique strains. Nonetheless, reconstructing common strains still is a substantial challenge for all genome binner. CONCLUSIONS: In conclusion, we tested a set of currently available, state-of-the-art metagenomics hybrid binning tools and provided a guide for selecting tools for metagenomic binning by comparing range of purity, completeness, adjusted rand index, and the number of high-quality reconstructed bins. Furthermore, available information for future binning strategy were concluded. |
format | Online Article Text |
id | pubmed-7469296 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-74692962020-09-03 Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets Yue, Yi Huang, Hao Qi, Zhao Dou, Hui-Min Liu, Xin-Yi Han, Tian-Fei Chen, Yue Song, Xiang-Jun Zhang, You-Hua Tu, Jian BMC Bioinformatics Research Article BACKGROUND: Shotgun metagenomics based on untargeted sequencing can explore the taxonomic profile and the function of unknown microorganisms in samples, and complement the shortage of amplicon sequencing. Binning assembled sequences into individual groups, which represent microbial genomes, is the key step and a major challenge in metagenomic research. Both supervised and unsupervised machine learning methods have been employed in binning. Genome binning belonging to unsupervised method clusters contigs into individual genome bins by machine learning methods without the assistance of any reference databases. So far a lot of genome binning tools have emerged. Evaluating these genome tools is of great significance to microbiological research. In this study, we evaluate 15 genome binning tools containing 12 original binning tools and 3 refining binning tools by comparing the performance of these tools on chicken gut metagenomic datasets and the first CAMI challenge datasets. RESULTS: For chicken gut metagenomic datasets, original genome binner MetaBat, Groopm2 and Autometa performed better than other original binner, and MetaWrap combined the binning results of them generated the most high-quality genome bins. For CAMI datasets, Groopm2 achieved the highest purity (> 0.9) with good completeness (> 0.8), and reconstructed the most high-quality genome bins among original genome binners. Compared with Groopm2, MetaBat2 had similar performance with higher completeness and lower purity. Genome refining binners DASTool predicated the most high-quality genome bins among all genomes binners. Most genome binner performed well for unique strains. Nonetheless, reconstructing common strains still is a substantial challenge for all genome binner. CONCLUSIONS: In conclusion, we tested a set of currently available, state-of-the-art metagenomics hybrid binning tools and provided a guide for selecting tools for metagenomic binning by comparing range of purity, completeness, adjusted rand index, and the number of high-quality reconstructed bins. Furthermore, available information for future binning strategy were concluded. BioMed Central 2020-07-28 /pmc/articles/PMC7469296/ /pubmed/32723290 http://dx.doi.org/10.1186/s12859-020-03667-3 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Yue, Yi Huang, Hao Qi, Zhao Dou, Hui-Min Liu, Xin-Yi Han, Tian-Fei Chen, Yue Song, Xiang-Jun Zhang, You-Hua Tu, Jian Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets |
title | Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets |
title_full | Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets |
title_fullStr | Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets |
title_full_unstemmed | Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets |
title_short | Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets |
title_sort | evaluating metagenomics tools for genome binning with real metagenomic datasets and cami datasets |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7469296/ https://www.ncbi.nlm.nih.gov/pubmed/32723290 http://dx.doi.org/10.1186/s12859-020-03667-3 |
work_keys_str_mv | AT yueyi evaluatingmetagenomicstoolsforgenomebinningwithrealmetagenomicdatasetsandcamidatasets AT huanghao evaluatingmetagenomicstoolsforgenomebinningwithrealmetagenomicdatasetsandcamidatasets AT qizhao evaluatingmetagenomicstoolsforgenomebinningwithrealmetagenomicdatasetsandcamidatasets AT douhuimin evaluatingmetagenomicstoolsforgenomebinningwithrealmetagenomicdatasetsandcamidatasets AT liuxinyi evaluatingmetagenomicstoolsforgenomebinningwithrealmetagenomicdatasetsandcamidatasets AT hantianfei evaluatingmetagenomicstoolsforgenomebinningwithrealmetagenomicdatasetsandcamidatasets AT chenyue evaluatingmetagenomicstoolsforgenomebinningwithrealmetagenomicdatasetsandcamidatasets AT songxiangjun evaluatingmetagenomicstoolsforgenomebinningwithrealmetagenomicdatasetsandcamidatasets AT zhangyouhua evaluatingmetagenomicstoolsforgenomebinningwithrealmetagenomicdatasetsandcamidatasets AT tujian evaluatingmetagenomicstoolsforgenomebinningwithrealmetagenomicdatasetsandcamidatasets |