Cargando…

LMAS: evaluating metagenomic short de novo assembly methods through defined communities

BACKGROUND: The de novo assembly of raw sequence data is key in metagenomic analysis. It allows recovering draft genomes from a pool of mixed raw reads, yielding longer sequences that offer contextual information and provide a more complete picture of the microbial community. FINDINGS: To better com...

Descripción completa

Detalles Bibliográficos
Autores principales: Mendes, Catarina Inês, Vila-Cerqueira, Pedro, Motro, Yair, Moran-Gilad, Jacob, Carriço, João André, Ramirez, Mário
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9795473/
https://www.ncbi.nlm.nih.gov/pubmed/36576131
http://dx.doi.org/10.1093/gigascience/giac122
_version_ 1784860269407633408
author Mendes, Catarina Inês
Vila-Cerqueira, Pedro
Motro, Yair
Moran-Gilad, Jacob
Carriço, João André
Ramirez, Mário
author_facet Mendes, Catarina Inês
Vila-Cerqueira, Pedro
Motro, Yair
Moran-Gilad, Jacob
Carriço, João André
Ramirez, Mário
author_sort Mendes, Catarina Inês
collection PubMed
description BACKGROUND: The de novo assembly of raw sequence data is key in metagenomic analysis. It allows recovering draft genomes from a pool of mixed raw reads, yielding longer sequences that offer contextual information and provide a more complete picture of the microbial community. FINDINGS: To better compare de novo assemblers for metagenomic analysis, LMAS (Last Metagenomic Assembler Standing) was developed as a flexible platform allowing users to evaluate assembler performance given known standard communities. Overall, in our test datasets, k-mer De Bruijn graph assemblers outperformed the alternative approaches but came with a greater computational cost. Furthermore, assemblers branded as metagenomic specific did not consistently outperform other genomic assemblers in metagenomic samples. Some assemblers still in use, such as ABySS, MetaHipmer2, minia, and VelvetOptimiser, perform relatively poorly and should be used with caution when assembling complex samples. Meaningful strain resolution at the single-nucleotide polymorphism level was not achieved, even by the best assemblers tested. CONCLUSIONS: The choice of a de novo assembler depends on the computational resources available, the replicon of interest, and the major goals of the analysis. No single assembler appeared an ideal choice for short-read metagenomic prokaryote replicon assembly, each showing specific strengths. The choice of metagenomic assembler should be guided by user requirements and characteristics of the sample of interest, and LMAS provides an interactive evaluation platform for this purpose. LMAS is open source, and the workflow and its documentation are available at https://github.com/B-UMMI/LMAS and https://lmas.readthedocs.io/, respectively.
format Online
Article
Text
id pubmed-9795473
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-97954732022-12-28 LMAS: evaluating metagenomic short de novo assembly methods through defined communities Mendes, Catarina Inês Vila-Cerqueira, Pedro Motro, Yair Moran-Gilad, Jacob Carriço, João André Ramirez, Mário Gigascience Technical Note BACKGROUND: The de novo assembly of raw sequence data is key in metagenomic analysis. It allows recovering draft genomes from a pool of mixed raw reads, yielding longer sequences that offer contextual information and provide a more complete picture of the microbial community. FINDINGS: To better compare de novo assemblers for metagenomic analysis, LMAS (Last Metagenomic Assembler Standing) was developed as a flexible platform allowing users to evaluate assembler performance given known standard communities. Overall, in our test datasets, k-mer De Bruijn graph assemblers outperformed the alternative approaches but came with a greater computational cost. Furthermore, assemblers branded as metagenomic specific did not consistently outperform other genomic assemblers in metagenomic samples. Some assemblers still in use, such as ABySS, MetaHipmer2, minia, and VelvetOptimiser, perform relatively poorly and should be used with caution when assembling complex samples. Meaningful strain resolution at the single-nucleotide polymorphism level was not achieved, even by the best assemblers tested. CONCLUSIONS: The choice of a de novo assembler depends on the computational resources available, the replicon of interest, and the major goals of the analysis. No single assembler appeared an ideal choice for short-read metagenomic prokaryote replicon assembly, each showing specific strengths. The choice of metagenomic assembler should be guided by user requirements and characteristics of the sample of interest, and LMAS provides an interactive evaluation platform for this purpose. LMAS is open source, and the workflow and its documentation are available at https://github.com/B-UMMI/LMAS and https://lmas.readthedocs.io/, respectively. Oxford University Press 2022-12-28 /pmc/articles/PMC9795473/ /pubmed/36576131 http://dx.doi.org/10.1093/gigascience/giac122 Text en © The Author(s) 2022. Published by Oxford University Press GigaScience. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Mendes, Catarina Inês
Vila-Cerqueira, Pedro
Motro, Yair
Moran-Gilad, Jacob
Carriço, João André
Ramirez, Mário
LMAS: evaluating metagenomic short de novo assembly methods through defined communities
title LMAS: evaluating metagenomic short de novo assembly methods through defined communities
title_full LMAS: evaluating metagenomic short de novo assembly methods through defined communities
title_fullStr LMAS: evaluating metagenomic short de novo assembly methods through defined communities
title_full_unstemmed LMAS: evaluating metagenomic short de novo assembly methods through defined communities
title_short LMAS: evaluating metagenomic short de novo assembly methods through defined communities
title_sort lmas: evaluating metagenomic short de novo assembly methods through defined communities
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9795473/
https://www.ncbi.nlm.nih.gov/pubmed/36576131
http://dx.doi.org/10.1093/gigascience/giac122
work_keys_str_mv AT mendescatarinaines lmasevaluatingmetagenomicshortdenovoassemblymethodsthroughdefinedcommunities
AT vilacerqueirapedro lmasevaluatingmetagenomicshortdenovoassemblymethodsthroughdefinedcommunities
AT motroyair lmasevaluatingmetagenomicshortdenovoassemblymethodsthroughdefinedcommunities
AT morangiladjacob lmasevaluatingmetagenomicshortdenovoassemblymethodsthroughdefinedcommunities
AT carricojoaoandre lmasevaluatingmetagenomicshortdenovoassemblymethodsthroughdefinedcommunities
AT ramirezmario lmasevaluatingmetagenomicshortdenovoassemblymethodsthroughdefinedcommunities