Cargando…

Metagenomic binning with assembly graph embeddings

MOTIVATION: Despite recent advancements in sequencing technologies and assembly methods, obtaining high-quality microbial genomes from metagenomic samples is still not a trivial task. Current metagenomic binners do not take full advantage of assembly graphs and are not optimized for long-read assemb...

Descripción completa

Detalles Bibliográficos
Autores principales: Lamurias, Andre, Sereika, Mantas, Albertsen, Mads, Hose, Katja, Nielsen, Thomas Dyhre
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9525014/
https://www.ncbi.nlm.nih.gov/pubmed/35972375
http://dx.doi.org/10.1093/bioinformatics/btac557
_version_ 1784800616679211008
author Lamurias, Andre
Sereika, Mantas
Albertsen, Mads
Hose, Katja
Nielsen, Thomas Dyhre
author_facet Lamurias, Andre
Sereika, Mantas
Albertsen, Mads
Hose, Katja
Nielsen, Thomas Dyhre
author_sort Lamurias, Andre
collection PubMed
description MOTIVATION: Despite recent advancements in sequencing technologies and assembly methods, obtaining high-quality microbial genomes from metagenomic samples is still not a trivial task. Current metagenomic binners do not take full advantage of assembly graphs and are not optimized for long-read assemblies. Deep graph learning algorithms have been proposed in other fields to deal with complex graph data structures. The graph structure generated during the assembly process could be integrated with contig features to obtain better bins with deep learning. RESULTS: We propose GraphMB, which uses graph neural networks to incorporate the assembly graph into the binning process. We test GraphMB on long-read datasets of different complexities, and compare the performance with other binners in terms of the number of High Quality (HQ) genome bins obtained. With our approach, we were able to obtain unique bins on all real datasets, and obtain more bins on most datasets. In particular, we obtained on average 17.5% more HQ bins when compared with state-of-the-art binners and 13.7% when aggregating the results of our binner with the others. These results indicate that a deep learning model can integrate contig-specific and graph-structure information to improve metagenomic binning. AVAILABILITY AND IMPLEMENTATION: GraphMB is available from https://github.com/MicrobialDarkMatter/GraphMB. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9525014
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-95250142022-10-03 Metagenomic binning with assembly graph embeddings Lamurias, Andre Sereika, Mantas Albertsen, Mads Hose, Katja Nielsen, Thomas Dyhre Bioinformatics Original Papers MOTIVATION: Despite recent advancements in sequencing technologies and assembly methods, obtaining high-quality microbial genomes from metagenomic samples is still not a trivial task. Current metagenomic binners do not take full advantage of assembly graphs and are not optimized for long-read assemblies. Deep graph learning algorithms have been proposed in other fields to deal with complex graph data structures. The graph structure generated during the assembly process could be integrated with contig features to obtain better bins with deep learning. RESULTS: We propose GraphMB, which uses graph neural networks to incorporate the assembly graph into the binning process. We test GraphMB on long-read datasets of different complexities, and compare the performance with other binners in terms of the number of High Quality (HQ) genome bins obtained. With our approach, we were able to obtain unique bins on all real datasets, and obtain more bins on most datasets. In particular, we obtained on average 17.5% more HQ bins when compared with state-of-the-art binners and 13.7% when aggregating the results of our binner with the others. These results indicate that a deep learning model can integrate contig-specific and graph-structure information to improve metagenomic binning. AVAILABILITY AND IMPLEMENTATION: GraphMB is available from https://github.com/MicrobialDarkMatter/GraphMB. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-08-16 /pmc/articles/PMC9525014/ /pubmed/35972375 http://dx.doi.org/10.1093/bioinformatics/btac557 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Lamurias, Andre
Sereika, Mantas
Albertsen, Mads
Hose, Katja
Nielsen, Thomas Dyhre
Metagenomic binning with assembly graph embeddings
title Metagenomic binning with assembly graph embeddings
title_full Metagenomic binning with assembly graph embeddings
title_fullStr Metagenomic binning with assembly graph embeddings
title_full_unstemmed Metagenomic binning with assembly graph embeddings
title_short Metagenomic binning with assembly graph embeddings
title_sort metagenomic binning with assembly graph embeddings
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9525014/
https://www.ncbi.nlm.nih.gov/pubmed/35972375
http://dx.doi.org/10.1093/bioinformatics/btac557
work_keys_str_mv AT lamuriasandre metagenomicbinningwithassemblygraphembeddings
AT sereikamantas metagenomicbinningwithassemblygraphembeddings
AT albertsenmads metagenomicbinningwithassemblygraphembeddings
AT hosekatja metagenomicbinningwithassemblygraphembeddings
AT nielsenthomasdyhre metagenomicbinningwithassemblygraphembeddings