Cargando…

Binnacle: Using Scaffolds to Improve the Contiguity and Quality of Metagenomic Bins

High-throughput sequencing has revolutionized the field of microbiology, however, reconstructing complete genomes of organisms from whole metagenomic shotgun sequencing data remains a challenge. Recovered genomes are often highly fragmented, due to uneven abundances of organisms, repeats within and...

Descripción completa

Detalles Bibliográficos
Autores principales: Muralidharan, Harihara Subrahmaniam, Shah, Nidhi, Meisel, Jacquelyn S., Pop, Mihai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7945042/
https://www.ncbi.nlm.nih.gov/pubmed/33717033
http://dx.doi.org/10.3389/fmicb.2021.638561
_version_ 1783662786311618560
author Muralidharan, Harihara Subrahmaniam
Shah, Nidhi
Meisel, Jacquelyn S.
Pop, Mihai
author_facet Muralidharan, Harihara Subrahmaniam
Shah, Nidhi
Meisel, Jacquelyn S.
Pop, Mihai
author_sort Muralidharan, Harihara Subrahmaniam
collection PubMed
description High-throughput sequencing has revolutionized the field of microbiology, however, reconstructing complete genomes of organisms from whole metagenomic shotgun sequencing data remains a challenge. Recovered genomes are often highly fragmented, due to uneven abundances of organisms, repeats within and across genomes, sequencing errors, and strain-level variation. To address the fragmented nature of metagenomic assemblies, scientists rely on a process called binning, which clusters together contigs inferred to originate from the same organism. Existing binning algorithms use oligonucleotide frequencies and contig abundance (coverage) within and across samples to group together contigs from the same organism. However, these algorithms often miss short contigs and contigs from regions with unusual coverage or DNA composition characteristics, such as mobile elements. Here, we propose that information from assembly graphs can assist current strategies for metagenomic binning. We use MetaCarvel, a metagenomic scaffolding tool, to construct assembly graphs where contigs are nodes and edges are inferred based on paired-end reads. We developed a tool, Binnacle, that extracts information from the assembly graphs and clusters scaffolds into comprehensive bins. Binnacle also provides wrapper scripts to integrate with existing binning methods. The Binnacle pipeline can be found on GitHub (https://github.com/marbl/binnacle). We show that binning graph-based scaffolds, rather than contigs, improves the contiguity and quality of the resulting bins, and captures a broader set of the genes of the organisms being reconstructed.
format Online
Article
Text
id pubmed-7945042
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-79450422021-03-11 Binnacle: Using Scaffolds to Improve the Contiguity and Quality of Metagenomic Bins Muralidharan, Harihara Subrahmaniam Shah, Nidhi Meisel, Jacquelyn S. Pop, Mihai Front Microbiol Microbiology High-throughput sequencing has revolutionized the field of microbiology, however, reconstructing complete genomes of organisms from whole metagenomic shotgun sequencing data remains a challenge. Recovered genomes are often highly fragmented, due to uneven abundances of organisms, repeats within and across genomes, sequencing errors, and strain-level variation. To address the fragmented nature of metagenomic assemblies, scientists rely on a process called binning, which clusters together contigs inferred to originate from the same organism. Existing binning algorithms use oligonucleotide frequencies and contig abundance (coverage) within and across samples to group together contigs from the same organism. However, these algorithms often miss short contigs and contigs from regions with unusual coverage or DNA composition characteristics, such as mobile elements. Here, we propose that information from assembly graphs can assist current strategies for metagenomic binning. We use MetaCarvel, a metagenomic scaffolding tool, to construct assembly graphs where contigs are nodes and edges are inferred based on paired-end reads. We developed a tool, Binnacle, that extracts information from the assembly graphs and clusters scaffolds into comprehensive bins. Binnacle also provides wrapper scripts to integrate with existing binning methods. The Binnacle pipeline can be found on GitHub (https://github.com/marbl/binnacle). We show that binning graph-based scaffolds, rather than contigs, improves the contiguity and quality of the resulting bins, and captures a broader set of the genes of the organisms being reconstructed. Frontiers Media S.A. 2021-02-24 /pmc/articles/PMC7945042/ /pubmed/33717033 http://dx.doi.org/10.3389/fmicb.2021.638561 Text en Copyright © 2021 Muralidharan, Shah, Meisel and Pop. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Microbiology
Muralidharan, Harihara Subrahmaniam
Shah, Nidhi
Meisel, Jacquelyn S.
Pop, Mihai
Binnacle: Using Scaffolds to Improve the Contiguity and Quality of Metagenomic Bins
title Binnacle: Using Scaffolds to Improve the Contiguity and Quality of Metagenomic Bins
title_full Binnacle: Using Scaffolds to Improve the Contiguity and Quality of Metagenomic Bins
title_fullStr Binnacle: Using Scaffolds to Improve the Contiguity and Quality of Metagenomic Bins
title_full_unstemmed Binnacle: Using Scaffolds to Improve the Contiguity and Quality of Metagenomic Bins
title_short Binnacle: Using Scaffolds to Improve the Contiguity and Quality of Metagenomic Bins
title_sort binnacle: using scaffolds to improve the contiguity and quality of metagenomic bins
topic Microbiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7945042/
https://www.ncbi.nlm.nih.gov/pubmed/33717033
http://dx.doi.org/10.3389/fmicb.2021.638561
work_keys_str_mv AT muralidharanhariharasubrahmaniam binnacleusingscaffoldstoimprovethecontiguityandqualityofmetagenomicbins
AT shahnidhi binnacleusingscaffoldstoimprovethecontiguityandqualityofmetagenomicbins
AT meiseljacquelyns binnacleusingscaffoldstoimprovethecontiguityandqualityofmetagenomicbins
AT popmihai binnacleusingscaffoldstoimprovethecontiguityandqualityofmetagenomicbins