Cargando…

Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity

Genomes computationally inferred from large metagenomic data sets are often incomplete and may be missing functionally important content and strain variation. We introduce an information retrieval system for large metagenomic data sets that exploits the sparsity of DNA assembly graphs to efficiently...

Descripción completa

Detalles Bibliográficos
Autores principales: Brown, C. Titus, Moritz, Dominik, O’Brien, Michael P., Reidl, Felix, Reiter, Taylor, Sullivan, Blair D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7336657/
https://www.ncbi.nlm.nih.gov/pubmed/32631445
http://dx.doi.org/10.1186/s13059-020-02066-4
_version_ 1783554361823068160
author Brown, C. Titus
Moritz, Dominik
O’Brien, Michael P.
Reidl, Felix
Reiter, Taylor
Sullivan, Blair D.
author_facet Brown, C. Titus
Moritz, Dominik
O’Brien, Michael P.
Reidl, Felix
Reiter, Taylor
Sullivan, Blair D.
author_sort Brown, C. Titus
collection PubMed
description Genomes computationally inferred from large metagenomic data sets are often incomplete and may be missing functionally important content and strain variation. We introduce an information retrieval system for large metagenomic data sets that exploits the sparsity of DNA assembly graphs to efficiently extract subgraphs surrounding an inferred genome. We apply this system to recover missing content from genome bins and show that substantial genomic sequence variation is present in a real metagenome. Our software implementation is available at https://github.com/spacegraphcats/spacegraphcatsunder the 3-Clause BSD License.
format Online
Article
Text
id pubmed-7336657
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-73366572020-07-08 Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity Brown, C. Titus Moritz, Dominik O’Brien, Michael P. Reidl, Felix Reiter, Taylor Sullivan, Blair D. Genome Biol Method Genomes computationally inferred from large metagenomic data sets are often incomplete and may be missing functionally important content and strain variation. We introduce an information retrieval system for large metagenomic data sets that exploits the sparsity of DNA assembly graphs to efficiently extract subgraphs surrounding an inferred genome. We apply this system to recover missing content from genome bins and show that substantial genomic sequence variation is present in a real metagenome. Our software implementation is available at https://github.com/spacegraphcats/spacegraphcatsunder the 3-Clause BSD License. BioMed Central 2020-07-06 /pmc/articles/PMC7336657/ /pubmed/32631445 http://dx.doi.org/10.1186/s13059-020-02066-4 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Method
Brown, C. Titus
Moritz, Dominik
O’Brien, Michael P.
Reidl, Felix
Reiter, Taylor
Sullivan, Blair D.
Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity
title Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity
title_full Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity
title_fullStr Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity
title_full_unstemmed Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity
title_short Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity
title_sort exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7336657/
https://www.ncbi.nlm.nih.gov/pubmed/32631445
http://dx.doi.org/10.1186/s13059-020-02066-4
work_keys_str_mv AT brownctitus exploringneighborhoodsinlargemetagenomeassemblygraphsusingspacegraphcatsrevealshiddensequencediversity
AT moritzdominik exploringneighborhoodsinlargemetagenomeassemblygraphsusingspacegraphcatsrevealshiddensequencediversity
AT obrienmichaelp exploringneighborhoodsinlargemetagenomeassemblygraphsusingspacegraphcatsrevealshiddensequencediversity
AT reidlfelix exploringneighborhoodsinlargemetagenomeassemblygraphsusingspacegraphcatsrevealshiddensequencediversity
AT reitertaylor exploringneighborhoodsinlargemetagenomeassemblygraphsusingspacegraphcatsrevealshiddensequencediversity
AT sullivanblaird exploringneighborhoodsinlargemetagenomeassemblygraphsusingspacegraphcatsrevealshiddensequencediversity