Cargando…
Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity
Genomes computationally inferred from large metagenomic data sets are often incomplete and may be missing functionally important content and strain variation. We introduce an information retrieval system for large metagenomic data sets that exploits the sparsity of DNA assembly graphs to efficiently...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7336657/ https://www.ncbi.nlm.nih.gov/pubmed/32631445 http://dx.doi.org/10.1186/s13059-020-02066-4 |
_version_ | 1783554361823068160 |
---|---|
author | Brown, C. Titus Moritz, Dominik O’Brien, Michael P. Reidl, Felix Reiter, Taylor Sullivan, Blair D. |
author_facet | Brown, C. Titus Moritz, Dominik O’Brien, Michael P. Reidl, Felix Reiter, Taylor Sullivan, Blair D. |
author_sort | Brown, C. Titus |
collection | PubMed |
description | Genomes computationally inferred from large metagenomic data sets are often incomplete and may be missing functionally important content and strain variation. We introduce an information retrieval system for large metagenomic data sets that exploits the sparsity of DNA assembly graphs to efficiently extract subgraphs surrounding an inferred genome. We apply this system to recover missing content from genome bins and show that substantial genomic sequence variation is present in a real metagenome. Our software implementation is available at https://github.com/spacegraphcats/spacegraphcatsunder the 3-Clause BSD License. |
format | Online Article Text |
id | pubmed-7336657 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-73366572020-07-08 Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity Brown, C. Titus Moritz, Dominik O’Brien, Michael P. Reidl, Felix Reiter, Taylor Sullivan, Blair D. Genome Biol Method Genomes computationally inferred from large metagenomic data sets are often incomplete and may be missing functionally important content and strain variation. We introduce an information retrieval system for large metagenomic data sets that exploits the sparsity of DNA assembly graphs to efficiently extract subgraphs surrounding an inferred genome. We apply this system to recover missing content from genome bins and show that substantial genomic sequence variation is present in a real metagenome. Our software implementation is available at https://github.com/spacegraphcats/spacegraphcatsunder the 3-Clause BSD License. BioMed Central 2020-07-06 /pmc/articles/PMC7336657/ /pubmed/32631445 http://dx.doi.org/10.1186/s13059-020-02066-4 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Method Brown, C. Titus Moritz, Dominik O’Brien, Michael P. Reidl, Felix Reiter, Taylor Sullivan, Blair D. Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity |
title | Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity |
title_full | Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity |
title_fullStr | Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity |
title_full_unstemmed | Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity |
title_short | Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity |
title_sort | exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity |
topic | Method |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7336657/ https://www.ncbi.nlm.nih.gov/pubmed/32631445 http://dx.doi.org/10.1186/s13059-020-02066-4 |
work_keys_str_mv | AT brownctitus exploringneighborhoodsinlargemetagenomeassemblygraphsusingspacegraphcatsrevealshiddensequencediversity AT moritzdominik exploringneighborhoodsinlargemetagenomeassemblygraphsusingspacegraphcatsrevealshiddensequencediversity AT obrienmichaelp exploringneighborhoodsinlargemetagenomeassemblygraphsusingspacegraphcatsrevealshiddensequencediversity AT reidlfelix exploringneighborhoodsinlargemetagenomeassemblygraphsusingspacegraphcatsrevealshiddensequencediversity AT reitertaylor exploringneighborhoodsinlargemetagenomeassemblygraphsusingspacegraphcatsrevealshiddensequencediversity AT sullivanblaird exploringneighborhoodsinlargemetagenomeassemblygraphsusingspacegraphcatsrevealshiddensequencediversity |