Cargando…

redbiom: a Rapid Sample Discovery and Feature Characterization System

Meta-analyses at the whole-community level have been important in microbiome studies, revealing profound features that structure Earth’s microbial communities, such as the unique differentiation of microbes from the mammalian gut relative to free-living microbial communities, the separation of micro...

Descripción completa

Detalles Bibliográficos
Autores principales: McDonald, Daniel, Kaehler, Benjamin, Gonzalez, Antonio, DeReus, Jeff, Ackermann, Gail, Marotz, Clarisse, Huttley, Gavin, Knight, Rob
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6593222/
https://www.ncbi.nlm.nih.gov/pubmed/31239397
http://dx.doi.org/10.1128/mSystems.00215-19
_version_ 1783430000352952320
author McDonald, Daniel
Kaehler, Benjamin
Gonzalez, Antonio
DeReus, Jeff
Ackermann, Gail
Marotz, Clarisse
Huttley, Gavin
Knight, Rob
author_facet McDonald, Daniel
Kaehler, Benjamin
Gonzalez, Antonio
DeReus, Jeff
Ackermann, Gail
Marotz, Clarisse
Huttley, Gavin
Knight, Rob
author_sort McDonald, Daniel
collection PubMed
description Meta-analyses at the whole-community level have been important in microbiome studies, revealing profound features that structure Earth’s microbial communities, such as the unique differentiation of microbes from the mammalian gut relative to free-living microbial communities, the separation of microbiomes in saline and nonsaline environments, and the role of pH in driving soil microbial compositions. However, our ability to identify the specific features of a microbiome that differentiate these community-level patterns have lagged behind, especially as ever-cheaper DNA sequencing has yielded increasingly large data sets. One critical gap is the ability to search for samples that contain specific features (for example, sub-operational taxonomic units [sOTUs] identified by high-resolution statistical methods for removing amplicon sequencing errors). Here we introduce redbiom, a microbiome caching layer, which allows users to rapidly query samples that contain a given feature, retrieve sample data and metadata, and search for samples that match specified metadata values or ranges (e.g., all samples with a pH of >7), implemented using an in-memory NoSQL database called Redis. By default, redbiom allows public anonymous sample access for over 100,000 publicly available samples in the Qiita database. At over 100,000 samples, the caching server requires only 35 GB of resident memory. We highlight how redbiom enables a new type of characterization of microbiome samples and provide tutorials for using redbiom with QIIME 2. redbiom is open source under the BSD license, hosted on GitHub, and can be deployed independently of Qiita to enable search of proprietary or clinically restricted microbiome databases. IMPORTANCE Although analyses that combine many microbiomes at the whole-community level have become routine, searching rapidly for microbiomes that contain a particular sequence has remained difficult. The software we present here, redbiom, dramatically accelerates this process, allowing samples that contain microbiome features to be rapidly identified. This is especially useful when taxonomic annotation is limited, allowing users to identify environments in which unannotated microbes of interest were previously observed. This approach also allows environmental or clinical factors that correlate with specific features, or vice versa, to be identified rapidly, even at a scale of billions of sequences in hundreds of thousands of samples. The software is integrated with existing analysis tools to enable fast, large-scale microbiome searches and discovery of new microbiome relationships.
format Online
Article
Text
id pubmed-6593222
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-65932222019-07-03 redbiom: a Rapid Sample Discovery and Feature Characterization System McDonald, Daniel Kaehler, Benjamin Gonzalez, Antonio DeReus, Jeff Ackermann, Gail Marotz, Clarisse Huttley, Gavin Knight, Rob mSystems Observation Meta-analyses at the whole-community level have been important in microbiome studies, revealing profound features that structure Earth’s microbial communities, such as the unique differentiation of microbes from the mammalian gut relative to free-living microbial communities, the separation of microbiomes in saline and nonsaline environments, and the role of pH in driving soil microbial compositions. However, our ability to identify the specific features of a microbiome that differentiate these community-level patterns have lagged behind, especially as ever-cheaper DNA sequencing has yielded increasingly large data sets. One critical gap is the ability to search for samples that contain specific features (for example, sub-operational taxonomic units [sOTUs] identified by high-resolution statistical methods for removing amplicon sequencing errors). Here we introduce redbiom, a microbiome caching layer, which allows users to rapidly query samples that contain a given feature, retrieve sample data and metadata, and search for samples that match specified metadata values or ranges (e.g., all samples with a pH of >7), implemented using an in-memory NoSQL database called Redis. By default, redbiom allows public anonymous sample access for over 100,000 publicly available samples in the Qiita database. At over 100,000 samples, the caching server requires only 35 GB of resident memory. We highlight how redbiom enables a new type of characterization of microbiome samples and provide tutorials for using redbiom with QIIME 2. redbiom is open source under the BSD license, hosted on GitHub, and can be deployed independently of Qiita to enable search of proprietary or clinically restricted microbiome databases. IMPORTANCE Although analyses that combine many microbiomes at the whole-community level have become routine, searching rapidly for microbiomes that contain a particular sequence has remained difficult. The software we present here, redbiom, dramatically accelerates this process, allowing samples that contain microbiome features to be rapidly identified. This is especially useful when taxonomic annotation is limited, allowing users to identify environments in which unannotated microbes of interest were previously observed. This approach also allows environmental or clinical factors that correlate with specific features, or vice versa, to be identified rapidly, even at a scale of billions of sequences in hundreds of thousands of samples. The software is integrated with existing analysis tools to enable fast, large-scale microbiome searches and discovery of new microbiome relationships. American Society for Microbiology 2019-06-25 /pmc/articles/PMC6593222/ /pubmed/31239397 http://dx.doi.org/10.1128/mSystems.00215-19 Text en Copyright © 2019 McDonald et al. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Observation
McDonald, Daniel
Kaehler, Benjamin
Gonzalez, Antonio
DeReus, Jeff
Ackermann, Gail
Marotz, Clarisse
Huttley, Gavin
Knight, Rob
redbiom: a Rapid Sample Discovery and Feature Characterization System
title redbiom: a Rapid Sample Discovery and Feature Characterization System
title_full redbiom: a Rapid Sample Discovery and Feature Characterization System
title_fullStr redbiom: a Rapid Sample Discovery and Feature Characterization System
title_full_unstemmed redbiom: a Rapid Sample Discovery and Feature Characterization System
title_short redbiom: a Rapid Sample Discovery and Feature Characterization System
title_sort redbiom: a rapid sample discovery and feature characterization system
topic Observation
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6593222/
https://www.ncbi.nlm.nih.gov/pubmed/31239397
http://dx.doi.org/10.1128/mSystems.00215-19
work_keys_str_mv AT mcdonalddaniel redbiomarapidsamplediscoveryandfeaturecharacterizationsystem
AT kaehlerbenjamin redbiomarapidsamplediscoveryandfeaturecharacterizationsystem
AT gonzalezantonio redbiomarapidsamplediscoveryandfeaturecharacterizationsystem
AT dereusjeff redbiomarapidsamplediscoveryandfeaturecharacterizationsystem
AT ackermanngail redbiomarapidsamplediscoveryandfeaturecharacterizationsystem
AT marotzclarisse redbiomarapidsamplediscoveryandfeaturecharacterizationsystem
AT huttleygavin redbiomarapidsamplediscoveryandfeaturecharacterizationsystem
AT knightrob redbiomarapidsamplediscoveryandfeaturecharacterizationsystem