Cargando…
Hierarchically Labeled Database Indexing Allows Scalable Characterization of Microbiomes
Increasingly available microbial reference data allow interpreting the composition and function of previously uncharacterized microbial communities in detail, via high-throughput sequencing analysis. However, efficient methods for read classification are required when the best database matches for s...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7125348/ https://www.ncbi.nlm.nih.gov/pubmed/32248063 http://dx.doi.org/10.1016/j.isci.2020.100988 |
_version_ | 1783515926233088000 |
---|---|
author | Utro, Filippo Haiminen, Niina Siragusa, Enrico Gardiner, Laura-Jayne Seabolt, Ed Krishna, Ritesh Kaufman, James H. Parida, Laxmi |
author_facet | Utro, Filippo Haiminen, Niina Siragusa, Enrico Gardiner, Laura-Jayne Seabolt, Ed Krishna, Ritesh Kaufman, James H. Parida, Laxmi |
author_sort | Utro, Filippo |
collection | PubMed |
description | Increasingly available microbial reference data allow interpreting the composition and function of previously uncharacterized microbial communities in detail, via high-throughput sequencing analysis. However, efficient methods for read classification are required when the best database matches for short sequence reads are often shared among multiple reference sequences. Here, we take advantage of the fact that microbial sequences can be annotated relative to established tree structures, and we develop a highly scalable read classifier, PRROMenade, by enhancing the generalized Burrows-Wheeler transform with a labeling step to directly assign reads to the corresponding lowest taxonomic unit in an annotation tree. PRROMenade solves the multi-matching problem while allowing fast variable-size sequence classification for phylogenetic or functional annotation. Our simulations with 5% added differences from reference indicated only 1.5% error rate for PRROMenade functional classification. On metatranscriptomic data PRROMenade highlighted biologically relevant functional pathways related to diet-induced changes in the human gut microbiome. |
format | Online Article Text |
id | pubmed-7125348 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-71253482020-04-06 Hierarchically Labeled Database Indexing Allows Scalable Characterization of Microbiomes Utro, Filippo Haiminen, Niina Siragusa, Enrico Gardiner, Laura-Jayne Seabolt, Ed Krishna, Ritesh Kaufman, James H. Parida, Laxmi iScience Article Increasingly available microbial reference data allow interpreting the composition and function of previously uncharacterized microbial communities in detail, via high-throughput sequencing analysis. However, efficient methods for read classification are required when the best database matches for short sequence reads are often shared among multiple reference sequences. Here, we take advantage of the fact that microbial sequences can be annotated relative to established tree structures, and we develop a highly scalable read classifier, PRROMenade, by enhancing the generalized Burrows-Wheeler transform with a labeling step to directly assign reads to the corresponding lowest taxonomic unit in an annotation tree. PRROMenade solves the multi-matching problem while allowing fast variable-size sequence classification for phylogenetic or functional annotation. Our simulations with 5% added differences from reference indicated only 1.5% error rate for PRROMenade functional classification. On metatranscriptomic data PRROMenade highlighted biologically relevant functional pathways related to diet-induced changes in the human gut microbiome. Elsevier 2020-03-17 /pmc/articles/PMC7125348/ /pubmed/32248063 http://dx.doi.org/10.1016/j.isci.2020.100988 Text en © 2020 The Author(s) http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Utro, Filippo Haiminen, Niina Siragusa, Enrico Gardiner, Laura-Jayne Seabolt, Ed Krishna, Ritesh Kaufman, James H. Parida, Laxmi Hierarchically Labeled Database Indexing Allows Scalable Characterization of Microbiomes |
title | Hierarchically Labeled Database Indexing Allows Scalable Characterization of Microbiomes |
title_full | Hierarchically Labeled Database Indexing Allows Scalable Characterization of Microbiomes |
title_fullStr | Hierarchically Labeled Database Indexing Allows Scalable Characterization of Microbiomes |
title_full_unstemmed | Hierarchically Labeled Database Indexing Allows Scalable Characterization of Microbiomes |
title_short | Hierarchically Labeled Database Indexing Allows Scalable Characterization of Microbiomes |
title_sort | hierarchically labeled database indexing allows scalable characterization of microbiomes |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7125348/ https://www.ncbi.nlm.nih.gov/pubmed/32248063 http://dx.doi.org/10.1016/j.isci.2020.100988 |
work_keys_str_mv | AT utrofilippo hierarchicallylabeleddatabaseindexingallowsscalablecharacterizationofmicrobiomes AT haiminenniina hierarchicallylabeleddatabaseindexingallowsscalablecharacterizationofmicrobiomes AT siragusaenrico hierarchicallylabeleddatabaseindexingallowsscalablecharacterizationofmicrobiomes AT gardinerlaurajayne hierarchicallylabeleddatabaseindexingallowsscalablecharacterizationofmicrobiomes AT seabolted hierarchicallylabeleddatabaseindexingallowsscalablecharacterizationofmicrobiomes AT krishnaritesh hierarchicallylabeleddatabaseindexingallowsscalablecharacterizationofmicrobiomes AT kaufmanjamesh hierarchicallylabeleddatabaseindexingallowsscalablecharacterizationofmicrobiomes AT paridalaxmi hierarchicallylabeleddatabaseindexingallowsscalablecharacterizationofmicrobiomes |