Cargando…

Protein signature-based estimation of metagenomic abundances including all domains of life and viruses

Motivation: Metagenome analysis requires tools that can estimate the taxonomic abundances in anonymous sequence data over the whole range of biological entities. Because there is usually no prior knowledge about the data composition, not only all domains of life but also viruses have to be included...

Descripción completa

Detalles Bibliográficos
Autores principales: Klingenberg, Heiner, Aßhauer, Kathrin Petra, Lingner, Thomas, Meinicke, Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3624802/
https://www.ncbi.nlm.nih.gov/pubmed/23418187
http://dx.doi.org/10.1093/bioinformatics/btt077
_version_ 1782266056180498432
author Klingenberg, Heiner
Aßhauer, Kathrin Petra
Lingner, Thomas
Meinicke, Peter
author_facet Klingenberg, Heiner
Aßhauer, Kathrin Petra
Lingner, Thomas
Meinicke, Peter
author_sort Klingenberg, Heiner
collection PubMed
description Motivation: Metagenome analysis requires tools that can estimate the taxonomic abundances in anonymous sequence data over the whole range of biological entities. Because there is usually no prior knowledge about the data composition, not only all domains of life but also viruses have to be included in taxonomic profiling. Such a full-range approach, however, is difficult to realize owing to the limited coverage of available reference data. In particular, archaea and viruses are generally not well represented by current genome databases. Results: We introduce a novel approach to taxonomic profiling of metagenomes that is based on mixture model analysis of protein signatures. Our results on simulated and real data reveal the difficulties of the existing methods when measuring achaeal or viral abundances and show the overall good profiling performance of the protein-based mixture model. As an application example, we provide a large-scale analysis of data from the Human Microbiome Project. This demonstrates the utility of our method as a first instance profiling tool for a fast estimate of the community structure. Availability: http://gobics.de/TaxyPro. Contact: pmeinic@gwdg.de Supplementary information: Supplementary Material is available at Bioinformatics online.
format Online
Article
Text
id pubmed-3624802
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-36248022013-04-12 Protein signature-based estimation of metagenomic abundances including all domains of life and viruses Klingenberg, Heiner Aßhauer, Kathrin Petra Lingner, Thomas Meinicke, Peter Bioinformatics Original Papers Motivation: Metagenome analysis requires tools that can estimate the taxonomic abundances in anonymous sequence data over the whole range of biological entities. Because there is usually no prior knowledge about the data composition, not only all domains of life but also viruses have to be included in taxonomic profiling. Such a full-range approach, however, is difficult to realize owing to the limited coverage of available reference data. In particular, archaea and viruses are generally not well represented by current genome databases. Results: We introduce a novel approach to taxonomic profiling of metagenomes that is based on mixture model analysis of protein signatures. Our results on simulated and real data reveal the difficulties of the existing methods when measuring achaeal or viral abundances and show the overall good profiling performance of the protein-based mixture model. As an application example, we provide a large-scale analysis of data from the Human Microbiome Project. This demonstrates the utility of our method as a first instance profiling tool for a fast estimate of the community structure. Availability: http://gobics.de/TaxyPro. Contact: pmeinic@gwdg.de Supplementary information: Supplementary Material is available at Bioinformatics online. Oxford University Press 2013-04-15 2013-02-15 /pmc/articles/PMC3624802/ /pubmed/23418187 http://dx.doi.org/10.1093/bioinformatics/btt077 Text en © The Author 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Klingenberg, Heiner
Aßhauer, Kathrin Petra
Lingner, Thomas
Meinicke, Peter
Protein signature-based estimation of metagenomic abundances including all domains of life and viruses
title Protein signature-based estimation of metagenomic abundances including all domains of life and viruses
title_full Protein signature-based estimation of metagenomic abundances including all domains of life and viruses
title_fullStr Protein signature-based estimation of metagenomic abundances including all domains of life and viruses
title_full_unstemmed Protein signature-based estimation of metagenomic abundances including all domains of life and viruses
title_short Protein signature-based estimation of metagenomic abundances including all domains of life and viruses
title_sort protein signature-based estimation of metagenomic abundances including all domains of life and viruses
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3624802/
https://www.ncbi.nlm.nih.gov/pubmed/23418187
http://dx.doi.org/10.1093/bioinformatics/btt077
work_keys_str_mv AT klingenbergheiner proteinsignaturebasedestimationofmetagenomicabundancesincludingalldomainsoflifeandviruses
AT aßhauerkathrinpetra proteinsignaturebasedestimationofmetagenomicabundancesincludingalldomainsoflifeandviruses
AT lingnerthomas proteinsignaturebasedestimationofmetagenomicabundancesincludingalldomainsoflifeandviruses
AT meinickepeter proteinsignaturebasedestimationofmetagenomicabundancesincludingalldomainsoflifeandviruses