Cargando…

Exploring Neighborhoods in the Metagenome Universe

The variety of metagenomes in current databases provides a rapidly growing source of information for comparative studies. However, the quantity and quality of supplementary metadata is still lagging behind. It is therefore important to be able to identify related metagenomes by means of the availabl...

Descripción completa

Detalles Bibliográficos
Autores principales: Aßhauer, Kathrin P., Klingenberg, Heiner, Lingner, Thomas, Meinicke, Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4139848/
https://www.ncbi.nlm.nih.gov/pubmed/25026170
http://dx.doi.org/10.3390/ijms150712364
_version_ 1782331425214693376
author Aßhauer, Kathrin P.
Klingenberg, Heiner
Lingner, Thomas
Meinicke, Peter
author_facet Aßhauer, Kathrin P.
Klingenberg, Heiner
Lingner, Thomas
Meinicke, Peter
author_sort Aßhauer, Kathrin P.
collection PubMed
description The variety of metagenomes in current databases provides a rapidly growing source of information for comparative studies. However, the quantity and quality of supplementary metadata is still lagging behind. It is therefore important to be able to identify related metagenomes by means of the available sequence data alone. We have studied efficient sequence-based methods for large-scale identification of similar metagenomes within a database retrieval context. In a broad comparison of different profiling methods we found that vector-based distance measures are well-suitable for the detection of metagenomic neighbors. Our evaluation on more than 1700 publicly available metagenomes indicates that for a query metagenome from a particular habitat on average nine out of ten nearest neighbors represent the same habitat category independent of the utilized profiling method or distance measure. While for well-defined labels a neighborhood accuracy of 100% can be achieved, in general the neighbor detection is severely affected by a natural overlap of manually annotated categories. In addition, we present results of a novel visualization method that is able to reflect the similarity of metagenomes in a 2D scatter plot. The visualization method shows a similarly high accuracy in the reduced space as compared with the high-dimensional profile space. Our study suggests that for inspection of metagenome neighborhoods the profiling methods and distance measures can be chosen to provide a convenient interpretation of results in terms of the underlying features. Furthermore, supplementary metadata of metagenome samples in the future needs to comply with readily available ontologies for fine-grained and standardized annotation. To make profile-based k-nearest-neighbor search and the 2D-visualization of the metagenome universe available to the research community, we included the proposed methods in our CoMet-Universe server for comparative metagenome analysis.
format Online
Article
Text
id pubmed-4139848
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-41398482014-08-21 Exploring Neighborhoods in the Metagenome Universe Aßhauer, Kathrin P. Klingenberg, Heiner Lingner, Thomas Meinicke, Peter Int J Mol Sci Article The variety of metagenomes in current databases provides a rapidly growing source of information for comparative studies. However, the quantity and quality of supplementary metadata is still lagging behind. It is therefore important to be able to identify related metagenomes by means of the available sequence data alone. We have studied efficient sequence-based methods for large-scale identification of similar metagenomes within a database retrieval context. In a broad comparison of different profiling methods we found that vector-based distance measures are well-suitable for the detection of metagenomic neighbors. Our evaluation on more than 1700 publicly available metagenomes indicates that for a query metagenome from a particular habitat on average nine out of ten nearest neighbors represent the same habitat category independent of the utilized profiling method or distance measure. While for well-defined labels a neighborhood accuracy of 100% can be achieved, in general the neighbor detection is severely affected by a natural overlap of manually annotated categories. In addition, we present results of a novel visualization method that is able to reflect the similarity of metagenomes in a 2D scatter plot. The visualization method shows a similarly high accuracy in the reduced space as compared with the high-dimensional profile space. Our study suggests that for inspection of metagenome neighborhoods the profiling methods and distance measures can be chosen to provide a convenient interpretation of results in terms of the underlying features. Furthermore, supplementary metadata of metagenome samples in the future needs to comply with readily available ontologies for fine-grained and standardized annotation. To make profile-based k-nearest-neighbor search and the 2D-visualization of the metagenome universe available to the research community, we included the proposed methods in our CoMet-Universe server for comparative metagenome analysis. MDPI 2014-07-14 /pmc/articles/PMC4139848/ /pubmed/25026170 http://dx.doi.org/10.3390/ijms150712364 Text en © 2014 by the authors; licensee MDPI, Basel, Switzerland. http://creativecommons.org/licenses/by/3.0/ This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
spellingShingle Article
Aßhauer, Kathrin P.
Klingenberg, Heiner
Lingner, Thomas
Meinicke, Peter
Exploring Neighborhoods in the Metagenome Universe
title Exploring Neighborhoods in the Metagenome Universe
title_full Exploring Neighborhoods in the Metagenome Universe
title_fullStr Exploring Neighborhoods in the Metagenome Universe
title_full_unstemmed Exploring Neighborhoods in the Metagenome Universe
title_short Exploring Neighborhoods in the Metagenome Universe
title_sort exploring neighborhoods in the metagenome universe
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4139848/
https://www.ncbi.nlm.nih.gov/pubmed/25026170
http://dx.doi.org/10.3390/ijms150712364
work_keys_str_mv AT aßhauerkathrinp exploringneighborhoodsinthemetagenomeuniverse
AT klingenbergheiner exploringneighborhoodsinthemetagenomeuniverse
AT lingnerthomas exploringneighborhoodsinthemetagenomeuniverse
AT meinickepeter exploringneighborhoodsinthemetagenomeuniverse