Cargando…

Computational workflow for the fine-grained analysis of metagenomic samples

BACKGROUND: The field of metagenomics, defined as the direct genetic analysis of uncultured samples of genomes contained within an environmental sample, is gaining increasing popularity. The aim of studies of metagenomics is to determine the species present in an environmental community and identify...

Descripción completa

Detalles Bibliográficos
Autores principales: Pérez-Wohlfeil, Esteban, Arjona-Medina, Jose A., Torreno, Oscar, Ulzurrun, Eugenia, Trelles, Oswaldo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5088524/
https://www.ncbi.nlm.nih.gov/pubmed/27801291
http://dx.doi.org/10.1186/s12864-016-3063-x
_version_ 1782464112205234176
author Pérez-Wohlfeil, Esteban
Arjona-Medina, Jose A.
Torreno, Oscar
Ulzurrun, Eugenia
Trelles, Oswaldo
author_facet Pérez-Wohlfeil, Esteban
Arjona-Medina, Jose A.
Torreno, Oscar
Ulzurrun, Eugenia
Trelles, Oswaldo
author_sort Pérez-Wohlfeil, Esteban
collection PubMed
description BACKGROUND: The field of metagenomics, defined as the direct genetic analysis of uncultured samples of genomes contained within an environmental sample, is gaining increasing popularity. The aim of studies of metagenomics is to determine the species present in an environmental community and identify changes in the abundance of species under different conditions. Current metagenomic analysis software faces bottlenecks due to the high computational load required to analyze complex samples. RESULTS: A computational open-source workflow has been developed for the detailed analysis of metagenomes. This workflow provides new tools and datafile specifications that facilitate the identification of differences in abundance of reads assigned to taxa (mapping), enables the detection of reads of low-abundance bacteria (producing evidence of their presence), provides new concepts for filtering spurious matches, etc. Innovative visualization ideas for improved display of metagenomic diversity are also proposed to better understand how reads are mapped to taxa. Illustrative examples are provided based on the study of two collections of metagenomes from faecal microbial communities of adult female monozygotic and dizygotic twin pairs concordant for leanness or obesity and their mothers. CONCLUSIONS: The proposed workflow provides an open environment that offers the opportunity to perform the mapping process using different reference databases. Additionally, this workflow shows the specifications of the mapping process and datafile formats to facilitate the development of new plugins for further post-processing. This open and extensible platform has been designed with the aim of enabling in-depth analysis of metagenomic samples and better understanding of the underlying biological processes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3063-x) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5088524
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50885242016-11-07 Computational workflow for the fine-grained analysis of metagenomic samples Pérez-Wohlfeil, Esteban Arjona-Medina, Jose A. Torreno, Oscar Ulzurrun, Eugenia Trelles, Oswaldo BMC Genomics Research BACKGROUND: The field of metagenomics, defined as the direct genetic analysis of uncultured samples of genomes contained within an environmental sample, is gaining increasing popularity. The aim of studies of metagenomics is to determine the species present in an environmental community and identify changes in the abundance of species under different conditions. Current metagenomic analysis software faces bottlenecks due to the high computational load required to analyze complex samples. RESULTS: A computational open-source workflow has been developed for the detailed analysis of metagenomes. This workflow provides new tools and datafile specifications that facilitate the identification of differences in abundance of reads assigned to taxa (mapping), enables the detection of reads of low-abundance bacteria (producing evidence of their presence), provides new concepts for filtering spurious matches, etc. Innovative visualization ideas for improved display of metagenomic diversity are also proposed to better understand how reads are mapped to taxa. Illustrative examples are provided based on the study of two collections of metagenomes from faecal microbial communities of adult female monozygotic and dizygotic twin pairs concordant for leanness or obesity and their mothers. CONCLUSIONS: The proposed workflow provides an open environment that offers the opportunity to perform the mapping process using different reference databases. Additionally, this workflow shows the specifications of the mapping process and datafile formats to facilitate the development of new plugins for further post-processing. This open and extensible platform has been designed with the aim of enabling in-depth analysis of metagenomic samples and better understanding of the underlying biological processes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3063-x) contains supplementary material, which is available to authorized users. BioMed Central 2016-10-25 /pmc/articles/PMC5088524/ /pubmed/27801291 http://dx.doi.org/10.1186/s12864-016-3063-x Text en © The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Pérez-Wohlfeil, Esteban
Arjona-Medina, Jose A.
Torreno, Oscar
Ulzurrun, Eugenia
Trelles, Oswaldo
Computational workflow for the fine-grained analysis of metagenomic samples
title Computational workflow for the fine-grained analysis of metagenomic samples
title_full Computational workflow for the fine-grained analysis of metagenomic samples
title_fullStr Computational workflow for the fine-grained analysis of metagenomic samples
title_full_unstemmed Computational workflow for the fine-grained analysis of metagenomic samples
title_short Computational workflow for the fine-grained analysis of metagenomic samples
title_sort computational workflow for the fine-grained analysis of metagenomic samples
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5088524/
https://www.ncbi.nlm.nih.gov/pubmed/27801291
http://dx.doi.org/10.1186/s12864-016-3063-x
work_keys_str_mv AT perezwohlfeilesteban computationalworkflowforthefinegrainedanalysisofmetagenomicsamples
AT arjonamedinajosea computationalworkflowforthefinegrainedanalysisofmetagenomicsamples
AT torrenooscar computationalworkflowforthefinegrainedanalysisofmetagenomicsamples
AT ulzurruneugenia computationalworkflowforthefinegrainedanalysisofmetagenomicsamples
AT trellesoswaldo computationalworkflowforthefinegrainedanalysisofmetagenomicsamples