Cargando…

Integrative workflows for metagenomic analysis

The rapid evolution of all sequencing technologies, described by the term Next Generation Sequencing (NGS), have revolutionized metagenomic analysis. They constitute a combination of high-throughput analytical protocols, coupled to delicate measuring techniques, in order to potentially discover, pro...

Descripción completa

Detalles Bibliográficos
Autores principales: Ladoukakis, Efthymios, Kolisis, Fragiskos N., Chatziioannou, Aristotelis A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4237130/
https://www.ncbi.nlm.nih.gov/pubmed/25478562
http://dx.doi.org/10.3389/fcell.2014.00070
_version_ 1782345297669652480
author Ladoukakis, Efthymios
Kolisis, Fragiskos N.
Chatziioannou, Aristotelis A.
author_facet Ladoukakis, Efthymios
Kolisis, Fragiskos N.
Chatziioannou, Aristotelis A.
author_sort Ladoukakis, Efthymios
collection PubMed
description The rapid evolution of all sequencing technologies, described by the term Next Generation Sequencing (NGS), have revolutionized metagenomic analysis. They constitute a combination of high-throughput analytical protocols, coupled to delicate measuring techniques, in order to potentially discover, properly assemble and map allelic sequences to the correct genomes, achieving particularly high yields for only a fraction of the cost of traditional processes (i.e., Sanger). From a bioinformatic perspective, this boils down to many GB of data being generated from each single sequencing experiment, rendering the management or even the storage, critical bottlenecks with respect to the overall analytical endeavor. The enormous complexity is even more aggravated by the versatility of the processing steps available, represented by the numerous bioinformatic tools that are essential, for each analytical task, in order to fully unveil the genetic content of a metagenomic dataset. These disparate tasks range from simple, nonetheless non-trivial, quality control of raw data to exceptionally complex protein annotation procedures, requesting a high level of expertise for their proper application or the neat implementation of the whole workflow. Furthermore, a bioinformatic analysis of such scale, requires grand computational resources, imposing as the sole realistic solution, the utilization of cloud computing infrastructures. In this review article we discuss different, integrative, bioinformatic solutions available, which address the aforementioned issues, by performing a critical assessment of the available automated pipelines for data management, quality control, and annotation of metagenomic data, embracing various, major sequencing technologies and applications.
format Online
Article
Text
id pubmed-4237130
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-42371302014-12-04 Integrative workflows for metagenomic analysis Ladoukakis, Efthymios Kolisis, Fragiskos N. Chatziioannou, Aristotelis A. Front Cell Dev Biol Physiology The rapid evolution of all sequencing technologies, described by the term Next Generation Sequencing (NGS), have revolutionized metagenomic analysis. They constitute a combination of high-throughput analytical protocols, coupled to delicate measuring techniques, in order to potentially discover, properly assemble and map allelic sequences to the correct genomes, achieving particularly high yields for only a fraction of the cost of traditional processes (i.e., Sanger). From a bioinformatic perspective, this boils down to many GB of data being generated from each single sequencing experiment, rendering the management or even the storage, critical bottlenecks with respect to the overall analytical endeavor. The enormous complexity is even more aggravated by the versatility of the processing steps available, represented by the numerous bioinformatic tools that are essential, for each analytical task, in order to fully unveil the genetic content of a metagenomic dataset. These disparate tasks range from simple, nonetheless non-trivial, quality control of raw data to exceptionally complex protein annotation procedures, requesting a high level of expertise for their proper application or the neat implementation of the whole workflow. Furthermore, a bioinformatic analysis of such scale, requires grand computational resources, imposing as the sole realistic solution, the utilization of cloud computing infrastructures. In this review article we discuss different, integrative, bioinformatic solutions available, which address the aforementioned issues, by performing a critical assessment of the available automated pipelines for data management, quality control, and annotation of metagenomic data, embracing various, major sequencing technologies and applications. Frontiers Media S.A. 2014-11-19 /pmc/articles/PMC4237130/ /pubmed/25478562 http://dx.doi.org/10.3389/fcell.2014.00070 Text en Copyright © 2014 Ladoukakis, Kolisis and Chatziioannou. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Physiology
Ladoukakis, Efthymios
Kolisis, Fragiskos N.
Chatziioannou, Aristotelis A.
Integrative workflows for metagenomic analysis
title Integrative workflows for metagenomic analysis
title_full Integrative workflows for metagenomic analysis
title_fullStr Integrative workflows for metagenomic analysis
title_full_unstemmed Integrative workflows for metagenomic analysis
title_short Integrative workflows for metagenomic analysis
title_sort integrative workflows for metagenomic analysis
topic Physiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4237130/
https://www.ncbi.nlm.nih.gov/pubmed/25478562
http://dx.doi.org/10.3389/fcell.2014.00070
work_keys_str_mv AT ladoukakisefthymios integrativeworkflowsformetagenomicanalysis
AT kolisisfragiskosn integrativeworkflowsformetagenomicanalysis
AT chatziioannouaristotelisa integrativeworkflowsformetagenomicanalysis