Cargando…

NEAT: a framework for building fully automated NGS pipelines and analyses

BACKGROUND: The analysis of next generation sequencing (NGS) has become a standard task for many laboratories in the life sciences. Though there exists several tools to support users in the manipulation of such datasets on various levels, few are built on the basis of vertical integration. Here, we...

Descripción completa

Detalles Bibliográficos
Autor principal: Schorderet, Patrick
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4736651/
https://www.ncbi.nlm.nih.gov/pubmed/26830846
http://dx.doi.org/10.1186/s12859-016-0902-3
Descripción
Sumario:BACKGROUND: The analysis of next generation sequencing (NGS) has become a standard task for many laboratories in the life sciences. Though there exists several tools to support users in the manipulation of such datasets on various levels, few are built on the basis of vertical integration. Here, we present the NExt generation Analysis Toolbox (NEAT) that allows non-expert users including wet-lab scientists to comprehensively build, run and analyze NGS data through double-clickable executables without the need of any programming experience. RESULTS: In comparison to many publicly available tools including Galaxy, NEAT provides three main advantages: (1) Through the development of double-clickable executables, NEAT is efficient (completes within <24 hours), easy to implement and intuitive; (2) Storage space, maximum number of job submissions, wall time and cluster-specific parameters can be customized as NEAT is run on the institution’s cluster; (3) NEAT allows users to visualize and summarize NGS data rapidly and efficiently using various built-in exploratory data analysis tools including metagenomic and differentially expressed gene analysis. To simplify the control of the workflow, NEAT projects are built around a unique and centralized file containing sample names, replicates, conditions, antibodies, alignment-, filtering- and peak calling parameters as well as cluster-specific paths and settings. Moreover, the small-sized files produced by NEAT allow users to easily manipulate, consolidate and share datasets from different users and institutions. CONCLUSIONS: NEAT provides biologists and bioinformaticians with a robust, efficient and comprehensive tool for the analysis of massive NGS datasets. Frameworks such as NEAT not only allow novice users to overcome the increasing number of technical hurdles due to the complexity of manipulating large datasets, but provide more advance users with tools that ensure high reproducibility standards in the NGS era. NEAT is publically available at https://github.com/pschorderet/NEAT. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0902-3) contains supplementary material, which is available to authorized users.