Cargando…

NG-meta-profiler: fast processing of metagenomes using NGLess, a domain-specific language

BACKGROUND: Shotgun metagenomes contain a sample of all the genomic material in an environment, allowing for the characterization of a microbial community. In order to understand these communities, bioinformatics methods are crucial. A common first step in processing metagenomes is to compute abunda...

Descripción completa

Detalles Bibliográficos
Autores principales: Coelho, Luis Pedro, Alves, Renato, Monteiro, Paulo, Huerta-Cepas, Jaime, Freitas, Ana Teresa, Bork, Peer
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6547473/
https://www.ncbi.nlm.nih.gov/pubmed/31159881
http://dx.doi.org/10.1186/s40168-019-0684-8
_version_ 1783423684235493376
author Coelho, Luis Pedro
Alves, Renato
Monteiro, Paulo
Huerta-Cepas, Jaime
Freitas, Ana Teresa
Bork, Peer
author_facet Coelho, Luis Pedro
Alves, Renato
Monteiro, Paulo
Huerta-Cepas, Jaime
Freitas, Ana Teresa
Bork, Peer
author_sort Coelho, Luis Pedro
collection PubMed
description BACKGROUND: Shotgun metagenomes contain a sample of all the genomic material in an environment, allowing for the characterization of a microbial community. In order to understand these communities, bioinformatics methods are crucial. A common first step in processing metagenomes is to compute abundance estimates of different taxonomic or functional groups from the raw sequencing data. Given the breadth of the field, computational solutions need to be flexible and extensible, enabling the combination of different tools into a larger pipeline. RESULTS: We present NGLess and NG-meta-profiler. NGLess is a domain specific language for describing next-generation sequence processing pipelines. It was developed with the goal of enabling user-friendly computational reproducibility. It provides built-in support for many common operations on sequencing data and is extensible with external tools with configuration files. Using this framework, we developed NG-meta-profiler, a fast profiler for metagenomes which performs sequence preprocessing, mapping to bundled databases, filtering of the mapping results, and profiling (taxonomic and functional). It is significantly faster than either MOCAT2 or htseq-count and (as it builds on NGLess) its results are perfectly reproducible. CONCLUSIONS: NG-meta-profiler is a high-performance solution for metagenomics processing built on NGLess. It can be used as-is to execute standard analyses or serve as the starting point for customization in a perfectly reproducible fashion. NGLess and NG-meta-profiler are open source software (under the liberal MIT license) and can be downloaded from https://ngless.embl.de or installed through bioconda. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40168-019-0684-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6547473
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-65474732019-06-06 NG-meta-profiler: fast processing of metagenomes using NGLess, a domain-specific language Coelho, Luis Pedro Alves, Renato Monteiro, Paulo Huerta-Cepas, Jaime Freitas, Ana Teresa Bork, Peer Microbiome Software BACKGROUND: Shotgun metagenomes contain a sample of all the genomic material in an environment, allowing for the characterization of a microbial community. In order to understand these communities, bioinformatics methods are crucial. A common first step in processing metagenomes is to compute abundance estimates of different taxonomic or functional groups from the raw sequencing data. Given the breadth of the field, computational solutions need to be flexible and extensible, enabling the combination of different tools into a larger pipeline. RESULTS: We present NGLess and NG-meta-profiler. NGLess is a domain specific language for describing next-generation sequence processing pipelines. It was developed with the goal of enabling user-friendly computational reproducibility. It provides built-in support for many common operations on sequencing data and is extensible with external tools with configuration files. Using this framework, we developed NG-meta-profiler, a fast profiler for metagenomes which performs sequence preprocessing, mapping to bundled databases, filtering of the mapping results, and profiling (taxonomic and functional). It is significantly faster than either MOCAT2 or htseq-count and (as it builds on NGLess) its results are perfectly reproducible. CONCLUSIONS: NG-meta-profiler is a high-performance solution for metagenomics processing built on NGLess. It can be used as-is to execute standard analyses or serve as the starting point for customization in a perfectly reproducible fashion. NGLess and NG-meta-profiler are open source software (under the liberal MIT license) and can be downloaded from https://ngless.embl.de or installed through bioconda. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40168-019-0684-8) contains supplementary material, which is available to authorized users. BioMed Central 2019-06-03 /pmc/articles/PMC6547473/ /pubmed/31159881 http://dx.doi.org/10.1186/s40168-019-0684-8 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Coelho, Luis Pedro
Alves, Renato
Monteiro, Paulo
Huerta-Cepas, Jaime
Freitas, Ana Teresa
Bork, Peer
NG-meta-profiler: fast processing of metagenomes using NGLess, a domain-specific language
title NG-meta-profiler: fast processing of metagenomes using NGLess, a domain-specific language
title_full NG-meta-profiler: fast processing of metagenomes using NGLess, a domain-specific language
title_fullStr NG-meta-profiler: fast processing of metagenomes using NGLess, a domain-specific language
title_full_unstemmed NG-meta-profiler: fast processing of metagenomes using NGLess, a domain-specific language
title_short NG-meta-profiler: fast processing of metagenomes using NGLess, a domain-specific language
title_sort ng-meta-profiler: fast processing of metagenomes using ngless, a domain-specific language
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6547473/
https://www.ncbi.nlm.nih.gov/pubmed/31159881
http://dx.doi.org/10.1186/s40168-019-0684-8
work_keys_str_mv AT coelholuispedro ngmetaprofilerfastprocessingofmetagenomesusingnglessadomainspecificlanguage
AT alvesrenato ngmetaprofilerfastprocessingofmetagenomesusingnglessadomainspecificlanguage
AT monteiropaulo ngmetaprofilerfastprocessingofmetagenomesusingnglessadomainspecificlanguage
AT huertacepasjaime ngmetaprofilerfastprocessingofmetagenomesusingnglessadomainspecificlanguage
AT freitasanateresa ngmetaprofilerfastprocessingofmetagenomesusingnglessadomainspecificlanguage
AT borkpeer ngmetaprofilerfastprocessingofmetagenomesusingnglessadomainspecificlanguage