Cargando…

MetaLAFFA: a flexible, end-to-end, distributed computing-compatible metagenomic functional annotation pipeline

BACKGROUND: Microbial communities have become an important subject of research across multiple disciplines in recent years. These communities are often examined via shotgun metagenomic sequencing, a technology which can offer unique insights into the genomic content of a microbial community. Functio...

Descripción completa

Detalles Bibliográficos
Autores principales: Eng, Alexander, Verster, Adrian J., Borenstein, Elhanan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7579964/
https://www.ncbi.nlm.nih.gov/pubmed/33087062
http://dx.doi.org/10.1186/s12859-020-03815-9
_version_ 1783598700991348736
author Eng, Alexander
Verster, Adrian J.
Borenstein, Elhanan
author_facet Eng, Alexander
Verster, Adrian J.
Borenstein, Elhanan
author_sort Eng, Alexander
collection PubMed
description BACKGROUND: Microbial communities have become an important subject of research across multiple disciplines in recent years. These communities are often examined via shotgun metagenomic sequencing, a technology which can offer unique insights into the genomic content of a microbial community. Functional annotation of shotgun metagenomic data has become an increasingly popular method for identifying the aggregate functional capacities encoded by the community’s constituent microbes. Currently available metagenomic functional annotation pipelines, however, suffer from several shortcomings, including limited pipeline customization options, lack of standard raw sequence data pre-processing, and insufficient capabilities for integration with distributed computing systems. RESULTS: Here we introduce MetaLAFFA, a functional annotation pipeline designed to take unfiltered shotgun metagenomic data as input and generate functional profiles. MetaLAFFA is implemented as a Snakemake pipeline, which enables convenient integration with distributed computing clusters, allowing users to take full advantage of available computing resources. Default pipeline settings allow new users to run MetaLAFFA according to common practices while a Python module-based configuration system provides advanced users with a flexible interface for pipeline customization. MetaLAFFA also generates summary statistics for each step in the pipeline so that users can better understand pre-processing and annotation quality. CONCLUSIONS: MetaLAFFA is a new end-to-end metagenomic functional annotation pipeline with distributed computing compatibility and flexible customization options. MetaLAFFA source code is available at https://github.com/borenstein-lab/MetaLAFFA and can be installed via Conda as described in the accompanying documentation.
format Online
Article
Text
id pubmed-7579964
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-75799642020-10-22 MetaLAFFA: a flexible, end-to-end, distributed computing-compatible metagenomic functional annotation pipeline Eng, Alexander Verster, Adrian J. Borenstein, Elhanan BMC Bioinformatics Software BACKGROUND: Microbial communities have become an important subject of research across multiple disciplines in recent years. These communities are often examined via shotgun metagenomic sequencing, a technology which can offer unique insights into the genomic content of a microbial community. Functional annotation of shotgun metagenomic data has become an increasingly popular method for identifying the aggregate functional capacities encoded by the community’s constituent microbes. Currently available metagenomic functional annotation pipelines, however, suffer from several shortcomings, including limited pipeline customization options, lack of standard raw sequence data pre-processing, and insufficient capabilities for integration with distributed computing systems. RESULTS: Here we introduce MetaLAFFA, a functional annotation pipeline designed to take unfiltered shotgun metagenomic data as input and generate functional profiles. MetaLAFFA is implemented as a Snakemake pipeline, which enables convenient integration with distributed computing clusters, allowing users to take full advantage of available computing resources. Default pipeline settings allow new users to run MetaLAFFA according to common practices while a Python module-based configuration system provides advanced users with a flexible interface for pipeline customization. MetaLAFFA also generates summary statistics for each step in the pipeline so that users can better understand pre-processing and annotation quality. CONCLUSIONS: MetaLAFFA is a new end-to-end metagenomic functional annotation pipeline with distributed computing compatibility and flexible customization options. MetaLAFFA source code is available at https://github.com/borenstein-lab/MetaLAFFA and can be installed via Conda as described in the accompanying documentation. BioMed Central 2020-10-21 /pmc/articles/PMC7579964/ /pubmed/33087062 http://dx.doi.org/10.1186/s12859-020-03815-9 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Eng, Alexander
Verster, Adrian J.
Borenstein, Elhanan
MetaLAFFA: a flexible, end-to-end, distributed computing-compatible metagenomic functional annotation pipeline
title MetaLAFFA: a flexible, end-to-end, distributed computing-compatible metagenomic functional annotation pipeline
title_full MetaLAFFA: a flexible, end-to-end, distributed computing-compatible metagenomic functional annotation pipeline
title_fullStr MetaLAFFA: a flexible, end-to-end, distributed computing-compatible metagenomic functional annotation pipeline
title_full_unstemmed MetaLAFFA: a flexible, end-to-end, distributed computing-compatible metagenomic functional annotation pipeline
title_short MetaLAFFA: a flexible, end-to-end, distributed computing-compatible metagenomic functional annotation pipeline
title_sort metalaffa: a flexible, end-to-end, distributed computing-compatible metagenomic functional annotation pipeline
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7579964/
https://www.ncbi.nlm.nih.gov/pubmed/33087062
http://dx.doi.org/10.1186/s12859-020-03815-9
work_keys_str_mv AT engalexander metalaffaaflexibleendtoenddistributedcomputingcompatiblemetagenomicfunctionalannotationpipeline
AT versteradrianj metalaffaaflexibleendtoenddistributedcomputingcompatiblemetagenomicfunctionalannotationpipeline
AT borensteinelhanan metalaffaaflexibleendtoenddistributedcomputingcompatiblemetagenomicfunctionalannotationpipeline