Cargando…
CAMAMED: a pipeline for composition-aware mapping-based analysis of metagenomic data
Metagenomics is the study of genomic DNA recovered from a microbial community. Both assembly-based and mapping-based methods have been used to analyze metagenomic data. When appropriate gene catalogs are available, mapping-based methods are preferred over assembly based approaches, especially for an...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7787360/ https://www.ncbi.nlm.nih.gov/pubmed/33575649 http://dx.doi.org/10.1093/nargab/lqaa107 |
_version_ | 1783632808252538880 |
---|---|
author | Norouzi-Beirami, Mohammad H Marashi, Sayed-Amir Banaei-Moghaddam, Ali M Kavousi, Kaveh |
author_facet | Norouzi-Beirami, Mohammad H Marashi, Sayed-Amir Banaei-Moghaddam, Ali M Kavousi, Kaveh |
author_sort | Norouzi-Beirami, Mohammad H |
collection | PubMed |
description | Metagenomics is the study of genomic DNA recovered from a microbial community. Both assembly-based and mapping-based methods have been used to analyze metagenomic data. When appropriate gene catalogs are available, mapping-based methods are preferred over assembly based approaches, especially for analyzing the data at the functional level. In this study, we introduce CAMAMED as a composition-aware mapping-based metagenomic data analysis pipeline. This pipeline can analyze metagenomic samples at both taxonomic and functional profiling levels. Using this pipeline, metagenome sequences can be mapped to non-redundant gene catalogs and the gene frequency in the samples are obtained. Due to the highly compositional nature of metagenomic data, the cumulative sum-scaling method is used at both taxa and gene levels for compositional data analysis in our pipeline. Additionally, by mapping the genes to the KEGG database, annotations related to each gene can be extracted at different functional levels such as KEGG ortholog groups, enzyme commission numbers and reactions. Furthermore, the pipeline enables the user to identify potential biomarkers in case-control metagenomic samples by investigating functional differences. The source code for this software is available from https://github.com/mhnb/camamed. Also, the ready to use Docker images are available at https://hub.docker.com. |
format | Online Article Text |
id | pubmed-7787360 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-77873602021-02-10 CAMAMED: a pipeline for composition-aware mapping-based analysis of metagenomic data Norouzi-Beirami, Mohammad H Marashi, Sayed-Amir Banaei-Moghaddam, Ali M Kavousi, Kaveh NAR Genom Bioinform Application Notes Metagenomics is the study of genomic DNA recovered from a microbial community. Both assembly-based and mapping-based methods have been used to analyze metagenomic data. When appropriate gene catalogs are available, mapping-based methods are preferred over assembly based approaches, especially for analyzing the data at the functional level. In this study, we introduce CAMAMED as a composition-aware mapping-based metagenomic data analysis pipeline. This pipeline can analyze metagenomic samples at both taxonomic and functional profiling levels. Using this pipeline, metagenome sequences can be mapped to non-redundant gene catalogs and the gene frequency in the samples are obtained. Due to the highly compositional nature of metagenomic data, the cumulative sum-scaling method is used at both taxa and gene levels for compositional data analysis in our pipeline. Additionally, by mapping the genes to the KEGG database, annotations related to each gene can be extracted at different functional levels such as KEGG ortholog groups, enzyme commission numbers and reactions. Furthermore, the pipeline enables the user to identify potential biomarkers in case-control metagenomic samples by investigating functional differences. The source code for this software is available from https://github.com/mhnb/camamed. Also, the ready to use Docker images are available at https://hub.docker.com. Oxford University Press 2021-01-06 /pmc/articles/PMC7787360/ /pubmed/33575649 http://dx.doi.org/10.1093/nargab/lqaa107 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Application Notes Norouzi-Beirami, Mohammad H Marashi, Sayed-Amir Banaei-Moghaddam, Ali M Kavousi, Kaveh CAMAMED: a pipeline for composition-aware mapping-based analysis of metagenomic data |
title | CAMAMED: a pipeline for composition-aware mapping-based analysis of metagenomic data |
title_full | CAMAMED: a pipeline for composition-aware mapping-based analysis of metagenomic data |
title_fullStr | CAMAMED: a pipeline for composition-aware mapping-based analysis of metagenomic data |
title_full_unstemmed | CAMAMED: a pipeline for composition-aware mapping-based analysis of metagenomic data |
title_short | CAMAMED: a pipeline for composition-aware mapping-based analysis of metagenomic data |
title_sort | camamed: a pipeline for composition-aware mapping-based analysis of metagenomic data |
topic | Application Notes |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7787360/ https://www.ncbi.nlm.nih.gov/pubmed/33575649 http://dx.doi.org/10.1093/nargab/lqaa107 |
work_keys_str_mv | AT norouzibeiramimohammadh camamedapipelineforcompositionawaremappingbasedanalysisofmetagenomicdata AT marashisayedamir camamedapipelineforcompositionawaremappingbasedanalysisofmetagenomicdata AT banaeimoghaddamalim camamedapipelineforcompositionawaremappingbasedanalysisofmetagenomicdata AT kavousikaveh camamedapipelineforcompositionawaremappingbasedanalysisofmetagenomicdata |