Cargando…

To assemble or not to resemble—A validated Comparative Metatranscriptomics Workflow (CoMW)

BACKGROUND: Metatranscriptomics has been used widely for investigation and quantification of microbial communities’ activity in response to external stimuli. By assessing the genes expressed, metatranscriptomics provides an understanding of the interactions between different major functional guilds...

Descripción completa

Detalles Bibliográficos
Autores principales: Anwar, Muhammad Zohaib, Lanzen, Anders, Bang-Andreasen, Toke, Jacobsen, Carsten Suhr
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6667343/
https://www.ncbi.nlm.nih.gov/pubmed/31363751
http://dx.doi.org/10.1093/gigascience/giz096
_version_ 1783440017296719872
author Anwar, Muhammad Zohaib
Lanzen, Anders
Bang-Andreasen, Toke
Jacobsen, Carsten Suhr
author_facet Anwar, Muhammad Zohaib
Lanzen, Anders
Bang-Andreasen, Toke
Jacobsen, Carsten Suhr
author_sort Anwar, Muhammad Zohaib
collection PubMed
description BACKGROUND: Metatranscriptomics has been used widely for investigation and quantification of microbial communities’ activity in response to external stimuli. By assessing the genes expressed, metatranscriptomics provides an understanding of the interactions between different major functional guilds and the environment. Here, we present a de novo assembly-based Comparative Metatranscriptomics Workflow (CoMW) implemented in a modular, reproducible structure. Metatranscriptomics typically uses short sequence reads, which can either be directly aligned to external reference databases (“assembly-free approach”) or first assembled into contigs before alignment (“assembly-based approach”). We also compare CoMW (assembly-based implementation) with an assembly-free alternative workflow, using simulated and real-world metatranscriptomes from Arctic and temperate terrestrial environments. We evaluate their accuracy in precision and recall using generic and specialized hierarchical protein databases. RESULTS: CoMW provided significantly fewer false-positive results, resulting in more precise identification and quantification of functional genes in metatranscriptomes. Using the comprehensive database M5nr, the assembly-based approach identified genes with only 0.6% false-positive results at thresholds ranging from inclusive to stringent compared with the assembly-free approach, which yielded up to 15% false-positive results. Using specialized databases (carbohydrate-active enzyme and nitrogen cycle), the assembly-based approach identified and quantified genes with 3–5 times fewer false-positive results. We also evaluated the impact of both approaches on real-world datasets. CONCLUSIONS: We present an open source de novo assembly-based CoMW. Our benchmarking findings support assembling short reads into contigs before alignment to a reference database because this provides higher precision and minimizes false-positive results.
format Online
Article
Text
id pubmed-6667343
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-66673432019-08-05 To assemble or not to resemble—A validated Comparative Metatranscriptomics Workflow (CoMW) Anwar, Muhammad Zohaib Lanzen, Anders Bang-Andreasen, Toke Jacobsen, Carsten Suhr Gigascience Technical Note BACKGROUND: Metatranscriptomics has been used widely for investigation and quantification of microbial communities’ activity in response to external stimuli. By assessing the genes expressed, metatranscriptomics provides an understanding of the interactions between different major functional guilds and the environment. Here, we present a de novo assembly-based Comparative Metatranscriptomics Workflow (CoMW) implemented in a modular, reproducible structure. Metatranscriptomics typically uses short sequence reads, which can either be directly aligned to external reference databases (“assembly-free approach”) or first assembled into contigs before alignment (“assembly-based approach”). We also compare CoMW (assembly-based implementation) with an assembly-free alternative workflow, using simulated and real-world metatranscriptomes from Arctic and temperate terrestrial environments. We evaluate their accuracy in precision and recall using generic and specialized hierarchical protein databases. RESULTS: CoMW provided significantly fewer false-positive results, resulting in more precise identification and quantification of functional genes in metatranscriptomes. Using the comprehensive database M5nr, the assembly-based approach identified genes with only 0.6% false-positive results at thresholds ranging from inclusive to stringent compared with the assembly-free approach, which yielded up to 15% false-positive results. Using specialized databases (carbohydrate-active enzyme and nitrogen cycle), the assembly-based approach identified and quantified genes with 3–5 times fewer false-positive results. We also evaluated the impact of both approaches on real-world datasets. CONCLUSIONS: We present an open source de novo assembly-based CoMW. Our benchmarking findings support assembling short reads into contigs before alignment to a reference database because this provides higher precision and minimizes false-positive results. Oxford University Press 2019-07-30 /pmc/articles/PMC6667343/ /pubmed/31363751 http://dx.doi.org/10.1093/gigascience/giz096 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Anwar, Muhammad Zohaib
Lanzen, Anders
Bang-Andreasen, Toke
Jacobsen, Carsten Suhr
To assemble or not to resemble—A validated Comparative Metatranscriptomics Workflow (CoMW)
title To assemble or not to resemble—A validated Comparative Metatranscriptomics Workflow (CoMW)
title_full To assemble or not to resemble—A validated Comparative Metatranscriptomics Workflow (CoMW)
title_fullStr To assemble or not to resemble—A validated Comparative Metatranscriptomics Workflow (CoMW)
title_full_unstemmed To assemble or not to resemble—A validated Comparative Metatranscriptomics Workflow (CoMW)
title_short To assemble or not to resemble—A validated Comparative Metatranscriptomics Workflow (CoMW)
title_sort to assemble or not to resemble—a validated comparative metatranscriptomics workflow (comw)
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6667343/
https://www.ncbi.nlm.nih.gov/pubmed/31363751
http://dx.doi.org/10.1093/gigascience/giz096
work_keys_str_mv AT anwarmuhammadzohaib toassembleornottoresembleavalidatedcomparativemetatranscriptomicsworkflowcomw
AT lanzenanders toassembleornottoresembleavalidatedcomparativemetatranscriptomicsworkflowcomw
AT bangandreasentoke toassembleornottoresembleavalidatedcomparativemetatranscriptomicsworkflowcomw
AT jacobsencarstensuhr toassembleornottoresembleavalidatedcomparativemetatranscriptomicsworkflowcomw