Cargando…
DOE JGI Metagenome Workflow
The DOE Joint Genome Institute (JGI) Metagenome Workflow performs metagenome data processing, including assembly; structural, functional, and taxonomic annotation; and binning of metagenomic data sets that are subsequently included into the Integrated Microbial Genomes and Microbiomes (IMG/M) (I.-M....
Autores principales: | , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Society for Microbiology
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8269246/ https://www.ncbi.nlm.nih.gov/pubmed/34006627 http://dx.doi.org/10.1128/mSystems.00804-20 |
_version_ | 1783720535920738304 |
---|---|
author | Clum, Alicia Huntemann, Marcel Bushnell, Brian Foster, Brian Foster, Bryce Roux, Simon Hajek, Patrick P. Varghese, Neha Mukherjee, Supratim Reddy, T. B. K. Daum, Chris Yoshinaga, Yuko O’Malley, Ronan Seshadri, Rekha Kyrpides, Nikos C. Eloe-Fadrosh, Emiley A. Chen, I-Min A. Copeland, Alex Ivanova, Natalia N. |
author_facet | Clum, Alicia Huntemann, Marcel Bushnell, Brian Foster, Brian Foster, Bryce Roux, Simon Hajek, Patrick P. Varghese, Neha Mukherjee, Supratim Reddy, T. B. K. Daum, Chris Yoshinaga, Yuko O’Malley, Ronan Seshadri, Rekha Kyrpides, Nikos C. Eloe-Fadrosh, Emiley A. Chen, I-Min A. Copeland, Alex Ivanova, Natalia N. |
author_sort | Clum, Alicia |
collection | PubMed |
description | The DOE Joint Genome Institute (JGI) Metagenome Workflow performs metagenome data processing, including assembly; structural, functional, and taxonomic annotation; and binning of metagenomic data sets that are subsequently included into the Integrated Microbial Genomes and Microbiomes (IMG/M) (I.-M. A. Chen, K. Chu, K. Palaniappan, A. Ratner, et al., Nucleic Acids Res, 49:D751–D763, 2021, https://doi.org/10.1093/nar/gkaa939) comparative analysis system and provided for download via the JGI data portal (https://genome.jgi.doe.gov/portal/). This workflow scales to run on thousands of metagenome samples per year, which can vary by the complexity of microbial communities and sequencing depth. Here, we describe the different tools, databases, and parameters used at different steps of the workflow to help with the interpretation of metagenome data available in IMG and to enable researchers to apply this workflow to their own data. We use 20 publicly available sediment metagenomes to illustrate the computing requirements for the different steps and highlight the typical results of data processing. The workflow modules for read filtering and metagenome assembly are available as a workflow description language (WDL) file (https://code.jgi.doe.gov/BFoster/jgi_meta_wdl). The workflow modules for annotation and binning are provided as a service to the user community at https://img.jgi.doe.gov/submit and require filling out the project and associated metadata descriptions in the Genomes OnLine Database (GOLD) (S. Mukherjee, D. Stamatis, J. Bertsch, G. Ovchinnikova, et al., Nucleic Acids Res, 49:D723–D733, 2021, https://doi.org/10.1093/nar/gkaa983). IMPORTANCE The DOE JGI Metagenome Workflow is designed for processing metagenomic data sets starting from Illumina fastq files. It performs data preprocessing, error correction, assembly, structural and functional annotation, and binning. The results of processing are provided in several standard formats, such as fasta and gff, and can be used for subsequent integration into the Integrated Microbial Genomes and Microbiomes (IMG/M) system where they can be compared to a comprehensive set of publicly available metagenomes. As of 30 July 2020, 7,155 JGI metagenomes have been processed by the DOE JGI Metagenome Workflow. Here, we present a metagenome workflow developed at the JGI that generates rich data in standard formats and has been optimized for downstream analyses ranging from assessment of the functional and taxonomic composition of microbial communities to genome-resolved metagenomics and the identification and characterization of novel taxa. This workflow is currently being used to analyze thousands of metagenomic data sets in a consistent and standardized manner. |
format | Online Article Text |
id | pubmed-8269246 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | American Society for Microbiology |
record_format | MEDLINE/PubMed |
spelling | pubmed-82692462021-08-02 DOE JGI Metagenome Workflow Clum, Alicia Huntemann, Marcel Bushnell, Brian Foster, Brian Foster, Bryce Roux, Simon Hajek, Patrick P. Varghese, Neha Mukherjee, Supratim Reddy, T. B. K. Daum, Chris Yoshinaga, Yuko O’Malley, Ronan Seshadri, Rekha Kyrpides, Nikos C. Eloe-Fadrosh, Emiley A. Chen, I-Min A. Copeland, Alex Ivanova, Natalia N. mSystems Methods and Protocols The DOE Joint Genome Institute (JGI) Metagenome Workflow performs metagenome data processing, including assembly; structural, functional, and taxonomic annotation; and binning of metagenomic data sets that are subsequently included into the Integrated Microbial Genomes and Microbiomes (IMG/M) (I.-M. A. Chen, K. Chu, K. Palaniappan, A. Ratner, et al., Nucleic Acids Res, 49:D751–D763, 2021, https://doi.org/10.1093/nar/gkaa939) comparative analysis system and provided for download via the JGI data portal (https://genome.jgi.doe.gov/portal/). This workflow scales to run on thousands of metagenome samples per year, which can vary by the complexity of microbial communities and sequencing depth. Here, we describe the different tools, databases, and parameters used at different steps of the workflow to help with the interpretation of metagenome data available in IMG and to enable researchers to apply this workflow to their own data. We use 20 publicly available sediment metagenomes to illustrate the computing requirements for the different steps and highlight the typical results of data processing. The workflow modules for read filtering and metagenome assembly are available as a workflow description language (WDL) file (https://code.jgi.doe.gov/BFoster/jgi_meta_wdl). The workflow modules for annotation and binning are provided as a service to the user community at https://img.jgi.doe.gov/submit and require filling out the project and associated metadata descriptions in the Genomes OnLine Database (GOLD) (S. Mukherjee, D. Stamatis, J. Bertsch, G. Ovchinnikova, et al., Nucleic Acids Res, 49:D723–D733, 2021, https://doi.org/10.1093/nar/gkaa983). IMPORTANCE The DOE JGI Metagenome Workflow is designed for processing metagenomic data sets starting from Illumina fastq files. It performs data preprocessing, error correction, assembly, structural and functional annotation, and binning. The results of processing are provided in several standard formats, such as fasta and gff, and can be used for subsequent integration into the Integrated Microbial Genomes and Microbiomes (IMG/M) system where they can be compared to a comprehensive set of publicly available metagenomes. As of 30 July 2020, 7,155 JGI metagenomes have been processed by the DOE JGI Metagenome Workflow. Here, we present a metagenome workflow developed at the JGI that generates rich data in standard formats and has been optimized for downstream analyses ranging from assessment of the functional and taxonomic composition of microbial communities to genome-resolved metagenomics and the identification and characterization of novel taxa. This workflow is currently being used to analyze thousands of metagenomic data sets in a consistent and standardized manner. American Society for Microbiology 2021-05-18 /pmc/articles/PMC8269246/ /pubmed/34006627 http://dx.doi.org/10.1128/mSystems.00804-20 Text en https://doi.org/10.1128/AuthorWarrantyLicense.v1This is a work of the U.S. Government and is not subject to copyright protection in the United States. Foreign copyrights may apply. |
spellingShingle | Methods and Protocols Clum, Alicia Huntemann, Marcel Bushnell, Brian Foster, Brian Foster, Bryce Roux, Simon Hajek, Patrick P. Varghese, Neha Mukherjee, Supratim Reddy, T. B. K. Daum, Chris Yoshinaga, Yuko O’Malley, Ronan Seshadri, Rekha Kyrpides, Nikos C. Eloe-Fadrosh, Emiley A. Chen, I-Min A. Copeland, Alex Ivanova, Natalia N. DOE JGI Metagenome Workflow |
title | DOE JGI Metagenome Workflow |
title_full | DOE JGI Metagenome Workflow |
title_fullStr | DOE JGI Metagenome Workflow |
title_full_unstemmed | DOE JGI Metagenome Workflow |
title_short | DOE JGI Metagenome Workflow |
title_sort | doe jgi metagenome workflow |
topic | Methods and Protocols |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8269246/ https://www.ncbi.nlm.nih.gov/pubmed/34006627 http://dx.doi.org/10.1128/mSystems.00804-20 |
work_keys_str_mv | AT clumalicia doejgimetagenomeworkflow AT huntemannmarcel doejgimetagenomeworkflow AT bushnellbrian doejgimetagenomeworkflow AT fosterbrian doejgimetagenomeworkflow AT fosterbryce doejgimetagenomeworkflow AT rouxsimon doejgimetagenomeworkflow AT hajekpatrickp doejgimetagenomeworkflow AT vargheseneha doejgimetagenomeworkflow AT mukherjeesupratim doejgimetagenomeworkflow AT reddytbk doejgimetagenomeworkflow AT daumchris doejgimetagenomeworkflow AT yoshinagayuko doejgimetagenomeworkflow AT omalleyronan doejgimetagenomeworkflow AT seshadrirekha doejgimetagenomeworkflow AT kyrpidesnikosc doejgimetagenomeworkflow AT eloefadroshemileya doejgimetagenomeworkflow AT chenimina doejgimetagenomeworkflow AT copelandalex doejgimetagenomeworkflow AT ivanovanatalian doejgimetagenomeworkflow |