Cargando…
The GEN-ERA toolbox: unified and reproducible workflows for research in microbial genomics
BACKGROUND: Microbial culture collections play a key role in taxonomy by studying the diversity of their strains and providing well-characterized biological material to the scientific community for fundamental and applied research. These microbial resource centers thus need to implement new standard...
Autores principales: | , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10084500/ https://www.ncbi.nlm.nih.gov/pubmed/37036103 http://dx.doi.org/10.1093/gigascience/giad022 |
_version_ | 1785021750632775680 |
---|---|
author | Cornet, Luc Durieu, Benoit Baert, Frederik D'hooge, Elizabet Colignon, David Meunier, Loic Lupo, Valérian Cleenwerck, Ilse Daniel, Heide-Marie Rigouts, Leen Sirjacobs, Damien Declerck, Stéphane Vandamme, Peter Wilmotte, Annick Baurain, Denis Becker, Pierre |
author_facet | Cornet, Luc Durieu, Benoit Baert, Frederik D'hooge, Elizabet Colignon, David Meunier, Loic Lupo, Valérian Cleenwerck, Ilse Daniel, Heide-Marie Rigouts, Leen Sirjacobs, Damien Declerck, Stéphane Vandamme, Peter Wilmotte, Annick Baurain, Denis Becker, Pierre |
author_sort | Cornet, Luc |
collection | PubMed |
description | BACKGROUND: Microbial culture collections play a key role in taxonomy by studying the diversity of their strains and providing well-characterized biological material to the scientific community for fundamental and applied research. These microbial resource centers thus need to implement new standards in species delineation, including whole-genome sequencing and phylogenomics. In this context, the genomic needs of the Belgian Coordinated Collections of Microorganisms were studied, resulting in the GEN-ERA toolbox. The latter is a unified cluster of bioinformatic workflows dedicated to both bacteria and small eukaryotes (e.g., yeasts). FINDINGS: This public toolbox allows researchers without a specific training in bioinformatics to perform robust phylogenomic analyses. Hence, it facilitates all steps from genome downloading and quality assessment, including genomic contamination estimation, to tree reconstruction. It also offers workflows for average nucleotide identity comparisons and metabolic modeling. TECHNICAL DETAILS: Nextflow workflows are launched by a single command and are available on the GEN-ERA GitHub repository (https://github.com/Lcornet/GENERA). All the workflows are based on Singularity containers to increase reproducibility. TESTING: The toolbox was developed for a diversity of microorganisms, including bacteria and fungi. It was further tested on an empirical dataset of 18 (meta)genomes of early branching Cyanobacteria, providing the most up-to-date phylogenomic analysis of the Gloeobacterales order, the first group to diverge in the evolutionary tree of Cyanobacteria. CONCLUSION: The GEN-ERA toolbox can be used to infer completely reproducible comparative genomic and metabolic analyses on prokaryotes and small eukaryotes. Although designed for routine bioinformatics of culture collections, it can also be used by all researchers interested in microbial taxonomy, as exemplified by our case study on Gloeobacterales. |
format | Online Article Text |
id | pubmed-10084500 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-100845002023-04-11 The GEN-ERA toolbox: unified and reproducible workflows for research in microbial genomics Cornet, Luc Durieu, Benoit Baert, Frederik D'hooge, Elizabet Colignon, David Meunier, Loic Lupo, Valérian Cleenwerck, Ilse Daniel, Heide-Marie Rigouts, Leen Sirjacobs, Damien Declerck, Stéphane Vandamme, Peter Wilmotte, Annick Baurain, Denis Becker, Pierre Gigascience Technical Note BACKGROUND: Microbial culture collections play a key role in taxonomy by studying the diversity of their strains and providing well-characterized biological material to the scientific community for fundamental and applied research. These microbial resource centers thus need to implement new standards in species delineation, including whole-genome sequencing and phylogenomics. In this context, the genomic needs of the Belgian Coordinated Collections of Microorganisms were studied, resulting in the GEN-ERA toolbox. The latter is a unified cluster of bioinformatic workflows dedicated to both bacteria and small eukaryotes (e.g., yeasts). FINDINGS: This public toolbox allows researchers without a specific training in bioinformatics to perform robust phylogenomic analyses. Hence, it facilitates all steps from genome downloading and quality assessment, including genomic contamination estimation, to tree reconstruction. It also offers workflows for average nucleotide identity comparisons and metabolic modeling. TECHNICAL DETAILS: Nextflow workflows are launched by a single command and are available on the GEN-ERA GitHub repository (https://github.com/Lcornet/GENERA). All the workflows are based on Singularity containers to increase reproducibility. TESTING: The toolbox was developed for a diversity of microorganisms, including bacteria and fungi. It was further tested on an empirical dataset of 18 (meta)genomes of early branching Cyanobacteria, providing the most up-to-date phylogenomic analysis of the Gloeobacterales order, the first group to diverge in the evolutionary tree of Cyanobacteria. CONCLUSION: The GEN-ERA toolbox can be used to infer completely reproducible comparative genomic and metabolic analyses on prokaryotes and small eukaryotes. Although designed for routine bioinformatics of culture collections, it can also be used by all researchers interested in microbial taxonomy, as exemplified by our case study on Gloeobacterales. Oxford University Press 2023-04-10 /pmc/articles/PMC10084500/ /pubmed/37036103 http://dx.doi.org/10.1093/gigascience/giad022 Text en © The Author(s) 2023. Published by Oxford University Press GigaScience. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Technical Note Cornet, Luc Durieu, Benoit Baert, Frederik D'hooge, Elizabet Colignon, David Meunier, Loic Lupo, Valérian Cleenwerck, Ilse Daniel, Heide-Marie Rigouts, Leen Sirjacobs, Damien Declerck, Stéphane Vandamme, Peter Wilmotte, Annick Baurain, Denis Becker, Pierre The GEN-ERA toolbox: unified and reproducible workflows for research in microbial genomics |
title | The GEN-ERA toolbox: unified and reproducible workflows for research in microbial genomics |
title_full | The GEN-ERA toolbox: unified and reproducible workflows for research in microbial genomics |
title_fullStr | The GEN-ERA toolbox: unified and reproducible workflows for research in microbial genomics |
title_full_unstemmed | The GEN-ERA toolbox: unified and reproducible workflows for research in microbial genomics |
title_short | The GEN-ERA toolbox: unified and reproducible workflows for research in microbial genomics |
title_sort | gen-era toolbox: unified and reproducible workflows for research in microbial genomics |
topic | Technical Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10084500/ https://www.ncbi.nlm.nih.gov/pubmed/37036103 http://dx.doi.org/10.1093/gigascience/giad022 |
work_keys_str_mv | AT cornetluc thegeneratoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT durieubenoit thegeneratoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT baertfrederik thegeneratoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT dhoogeelizabet thegeneratoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT colignondavid thegeneratoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT meunierloic thegeneratoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT lupovalerian thegeneratoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT cleenwerckilse thegeneratoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT danielheidemarie thegeneratoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT rigoutsleen thegeneratoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT sirjacobsdamien thegeneratoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT declerckstephane thegeneratoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT vandammepeter thegeneratoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT wilmotteannick thegeneratoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT bauraindenis thegeneratoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT beckerpierre thegeneratoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT cornetluc generatoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT durieubenoit generatoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT baertfrederik generatoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT dhoogeelizabet generatoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT colignondavid generatoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT meunierloic generatoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT lupovalerian generatoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT cleenwerckilse generatoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT danielheidemarie generatoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT rigoutsleen generatoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT sirjacobsdamien generatoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT declerckstephane generatoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT vandammepeter generatoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT wilmotteannick generatoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT bauraindenis generatoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics AT beckerpierre generatoolboxunifiedandreproducibleworkflowsforresearchinmicrobialgenomics |