Cargando…
Geniac: Automatic Configuration GENerator and Installer for nextflow pipelines
With the advent of high-throughput biotechnological platforms and their ever-growing capacity, life science has turned into a digitized, computational and data-intensive discipline. As a consequence, standard analysis with a bioinformatics pipeline in the context of routine production has become a c...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
F1000 Research Limited
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10445886/ https://www.ncbi.nlm.nih.gov/pubmed/37645091 http://dx.doi.org/10.12688/openreseurope.13861.2 |
_version_ | 1785094279642742784 |
---|---|
author | Allain, Fabrice Roméjon, Julien La Rosa, Philippe Jarlier, Frédéric Servant, Nicolas Hupé, Philippe |
author_facet | Allain, Fabrice Roméjon, Julien La Rosa, Philippe Jarlier, Frédéric Servant, Nicolas Hupé, Philippe |
author_sort | Allain, Fabrice |
collection | PubMed |
description | With the advent of high-throughput biotechnological platforms and their ever-growing capacity, life science has turned into a digitized, computational and data-intensive discipline. As a consequence, standard analysis with a bioinformatics pipeline in the context of routine production has become a challenge such that the data can be processed in real-time and delivered to the end-users as fast as possible. The usage of workflow management systems along with packaging systems and containerization technologies offer an opportunity to tackle this challenge. While very powerful, they can be used and combined in many multiple ways which may differ from one developer to another. Therefore, promoting the homogeneity of the workflow implementation requires guidelines and protocols which detail how the source code of the bioinformatics pipeline should be written and organized to ensure its usability, maintainability, interoperability, sustainability, portability, reproducibility, scalability and efficiency. Capitalizing on Nextflow, Conda, Docker, Singularity and the nf-core initiative, we propose a set of best practices along the development life cycle of the bioinformatics pipeline and deployment for production operations which target different expert communities including i) the bioinformaticians and statisticians ii) the software engineers and iii) the data managers and core facility engineers. We implemented Geniac (Automatic Configuration GENerator and Installer for nextflow pipelines) which consists of a toolbox with three components: i) a technical documentation available at https://geniac.readthedocs.io to detail coding guidelines for the bioinformatics pipeline with Nextflow, ii) a command line interface with a linter to check that the code respects the guidelines, and iii) an add-on to generate configuration files, build the containers and deploy the pipeline. The Geniac toolbox aims at the harmonization of development practices across developers and automation of the generation of configuration files and containers by parsing the source code of the Nextflow pipeline. |
format | Online Article Text |
id | pubmed-10445886 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | F1000 Research Limited |
record_format | MEDLINE/PubMed |
spelling | pubmed-104458862023-08-29 Geniac: Automatic Configuration GENerator and Installer for nextflow pipelines Allain, Fabrice Roméjon, Julien La Rosa, Philippe Jarlier, Frédéric Servant, Nicolas Hupé, Philippe Open Res Eur Software Tool Article With the advent of high-throughput biotechnological platforms and their ever-growing capacity, life science has turned into a digitized, computational and data-intensive discipline. As a consequence, standard analysis with a bioinformatics pipeline in the context of routine production has become a challenge such that the data can be processed in real-time and delivered to the end-users as fast as possible. The usage of workflow management systems along with packaging systems and containerization technologies offer an opportunity to tackle this challenge. While very powerful, they can be used and combined in many multiple ways which may differ from one developer to another. Therefore, promoting the homogeneity of the workflow implementation requires guidelines and protocols which detail how the source code of the bioinformatics pipeline should be written and organized to ensure its usability, maintainability, interoperability, sustainability, portability, reproducibility, scalability and efficiency. Capitalizing on Nextflow, Conda, Docker, Singularity and the nf-core initiative, we propose a set of best practices along the development life cycle of the bioinformatics pipeline and deployment for production operations which target different expert communities including i) the bioinformaticians and statisticians ii) the software engineers and iii) the data managers and core facility engineers. We implemented Geniac (Automatic Configuration GENerator and Installer for nextflow pipelines) which consists of a toolbox with three components: i) a technical documentation available at https://geniac.readthedocs.io to detail coding guidelines for the bioinformatics pipeline with Nextflow, ii) a command line interface with a linter to check that the code respects the guidelines, and iii) an add-on to generate configuration files, build the containers and deploy the pipeline. The Geniac toolbox aims at the harmonization of development practices across developers and automation of the generation of configuration files and containers by parsing the source code of the Nextflow pipeline. F1000 Research Limited 2022-02-21 /pmc/articles/PMC10445886/ /pubmed/37645091 http://dx.doi.org/10.12688/openreseurope.13861.2 Text en Copyright: © 2022 Allain F et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Tool Article Allain, Fabrice Roméjon, Julien La Rosa, Philippe Jarlier, Frédéric Servant, Nicolas Hupé, Philippe Geniac: Automatic Configuration GENerator and Installer for nextflow pipelines |
title | Geniac: Automatic Configuration GENerator and Installer for nextflow pipelines |
title_full | Geniac: Automatic Configuration GENerator and Installer for nextflow pipelines |
title_fullStr | Geniac: Automatic Configuration GENerator and Installer for nextflow pipelines |
title_full_unstemmed | Geniac: Automatic Configuration GENerator and Installer for nextflow pipelines |
title_short | Geniac: Automatic Configuration GENerator and Installer for nextflow pipelines |
title_sort | geniac: automatic configuration generator and installer for nextflow pipelines |
topic | Software Tool Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10445886/ https://www.ncbi.nlm.nih.gov/pubmed/37645091 http://dx.doi.org/10.12688/openreseurope.13861.2 |
work_keys_str_mv | AT allainfabrice geniacautomaticconfigurationgeneratorandinstallerfornextflowpipelines AT romejonjulien geniacautomaticconfigurationgeneratorandinstallerfornextflowpipelines AT larosaphilippe geniacautomaticconfigurationgeneratorandinstallerfornextflowpipelines AT jarlierfrederic geniacautomaticconfigurationgeneratorandinstallerfornextflowpipelines AT servantnicolas geniacautomaticconfigurationgeneratorandinstallerfornextflowpipelines AT hupephilippe geniacautomaticconfigurationgeneratorandinstallerfornextflowpipelines |