Cargando…

Geniac: Automatic Configuration GENerator and Installer for nextflow pipelines

With the advent of high-throughput biotechnological platforms and their ever-growing capacity, life science has turned into a digitized, computational and data-intensive discipline. As a consequence, standard analysis with a bioinformatics pipeline in the context of routine production has become a c...

Descripción completa

Detalles Bibliográficos
Autores principales: Allain, Fabrice, Roméjon, Julien, La Rosa, Philippe, Jarlier, Frédéric, Servant, Nicolas, Hupé, Philippe
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000 Research Limited 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10445886/
https://www.ncbi.nlm.nih.gov/pubmed/37645091
http://dx.doi.org/10.12688/openreseurope.13861.2
_version_ 1785094279642742784
author Allain, Fabrice
Roméjon, Julien
La Rosa, Philippe
Jarlier, Frédéric
Servant, Nicolas
Hupé, Philippe
author_facet Allain, Fabrice
Roméjon, Julien
La Rosa, Philippe
Jarlier, Frédéric
Servant, Nicolas
Hupé, Philippe
author_sort Allain, Fabrice
collection PubMed
description With the advent of high-throughput biotechnological platforms and their ever-growing capacity, life science has turned into a digitized, computational and data-intensive discipline. As a consequence, standard analysis with a bioinformatics pipeline in the context of routine production has become a challenge such that the data can be processed in real-time and delivered to the end-users as fast as possible. The usage of workflow management systems along with packaging systems and containerization technologies offer an opportunity to tackle this challenge. While very powerful, they can be used and combined in many multiple ways which may differ from one developer to another. Therefore, promoting the homogeneity of the workflow implementation requires guidelines and protocols which detail how the source code of the bioinformatics pipeline should be written and organized to ensure its usability, maintainability, interoperability, sustainability, portability, reproducibility, scalability and efficiency. Capitalizing on Nextflow, Conda, Docker, Singularity and the nf-core initiative, we propose a set of best practices along the development life cycle of the bioinformatics pipeline and deployment for production operations which target different expert communities including i) the bioinformaticians and statisticians ii) the software engineers and iii) the data managers and core facility engineers. We implemented Geniac (Automatic Configuration GENerator and Installer for nextflow pipelines) which consists of a toolbox with three components: i) a technical documentation available at https://geniac.readthedocs.io to detail coding guidelines for the bioinformatics pipeline with Nextflow, ii) a command line interface with a linter to check that the code respects the guidelines, and iii) an add-on to generate configuration files, build the containers and deploy the pipeline. The Geniac toolbox aims at the harmonization of development practices across developers and automation of the generation of configuration files and containers by parsing the source code of the Nextflow pipeline.
format Online
Article
Text
id pubmed-10445886
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher F1000 Research Limited
record_format MEDLINE/PubMed
spelling pubmed-104458862023-08-29 Geniac: Automatic Configuration GENerator and Installer for nextflow pipelines Allain, Fabrice Roméjon, Julien La Rosa, Philippe Jarlier, Frédéric Servant, Nicolas Hupé, Philippe Open Res Eur Software Tool Article With the advent of high-throughput biotechnological platforms and their ever-growing capacity, life science has turned into a digitized, computational and data-intensive discipline. As a consequence, standard analysis with a bioinformatics pipeline in the context of routine production has become a challenge such that the data can be processed in real-time and delivered to the end-users as fast as possible. The usage of workflow management systems along with packaging systems and containerization technologies offer an opportunity to tackle this challenge. While very powerful, they can be used and combined in many multiple ways which may differ from one developer to another. Therefore, promoting the homogeneity of the workflow implementation requires guidelines and protocols which detail how the source code of the bioinformatics pipeline should be written and organized to ensure its usability, maintainability, interoperability, sustainability, portability, reproducibility, scalability and efficiency. Capitalizing on Nextflow, Conda, Docker, Singularity and the nf-core initiative, we propose a set of best practices along the development life cycle of the bioinformatics pipeline and deployment for production operations which target different expert communities including i) the bioinformaticians and statisticians ii) the software engineers and iii) the data managers and core facility engineers. We implemented Geniac (Automatic Configuration GENerator and Installer for nextflow pipelines) which consists of a toolbox with three components: i) a technical documentation available at https://geniac.readthedocs.io to detail coding guidelines for the bioinformatics pipeline with Nextflow, ii) a command line interface with a linter to check that the code respects the guidelines, and iii) an add-on to generate configuration files, build the containers and deploy the pipeline. The Geniac toolbox aims at the harmonization of development practices across developers and automation of the generation of configuration files and containers by parsing the source code of the Nextflow pipeline. F1000 Research Limited 2022-02-21 /pmc/articles/PMC10445886/ /pubmed/37645091 http://dx.doi.org/10.12688/openreseurope.13861.2 Text en Copyright: © 2022 Allain F et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software Tool Article
Allain, Fabrice
Roméjon, Julien
La Rosa, Philippe
Jarlier, Frédéric
Servant, Nicolas
Hupé, Philippe
Geniac: Automatic Configuration GENerator and Installer for nextflow pipelines
title Geniac: Automatic Configuration GENerator and Installer for nextflow pipelines
title_full Geniac: Automatic Configuration GENerator and Installer for nextflow pipelines
title_fullStr Geniac: Automatic Configuration GENerator and Installer for nextflow pipelines
title_full_unstemmed Geniac: Automatic Configuration GENerator and Installer for nextflow pipelines
title_short Geniac: Automatic Configuration GENerator and Installer for nextflow pipelines
title_sort geniac: automatic configuration generator and installer for nextflow pipelines
topic Software Tool Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10445886/
https://www.ncbi.nlm.nih.gov/pubmed/37645091
http://dx.doi.org/10.12688/openreseurope.13861.2
work_keys_str_mv AT allainfabrice geniacautomaticconfigurationgeneratorandinstallerfornextflowpipelines
AT romejonjulien geniacautomaticconfigurationgeneratorandinstallerfornextflowpipelines
AT larosaphilippe geniacautomaticconfigurationgeneratorandinstallerfornextflowpipelines
AT jarlierfrederic geniacautomaticconfigurationgeneratorandinstallerfornextflowpipelines
AT servantnicolas geniacautomaticconfigurationgeneratorandinstallerfornextflowpipelines
AT hupephilippe geniacautomaticconfigurationgeneratorandinstallerfornextflowpipelines