Cargando…

GenPipes: an open-source framework for distributed and scalable genomic analyses

BACKGROUND: With the decreasing cost of sequencing and the rapid developments in genomics technologies and protocols, the need for validated bioinformatics software that enables efficient large-scale data processing is growing. FINDINGS: Here we present GenPipes, a flexible Python-based framework th...

Descripción completa

Detalles Bibliográficos
Autores principales: Bourgey, Mathieu, Dali, Rola, Eveleigh, Robert, Chen, Kuang Chung, Letourneau, Louis, Fillon, Joel, Michaud, Marc, Caron, Maxime, Sandoval, Johanna, Lefebvre, Francois, Leveque, Gary, Mercier, Eloi, Bujold, David, Marquis, Pascale, Van, Patrick Tran, Anderson de Lima Morais, David, Tremblay, Julien, Shao, Xiaojian, Henrion, Edouard, Gonzalez, Emmanuel, Quirion, Pierre-Olivier, Caron, Bryan, Bourque, Guillaume
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6559338/
https://www.ncbi.nlm.nih.gov/pubmed/31185495
http://dx.doi.org/10.1093/gigascience/giz037
_version_ 1783425816608112640
author Bourgey, Mathieu
Dali, Rola
Eveleigh, Robert
Chen, Kuang Chung
Letourneau, Louis
Fillon, Joel
Michaud, Marc
Caron, Maxime
Sandoval, Johanna
Lefebvre, Francois
Leveque, Gary
Mercier, Eloi
Bujold, David
Marquis, Pascale
Van, Patrick Tran
Anderson de Lima Morais, David
Tremblay, Julien
Shao, Xiaojian
Henrion, Edouard
Gonzalez, Emmanuel
Quirion, Pierre-Olivier
Caron, Bryan
Bourque, Guillaume
author_facet Bourgey, Mathieu
Dali, Rola
Eveleigh, Robert
Chen, Kuang Chung
Letourneau, Louis
Fillon, Joel
Michaud, Marc
Caron, Maxime
Sandoval, Johanna
Lefebvre, Francois
Leveque, Gary
Mercier, Eloi
Bujold, David
Marquis, Pascale
Van, Patrick Tran
Anderson de Lima Morais, David
Tremblay, Julien
Shao, Xiaojian
Henrion, Edouard
Gonzalez, Emmanuel
Quirion, Pierre-Olivier
Caron, Bryan
Bourque, Guillaume
author_sort Bourgey, Mathieu
collection PubMed
description BACKGROUND: With the decreasing cost of sequencing and the rapid developments in genomics technologies and protocols, the need for validated bioinformatics software that enables efficient large-scale data processing is growing. FINDINGS: Here we present GenPipes, a flexible Python-based framework that facilitates the development and deployment of multi-step workflows optimized for high-performance computing clusters and the cloud. GenPipes already implements 12 validated and scalable pipelines for various genomics applications, including RNA sequencing, chromatin immunoprecipitation sequencing, DNA sequencing, methylation sequencing, Hi-C, capture Hi-C, metagenomics, and Pacific Biosciences long-read assembly. The software is available under a GPLv3 open source license and is continuously updated to follow recent advances in genomics and bioinformatics. The framework has already been configured on several servers, and a Docker image is also available to facilitate additional installations. CONCLUSIONS: GenPipes offers genomics researchers a simple method to analyze different types of data, customizable to their needs and resources, as well as the flexibility to create their own workflows.
format Online
Article
Text
id pubmed-6559338
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-65593382019-06-17 GenPipes: an open-source framework for distributed and scalable genomic analyses Bourgey, Mathieu Dali, Rola Eveleigh, Robert Chen, Kuang Chung Letourneau, Louis Fillon, Joel Michaud, Marc Caron, Maxime Sandoval, Johanna Lefebvre, Francois Leveque, Gary Mercier, Eloi Bujold, David Marquis, Pascale Van, Patrick Tran Anderson de Lima Morais, David Tremblay, Julien Shao, Xiaojian Henrion, Edouard Gonzalez, Emmanuel Quirion, Pierre-Olivier Caron, Bryan Bourque, Guillaume Gigascience Technical Note BACKGROUND: With the decreasing cost of sequencing and the rapid developments in genomics technologies and protocols, the need for validated bioinformatics software that enables efficient large-scale data processing is growing. FINDINGS: Here we present GenPipes, a flexible Python-based framework that facilitates the development and deployment of multi-step workflows optimized for high-performance computing clusters and the cloud. GenPipes already implements 12 validated and scalable pipelines for various genomics applications, including RNA sequencing, chromatin immunoprecipitation sequencing, DNA sequencing, methylation sequencing, Hi-C, capture Hi-C, metagenomics, and Pacific Biosciences long-read assembly. The software is available under a GPLv3 open source license and is continuously updated to follow recent advances in genomics and bioinformatics. The framework has already been configured on several servers, and a Docker image is also available to facilitate additional installations. CONCLUSIONS: GenPipes offers genomics researchers a simple method to analyze different types of data, customizable to their needs and resources, as well as the flexibility to create their own workflows. Oxford University Press 2019-06-11 /pmc/articles/PMC6559338/ /pubmed/31185495 http://dx.doi.org/10.1093/gigascience/giz037 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Bourgey, Mathieu
Dali, Rola
Eveleigh, Robert
Chen, Kuang Chung
Letourneau, Louis
Fillon, Joel
Michaud, Marc
Caron, Maxime
Sandoval, Johanna
Lefebvre, Francois
Leveque, Gary
Mercier, Eloi
Bujold, David
Marquis, Pascale
Van, Patrick Tran
Anderson de Lima Morais, David
Tremblay, Julien
Shao, Xiaojian
Henrion, Edouard
Gonzalez, Emmanuel
Quirion, Pierre-Olivier
Caron, Bryan
Bourque, Guillaume
GenPipes: an open-source framework for distributed and scalable genomic analyses
title GenPipes: an open-source framework for distributed and scalable genomic analyses
title_full GenPipes: an open-source framework for distributed and scalable genomic analyses
title_fullStr GenPipes: an open-source framework for distributed and scalable genomic analyses
title_full_unstemmed GenPipes: an open-source framework for distributed and scalable genomic analyses
title_short GenPipes: an open-source framework for distributed and scalable genomic analyses
title_sort genpipes: an open-source framework for distributed and scalable genomic analyses
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6559338/
https://www.ncbi.nlm.nih.gov/pubmed/31185495
http://dx.doi.org/10.1093/gigascience/giz037
work_keys_str_mv AT bourgeymathieu genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT dalirola genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT eveleighrobert genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT chenkuangchung genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT letourneaulouis genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT fillonjoel genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT michaudmarc genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT caronmaxime genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT sandovaljohanna genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT lefebvrefrancois genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT levequegary genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT merciereloi genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT bujolddavid genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT marquispascale genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT vanpatricktran genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT andersondelimamoraisdavid genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT tremblayjulien genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT shaoxiaojian genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT henrionedouard genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT gonzalezemmanuel genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT quirionpierreolivier genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT caronbryan genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses
AT bourqueguillaume genpipesanopensourceframeworkfordistributedandscalablegenomicanalyses