Cargando…

Pipeliner: A Nextflow-Based Framework for the Definition of Sequencing Data Processing Pipelines

The advent of high-throughput sequencing technologies has led to the need for flexible and user-friendly data preprocessing platforms. The Pipeliner framework provides an out-of-the-box solution for processing various types of sequencing data. It combines the Nextflow scripting language and Anaconda...

Descripción completa

Detalles Bibliográficos
Autores principales: Federico, Anthony, Karagiannis, Tanya, Karri, Kritika, Kishore, Dileep, Koga, Yusuke, Campbell, Joshua D., Monti, Stefano
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6609566/
https://www.ncbi.nlm.nih.gov/pubmed/31316552
http://dx.doi.org/10.3389/fgene.2019.00614
_version_ 1783432333262585856
author Federico, Anthony
Karagiannis, Tanya
Karri, Kritika
Kishore, Dileep
Koga, Yusuke
Campbell, Joshua D.
Monti, Stefano
author_facet Federico, Anthony
Karagiannis, Tanya
Karri, Kritika
Kishore, Dileep
Koga, Yusuke
Campbell, Joshua D.
Monti, Stefano
author_sort Federico, Anthony
collection PubMed
description The advent of high-throughput sequencing technologies has led to the need for flexible and user-friendly data preprocessing platforms. The Pipeliner framework provides an out-of-the-box solution for processing various types of sequencing data. It combines the Nextflow scripting language and Anaconda package manager to generate modular computational workflows. We have used Pipeliner to create several pipelines for sequencing data processing including bulk RNA-sequencing (RNA-seq), single-cell RNA-seq, as well as digital gene expression data. This report highlights the design methodology behind Pipeliner that enables the development of highly flexible and reproducible pipelines that are easy to extend and maintain on multiple computing environments. We also provide a quick start user guide demonstrating how to setup and execute available pipelines with toy datasets.
format Online
Article
Text
id pubmed-6609566
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-66095662019-07-17 Pipeliner: A Nextflow-Based Framework for the Definition of Sequencing Data Processing Pipelines Federico, Anthony Karagiannis, Tanya Karri, Kritika Kishore, Dileep Koga, Yusuke Campbell, Joshua D. Monti, Stefano Front Genet Genetics The advent of high-throughput sequencing technologies has led to the need for flexible and user-friendly data preprocessing platforms. The Pipeliner framework provides an out-of-the-box solution for processing various types of sequencing data. It combines the Nextflow scripting language and Anaconda package manager to generate modular computational workflows. We have used Pipeliner to create several pipelines for sequencing data processing including bulk RNA-sequencing (RNA-seq), single-cell RNA-seq, as well as digital gene expression data. This report highlights the design methodology behind Pipeliner that enables the development of highly flexible and reproducible pipelines that are easy to extend and maintain on multiple computing environments. We also provide a quick start user guide demonstrating how to setup and execute available pipelines with toy datasets. Frontiers Media S.A. 2019-06-28 /pmc/articles/PMC6609566/ /pubmed/31316552 http://dx.doi.org/10.3389/fgene.2019.00614 Text en Copyright © 2019 Federico, Karagiannis, Karri, Kishore, Koga, Campbell and Monti http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Federico, Anthony
Karagiannis, Tanya
Karri, Kritika
Kishore, Dileep
Koga, Yusuke
Campbell, Joshua D.
Monti, Stefano
Pipeliner: A Nextflow-Based Framework for the Definition of Sequencing Data Processing Pipelines
title Pipeliner: A Nextflow-Based Framework for the Definition of Sequencing Data Processing Pipelines
title_full Pipeliner: A Nextflow-Based Framework for the Definition of Sequencing Data Processing Pipelines
title_fullStr Pipeliner: A Nextflow-Based Framework for the Definition of Sequencing Data Processing Pipelines
title_full_unstemmed Pipeliner: A Nextflow-Based Framework for the Definition of Sequencing Data Processing Pipelines
title_short Pipeliner: A Nextflow-Based Framework for the Definition of Sequencing Data Processing Pipelines
title_sort pipeliner: a nextflow-based framework for the definition of sequencing data processing pipelines
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6609566/
https://www.ncbi.nlm.nih.gov/pubmed/31316552
http://dx.doi.org/10.3389/fgene.2019.00614
work_keys_str_mv AT federicoanthony pipelineranextflowbasedframeworkforthedefinitionofsequencingdataprocessingpipelines
AT karagiannistanya pipelineranextflowbasedframeworkforthedefinitionofsequencingdataprocessingpipelines
AT karrikritika pipelineranextflowbasedframeworkforthedefinitionofsequencingdataprocessingpipelines
AT kishoredileep pipelineranextflowbasedframeworkforthedefinitionofsequencingdataprocessingpipelines
AT kogayusuke pipelineranextflowbasedframeworkforthedefinitionofsequencingdataprocessingpipelines
AT campbelljoshuad pipelineranextflowbasedframeworkforthedefinitionofsequencingdataprocessingpipelines
AT montistefano pipelineranextflowbasedframeworkforthedefinitionofsequencingdataprocessingpipelines