Cargando…
Pipeliner: A Nextflow-Based Framework for the Definition of Sequencing Data Processing Pipelines
The advent of high-throughput sequencing technologies has led to the need for flexible and user-friendly data preprocessing platforms. The Pipeliner framework provides an out-of-the-box solution for processing various types of sequencing data. It combines the Nextflow scripting language and Anaconda...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6609566/ https://www.ncbi.nlm.nih.gov/pubmed/31316552 http://dx.doi.org/10.3389/fgene.2019.00614 |
_version_ | 1783432333262585856 |
---|---|
author | Federico, Anthony Karagiannis, Tanya Karri, Kritika Kishore, Dileep Koga, Yusuke Campbell, Joshua D. Monti, Stefano |
author_facet | Federico, Anthony Karagiannis, Tanya Karri, Kritika Kishore, Dileep Koga, Yusuke Campbell, Joshua D. Monti, Stefano |
author_sort | Federico, Anthony |
collection | PubMed |
description | The advent of high-throughput sequencing technologies has led to the need for flexible and user-friendly data preprocessing platforms. The Pipeliner framework provides an out-of-the-box solution for processing various types of sequencing data. It combines the Nextflow scripting language and Anaconda package manager to generate modular computational workflows. We have used Pipeliner to create several pipelines for sequencing data processing including bulk RNA-sequencing (RNA-seq), single-cell RNA-seq, as well as digital gene expression data. This report highlights the design methodology behind Pipeliner that enables the development of highly flexible and reproducible pipelines that are easy to extend and maintain on multiple computing environments. We also provide a quick start user guide demonstrating how to setup and execute available pipelines with toy datasets. |
format | Online Article Text |
id | pubmed-6609566 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-66095662019-07-17 Pipeliner: A Nextflow-Based Framework for the Definition of Sequencing Data Processing Pipelines Federico, Anthony Karagiannis, Tanya Karri, Kritika Kishore, Dileep Koga, Yusuke Campbell, Joshua D. Monti, Stefano Front Genet Genetics The advent of high-throughput sequencing technologies has led to the need for flexible and user-friendly data preprocessing platforms. The Pipeliner framework provides an out-of-the-box solution for processing various types of sequencing data. It combines the Nextflow scripting language and Anaconda package manager to generate modular computational workflows. We have used Pipeliner to create several pipelines for sequencing data processing including bulk RNA-sequencing (RNA-seq), single-cell RNA-seq, as well as digital gene expression data. This report highlights the design methodology behind Pipeliner that enables the development of highly flexible and reproducible pipelines that are easy to extend and maintain on multiple computing environments. We also provide a quick start user guide demonstrating how to setup and execute available pipelines with toy datasets. Frontiers Media S.A. 2019-06-28 /pmc/articles/PMC6609566/ /pubmed/31316552 http://dx.doi.org/10.3389/fgene.2019.00614 Text en Copyright © 2019 Federico, Karagiannis, Karri, Kishore, Koga, Campbell and Monti http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Federico, Anthony Karagiannis, Tanya Karri, Kritika Kishore, Dileep Koga, Yusuke Campbell, Joshua D. Monti, Stefano Pipeliner: A Nextflow-Based Framework for the Definition of Sequencing Data Processing Pipelines |
title | Pipeliner: A Nextflow-Based Framework for the Definition of Sequencing Data Processing Pipelines |
title_full | Pipeliner: A Nextflow-Based Framework for the Definition of Sequencing Data Processing Pipelines |
title_fullStr | Pipeliner: A Nextflow-Based Framework for the Definition of Sequencing Data Processing Pipelines |
title_full_unstemmed | Pipeliner: A Nextflow-Based Framework for the Definition of Sequencing Data Processing Pipelines |
title_short | Pipeliner: A Nextflow-Based Framework for the Definition of Sequencing Data Processing Pipelines |
title_sort | pipeliner: a nextflow-based framework for the definition of sequencing data processing pipelines |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6609566/ https://www.ncbi.nlm.nih.gov/pubmed/31316552 http://dx.doi.org/10.3389/fgene.2019.00614 |
work_keys_str_mv | AT federicoanthony pipelineranextflowbasedframeworkforthedefinitionofsequencingdataprocessingpipelines AT karagiannistanya pipelineranextflowbasedframeworkforthedefinitionofsequencingdataprocessingpipelines AT karrikritika pipelineranextflowbasedframeworkforthedefinitionofsequencingdataprocessingpipelines AT kishoredileep pipelineranextflowbasedframeworkforthedefinitionofsequencingdataprocessingpipelines AT kogayusuke pipelineranextflowbasedframeworkforthedefinitionofsequencingdataprocessingpipelines AT campbelljoshuad pipelineranextflowbasedframeworkforthedefinitionofsequencingdataprocessingpipelines AT montistefano pipelineranextflowbasedframeworkforthedefinitionofsequencingdataprocessingpipelines |