Cargando…

MasterOfPores: A Workflow for the Analysis of Oxford Nanopore Direct RNA Sequencing Datasets

The direct RNA sequencing platform offered by Oxford Nanopore Technologies allows for direct measurement of RNA molecules without the need of conversion to complementary DNA, fragmentation or amplification. As such, it is virtually capable of detecting any given RNA modification present in the molec...

Descripción completa

Detalles Bibliográficos
Autores principales: Cozzuto, Luca, Liu, Huanle, Pryszcz, Leszek P., Pulido, Toni Hermoso, Delgado-Tejedor, Anna, Ponomarenko, Julia, Novoa, Eva Maria
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7089958/
https://www.ncbi.nlm.nih.gov/pubmed/32256520
http://dx.doi.org/10.3389/fgene.2020.00211
_version_ 1783509828558127104
author Cozzuto, Luca
Liu, Huanle
Pryszcz, Leszek P.
Pulido, Toni Hermoso
Delgado-Tejedor, Anna
Ponomarenko, Julia
Novoa, Eva Maria
author_facet Cozzuto, Luca
Liu, Huanle
Pryszcz, Leszek P.
Pulido, Toni Hermoso
Delgado-Tejedor, Anna
Ponomarenko, Julia
Novoa, Eva Maria
author_sort Cozzuto, Luca
collection PubMed
description The direct RNA sequencing platform offered by Oxford Nanopore Technologies allows for direct measurement of RNA molecules without the need of conversion to complementary DNA, fragmentation or amplification. As such, it is virtually capable of detecting any given RNA modification present in the molecule that is being sequenced, as well as provide polyA tail length estimations at the level of individual RNA molecules. Although this technology has been publicly available since 2017, the complexity of the raw Nanopore data, together with the lack of systematic and reproducible pipelines, have greatly hindered the access of this technology to the general user. Here we address this problem by providing a fully benchmarked workflow for the analysis of direct RNA sequencing reads, termed MasterOfPores. The pipeline starts with a pre-processing module, which converts raw current intensities into multiple types of processed data including FASTQ and BAM, providing metrics of the quality of the run, quality-filtering, demultiplexing, base-calling and mapping. In a second step, the pipeline performs downstream analyses of the mapped reads, including prediction of RNA modifications and estimation of polyA tail lengths. Four direct RNA MinION sequencing runs can be fully processed and analyzed in 10 h on 100 CPUs. The pipeline can also be executed in GPU locally or in the cloud, decreasing the run time fourfold. The software is written using the NextFlow framework for parallelization and portability, and relies on Linux containers such as Docker and Singularity for achieving better reproducibility. The MasterOfPores workflow can be executed on any Unix-compatible OS on a computer, cluster or cloud without the need of installing any additional software or dependencies, and is freely available in Github (https://github.com/biocorecrg/master_of_pores). This workflow simplifies direct RNA sequencing data analyses, facilitating the study of the (epi)transcriptome at single molecule resolution.
format Online
Article
Text
id pubmed-7089958
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-70899582020-03-31 MasterOfPores: A Workflow for the Analysis of Oxford Nanopore Direct RNA Sequencing Datasets Cozzuto, Luca Liu, Huanle Pryszcz, Leszek P. Pulido, Toni Hermoso Delgado-Tejedor, Anna Ponomarenko, Julia Novoa, Eva Maria Front Genet Genetics The direct RNA sequencing platform offered by Oxford Nanopore Technologies allows for direct measurement of RNA molecules without the need of conversion to complementary DNA, fragmentation or amplification. As such, it is virtually capable of detecting any given RNA modification present in the molecule that is being sequenced, as well as provide polyA tail length estimations at the level of individual RNA molecules. Although this technology has been publicly available since 2017, the complexity of the raw Nanopore data, together with the lack of systematic and reproducible pipelines, have greatly hindered the access of this technology to the general user. Here we address this problem by providing a fully benchmarked workflow for the analysis of direct RNA sequencing reads, termed MasterOfPores. The pipeline starts with a pre-processing module, which converts raw current intensities into multiple types of processed data including FASTQ and BAM, providing metrics of the quality of the run, quality-filtering, demultiplexing, base-calling and mapping. In a second step, the pipeline performs downstream analyses of the mapped reads, including prediction of RNA modifications and estimation of polyA tail lengths. Four direct RNA MinION sequencing runs can be fully processed and analyzed in 10 h on 100 CPUs. The pipeline can also be executed in GPU locally or in the cloud, decreasing the run time fourfold. The software is written using the NextFlow framework for parallelization and portability, and relies on Linux containers such as Docker and Singularity for achieving better reproducibility. The MasterOfPores workflow can be executed on any Unix-compatible OS on a computer, cluster or cloud without the need of installing any additional software or dependencies, and is freely available in Github (https://github.com/biocorecrg/master_of_pores). This workflow simplifies direct RNA sequencing data analyses, facilitating the study of the (epi)transcriptome at single molecule resolution. Frontiers Media S.A. 2020-03-17 /pmc/articles/PMC7089958/ /pubmed/32256520 http://dx.doi.org/10.3389/fgene.2020.00211 Text en Copyright © 2020 Cozzuto, Liu, Pryszcz, Pulido, Delgado-Tejedor, Ponomarenko and Novoa. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Cozzuto, Luca
Liu, Huanle
Pryszcz, Leszek P.
Pulido, Toni Hermoso
Delgado-Tejedor, Anna
Ponomarenko, Julia
Novoa, Eva Maria
MasterOfPores: A Workflow for the Analysis of Oxford Nanopore Direct RNA Sequencing Datasets
title MasterOfPores: A Workflow for the Analysis of Oxford Nanopore Direct RNA Sequencing Datasets
title_full MasterOfPores: A Workflow for the Analysis of Oxford Nanopore Direct RNA Sequencing Datasets
title_fullStr MasterOfPores: A Workflow for the Analysis of Oxford Nanopore Direct RNA Sequencing Datasets
title_full_unstemmed MasterOfPores: A Workflow for the Analysis of Oxford Nanopore Direct RNA Sequencing Datasets
title_short MasterOfPores: A Workflow for the Analysis of Oxford Nanopore Direct RNA Sequencing Datasets
title_sort masterofpores: a workflow for the analysis of oxford nanopore direct rna sequencing datasets
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7089958/
https://www.ncbi.nlm.nih.gov/pubmed/32256520
http://dx.doi.org/10.3389/fgene.2020.00211
work_keys_str_mv AT cozzutoluca masterofporesaworkflowfortheanalysisofoxfordnanoporedirectrnasequencingdatasets
AT liuhuanle masterofporesaworkflowfortheanalysisofoxfordnanoporedirectrnasequencingdatasets
AT pryszczleszekp masterofporesaworkflowfortheanalysisofoxfordnanoporedirectrnasequencingdatasets
AT pulidotonihermoso masterofporesaworkflowfortheanalysisofoxfordnanoporedirectrnasequencingdatasets
AT delgadotejedoranna masterofporesaworkflowfortheanalysisofoxfordnanoporedirectrnasequencingdatasets
AT ponomarenkojulia masterofporesaworkflowfortheanalysisofoxfordnanoporedirectrnasequencingdatasets
AT novoaevamaria masterofporesaworkflowfortheanalysisofoxfordnanoporedirectrnasequencingdatasets