Cargando…

L-RAPiT: A Cloud-Based Computing Pipeline for the Analysis of Long-Read RNA Sequencing Data

Long-read sequencing (LRS) has been adopted to meet a wide variety of research needs, ranging from the construction of novel transcriptome annotations to the rapid identification of emerging virus variants. Amongst other advantages, LRS preserves more information about RNA at the transcript level th...

Descripción completa

Detalles Bibliográficos
Autores principales: Nelson, Theodore M., Ghosh, Sankar, Postler, Thomas S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9781625/
https://www.ncbi.nlm.nih.gov/pubmed/36555493
http://dx.doi.org/10.3390/ijms232415851
_version_ 1784857120097697792
author Nelson, Theodore M.
Ghosh, Sankar
Postler, Thomas S.
author_facet Nelson, Theodore M.
Ghosh, Sankar
Postler, Thomas S.
author_sort Nelson, Theodore M.
collection PubMed
description Long-read sequencing (LRS) has been adopted to meet a wide variety of research needs, ranging from the construction of novel transcriptome annotations to the rapid identification of emerging virus variants. Amongst other advantages, LRS preserves more information about RNA at the transcript level than conventional high-throughput sequencing, including far more accurate and quantitative records of splicing patterns. New studies with LRS datasets are being published at an exponential rate, generating a vast reservoir of information that can be leveraged to address a host of different research questions. However, mining such publicly available data in a tailored fashion is currently not easy, as the available software tools typically require familiarity with the command-line interface, which constitutes a significant obstacle to many researchers. Additionally, different research groups utilize different software packages to perform LRS analysis, which often prevents a direct comparison of published results across different studies. To address these challenges, we have developed the Long-Read Analysis Pipeline for Transcriptomics (L-RAPiT), a user-friendly, free pipeline requiring no dedicated computational resources or bioinformatics expertise. L-RAPiT can be implemented directly through Google Colaboratory, a system based on the open-source Jupyter notebook environment, and allows for the direct analysis of transcriptomic reads from Oxford Nanopore and PacBio LRS machines. This new pipeline enables the rapid, convenient, and standardized analysis of publicly available or newly generated LRS datasets.
format Online
Article
Text
id pubmed-9781625
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-97816252022-12-24 L-RAPiT: A Cloud-Based Computing Pipeline for the Analysis of Long-Read RNA Sequencing Data Nelson, Theodore M. Ghosh, Sankar Postler, Thomas S. Int J Mol Sci Article Long-read sequencing (LRS) has been adopted to meet a wide variety of research needs, ranging from the construction of novel transcriptome annotations to the rapid identification of emerging virus variants. Amongst other advantages, LRS preserves more information about RNA at the transcript level than conventional high-throughput sequencing, including far more accurate and quantitative records of splicing patterns. New studies with LRS datasets are being published at an exponential rate, generating a vast reservoir of information that can be leveraged to address a host of different research questions. However, mining such publicly available data in a tailored fashion is currently not easy, as the available software tools typically require familiarity with the command-line interface, which constitutes a significant obstacle to many researchers. Additionally, different research groups utilize different software packages to perform LRS analysis, which often prevents a direct comparison of published results across different studies. To address these challenges, we have developed the Long-Read Analysis Pipeline for Transcriptomics (L-RAPiT), a user-friendly, free pipeline requiring no dedicated computational resources or bioinformatics expertise. L-RAPiT can be implemented directly through Google Colaboratory, a system based on the open-source Jupyter notebook environment, and allows for the direct analysis of transcriptomic reads from Oxford Nanopore and PacBio LRS machines. This new pipeline enables the rapid, convenient, and standardized analysis of publicly available or newly generated LRS datasets. MDPI 2022-12-13 /pmc/articles/PMC9781625/ /pubmed/36555493 http://dx.doi.org/10.3390/ijms232415851 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Nelson, Theodore M.
Ghosh, Sankar
Postler, Thomas S.
L-RAPiT: A Cloud-Based Computing Pipeline for the Analysis of Long-Read RNA Sequencing Data
title L-RAPiT: A Cloud-Based Computing Pipeline for the Analysis of Long-Read RNA Sequencing Data
title_full L-RAPiT: A Cloud-Based Computing Pipeline for the Analysis of Long-Read RNA Sequencing Data
title_fullStr L-RAPiT: A Cloud-Based Computing Pipeline for the Analysis of Long-Read RNA Sequencing Data
title_full_unstemmed L-RAPiT: A Cloud-Based Computing Pipeline for the Analysis of Long-Read RNA Sequencing Data
title_short L-RAPiT: A Cloud-Based Computing Pipeline for the Analysis of Long-Read RNA Sequencing Data
title_sort l-rapit: a cloud-based computing pipeline for the analysis of long-read rna sequencing data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9781625/
https://www.ncbi.nlm.nih.gov/pubmed/36555493
http://dx.doi.org/10.3390/ijms232415851
work_keys_str_mv AT nelsontheodorem lrapitacloudbasedcomputingpipelinefortheanalysisoflongreadrnasequencingdata
AT ghoshsankar lrapitacloudbasedcomputingpipelinefortheanalysisoflongreadrnasequencingdata
AT postlerthomass lrapitacloudbasedcomputingpipelinefortheanalysisoflongreadrnasequencingdata