Cargando…

cloudrnaSPAdes: Isoform assembly using bulk barcoded RNA sequencing data

MOTIVATION: Recent advancements in long-read RNA sequencing have enabled the examination of full-length isoforms, previously uncaptured by short-read sequencing methods. An alternative powerful method for studying isoforms is through the use of barcoded short-read RNA reads, for which a barcode indi...

Descripción completa

Detalles Bibliográficos
Autores principales: Meleshko, Dmitry, Prjbelski, Andrey D., Raiko, Mikhail, Tomescu, Alexandru I., Tilgner, Hagen, Hajirasouliha, Iman
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10402000/
https://www.ncbi.nlm.nih.gov/pubmed/37546844
http://dx.doi.org/10.1101/2023.07.25.550587
_version_ 1785084786112462848
author Meleshko, Dmitry
Prjbelski, Andrey D.
Raiko, Mikhail
Tomescu, Alexandru I.
Tilgner, Hagen
Hajirasouliha, Iman
author_facet Meleshko, Dmitry
Prjbelski, Andrey D.
Raiko, Mikhail
Tomescu, Alexandru I.
Tilgner, Hagen
Hajirasouliha, Iman
author_sort Meleshko, Dmitry
collection PubMed
description MOTIVATION: Recent advancements in long-read RNA sequencing have enabled the examination of full-length isoforms, previously uncaptured by short-read sequencing methods. An alternative powerful method for studying isoforms is through the use of barcoded short-read RNA reads, for which a barcode indicates whether two short-reads arise from the same molecule or not. Such techniques included the 10x Genomics linked-read based SParse Isoform Sequencing (SPIso-seq), as well as Loop-Seq, or Tell-Seq. Some applications, such as novel-isoform discovery, require very high coverage. Obtaining high coverage using long reads can be difficult, making barcoded RNA-seq data a valuable alternative for this task. However, most annotation pipelines are not able to work with a set of short reads instead of a single transcript, also not able to work with coverage gaps within a molecule if any. In order to overcome this challenge, we present an RNA-seq assembler allowing the determination of the expressed isoform per barcode. RESULTS: In this paper, we present cloudrnaSPAdes, a tool for assembling full-length isoforms from barcoded RNA-seq linked-read data in a reference-free fashion. Evaluating it on simulated and real human data, we found that cloudrnaSPAdes accurately assembles isoforms, even for genes with high isoform diversity. AVAILABILITY: cloudrnaSPAdes is a feature release of a SPAdes assembler and available at https://cab.spbu.ru/software/cloudrnaspades/.
format Online
Article
Text
id pubmed-10402000
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-104020002023-08-05 cloudrnaSPAdes: Isoform assembly using bulk barcoded RNA sequencing data Meleshko, Dmitry Prjbelski, Andrey D. Raiko, Mikhail Tomescu, Alexandru I. Tilgner, Hagen Hajirasouliha, Iman bioRxiv Article MOTIVATION: Recent advancements in long-read RNA sequencing have enabled the examination of full-length isoforms, previously uncaptured by short-read sequencing methods. An alternative powerful method for studying isoforms is through the use of barcoded short-read RNA reads, for which a barcode indicates whether two short-reads arise from the same molecule or not. Such techniques included the 10x Genomics linked-read based SParse Isoform Sequencing (SPIso-seq), as well as Loop-Seq, or Tell-Seq. Some applications, such as novel-isoform discovery, require very high coverage. Obtaining high coverage using long reads can be difficult, making barcoded RNA-seq data a valuable alternative for this task. However, most annotation pipelines are not able to work with a set of short reads instead of a single transcript, also not able to work with coverage gaps within a molecule if any. In order to overcome this challenge, we present an RNA-seq assembler allowing the determination of the expressed isoform per barcode. RESULTS: In this paper, we present cloudrnaSPAdes, a tool for assembling full-length isoforms from barcoded RNA-seq linked-read data in a reference-free fashion. Evaluating it on simulated and real human data, we found that cloudrnaSPAdes accurately assembles isoforms, even for genes with high isoform diversity. AVAILABILITY: cloudrnaSPAdes is a feature release of a SPAdes assembler and available at https://cab.spbu.ru/software/cloudrnaspades/. Cold Spring Harbor Laboratory 2023-07-27 /pmc/articles/PMC10402000/ /pubmed/37546844 http://dx.doi.org/10.1101/2023.07.25.550587 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle Article
Meleshko, Dmitry
Prjbelski, Andrey D.
Raiko, Mikhail
Tomescu, Alexandru I.
Tilgner, Hagen
Hajirasouliha, Iman
cloudrnaSPAdes: Isoform assembly using bulk barcoded RNA sequencing data
title cloudrnaSPAdes: Isoform assembly using bulk barcoded RNA sequencing data
title_full cloudrnaSPAdes: Isoform assembly using bulk barcoded RNA sequencing data
title_fullStr cloudrnaSPAdes: Isoform assembly using bulk barcoded RNA sequencing data
title_full_unstemmed cloudrnaSPAdes: Isoform assembly using bulk barcoded RNA sequencing data
title_short cloudrnaSPAdes: Isoform assembly using bulk barcoded RNA sequencing data
title_sort cloudrnaspades: isoform assembly using bulk barcoded rna sequencing data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10402000/
https://www.ncbi.nlm.nih.gov/pubmed/37546844
http://dx.doi.org/10.1101/2023.07.25.550587
work_keys_str_mv AT meleshkodmitry cloudrnaspadesisoformassemblyusingbulkbarcodedrnasequencingdata
AT prjbelskiandreyd cloudrnaspadesisoformassemblyusingbulkbarcodedrnasequencingdata
AT raikomikhail cloudrnaspadesisoformassemblyusingbulkbarcodedrnasequencingdata
AT tomescualexandrui cloudrnaspadesisoformassemblyusingbulkbarcodedrnasequencingdata
AT tilgnerhagen cloudrnaspadesisoformassemblyusingbulkbarcodedrnasequencingdata
AT hajirasoulihaiman cloudrnaspadesisoformassemblyusingbulkbarcodedrnasequencingdata