Cargando…

Reference-free assembly of long-read transcriptome sequencing data with RNA-Bloom2

Long-read sequencing technologies have improved significantly since their emergence. Their read lengths, potentially spanning entire transcripts, is advantageous for reconstructing transcriptomes. Existing long-read transcriptome assembly methods are primarily reference-based and to date, there is l...

Descripción completa

Detalles Bibliográficos
Autores principales: Nip, Ka Ming, Hafezqorani, Saber, Gagalova, Kristina K., Chiu, Readman, Yang, Chen, Warren, René L., Birol, Inanc
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10202958/
https://www.ncbi.nlm.nih.gov/pubmed/37217540
http://dx.doi.org/10.1038/s41467-023-38553-y
Descripción
Sumario:Long-read sequencing technologies have improved significantly since their emergence. Their read lengths, potentially spanning entire transcripts, is advantageous for reconstructing transcriptomes. Existing long-read transcriptome assembly methods are primarily reference-based and to date, there is little focus on reference-free transcriptome assembly. We introduce “RNA-Bloom2 [https://github.com/bcgsc/RNA-Bloom]”, a reference-free assembly method for long-read transcriptome sequencing data. Using simulated datasets and spike-in control data, we show that the transcriptome assembly quality of RNA-Bloom2 is competitive to those of reference-based methods. Furthermore, we find that RNA-Bloom2 requires 27.0 to 80.6% of the peak memory and 3.6 to 10.8% of the total wall-clock runtime of a competing reference-free method. Finally, we showcase RNA-Bloom2 in assembling a transcriptome sample of Picea sitchensis (Sitka spruce). Since our method does not rely on a reference, it further sets the groundwork for large-scale comparative transcriptomics where high-quality draft genome assemblies are not readily available.