Cargando…

2passtools: two-pass alignment using machine-learning-filtered splice junctions increases the accuracy of intron detection in long-read RNA sequencing

Transcription of eukaryotic genomes involves complex alternative processing of RNAs. Sequencing of full-length RNAs using long reads reveals the true complexity of processing. However, the relatively high error rates of long-read sequencing technologies can reduce the accuracy of intron identificati...

Descripción completa

Detalles Bibliográficos
Autores principales: Parker, Matthew T., Knop, Katarzyna, Barton, Geoffrey J., Simpson, Gordon G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7919322/
https://www.ncbi.nlm.nih.gov/pubmed/33648554
http://dx.doi.org/10.1186/s13059-021-02296-0
Descripción
Sumario:Transcription of eukaryotic genomes involves complex alternative processing of RNAs. Sequencing of full-length RNAs using long reads reveals the true complexity of processing. However, the relatively high error rates of long-read sequencing technologies can reduce the accuracy of intron identification. Here we apply alignment metrics and machine-learning-derived sequence information to filter spurious splice junctions from long-read alignments and use the remaining junctions to guide realignment in a two-pass approach. This method, available in the software package 2passtools (https://github.com/bartongroup/2passtools), improves the accuracy of spliced alignment and transcriptome assembly for species both with and without existing high-quality annotations. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-021-02296-0.