Cargando…

Transformation of alignment files improves performance of variant callers for long-read RNA sequencing data

Long-read RNA sequencing (lrRNA-seq) produces detailed information about full-length transcripts, including novel and sample-specific isoforms. Furthermore, there is an opportunity to call variants directly from lrRNA-seq data. However, most state-of-the-art variant callers have been developed for g...

Descripción completa

Detalles Bibliográficos
Autores principales: de Souza, Vladimir B. C., Jordan, Ben T., Tseng, Elizabeth, Nelson, Elizabeth A., Hirschi, Karen K., Sheynkman, Gloria, Robinson, Mark D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10123983/
https://www.ncbi.nlm.nih.gov/pubmed/37095564
http://dx.doi.org/10.1186/s13059-023-02923-y
Descripción
Sumario:Long-read RNA sequencing (lrRNA-seq) produces detailed information about full-length transcripts, including novel and sample-specific isoforms. Furthermore, there is an opportunity to call variants directly from lrRNA-seq data. However, most state-of-the-art variant callers have been developed for genomic DNA. Here, there are two objectives: first, we perform a mini-benchmark on GATK, DeepVariant, Clair3, and NanoCaller primarily on PacBio Iso-Seq, data, but also on Nanopore and Illumina RNA-seq data; second, we propose a pipeline to process spliced-alignment files, making them suitable for variant calling with DNA-based callers. With such manipulations, high calling performance can be achieved using DeepVariant on Iso-seq data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-02923-y.