Cargando…

Counting pseudoalignments to novel splicing events

MOTIVATION: Alternative splicing (AS) of introns from pre-mRNA produces diverse sets of transcripts across cell types and tissues, but is also dysregulated in many diseases. Alignment-free computational methods have greatly accelerated the quantification of mRNA transcripts from short RNA-seq reads,...

Descripción completa

Detalles Bibliográficos
Autores principales: Borozan, Luka, Rojas Ringeling, Francisca, Kao, Shao-Yen, Nikonova, Elena, Monteagudo-Mesas, Pablo, Matijević, Domagoj, Spletter, Maria L, Canzar, Stefan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10348833/
https://www.ncbi.nlm.nih.gov/pubmed/37432342
http://dx.doi.org/10.1093/bioinformatics/btad419
_version_ 1785073746491473920
author Borozan, Luka
Rojas Ringeling, Francisca
Kao, Shao-Yen
Nikonova, Elena
Monteagudo-Mesas, Pablo
Matijević, Domagoj
Spletter, Maria L
Canzar, Stefan
author_facet Borozan, Luka
Rojas Ringeling, Francisca
Kao, Shao-Yen
Nikonova, Elena
Monteagudo-Mesas, Pablo
Matijević, Domagoj
Spletter, Maria L
Canzar, Stefan
author_sort Borozan, Luka
collection PubMed
description MOTIVATION: Alternative splicing (AS) of introns from pre-mRNA produces diverse sets of transcripts across cell types and tissues, but is also dysregulated in many diseases. Alignment-free computational methods have greatly accelerated the quantification of mRNA transcripts from short RNA-seq reads, but they inherently rely on a catalog of known transcripts and might miss novel, disease-specific splicing events. By contrast, alignment of reads to the genome can effectively identify novel exonic segments and introns. Event-based methods then count how many reads align to predefined features. However, an alignment is more expensive to compute and constitutes a bottleneck in many AS analysis methods. RESULTS: Here, we propose fortuna, a method that guesses novel combinations of annotated splice sites to create transcript fragments. It then pseudoaligns reads to fragments using kallisto and efficiently derives counts of the most elementary splicing units from kallisto’s equivalence classes. These counts can be directly used for AS analysis or summarized to larger units as used by other widely applied methods. In experiments on synthetic and real data, fortuna was around [Formula: see text] faster than traditional align and count approaches, and was able to analyze almost 300 million reads in just 15 min when using four threads. It mapped reads containing mismatches more accurately across novel junctions and found more reads supporting aberrant splicing events in patients with autism spectrum disorder than existing methods. We further used fortuna to identify novel, tissue-specific splicing events in Drosophila. AVAILABILITY AND IMPLEMENTATION: fortuna source code is available at https://github.com/canzarlab/fortuna.
format Online
Article
Text
id pubmed-10348833
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-103488332023-07-15 Counting pseudoalignments to novel splicing events Borozan, Luka Rojas Ringeling, Francisca Kao, Shao-Yen Nikonova, Elena Monteagudo-Mesas, Pablo Matijević, Domagoj Spletter, Maria L Canzar, Stefan Bioinformatics Original Paper MOTIVATION: Alternative splicing (AS) of introns from pre-mRNA produces diverse sets of transcripts across cell types and tissues, but is also dysregulated in many diseases. Alignment-free computational methods have greatly accelerated the quantification of mRNA transcripts from short RNA-seq reads, but they inherently rely on a catalog of known transcripts and might miss novel, disease-specific splicing events. By contrast, alignment of reads to the genome can effectively identify novel exonic segments and introns. Event-based methods then count how many reads align to predefined features. However, an alignment is more expensive to compute and constitutes a bottleneck in many AS analysis methods. RESULTS: Here, we propose fortuna, a method that guesses novel combinations of annotated splice sites to create transcript fragments. It then pseudoaligns reads to fragments using kallisto and efficiently derives counts of the most elementary splicing units from kallisto’s equivalence classes. These counts can be directly used for AS analysis or summarized to larger units as used by other widely applied methods. In experiments on synthetic and real data, fortuna was around [Formula: see text] faster than traditional align and count approaches, and was able to analyze almost 300 million reads in just 15 min when using four threads. It mapped reads containing mismatches more accurately across novel junctions and found more reads supporting aberrant splicing events in patients with autism spectrum disorder than existing methods. We further used fortuna to identify novel, tissue-specific splicing events in Drosophila. AVAILABILITY AND IMPLEMENTATION: fortuna source code is available at https://github.com/canzarlab/fortuna. Oxford University Press 2023-07-11 /pmc/articles/PMC10348833/ /pubmed/37432342 http://dx.doi.org/10.1093/bioinformatics/btad419 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Borozan, Luka
Rojas Ringeling, Francisca
Kao, Shao-Yen
Nikonova, Elena
Monteagudo-Mesas, Pablo
Matijević, Domagoj
Spletter, Maria L
Canzar, Stefan
Counting pseudoalignments to novel splicing events
title Counting pseudoalignments to novel splicing events
title_full Counting pseudoalignments to novel splicing events
title_fullStr Counting pseudoalignments to novel splicing events
title_full_unstemmed Counting pseudoalignments to novel splicing events
title_short Counting pseudoalignments to novel splicing events
title_sort counting pseudoalignments to novel splicing events
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10348833/
https://www.ncbi.nlm.nih.gov/pubmed/37432342
http://dx.doi.org/10.1093/bioinformatics/btad419
work_keys_str_mv AT borozanluka countingpseudoalignmentstonovelsplicingevents
AT rojasringelingfrancisca countingpseudoalignmentstonovelsplicingevents
AT kaoshaoyen countingpseudoalignmentstonovelsplicingevents
AT nikonovaelena countingpseudoalignmentstonovelsplicingevents
AT monteagudomesaspablo countingpseudoalignmentstonovelsplicingevents
AT matijevicdomagoj countingpseudoalignmentstonovelsplicingevents
AT splettermarial countingpseudoalignmentstonovelsplicingevents
AT canzarstefan countingpseudoalignmentstonovelsplicingevents