Cargando…
Splam: a deep-learning-based splice site predictor that improves spliced alignments
The process of splicing messenger RNA to remove introns plays a central role in creating genes and gene variants. Here we describe Splam, a novel method for predicting splice junctions in DNA based on deep residual convolutional neural networks. Unlike some previous models, Splam looks at a relative...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10402160/ https://www.ncbi.nlm.nih.gov/pubmed/37546880 http://dx.doi.org/10.1101/2023.07.27.550754 |
_version_ | 1785084811495342080 |
---|---|
author | Chao, Kuan-Hao Mao, Alan Salzberg, Steven L Pertea, Mihaela |
author_facet | Chao, Kuan-Hao Mao, Alan Salzberg, Steven L Pertea, Mihaela |
author_sort | Chao, Kuan-Hao |
collection | PubMed |
description | The process of splicing messenger RNA to remove introns plays a central role in creating genes and gene variants. Here we describe Splam, a novel method for predicting splice junctions in DNA based on deep residual convolutional neural networks. Unlike some previous models, Splam looks at a relatively limited window of 400 base pairs flanking each splice site, motivated by the observation that the biological process of splicing relies primarily on signals within this window. Additionally, Splam introduces the idea of training the network on donor and acceptor pairs together, based on the principle that the splicing machinery recognizes both ends of each intron at once. We compare Splam’s accuracy to recent state-of-the-art splice site prediction methods, particularly SpliceAI, another method that uses deep neural networks. Our results show that Splam is consistently more accurate than SpliceAI, with an overall accuracy of 96% at predicting human splice junctions. Splam generalizes even to non-human species, including distant ones like the flowering plant Arabidopsis thaliana. Finally, we demonstrate the use of Splam on a novel application: processing the spliced alignments of RNA-seq data to identify and eliminate errors. We show that when used in this manner, Splam yields substantial improvements in the accuracy of downstream transcriptome analysis of both poly(A) and ribo-depleted RNA-seq libraries. Overall, Splam offers a faster and more accurate approach to detecting splice junctions, while also providing a reliable and efficient solution for cleaning up erroneous spliced alignments. |
format | Online Article Text |
id | pubmed-10402160 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-104021602023-08-05 Splam: a deep-learning-based splice site predictor that improves spliced alignments Chao, Kuan-Hao Mao, Alan Salzberg, Steven L Pertea, Mihaela bioRxiv Article The process of splicing messenger RNA to remove introns plays a central role in creating genes and gene variants. Here we describe Splam, a novel method for predicting splice junctions in DNA based on deep residual convolutional neural networks. Unlike some previous models, Splam looks at a relatively limited window of 400 base pairs flanking each splice site, motivated by the observation that the biological process of splicing relies primarily on signals within this window. Additionally, Splam introduces the idea of training the network on donor and acceptor pairs together, based on the principle that the splicing machinery recognizes both ends of each intron at once. We compare Splam’s accuracy to recent state-of-the-art splice site prediction methods, particularly SpliceAI, another method that uses deep neural networks. Our results show that Splam is consistently more accurate than SpliceAI, with an overall accuracy of 96% at predicting human splice junctions. Splam generalizes even to non-human species, including distant ones like the flowering plant Arabidopsis thaliana. Finally, we demonstrate the use of Splam on a novel application: processing the spliced alignments of RNA-seq data to identify and eliminate errors. We show that when used in this manner, Splam yields substantial improvements in the accuracy of downstream transcriptome analysis of both poly(A) and ribo-depleted RNA-seq libraries. Overall, Splam offers a faster and more accurate approach to detecting splice junctions, while also providing a reliable and efficient solution for cleaning up erroneous spliced alignments. Cold Spring Harbor Laboratory 2023-07-29 /pmc/articles/PMC10402160/ /pubmed/37546880 http://dx.doi.org/10.1101/2023.07.27.550754 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. |
spellingShingle | Article Chao, Kuan-Hao Mao, Alan Salzberg, Steven L Pertea, Mihaela Splam: a deep-learning-based splice site predictor that improves spliced alignments |
title | Splam: a deep-learning-based splice site predictor that improves spliced alignments |
title_full | Splam: a deep-learning-based splice site predictor that improves spliced alignments |
title_fullStr | Splam: a deep-learning-based splice site predictor that improves spliced alignments |
title_full_unstemmed | Splam: a deep-learning-based splice site predictor that improves spliced alignments |
title_short | Splam: a deep-learning-based splice site predictor that improves spliced alignments |
title_sort | splam: a deep-learning-based splice site predictor that improves spliced alignments |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10402160/ https://www.ncbi.nlm.nih.gov/pubmed/37546880 http://dx.doi.org/10.1101/2023.07.27.550754 |
work_keys_str_mv | AT chaokuanhao splamadeeplearningbasedsplicesitepredictorthatimprovessplicedalignments AT maoalan splamadeeplearningbasedsplicesitepredictorthatimprovessplicedalignments AT salzbergstevenl splamadeeplearningbasedsplicesitepredictorthatimprovessplicedalignments AT perteamihaela splamadeeplearningbasedsplicesitepredictorthatimprovessplicedalignments |