Cargando…

NanoSplicer: accurate identification of splice junctions using Oxford Nanopore sequencing

MOTIVATION: Long-read sequencing methods have considerable advantages for characterizing RNA isoforms. Oxford Nanopore sequencing records changes in electrical current when nucleic acid traverses through a pore. However, basecalling of this raw signal (known as a squiggle) is error prone, making it...

Descripción completa

Detalles Bibliográficos
Autores principales: You, Yupei, Clark, Michael B, Shim, Heejung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344838/
https://www.ncbi.nlm.nih.gov/pubmed/35639973
http://dx.doi.org/10.1093/bioinformatics/btac359
_version_ 1784761302681387008
author You, Yupei
Clark, Michael B
Shim, Heejung
author_facet You, Yupei
Clark, Michael B
Shim, Heejung
author_sort You, Yupei
collection PubMed
description MOTIVATION: Long-read sequencing methods have considerable advantages for characterizing RNA isoforms. Oxford Nanopore sequencing records changes in electrical current when nucleic acid traverses through a pore. However, basecalling of this raw signal (known as a squiggle) is error prone, making it challenging to accurately identify splice junctions. Existing strategies include utilizing matched short-read data and/or annotated splice junctions to correct nanopore reads but add expense or limit junctions to known (incomplete) annotations. Therefore, a method that could accurately identify splice junctions solely from nanopore data would have numerous advantages. RESULTS: We developed ‘NanoSplicer’ to identify splice junctions using raw nanopore signal (squiggles). For each splice junction, the observed squiggle is compared to candidate squiggles representing potential junctions to identify the correct candidate. Measuring squiggle similarity enables us to compute the probability of each candidate junction and find the most likely one. We tested our method using (i) synthetic mRNAs with known splice junctions and (ii) biological mRNAs from a lung-cancer cell-line. The results from both datasets demonstrate NanoSplicer improves splice junction identification, especially when the basecalling error rate near the splice junction is elevated. AVAILABILITY AND IMPLEMENTATION: NanoSplicer is available at https://github.com/shimlab/NanoSplicer and archived at https://doi.org/10.5281/zenodo.6403849. Data is available from ENA: ERS7273757 and ERS7273453. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9344838
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-93448382022-08-03 NanoSplicer: accurate identification of splice junctions using Oxford Nanopore sequencing You, Yupei Clark, Michael B Shim, Heejung Bioinformatics Original Papers MOTIVATION: Long-read sequencing methods have considerable advantages for characterizing RNA isoforms. Oxford Nanopore sequencing records changes in electrical current when nucleic acid traverses through a pore. However, basecalling of this raw signal (known as a squiggle) is error prone, making it challenging to accurately identify splice junctions. Existing strategies include utilizing matched short-read data and/or annotated splice junctions to correct nanopore reads but add expense or limit junctions to known (incomplete) annotations. Therefore, a method that could accurately identify splice junctions solely from nanopore data would have numerous advantages. RESULTS: We developed ‘NanoSplicer’ to identify splice junctions using raw nanopore signal (squiggles). For each splice junction, the observed squiggle is compared to candidate squiggles representing potential junctions to identify the correct candidate. Measuring squiggle similarity enables us to compute the probability of each candidate junction and find the most likely one. We tested our method using (i) synthetic mRNAs with known splice junctions and (ii) biological mRNAs from a lung-cancer cell-line. The results from both datasets demonstrate NanoSplicer improves splice junction identification, especially when the basecalling error rate near the splice junction is elevated. AVAILABILITY AND IMPLEMENTATION: NanoSplicer is available at https://github.com/shimlab/NanoSplicer and archived at https://doi.org/10.5281/zenodo.6403849. Data is available from ENA: ERS7273757 and ERS7273453. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-05-27 /pmc/articles/PMC9344838/ /pubmed/35639973 http://dx.doi.org/10.1093/bioinformatics/btac359 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
You, Yupei
Clark, Michael B
Shim, Heejung
NanoSplicer: accurate identification of splice junctions using Oxford Nanopore sequencing
title NanoSplicer: accurate identification of splice junctions using Oxford Nanopore sequencing
title_full NanoSplicer: accurate identification of splice junctions using Oxford Nanopore sequencing
title_fullStr NanoSplicer: accurate identification of splice junctions using Oxford Nanopore sequencing
title_full_unstemmed NanoSplicer: accurate identification of splice junctions using Oxford Nanopore sequencing
title_short NanoSplicer: accurate identification of splice junctions using Oxford Nanopore sequencing
title_sort nanosplicer: accurate identification of splice junctions using oxford nanopore sequencing
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344838/
https://www.ncbi.nlm.nih.gov/pubmed/35639973
http://dx.doi.org/10.1093/bioinformatics/btac359
work_keys_str_mv AT youyupei nanospliceraccurateidentificationofsplicejunctionsusingoxfordnanoporesequencing
AT clarkmichaelb nanospliceraccurateidentificationofsplicejunctionsusingoxfordnanoporesequencing
AT shimheejung nanospliceraccurateidentificationofsplicejunctionsusingoxfordnanoporesequencing