Cargando…
Identification and quantification of small exon-containing isoforms in long-read RNA sequencing data
Small exons are pervasive in transcriptomes across organisms, and their quantification in RNA isoforms is crucial for understanding gene functions. Although long-read RNA-seq based on Oxford Nanopore Technologies (ONT) offers the advantage of covering transcripts in full length, its lower base accur...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10639058/ https://www.ncbi.nlm.nih.gov/pubmed/37843096 http://dx.doi.org/10.1093/nar/gkad810 |
_version_ | 1785146615720312832 |
---|---|
author | Liu, Zhen Zhu, Chenchen Steinmetz, Lars M Wei, Wu |
author_facet | Liu, Zhen Zhu, Chenchen Steinmetz, Lars M Wei, Wu |
author_sort | Liu, Zhen |
collection | PubMed |
description | Small exons are pervasive in transcriptomes across organisms, and their quantification in RNA isoforms is crucial for understanding gene functions. Although long-read RNA-seq based on Oxford Nanopore Technologies (ONT) offers the advantage of covering transcripts in full length, its lower base accuracy poses challenges for identifying individual exons, particularly microexons (≤ 30 nucleotides). Here, we systematically assess small exons quantification in synthetic and human ONT RNA-seq datasets. We demonstrate that reads containing small exons are often not properly aligned, affecting the quantification of relevant transcripts. Thus, we develop a local-realignment method for misaligned exons (MisER), which remaps reads with misaligned exons to the transcript references. Using synthetic and simulated datasets, we demonstrate the high sensitivity and specificity of MisER for the quantification of transcripts containing small exons. Moreover, MisER enabled us to identify small exons with a higher percent spliced-in index (PSI) in neural, particularly neural-regulated microexons, when comparing 14 neural to 16 non-neural tissues in humans. Our work introduces an improved quantification method for long-read RNA-seq and especially facilitates studies using ONT long-reads to elucidate the regulation of genes involving small exons. |
format | Online Article Text |
id | pubmed-10639058 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-106390582023-11-15 Identification and quantification of small exon-containing isoforms in long-read RNA sequencing data Liu, Zhen Zhu, Chenchen Steinmetz, Lars M Wei, Wu Nucleic Acids Res Methods Small exons are pervasive in transcriptomes across organisms, and their quantification in RNA isoforms is crucial for understanding gene functions. Although long-read RNA-seq based on Oxford Nanopore Technologies (ONT) offers the advantage of covering transcripts in full length, its lower base accuracy poses challenges for identifying individual exons, particularly microexons (≤ 30 nucleotides). Here, we systematically assess small exons quantification in synthetic and human ONT RNA-seq datasets. We demonstrate that reads containing small exons are often not properly aligned, affecting the quantification of relevant transcripts. Thus, we develop a local-realignment method for misaligned exons (MisER), which remaps reads with misaligned exons to the transcript references. Using synthetic and simulated datasets, we demonstrate the high sensitivity and specificity of MisER for the quantification of transcripts containing small exons. Moreover, MisER enabled us to identify small exons with a higher percent spliced-in index (PSI) in neural, particularly neural-regulated microexons, when comparing 14 neural to 16 non-neural tissues in humans. Our work introduces an improved quantification method for long-read RNA-seq and especially facilitates studies using ONT long-reads to elucidate the regulation of genes involving small exons. Oxford University Press 2023-10-16 /pmc/articles/PMC10639058/ /pubmed/37843096 http://dx.doi.org/10.1093/nar/gkad810 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Liu, Zhen Zhu, Chenchen Steinmetz, Lars M Wei, Wu Identification and quantification of small exon-containing isoforms in long-read RNA sequencing data |
title | Identification and quantification of small exon-containing isoforms in long-read RNA sequencing data |
title_full | Identification and quantification of small exon-containing isoforms in long-read RNA sequencing data |
title_fullStr | Identification and quantification of small exon-containing isoforms in long-read RNA sequencing data |
title_full_unstemmed | Identification and quantification of small exon-containing isoforms in long-read RNA sequencing data |
title_short | Identification and quantification of small exon-containing isoforms in long-read RNA sequencing data |
title_sort | identification and quantification of small exon-containing isoforms in long-read rna sequencing data |
topic | Methods |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10639058/ https://www.ncbi.nlm.nih.gov/pubmed/37843096 http://dx.doi.org/10.1093/nar/gkad810 |
work_keys_str_mv | AT liuzhen identificationandquantificationofsmallexoncontainingisoformsinlongreadrnasequencingdata AT zhuchenchen identificationandquantificationofsmallexoncontainingisoformsinlongreadrnasequencingdata AT steinmetzlarsm identificationandquantificationofsmallexoncontainingisoformsinlongreadrnasequencingdata AT weiwu identificationandquantificationofsmallexoncontainingisoformsinlongreadrnasequencingdata |