Cargando…

ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data

Long-read RNA sequencing (RNA-seq) holds great potential for characterizing transcriptome variation and full-length transcript isoforms, but the relatively high error rate of current long-read sequencing platforms poses a major challenge. We present ESPRESSO, a computational tool for robust discover...

Descripción completa

Detalles Bibliográficos
Autores principales: Gao, Yuan, Wang, Feng, Wang, Robert, Kutschera, Eric, Xu, Yang, Xie, Stephan, Wang, Yuanyuan, Kadash-Edmondson, Kathryn E., Lin, Lan, Xing, Yi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Association for the Advancement of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9858503/
https://www.ncbi.nlm.nih.gov/pubmed/36662851
http://dx.doi.org/10.1126/sciadv.abq5072
_version_ 1784874117293408256
author Gao, Yuan
Wang, Feng
Wang, Robert
Kutschera, Eric
Xu, Yang
Xie, Stephan
Wang, Yuanyuan
Kadash-Edmondson, Kathryn E.
Lin, Lan
Xing, Yi
author_facet Gao, Yuan
Wang, Feng
Wang, Robert
Kutschera, Eric
Xu, Yang
Xie, Stephan
Wang, Yuanyuan
Kadash-Edmondson, Kathryn E.
Lin, Lan
Xing, Yi
author_sort Gao, Yuan
collection PubMed
description Long-read RNA sequencing (RNA-seq) holds great potential for characterizing transcriptome variation and full-length transcript isoforms, but the relatively high error rate of current long-read sequencing platforms poses a major challenge. We present ESPRESSO, a computational tool for robust discovery and quantification of transcript isoforms from error-prone long reads. ESPRESSO jointly considers alignments of all long reads aligned to a gene and uses error profiles of individual reads to improve the identification of splice junctions and the discovery of their corresponding transcript isoforms. On both a synthetic spike-in RNA sample and human RNA samples, ESPRESSO outperforms multiple contemporary tools in not only transcript isoform discovery but also transcript isoform quantification. In total, we generated and analyzed ~1.1 billion nanopore RNA-seq reads covering 30 human tissue samples and three human cell lines. ESPRESSO and its companion dataset provide a useful resource for studying the RNA repertoire of eukaryotic transcriptomes.
format Online
Article
Text
id pubmed-9858503
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Association for the Advancement of Science
record_format MEDLINE/PubMed
spelling pubmed-98585032023-01-30 ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data Gao, Yuan Wang, Feng Wang, Robert Kutschera, Eric Xu, Yang Xie, Stephan Wang, Yuanyuan Kadash-Edmondson, Kathryn E. Lin, Lan Xing, Yi Sci Adv Biomedicine and Life Sciences Long-read RNA sequencing (RNA-seq) holds great potential for characterizing transcriptome variation and full-length transcript isoforms, but the relatively high error rate of current long-read sequencing platforms poses a major challenge. We present ESPRESSO, a computational tool for robust discovery and quantification of transcript isoforms from error-prone long reads. ESPRESSO jointly considers alignments of all long reads aligned to a gene and uses error profiles of individual reads to improve the identification of splice junctions and the discovery of their corresponding transcript isoforms. On both a synthetic spike-in RNA sample and human RNA samples, ESPRESSO outperforms multiple contemporary tools in not only transcript isoform discovery but also transcript isoform quantification. In total, we generated and analyzed ~1.1 billion nanopore RNA-seq reads covering 30 human tissue samples and three human cell lines. ESPRESSO and its companion dataset provide a useful resource for studying the RNA repertoire of eukaryotic transcriptomes. American Association for the Advancement of Science 2023-01-20 /pmc/articles/PMC9858503/ /pubmed/36662851 http://dx.doi.org/10.1126/sciadv.abq5072 Text en Copyright © 2023 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution NonCommercial License 4.0 (CC BY-NC). https://creativecommons.org/licenses/by-nc/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license (https://creativecommons.org/licenses/by-nc/4.0/) , which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.
spellingShingle Biomedicine and Life Sciences
Gao, Yuan
Wang, Feng
Wang, Robert
Kutschera, Eric
Xu, Yang
Xie, Stephan
Wang, Yuanyuan
Kadash-Edmondson, Kathryn E.
Lin, Lan
Xing, Yi
ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data
title ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data
title_full ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data
title_fullStr ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data
title_full_unstemmed ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data
title_short ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data
title_sort espresso: robust discovery and quantification of transcript isoforms from error-prone long-read rna-seq data
topic Biomedicine and Life Sciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9858503/
https://www.ncbi.nlm.nih.gov/pubmed/36662851
http://dx.doi.org/10.1126/sciadv.abq5072
work_keys_str_mv AT gaoyuan espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata
AT wangfeng espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata
AT wangrobert espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata
AT kutscheraeric espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata
AT xuyang espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata
AT xiestephan espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata
AT wangyuanyuan espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata
AT kadashedmondsonkathryne espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata
AT linlan espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata
AT xingyi espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata