Cargando…
ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data
Long-read RNA sequencing (RNA-seq) holds great potential for characterizing transcriptome variation and full-length transcript isoforms, but the relatively high error rate of current long-read sequencing platforms poses a major challenge. We present ESPRESSO, a computational tool for robust discover...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Association for the Advancement of Science
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9858503/ https://www.ncbi.nlm.nih.gov/pubmed/36662851 http://dx.doi.org/10.1126/sciadv.abq5072 |
_version_ | 1784874117293408256 |
---|---|
author | Gao, Yuan Wang, Feng Wang, Robert Kutschera, Eric Xu, Yang Xie, Stephan Wang, Yuanyuan Kadash-Edmondson, Kathryn E. Lin, Lan Xing, Yi |
author_facet | Gao, Yuan Wang, Feng Wang, Robert Kutschera, Eric Xu, Yang Xie, Stephan Wang, Yuanyuan Kadash-Edmondson, Kathryn E. Lin, Lan Xing, Yi |
author_sort | Gao, Yuan |
collection | PubMed |
description | Long-read RNA sequencing (RNA-seq) holds great potential for characterizing transcriptome variation and full-length transcript isoforms, but the relatively high error rate of current long-read sequencing platforms poses a major challenge. We present ESPRESSO, a computational tool for robust discovery and quantification of transcript isoforms from error-prone long reads. ESPRESSO jointly considers alignments of all long reads aligned to a gene and uses error profiles of individual reads to improve the identification of splice junctions and the discovery of their corresponding transcript isoforms. On both a synthetic spike-in RNA sample and human RNA samples, ESPRESSO outperforms multiple contemporary tools in not only transcript isoform discovery but also transcript isoform quantification. In total, we generated and analyzed ~1.1 billion nanopore RNA-seq reads covering 30 human tissue samples and three human cell lines. ESPRESSO and its companion dataset provide a useful resource for studying the RNA repertoire of eukaryotic transcriptomes. |
format | Online Article Text |
id | pubmed-9858503 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Association for the Advancement of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-98585032023-01-30 ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data Gao, Yuan Wang, Feng Wang, Robert Kutschera, Eric Xu, Yang Xie, Stephan Wang, Yuanyuan Kadash-Edmondson, Kathryn E. Lin, Lan Xing, Yi Sci Adv Biomedicine and Life Sciences Long-read RNA sequencing (RNA-seq) holds great potential for characterizing transcriptome variation and full-length transcript isoforms, but the relatively high error rate of current long-read sequencing platforms poses a major challenge. We present ESPRESSO, a computational tool for robust discovery and quantification of transcript isoforms from error-prone long reads. ESPRESSO jointly considers alignments of all long reads aligned to a gene and uses error profiles of individual reads to improve the identification of splice junctions and the discovery of their corresponding transcript isoforms. On both a synthetic spike-in RNA sample and human RNA samples, ESPRESSO outperforms multiple contemporary tools in not only transcript isoform discovery but also transcript isoform quantification. In total, we generated and analyzed ~1.1 billion nanopore RNA-seq reads covering 30 human tissue samples and three human cell lines. ESPRESSO and its companion dataset provide a useful resource for studying the RNA repertoire of eukaryotic transcriptomes. American Association for the Advancement of Science 2023-01-20 /pmc/articles/PMC9858503/ /pubmed/36662851 http://dx.doi.org/10.1126/sciadv.abq5072 Text en Copyright © 2023 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution NonCommercial License 4.0 (CC BY-NC). https://creativecommons.org/licenses/by-nc/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license (https://creativecommons.org/licenses/by-nc/4.0/) , which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited. |
spellingShingle | Biomedicine and Life Sciences Gao, Yuan Wang, Feng Wang, Robert Kutschera, Eric Xu, Yang Xie, Stephan Wang, Yuanyuan Kadash-Edmondson, Kathryn E. Lin, Lan Xing, Yi ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data |
title | ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data |
title_full | ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data |
title_fullStr | ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data |
title_full_unstemmed | ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data |
title_short | ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data |
title_sort | espresso: robust discovery and quantification of transcript isoforms from error-prone long-read rna-seq data |
topic | Biomedicine and Life Sciences |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9858503/ https://www.ncbi.nlm.nih.gov/pubmed/36662851 http://dx.doi.org/10.1126/sciadv.abq5072 |
work_keys_str_mv | AT gaoyuan espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata AT wangfeng espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata AT wangrobert espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata AT kutscheraeric espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata AT xuyang espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata AT xiestephan espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata AT wangyuanyuan espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata AT kadashedmondsonkathryne espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata AT linlan espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata AT xingyi espressorobustdiscoveryandquantificationoftranscriptisoformsfromerrorpronelongreadrnaseqdata |