Cargando…

Terminus enables the discovery of data-driven, robust transcript groups from RNA-seq data

MOTIVATION: Advances in sequencing technology, inference algorithms and differential testing methodology have enabled transcript-level analysis of RNA-seq data. Yet, the inherent inferential uncertainty in transcript-level abundance estimation, even among the most accurate approaches, means that rob...

Descripción completa

Detalles Bibliográficos
Autores principales: Sarkar, Hirak, Srivastava, Avi, Bravo, Héctor Corrada, Love, Michael I, Patro, Rob
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355257/
https://www.ncbi.nlm.nih.gov/pubmed/32657377
http://dx.doi.org/10.1093/bioinformatics/btaa448
_version_ 1783558238549049344
author Sarkar, Hirak
Srivastava, Avi
Bravo, Héctor Corrada
Love, Michael I
Patro, Rob
author_facet Sarkar, Hirak
Srivastava, Avi
Bravo, Héctor Corrada
Love, Michael I
Patro, Rob
author_sort Sarkar, Hirak
collection PubMed
description MOTIVATION: Advances in sequencing technology, inference algorithms and differential testing methodology have enabled transcript-level analysis of RNA-seq data. Yet, the inherent inferential uncertainty in transcript-level abundance estimation, even among the most accurate approaches, means that robust transcript-level analysis often remains a challenge. Conversely, gene-level analysis remains a common and robust approach for understanding RNA-seq data, but it coarsens the resulting analysis to the level of genes, even if the data strongly support specific transcript-level effects. RESULTS: We introduce a new data-driven approach for grouping together transcripts in an experiment based on their inferential uncertainty. Transcripts that share large numbers of ambiguously-mapping fragments with other transcripts, in complex patterns, often cannot have their abundances confidently estimated. Yet, the total transcriptional output of that group of transcripts will have greatly reduced inferential uncertainty, thus allowing more robust and confident downstream analysis. Our approach, implemented in the tool terminus, groups together transcripts in a data-driven manner allowing transcript-level analysis where it can be confidently supported, and deriving transcriptional groups where the inferential uncertainty is too high to support a transcript-level result. AVAILABILITY AND IMPLEMENTATION: Terminus is implemented in Rust, and is freely available and open source. It can be obtained from https://github.com/COMBINE-lab/Terminus. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-7355257
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-73552572020-07-16 Terminus enables the discovery of data-driven, robust transcript groups from RNA-seq data Sarkar, Hirak Srivastava, Avi Bravo, Héctor Corrada Love, Michael I Patro, Rob Bioinformatics Comparative and Functional Genomics MOTIVATION: Advances in sequencing technology, inference algorithms and differential testing methodology have enabled transcript-level analysis of RNA-seq data. Yet, the inherent inferential uncertainty in transcript-level abundance estimation, even among the most accurate approaches, means that robust transcript-level analysis often remains a challenge. Conversely, gene-level analysis remains a common and robust approach for understanding RNA-seq data, but it coarsens the resulting analysis to the level of genes, even if the data strongly support specific transcript-level effects. RESULTS: We introduce a new data-driven approach for grouping together transcripts in an experiment based on their inferential uncertainty. Transcripts that share large numbers of ambiguously-mapping fragments with other transcripts, in complex patterns, often cannot have their abundances confidently estimated. Yet, the total transcriptional output of that group of transcripts will have greatly reduced inferential uncertainty, thus allowing more robust and confident downstream analysis. Our approach, implemented in the tool terminus, groups together transcripts in a data-driven manner allowing transcript-level analysis where it can be confidently supported, and deriving transcriptional groups where the inferential uncertainty is too high to support a transcript-level result. AVAILABILITY AND IMPLEMENTATION: Terminus is implemented in Rust, and is freely available and open source. It can be obtained from https://github.com/COMBINE-lab/Terminus. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-07 2020-07-13 /pmc/articles/PMC7355257/ /pubmed/32657377 http://dx.doi.org/10.1093/bioinformatics/btaa448 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Comparative and Functional Genomics
Sarkar, Hirak
Srivastava, Avi
Bravo, Héctor Corrada
Love, Michael I
Patro, Rob
Terminus enables the discovery of data-driven, robust transcript groups from RNA-seq data
title Terminus enables the discovery of data-driven, robust transcript groups from RNA-seq data
title_full Terminus enables the discovery of data-driven, robust transcript groups from RNA-seq data
title_fullStr Terminus enables the discovery of data-driven, robust transcript groups from RNA-seq data
title_full_unstemmed Terminus enables the discovery of data-driven, robust transcript groups from RNA-seq data
title_short Terminus enables the discovery of data-driven, robust transcript groups from RNA-seq data
title_sort terminus enables the discovery of data-driven, robust transcript groups from rna-seq data
topic Comparative and Functional Genomics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355257/
https://www.ncbi.nlm.nih.gov/pubmed/32657377
http://dx.doi.org/10.1093/bioinformatics/btaa448
work_keys_str_mv AT sarkarhirak terminusenablesthediscoveryofdatadrivenrobusttranscriptgroupsfromrnaseqdata
AT srivastavaavi terminusenablesthediscoveryofdatadrivenrobusttranscriptgroupsfromrnaseqdata
AT bravohectorcorrada terminusenablesthediscoveryofdatadrivenrobusttranscriptgroupsfromrnaseqdata
AT lovemichaeli terminusenablesthediscoveryofdatadrivenrobusttranscriptgroupsfromrnaseqdata
AT patrorob terminusenablesthediscoveryofdatadrivenrobusttranscriptgroupsfromrnaseqdata