Cargando…

Ryūtō: network-flow based transcriptome reconstruction

BACKGROUND: The rapid increase in High-throughput sequencing of RNA (RNA-seq) has led to tremendous improvements in the detection and reconstruction of both expressed coding and non-coding RNA transcripts. Yet, the complete and accurate annotation of the complex transcriptional output of not only th...

Descripción completa

Detalles Bibliográficos
Autores principales: Gatter, Thomas, Stadler, Peter F
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6469118/
https://www.ncbi.nlm.nih.gov/pubmed/30991937
http://dx.doi.org/10.1186/s12859-019-2786-5
_version_ 1783411579792916480
author Gatter, Thomas
Stadler, Peter F
author_facet Gatter, Thomas
Stadler, Peter F
author_sort Gatter, Thomas
collection PubMed
description BACKGROUND: The rapid increase in High-throughput sequencing of RNA (RNA-seq) has led to tremendous improvements in the detection and reconstruction of both expressed coding and non-coding RNA transcripts. Yet, the complete and accurate annotation of the complex transcriptional output of not only the human genome has remained elusive. One of the critical bottlenecks in this endeavor is the computational reconstruction of transcript structures, due to high noise levels, technological limits, and other biases in the raw data. RESULTS: We introduce several new and improved algorithms in a novel workflow for transcript assembly and quantification. We propose an extension of the common splice graph framework that combines aspects of overlap and bin graphs and makes it possible to efficiently use both multi-splice and paired-end information to the fullest extent. Phasing information of reads is used to further resolve loci. The decomposition of read coverage patterns is modeled as a minimum-cost flow problem to account for the unavoidable non-uniformities of RNA-seq data. CONCLUSION: Its performance compares favorably with state of the art methods on both simulated and real-life datasets. Ryūtō calls 1−4% more true transcripts, while calling 5−35% less false predictions compared to the next best competitor. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2786-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6469118
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-64691182019-04-23 Ryūtō: network-flow based transcriptome reconstruction Gatter, Thomas Stadler, Peter F BMC Bioinformatics Methodology Article BACKGROUND: The rapid increase in High-throughput sequencing of RNA (RNA-seq) has led to tremendous improvements in the detection and reconstruction of both expressed coding and non-coding RNA transcripts. Yet, the complete and accurate annotation of the complex transcriptional output of not only the human genome has remained elusive. One of the critical bottlenecks in this endeavor is the computational reconstruction of transcript structures, due to high noise levels, technological limits, and other biases in the raw data. RESULTS: We introduce several new and improved algorithms in a novel workflow for transcript assembly and quantification. We propose an extension of the common splice graph framework that combines aspects of overlap and bin graphs and makes it possible to efficiently use both multi-splice and paired-end information to the fullest extent. Phasing information of reads is used to further resolve loci. The decomposition of read coverage patterns is modeled as a minimum-cost flow problem to account for the unavoidable non-uniformities of RNA-seq data. CONCLUSION: Its performance compares favorably with state of the art methods on both simulated and real-life datasets. Ryūtō calls 1−4% more true transcripts, while calling 5−35% less false predictions compared to the next best competitor. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2786-5) contains supplementary material, which is available to authorized users. BioMed Central 2019-04-16 /pmc/articles/PMC6469118/ /pubmed/30991937 http://dx.doi.org/10.1186/s12859-019-2786-5 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Gatter, Thomas
Stadler, Peter F
Ryūtō: network-flow based transcriptome reconstruction
title Ryūtō: network-flow based transcriptome reconstruction
title_full Ryūtō: network-flow based transcriptome reconstruction
title_fullStr Ryūtō: network-flow based transcriptome reconstruction
title_full_unstemmed Ryūtō: network-flow based transcriptome reconstruction
title_short Ryūtō: network-flow based transcriptome reconstruction
title_sort ryūtō: network-flow based transcriptome reconstruction
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6469118/
https://www.ncbi.nlm.nih.gov/pubmed/30991937
http://dx.doi.org/10.1186/s12859-019-2786-5
work_keys_str_mv AT gatterthomas ryutonetworkflowbasedtranscriptomereconstruction
AT stadlerpeterf ryutonetworkflowbasedtranscriptomereconstruction