Cargando…
Ryūtō: network-flow based transcriptome reconstruction
BACKGROUND: The rapid increase in High-throughput sequencing of RNA (RNA-seq) has led to tremendous improvements in the detection and reconstruction of both expressed coding and non-coding RNA transcripts. Yet, the complete and accurate annotation of the complex transcriptional output of not only th...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6469118/ https://www.ncbi.nlm.nih.gov/pubmed/30991937 http://dx.doi.org/10.1186/s12859-019-2786-5 |
_version_ | 1783411579792916480 |
---|---|
author | Gatter, Thomas Stadler, Peter F |
author_facet | Gatter, Thomas Stadler, Peter F |
author_sort | Gatter, Thomas |
collection | PubMed |
description | BACKGROUND: The rapid increase in High-throughput sequencing of RNA (RNA-seq) has led to tremendous improvements in the detection and reconstruction of both expressed coding and non-coding RNA transcripts. Yet, the complete and accurate annotation of the complex transcriptional output of not only the human genome has remained elusive. One of the critical bottlenecks in this endeavor is the computational reconstruction of transcript structures, due to high noise levels, technological limits, and other biases in the raw data. RESULTS: We introduce several new and improved algorithms in a novel workflow for transcript assembly and quantification. We propose an extension of the common splice graph framework that combines aspects of overlap and bin graphs and makes it possible to efficiently use both multi-splice and paired-end information to the fullest extent. Phasing information of reads is used to further resolve loci. The decomposition of read coverage patterns is modeled as a minimum-cost flow problem to account for the unavoidable non-uniformities of RNA-seq data. CONCLUSION: Its performance compares favorably with state of the art methods on both simulated and real-life datasets. Ryūtō calls 1−4% more true transcripts, while calling 5−35% less false predictions compared to the next best competitor. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2786-5) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6469118 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-64691182019-04-23 Ryūtō: network-flow based transcriptome reconstruction Gatter, Thomas Stadler, Peter F BMC Bioinformatics Methodology Article BACKGROUND: The rapid increase in High-throughput sequencing of RNA (RNA-seq) has led to tremendous improvements in the detection and reconstruction of both expressed coding and non-coding RNA transcripts. Yet, the complete and accurate annotation of the complex transcriptional output of not only the human genome has remained elusive. One of the critical bottlenecks in this endeavor is the computational reconstruction of transcript structures, due to high noise levels, technological limits, and other biases in the raw data. RESULTS: We introduce several new and improved algorithms in a novel workflow for transcript assembly and quantification. We propose an extension of the common splice graph framework that combines aspects of overlap and bin graphs and makes it possible to efficiently use both multi-splice and paired-end information to the fullest extent. Phasing information of reads is used to further resolve loci. The decomposition of read coverage patterns is modeled as a minimum-cost flow problem to account for the unavoidable non-uniformities of RNA-seq data. CONCLUSION: Its performance compares favorably with state of the art methods on both simulated and real-life datasets. Ryūtō calls 1−4% more true transcripts, while calling 5−35% less false predictions compared to the next best competitor. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2786-5) contains supplementary material, which is available to authorized users. BioMed Central 2019-04-16 /pmc/articles/PMC6469118/ /pubmed/30991937 http://dx.doi.org/10.1186/s12859-019-2786-5 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Gatter, Thomas Stadler, Peter F Ryūtō: network-flow based transcriptome reconstruction |
title | Ryūtō: network-flow based transcriptome reconstruction |
title_full | Ryūtō: network-flow based transcriptome reconstruction |
title_fullStr | Ryūtō: network-flow based transcriptome reconstruction |
title_full_unstemmed | Ryūtō: network-flow based transcriptome reconstruction |
title_short | Ryūtō: network-flow based transcriptome reconstruction |
title_sort | ryūtō: network-flow based transcriptome reconstruction |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6469118/ https://www.ncbi.nlm.nih.gov/pubmed/30991937 http://dx.doi.org/10.1186/s12859-019-2786-5 |
work_keys_str_mv | AT gatterthomas ryutonetworkflowbasedtranscriptomereconstruction AT stadlerpeterf ryutonetworkflowbasedtranscriptomereconstruction |