Cargando…

A novel min-cost flow method for estimating transcript expression with RNA-Seq

BACKGROUND: Through transcription and alternative splicing, a gene can be transcribed into different RNA sequences (isoforms), depending on the individual, on the tissue the cell is in, or in response to some stimuli. Recent RNA-Seq technology allows for new high-throughput ways for isoform identifi...

Descripción completa

Detalles Bibliográficos
Autores principales: Tomescu, Alexandru I, Kuosmanen, Anna, Rizzi, Romeo, Mäkinen, Veli
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3622638/
https://www.ncbi.nlm.nih.gov/pubmed/23734627
http://dx.doi.org/10.1186/1471-2105-14-S5-S15
_version_ 1782265858963275776
author Tomescu, Alexandru I
Kuosmanen, Anna
Rizzi, Romeo
Mäkinen, Veli
author_facet Tomescu, Alexandru I
Kuosmanen, Anna
Rizzi, Romeo
Mäkinen, Veli
author_sort Tomescu, Alexandru I
collection PubMed
description BACKGROUND: Through transcription and alternative splicing, a gene can be transcribed into different RNA sequences (isoforms), depending on the individual, on the tissue the cell is in, or in response to some stimuli. Recent RNA-Seq technology allows for new high-throughput ways for isoform identification and quantification based on short reads, and various methods have been put forward for this non-trivial problem. RESULTS: In this paper we propose a novel radically different method based on minimum-cost network flows. This has a two-fold advantage: on the one hand, it translates the problem as an established one in the field of network flows, which can be solved in polynomial time, with different existing solvers; on the other hand, it is general enough to encompass many of the previous proposals under the least sum of squares model. Our method works as follows: in order to find the transcripts which best explain, under a given fitness model, a splicing graph resulting from an RNA-Seq experiment, we find a min-cost flow in an offset flow network, under an equivalent cost model. Under very weak assumptions on the fitness model, the optimal flow can be computed in polynomial time. Parsimoniously splitting the flow back into few path transcripts can be done with any of the heuristics and approximations available from the theory of network flows. In the present implementation, we choose the simple strategy of repeatedly removing the heaviest path. CONCLUSIONS: We proposed a new very general method based on network flows for a multiassembly problem arising from isoform identification and quantification with RNA-Seq. Experimental results on prediction accuracy show that our method is very competitive with popular tools such as Cufflinks and IsoLasso. Our tool, called Traph (Transcrips in gRAPHs), is available at: http://www.cs.helsinki.fi/gsa/traph/.
format Online
Article
Text
id pubmed-3622638
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36226382013-04-15 A novel min-cost flow method for estimating transcript expression with RNA-Seq Tomescu, Alexandru I Kuosmanen, Anna Rizzi, Romeo Mäkinen, Veli BMC Bioinformatics Proceedings BACKGROUND: Through transcription and alternative splicing, a gene can be transcribed into different RNA sequences (isoforms), depending on the individual, on the tissue the cell is in, or in response to some stimuli. Recent RNA-Seq technology allows for new high-throughput ways for isoform identification and quantification based on short reads, and various methods have been put forward for this non-trivial problem. RESULTS: In this paper we propose a novel radically different method based on minimum-cost network flows. This has a two-fold advantage: on the one hand, it translates the problem as an established one in the field of network flows, which can be solved in polynomial time, with different existing solvers; on the other hand, it is general enough to encompass many of the previous proposals under the least sum of squares model. Our method works as follows: in order to find the transcripts which best explain, under a given fitness model, a splicing graph resulting from an RNA-Seq experiment, we find a min-cost flow in an offset flow network, under an equivalent cost model. Under very weak assumptions on the fitness model, the optimal flow can be computed in polynomial time. Parsimoniously splitting the flow back into few path transcripts can be done with any of the heuristics and approximations available from the theory of network flows. In the present implementation, we choose the simple strategy of repeatedly removing the heaviest path. CONCLUSIONS: We proposed a new very general method based on network flows for a multiassembly problem arising from isoform identification and quantification with RNA-Seq. Experimental results on prediction accuracy show that our method is very competitive with popular tools such as Cufflinks and IsoLasso. Our tool, called Traph (Transcrips in gRAPHs), is available at: http://www.cs.helsinki.fi/gsa/traph/. BioMed Central 2013-04-10 /pmc/articles/PMC3622638/ /pubmed/23734627 http://dx.doi.org/10.1186/1471-2105-14-S5-S15 Text en Copyright © 2013 Tomescu et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Tomescu, Alexandru I
Kuosmanen, Anna
Rizzi, Romeo
Mäkinen, Veli
A novel min-cost flow method for estimating transcript expression with RNA-Seq
title A novel min-cost flow method for estimating transcript expression with RNA-Seq
title_full A novel min-cost flow method for estimating transcript expression with RNA-Seq
title_fullStr A novel min-cost flow method for estimating transcript expression with RNA-Seq
title_full_unstemmed A novel min-cost flow method for estimating transcript expression with RNA-Seq
title_short A novel min-cost flow method for estimating transcript expression with RNA-Seq
title_sort novel min-cost flow method for estimating transcript expression with rna-seq
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3622638/
https://www.ncbi.nlm.nih.gov/pubmed/23734627
http://dx.doi.org/10.1186/1471-2105-14-S5-S15
work_keys_str_mv AT tomescualexandrui anovelmincostflowmethodforestimatingtranscriptexpressionwithrnaseq
AT kuosmanenanna anovelmincostflowmethodforestimatingtranscriptexpressionwithrnaseq
AT rizziromeo anovelmincostflowmethodforestimatingtranscriptexpressionwithrnaseq
AT makinenveli anovelmincostflowmethodforestimatingtranscriptexpressionwithrnaseq
AT tomescualexandrui novelmincostflowmethodforestimatingtranscriptexpressionwithrnaseq
AT kuosmanenanna novelmincostflowmethodforestimatingtranscriptexpressionwithrnaseq
AT rizziromeo novelmincostflowmethodforestimatingtranscriptexpressionwithrnaseq
AT makinenveli novelmincostflowmethodforestimatingtranscriptexpressionwithrnaseq