Cargando…

Piecing the puzzle together: a revisit to transcript reconstruction problem in RNA-seq

The advancement of RNA sequencing (RNA-seq) has provided an unprecedented opportunity to assess both the diversity and quantity of transcript isoforms in an mRNA transcriptome. In this paper, we revisit the computational problem of transcript reconstruction and quantification. Unlike existing method...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Yan, Hu, Yin, Liu, Jinze
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4168703/
https://www.ncbi.nlm.nih.gov/pubmed/25252653
http://dx.doi.org/10.1186/1471-2105-15-S9-S3
_version_ 1782335602260180992
author Huang, Yan
Hu, Yin
Liu, Jinze
author_facet Huang, Yan
Hu, Yin
Liu, Jinze
author_sort Huang, Yan
collection PubMed
description The advancement of RNA sequencing (RNA-seq) has provided an unprecedented opportunity to assess both the diversity and quantity of transcript isoforms in an mRNA transcriptome. In this paper, we revisit the computational problem of transcript reconstruction and quantification. Unlike existing methods which focus on how to explain the exons and splice variants detected by the reads with a set of isoforms, we aim at reconstructing transcripts by piecing the reads into individual effective transcript copies. Simultaneously, the quantity of each isoform is explicitly measured by the number of assembled effective copies, instead of estimated solely based on the collective read count. We have developed a novel method named Astroid that solves the problem of effective copy reconstruction on the basis of a flow network. The RNA-seq reads are represented as vertices in the flow network and are connected by weighted edges that evaluate the likelihood of two reads originating from the same effective copy. A maximum likelihood set of transcript copies is then reconstructed by solving a minimum-cost flow problem on the flow network. Simulation studies on the human transcriptome have demonstrated the superior sensitivity and specificity of Astroid in transcript reconstruction as well as improved accuracy in transcript quantification over several existing approaches. The application of Astroid on two real RNA-seq datasets has further demonstrated its accuracy through high correlation between the estimated isoform abundance and the qRT-PCR validations.
format Online
Article
Text
id pubmed-4168703
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41687032014-10-02 Piecing the puzzle together: a revisit to transcript reconstruction problem in RNA-seq Huang, Yan Hu, Yin Liu, Jinze BMC Bioinformatics Proceedings The advancement of RNA sequencing (RNA-seq) has provided an unprecedented opportunity to assess both the diversity and quantity of transcript isoforms in an mRNA transcriptome. In this paper, we revisit the computational problem of transcript reconstruction and quantification. Unlike existing methods which focus on how to explain the exons and splice variants detected by the reads with a set of isoforms, we aim at reconstructing transcripts by piecing the reads into individual effective transcript copies. Simultaneously, the quantity of each isoform is explicitly measured by the number of assembled effective copies, instead of estimated solely based on the collective read count. We have developed a novel method named Astroid that solves the problem of effective copy reconstruction on the basis of a flow network. The RNA-seq reads are represented as vertices in the flow network and are connected by weighted edges that evaluate the likelihood of two reads originating from the same effective copy. A maximum likelihood set of transcript copies is then reconstructed by solving a minimum-cost flow problem on the flow network. Simulation studies on the human transcriptome have demonstrated the superior sensitivity and specificity of Astroid in transcript reconstruction as well as improved accuracy in transcript quantification over several existing approaches. The application of Astroid on two real RNA-seq datasets has further demonstrated its accuracy through high correlation between the estimated isoform abundance and the qRT-PCR validations. BioMed Central 2014-09-10 /pmc/articles/PMC4168703/ /pubmed/25252653 http://dx.doi.org/10.1186/1471-2105-15-S9-S3 Text en Copyright © 2014 Huang and Hu; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Proceedings
Huang, Yan
Hu, Yin
Liu, Jinze
Piecing the puzzle together: a revisit to transcript reconstruction problem in RNA-seq
title Piecing the puzzle together: a revisit to transcript reconstruction problem in RNA-seq
title_full Piecing the puzzle together: a revisit to transcript reconstruction problem in RNA-seq
title_fullStr Piecing the puzzle together: a revisit to transcript reconstruction problem in RNA-seq
title_full_unstemmed Piecing the puzzle together: a revisit to transcript reconstruction problem in RNA-seq
title_short Piecing the puzzle together: a revisit to transcript reconstruction problem in RNA-seq
title_sort piecing the puzzle together: a revisit to transcript reconstruction problem in rna-seq
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4168703/
https://www.ncbi.nlm.nih.gov/pubmed/25252653
http://dx.doi.org/10.1186/1471-2105-15-S9-S3
work_keys_str_mv AT huangyan piecingthepuzzletogetherarevisittotranscriptreconstructionprobleminrnaseq
AT huyin piecingthepuzzletogetherarevisittotranscriptreconstructionprobleminrnaseq
AT liujinze piecingthepuzzletogetherarevisittotranscriptreconstructionprobleminrnaseq