Cargando…

MITIE: Simultaneous RNA-Seq-based transcript identification and quantification in multiple samples

Motivation: High-throughput sequencing of mRNA (RNA-Seq) has led to tremendous improvements in the detection of expressed genes and reconstruction of RNA transcripts. However, the extensive dynamic range of gene expression, technical limitations and biases, as well as the observed complexity of the...

Descripción completa

Detalles Bibliográficos
Autores principales: Behr, Jonas, Kahles, André, Zhong, Yi, Sreedharan, Vipin T., Drewe, Philipp, Rätsch, Gunnar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3789545/
https://www.ncbi.nlm.nih.gov/pubmed/23980025
http://dx.doi.org/10.1093/bioinformatics/btt442
_version_ 1782286459822145536
author Behr, Jonas
Kahles, André
Zhong, Yi
Sreedharan, Vipin T.
Drewe, Philipp
Rätsch, Gunnar
author_facet Behr, Jonas
Kahles, André
Zhong, Yi
Sreedharan, Vipin T.
Drewe, Philipp
Rätsch, Gunnar
author_sort Behr, Jonas
collection PubMed
description Motivation: High-throughput sequencing of mRNA (RNA-Seq) has led to tremendous improvements in the detection of expressed genes and reconstruction of RNA transcripts. However, the extensive dynamic range of gene expression, technical limitations and biases, as well as the observed complexity of the transcriptional landscape, pose profound computational challenges for transcriptome reconstruction. Results: We present the novel framework MITIE (Mixed Integer Transcript IdEntification) for simultaneous transcript reconstruction and quantification. We define a likelihood function based on the negative binomial distribution, use a regularization approach to select a few transcripts collectively explaining the observed read data and show how to find the optimal solution using Mixed Integer Programming. MITIE can (i) take advantage of known transcripts, (ii) reconstruct and quantify transcripts simultaneously in multiple samples, and (iii) resolve the location of multi-mapping reads. It is designed for genome- and assembly-based transcriptome reconstruction. We present an extensive study based on realistic simulated RNA-Seq data. When compared with state-of-the-art approaches, MITIE proves to be significantly more sensitive and overall more accurate. Moreover, MITIE yields substantial performance gains when used with multiple samples. We applied our system to 38 Drosophila melanogaster modENCODE RNA-Seq libraries and estimated the sensitivity of reconstructing omitted transcript annotations and the specificity with respect to annotated transcripts. Our results corroborate that a well-motivated objective paired with appropriate optimization techniques lead to significant improvements over the state-of-the-art in transcriptome reconstruction. Availability: MITIE is implemented in C++ and is available from http://bioweb.me/mitie under the GPL license. Contact: Jonas_Behr@web.de and raetsch@cbio.mskcc.org Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-3789545
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-37895452013-10-17 MITIE: Simultaneous RNA-Seq-based transcript identification and quantification in multiple samples Behr, Jonas Kahles, André Zhong, Yi Sreedharan, Vipin T. Drewe, Philipp Rätsch, Gunnar Bioinformatics Original Papers Motivation: High-throughput sequencing of mRNA (RNA-Seq) has led to tremendous improvements in the detection of expressed genes and reconstruction of RNA transcripts. However, the extensive dynamic range of gene expression, technical limitations and biases, as well as the observed complexity of the transcriptional landscape, pose profound computational challenges for transcriptome reconstruction. Results: We present the novel framework MITIE (Mixed Integer Transcript IdEntification) for simultaneous transcript reconstruction and quantification. We define a likelihood function based on the negative binomial distribution, use a regularization approach to select a few transcripts collectively explaining the observed read data and show how to find the optimal solution using Mixed Integer Programming. MITIE can (i) take advantage of known transcripts, (ii) reconstruct and quantify transcripts simultaneously in multiple samples, and (iii) resolve the location of multi-mapping reads. It is designed for genome- and assembly-based transcriptome reconstruction. We present an extensive study based on realistic simulated RNA-Seq data. When compared with state-of-the-art approaches, MITIE proves to be significantly more sensitive and overall more accurate. Moreover, MITIE yields substantial performance gains when used with multiple samples. We applied our system to 38 Drosophila melanogaster modENCODE RNA-Seq libraries and estimated the sensitivity of reconstructing omitted transcript annotations and the specificity with respect to annotated transcripts. Our results corroborate that a well-motivated objective paired with appropriate optimization techniques lead to significant improvements over the state-of-the-art in transcriptome reconstruction. Availability: MITIE is implemented in C++ and is available from http://bioweb.me/mitie under the GPL license. Contact: Jonas_Behr@web.de and raetsch@cbio.mskcc.org Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2013-10-15 2013-08-25 /pmc/articles/PMC3789545/ /pubmed/23980025 http://dx.doi.org/10.1093/bioinformatics/btt442 Text en © The Author 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Behr, Jonas
Kahles, André
Zhong, Yi
Sreedharan, Vipin T.
Drewe, Philipp
Rätsch, Gunnar
MITIE: Simultaneous RNA-Seq-based transcript identification and quantification in multiple samples
title MITIE: Simultaneous RNA-Seq-based transcript identification and quantification in multiple samples
title_full MITIE: Simultaneous RNA-Seq-based transcript identification and quantification in multiple samples
title_fullStr MITIE: Simultaneous RNA-Seq-based transcript identification and quantification in multiple samples
title_full_unstemmed MITIE: Simultaneous RNA-Seq-based transcript identification and quantification in multiple samples
title_short MITIE: Simultaneous RNA-Seq-based transcript identification and quantification in multiple samples
title_sort mitie: simultaneous rna-seq-based transcript identification and quantification in multiple samples
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3789545/
https://www.ncbi.nlm.nih.gov/pubmed/23980025
http://dx.doi.org/10.1093/bioinformatics/btt442
work_keys_str_mv AT behrjonas mitiesimultaneousrnaseqbasedtranscriptidentificationandquantificationinmultiplesamples
AT kahlesandre mitiesimultaneousrnaseqbasedtranscriptidentificationandquantificationinmultiplesamples
AT zhongyi mitiesimultaneousrnaseqbasedtranscriptidentificationandquantificationinmultiplesamples
AT sreedharanvipint mitiesimultaneousrnaseqbasedtranscriptidentificationandquantificationinmultiplesamples
AT drewephilipp mitiesimultaneousrnaseqbasedtranscriptidentificationandquantificationinmultiplesamples
AT ratschgunnar mitiesimultaneousrnaseqbasedtranscriptidentificationandquantificationinmultiplesamples