Cargando…

NSMAP: A method for spliced isoforms identification and quantification from RNA-Seq

BACKGROUND: The development of techniques for sequencing the messenger RNA (RNA-Seq) enables it to study the biological mechanisms such as alternative splicing and gene expression regulation more deeply and accurately. Most existing methods employ RNA-Seq to quantify the expression levels of already...

Descripción completa

Detalles Bibliográficos
Autores principales: Xia, Zheng, Wen, Jianguo, Chang, Chung-Che, Zhou, Xiaobo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3113944/
https://www.ncbi.nlm.nih.gov/pubmed/21575225
http://dx.doi.org/10.1186/1471-2105-12-162
_version_ 1782206009406652416
author Xia, Zheng
Wen, Jianguo
Chang, Chung-Che
Zhou, Xiaobo
author_facet Xia, Zheng
Wen, Jianguo
Chang, Chung-Che
Zhou, Xiaobo
author_sort Xia, Zheng
collection PubMed
description BACKGROUND: The development of techniques for sequencing the messenger RNA (RNA-Seq) enables it to study the biological mechanisms such as alternative splicing and gene expression regulation more deeply and accurately. Most existing methods employ RNA-Seq to quantify the expression levels of already annotated isoforms from the reference genome. However, the current reference genome is very incomplete due to the complexity of the transcriptome which hiders the comprehensive investigation of transcriptome using RNA-Seq. Novel study on isoform inference and estimation purely from RNA-Seq without annotation information is desirable. RESULTS: A Nonnegativity and Sparsity constrained Maximum APosteriori (NSMAP) model has been proposed to estimate the expression levels of isoforms from RNA-Seq data without the annotation information. In contrast to previous methods, NSMAP performs identification of the structures of expressed isoforms and estimation of the expression levels of those expressed isoforms simultaneously, which enables better identification of isoforms. In the simulations parameterized by two real RNA-Seq data sets, more than 77% expressed isoforms are correctly identified and quantified. Then, we apply NSMAP on two RNA-Seq data sets of myelodysplastic syndromes (MDS) samples and one normal sample in order to identify differentially expressed known and novel isoforms in MDS disease. CONCLUSIONS: NSMAP provides a good strategy to identify and quantify novel isoforms without the knowledge of annotated reference genome which can further realize the potential of RNA-Seq technique in transcriptome analysis. NSMAP package is freely available at https://sites.google.com/site/nsmapforrnaseq.
format Online
Article
Text
id pubmed-3113944
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31139442011-06-14 NSMAP: A method for spliced isoforms identification and quantification from RNA-Seq Xia, Zheng Wen, Jianguo Chang, Chung-Che Zhou, Xiaobo BMC Bioinformatics Methodology Article BACKGROUND: The development of techniques for sequencing the messenger RNA (RNA-Seq) enables it to study the biological mechanisms such as alternative splicing and gene expression regulation more deeply and accurately. Most existing methods employ RNA-Seq to quantify the expression levels of already annotated isoforms from the reference genome. However, the current reference genome is very incomplete due to the complexity of the transcriptome which hiders the comprehensive investigation of transcriptome using RNA-Seq. Novel study on isoform inference and estimation purely from RNA-Seq without annotation information is desirable. RESULTS: A Nonnegativity and Sparsity constrained Maximum APosteriori (NSMAP) model has been proposed to estimate the expression levels of isoforms from RNA-Seq data without the annotation information. In contrast to previous methods, NSMAP performs identification of the structures of expressed isoforms and estimation of the expression levels of those expressed isoforms simultaneously, which enables better identification of isoforms. In the simulations parameterized by two real RNA-Seq data sets, more than 77% expressed isoforms are correctly identified and quantified. Then, we apply NSMAP on two RNA-Seq data sets of myelodysplastic syndromes (MDS) samples and one normal sample in order to identify differentially expressed known and novel isoforms in MDS disease. CONCLUSIONS: NSMAP provides a good strategy to identify and quantify novel isoforms without the knowledge of annotated reference genome which can further realize the potential of RNA-Seq technique in transcriptome analysis. NSMAP package is freely available at https://sites.google.com/site/nsmapforrnaseq. BioMed Central 2011-05-16 /pmc/articles/PMC3113944/ /pubmed/21575225 http://dx.doi.org/10.1186/1471-2105-12-162 Text en Copyright ©2011 Xia et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Xia, Zheng
Wen, Jianguo
Chang, Chung-Che
Zhou, Xiaobo
NSMAP: A method for spliced isoforms identification and quantification from RNA-Seq
title NSMAP: A method for spliced isoforms identification and quantification from RNA-Seq
title_full NSMAP: A method for spliced isoforms identification and quantification from RNA-Seq
title_fullStr NSMAP: A method for spliced isoforms identification and quantification from RNA-Seq
title_full_unstemmed NSMAP: A method for spliced isoforms identification and quantification from RNA-Seq
title_short NSMAP: A method for spliced isoforms identification and quantification from RNA-Seq
title_sort nsmap: a method for spliced isoforms identification and quantification from rna-seq
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3113944/
https://www.ncbi.nlm.nih.gov/pubmed/21575225
http://dx.doi.org/10.1186/1471-2105-12-162
work_keys_str_mv AT xiazheng nsmapamethodforsplicedisoformsidentificationandquantificationfromrnaseq
AT wenjianguo nsmapamethodforsplicedisoformsidentificationandquantificationfromrnaseq
AT changchungche nsmapamethodforsplicedisoformsidentificationandquantificationfromrnaseq
AT zhouxiaobo nsmapamethodforsplicedisoformsidentificationandquantificationfromrnaseq