Cargando…

Modeling Exon-Specific Bias Distribution Improves the Analysis of RNA-Seq Data

RNA-seq technology has become an important tool for quantifying the gene and transcript expression in transcriptome study. The two major difficulties for the gene and transcript expression quantification are the read mapping ambiguity and the overdispersion of the read distribution along reference s...

Descripción completa

Detalles Bibliográficos
Autores principales:	Liu, Xuejun, Zhang, Li, Chen, Songcan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2015
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4598124/ https://www.ncbi.nlm.nih.gov/pubmed/26448625 http://dx.doi.org/10.1371/journal.pone.0140032

_version_	1782394035951894528
author	Liu, Xuejun Zhang, Li Chen, Songcan
author_facet	Liu, Xuejun Zhang, Li Chen, Songcan
author_sort	Liu, Xuejun
collection	PubMed
description	RNA-seq technology has become an important tool for quantifying the gene and transcript expression in transcriptome study. The two major difficulties for the gene and transcript expression quantification are the read mapping ambiguity and the overdispersion of the read distribution along reference sequence. Many approaches have been proposed to deal with these difficulties. A number of existing methods use Poisson distribution to model the read counts and this easily splits the counts into the contributions from multiple transcripts. Meanwhile, various solutions were put forward to account for the overdispersion in the Poisson models. By checking the similarities among the variation patterns of read counts for individual genes, we found that the count variation is exon-specific and has the conserved pattern across the samples for each individual gene. We introduce Gamma-distributed latent variables to model the read sequencing preference for each exon. These variables are embedded to the rate parameter of a Poisson model to account for the overdispersion of read distribution. The model is tractable since the Gamma priors can be integrated out in the maximum likelihood estimation. We evaluate the proposed approach, PGseq, using four real datasets and one simulated dataset, and compare its performance with other popular methods. Results show that PGseq presents competitive performance compared to other alternatives in terms of accuracy in the gene and transcript expression calculation and in the downstream differential expression analysis. Especially, we show the advantage of our method in the analysis of low expression.
format	Online Article Text
id	pubmed-4598124
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-45981242015-10-20 Modeling Exon-Specific Bias Distribution Improves the Analysis of RNA-Seq Data Liu, Xuejun Zhang, Li Chen, Songcan PLoS One Research Article RNA-seq technology has become an important tool for quantifying the gene and transcript expression in transcriptome study. The two major difficulties for the gene and transcript expression quantification are the read mapping ambiguity and the overdispersion of the read distribution along reference sequence. Many approaches have been proposed to deal with these difficulties. A number of existing methods use Poisson distribution to model the read counts and this easily splits the counts into the contributions from multiple transcripts. Meanwhile, various solutions were put forward to account for the overdispersion in the Poisson models. By checking the similarities among the variation patterns of read counts for individual genes, we found that the count variation is exon-specific and has the conserved pattern across the samples for each individual gene. We introduce Gamma-distributed latent variables to model the read sequencing preference for each exon. These variables are embedded to the rate parameter of a Poisson model to account for the overdispersion of read distribution. The model is tractable since the Gamma priors can be integrated out in the maximum likelihood estimation. We evaluate the proposed approach, PGseq, using four real datasets and one simulated dataset, and compare its performance with other popular methods. Results show that PGseq presents competitive performance compared to other alternatives in terms of accuracy in the gene and transcript expression calculation and in the downstream differential expression analysis. Especially, we show the advantage of our method in the analysis of low expression. Public Library of Science 2015-10-08 /pmc/articles/PMC4598124/ /pubmed/26448625 http://dx.doi.org/10.1371/journal.pone.0140032 Text en © 2015 Liu et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle	Research Article Liu, Xuejun Zhang, Li Chen, Songcan Modeling Exon-Specific Bias Distribution Improves the Analysis of RNA-Seq Data
title	Modeling Exon-Specific Bias Distribution Improves the Analysis of RNA-Seq Data
title_full	Modeling Exon-Specific Bias Distribution Improves the Analysis of RNA-Seq Data
title_fullStr	Modeling Exon-Specific Bias Distribution Improves the Analysis of RNA-Seq Data
title_full_unstemmed	Modeling Exon-Specific Bias Distribution Improves the Analysis of RNA-Seq Data
title_short	Modeling Exon-Specific Bias Distribution Improves the Analysis of RNA-Seq Data
title_sort	modeling exon-specific bias distribution improves the analysis of rna-seq data
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4598124/ https://www.ncbi.nlm.nih.gov/pubmed/26448625 http://dx.doi.org/10.1371/journal.pone.0140032
work_keys_str_mv	AT liuxuejun modelingexonspecificbiasdistributionimprovestheanalysisofrnaseqdata AT zhangli modelingexonspecificbiasdistributionimprovestheanalysisofrnaseqdata AT chensongcan modelingexonspecificbiasdistributionimprovestheanalysisofrnaseqdata

Modeling Exon-Specific Bias Distribution Improves the Analysis of RNA-Seq Data

Ejemplares similares