Cargando…

Differential expression analysis of RNA sequencing data by incorporating non-exonic mapped reads

BACKGROUND: RNA sequencing (RNA-seq) is a powerful tool for genome-wide expression profiling of biological samples with the advantage of high-throughput and high resolution. There are many existing algorithms nowadays for quantifying expression levels and detecting differential gene expression, but...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Hung-I Harry, Liu, Yuanhang, Zou, Yi, Lai, Zhao, Sarkar, Devanand, Huang, Yufei, Chen, Yidong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4474535/
https://www.ncbi.nlm.nih.gov/pubmed/26099631
http://dx.doi.org/10.1186/1471-2164-16-S7-S14
_version_ 1782377285060395008
author Chen, Hung-I Harry
Liu, Yuanhang
Zou, Yi
Lai, Zhao
Sarkar, Devanand
Huang, Yufei
Chen, Yidong
author_facet Chen, Hung-I Harry
Liu, Yuanhang
Zou, Yi
Lai, Zhao
Sarkar, Devanand
Huang, Yufei
Chen, Yidong
author_sort Chen, Hung-I Harry
collection PubMed
description BACKGROUND: RNA sequencing (RNA-seq) is a powerful tool for genome-wide expression profiling of biological samples with the advantage of high-throughput and high resolution. There are many existing algorithms nowadays for quantifying expression levels and detecting differential gene expression, but none of them takes the misaligned reads that are mapped to non-exonic regions into account. We developed a novel algorithm, XBSeq, where a statistical model was established based on the assumption that observed signals are the convolution of true expression signals and sequencing noises. The mapped reads in non-exonic regions are considered as sequencing noises, which follows a Poisson distribution. Given measureable observed and noise signals from RNA-seq data, true expression signals, assuming governed by the negative binomial distribution, can be delineated and thus the accurate detection of differential expressed genes. RESULTS: We implemented our novel XBSeq algorithm and evaluated it by using a set of simulated expression datasets under different conditions, using a combination of negative binomial and Poisson distributions with parameters derived from real RNA-seq data. We compared the performance of our method with other commonly used differential expression analysis algorithms. We also evaluated the changes in true and false positive rates with variations in biological replicates, differential fold changes, and expression levels in non-exonic regions. We also tested the algorithm on a set of real RNA-seq data where the common and different detection results from different algorithms were reported. CONCLUSIONS: In this paper, we proposed a novel XBSeq, a differential expression analysis algorithm for RNA-seq data that takes non-exonic mapped reads into consideration. When background noise is at baseline level, the performance of XBSeq and DESeq are mostly equivalent. However, our method surpasses DESeq and other algorithms with the increase of non-exonic mapped reads. Only in very low read count condition XBSeq had a slightly higher false discovery rate, which may be improved by adjusting the background noise effect in this situation. Taken together, by considering non-exonic mapped reads, XBSeq can provide accurate expression measurement and thus detect differential expressed genes even in noisy conditions.
format Online
Article
Text
id pubmed-4474535
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44745352015-06-25 Differential expression analysis of RNA sequencing data by incorporating non-exonic mapped reads Chen, Hung-I Harry Liu, Yuanhang Zou, Yi Lai, Zhao Sarkar, Devanand Huang, Yufei Chen, Yidong BMC Genomics Research BACKGROUND: RNA sequencing (RNA-seq) is a powerful tool for genome-wide expression profiling of biological samples with the advantage of high-throughput and high resolution. There are many existing algorithms nowadays for quantifying expression levels and detecting differential gene expression, but none of them takes the misaligned reads that are mapped to non-exonic regions into account. We developed a novel algorithm, XBSeq, where a statistical model was established based on the assumption that observed signals are the convolution of true expression signals and sequencing noises. The mapped reads in non-exonic regions are considered as sequencing noises, which follows a Poisson distribution. Given measureable observed and noise signals from RNA-seq data, true expression signals, assuming governed by the negative binomial distribution, can be delineated and thus the accurate detection of differential expressed genes. RESULTS: We implemented our novel XBSeq algorithm and evaluated it by using a set of simulated expression datasets under different conditions, using a combination of negative binomial and Poisson distributions with parameters derived from real RNA-seq data. We compared the performance of our method with other commonly used differential expression analysis algorithms. We also evaluated the changes in true and false positive rates with variations in biological replicates, differential fold changes, and expression levels in non-exonic regions. We also tested the algorithm on a set of real RNA-seq data where the common and different detection results from different algorithms were reported. CONCLUSIONS: In this paper, we proposed a novel XBSeq, a differential expression analysis algorithm for RNA-seq data that takes non-exonic mapped reads into consideration. When background noise is at baseline level, the performance of XBSeq and DESeq are mostly equivalent. However, our method surpasses DESeq and other algorithms with the increase of non-exonic mapped reads. Only in very low read count condition XBSeq had a slightly higher false discovery rate, which may be improved by adjusting the background noise effect in this situation. Taken together, by considering non-exonic mapped reads, XBSeq can provide accurate expression measurement and thus detect differential expressed genes even in noisy conditions. BioMed Central 2015-06-11 /pmc/articles/PMC4474535/ /pubmed/26099631 http://dx.doi.org/10.1186/1471-2164-16-S7-S14 Text en Copyright © 2015 Chen et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Chen, Hung-I Harry
Liu, Yuanhang
Zou, Yi
Lai, Zhao
Sarkar, Devanand
Huang, Yufei
Chen, Yidong
Differential expression analysis of RNA sequencing data by incorporating non-exonic mapped reads
title Differential expression analysis of RNA sequencing data by incorporating non-exonic mapped reads
title_full Differential expression analysis of RNA sequencing data by incorporating non-exonic mapped reads
title_fullStr Differential expression analysis of RNA sequencing data by incorporating non-exonic mapped reads
title_full_unstemmed Differential expression analysis of RNA sequencing data by incorporating non-exonic mapped reads
title_short Differential expression analysis of RNA sequencing data by incorporating non-exonic mapped reads
title_sort differential expression analysis of rna sequencing data by incorporating non-exonic mapped reads
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4474535/
https://www.ncbi.nlm.nih.gov/pubmed/26099631
http://dx.doi.org/10.1186/1471-2164-16-S7-S14
work_keys_str_mv AT chenhungiharry differentialexpressionanalysisofrnasequencingdatabyincorporatingnonexonicmappedreads
AT liuyuanhang differentialexpressionanalysisofrnasequencingdatabyincorporatingnonexonicmappedreads
AT zouyi differentialexpressionanalysisofrnasequencingdatabyincorporatingnonexonicmappedreads
AT laizhao differentialexpressionanalysisofrnasequencingdatabyincorporatingnonexonicmappedreads
AT sarkardevanand differentialexpressionanalysisofrnasequencingdatabyincorporatingnonexonicmappedreads
AT huangyufei differentialexpressionanalysisofrnasequencingdatabyincorporatingnonexonicmappedreads
AT chenyidong differentialexpressionanalysisofrnasequencingdatabyincorporatingnonexonicmappedreads