Cargando…

Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model

BACKGROUND: Accurate identification of differentially expressed (DE) genes in time course RNA-Seq data is crucial for understanding the dynamics of transcriptional regulatory network. However, most of the available methods treat gene expressions at different time points as replicates and test the si...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sun, Xiaoxiao, Dalpiaz, David, Wu, Di, S. Liu, Jun, Zhong, Wenxuan, Ma, Ping
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2016
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5002174/ https://www.ncbi.nlm.nih.gov/pubmed/27565575 http://dx.doi.org/10.1186/s12859-016-1180-9

_version_	1782450530989113344
author	Sun, Xiaoxiao Dalpiaz, David Wu, Di S. Liu, Jun Zhong, Wenxuan Ma, Ping
author_facet	Sun, Xiaoxiao Dalpiaz, David Wu, Di S. Liu, Jun Zhong, Wenxuan Ma, Ping
author_sort	Sun, Xiaoxiao
collection	PubMed
description	BACKGROUND: Accurate identification of differentially expressed (DE) genes in time course RNA-Seq data is crucial for understanding the dynamics of transcriptional regulatory network. However, most of the available methods treat gene expressions at different time points as replicates and test the significance of the mean expression difference between treatments or conditions irrespective of time. They thus fail to identify many DE genes with different profiles across time. In this article, we propose a negative binomial mixed-effect model (NBMM) to identify DE genes in time course RNA-Seq data. In the NBMM, mean gene expression is characterized by a fixed effect, and time dependency is described by random effects. The NBMM is very flexible and can be fitted to both unreplicated and replicated time course RNA-Seq data via a penalized likelihood method. By comparing gene expression profiles over time, we further classify the DE genes into two subtypes to enhance the understanding of expression dynamics. A significance test for detecting DE genes is derived using a Kullback-Leibler distance ratio. Additionally, a significance test for gene sets is developed using a gene set score. RESULTS: Simulation analysis shows that the NBMM outperforms currently available methods for detecting DE genes and gene sets. Moreover, our real data analysis of fruit fly developmental time course RNA-Seq data demonstrates the NBMM identifies biologically relevant genes which are well justified by gene ontology analysis. CONCLUSIONS: The proposed method is powerful and efficient to detect biologically relevant DE genes and gene sets in time course RNA-Seq data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1180-9) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5002174
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-50021742016-09-06 Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model Sun, Xiaoxiao Dalpiaz, David Wu, Di S. Liu, Jun Zhong, Wenxuan Ma, Ping BMC Bioinformatics Methodology Article BACKGROUND: Accurate identification of differentially expressed (DE) genes in time course RNA-Seq data is crucial for understanding the dynamics of transcriptional regulatory network. However, most of the available methods treat gene expressions at different time points as replicates and test the significance of the mean expression difference between treatments or conditions irrespective of time. They thus fail to identify many DE genes with different profiles across time. In this article, we propose a negative binomial mixed-effect model (NBMM) to identify DE genes in time course RNA-Seq data. In the NBMM, mean gene expression is characterized by a fixed effect, and time dependency is described by random effects. The NBMM is very flexible and can be fitted to both unreplicated and replicated time course RNA-Seq data via a penalized likelihood method. By comparing gene expression profiles over time, we further classify the DE genes into two subtypes to enhance the understanding of expression dynamics. A significance test for detecting DE genes is derived using a Kullback-Leibler distance ratio. Additionally, a significance test for gene sets is developed using a gene set score. RESULTS: Simulation analysis shows that the NBMM outperforms currently available methods for detecting DE genes and gene sets. Moreover, our real data analysis of fruit fly developmental time course RNA-Seq data demonstrates the NBMM identifies biologically relevant genes which are well justified by gene ontology analysis. CONCLUSIONS: The proposed method is powerful and efficient to detect biologically relevant DE genes and gene sets in time course RNA-Seq data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1180-9) contains supplementary material, which is available to authorized users. BioMed Central 2016-08-26 /pmc/articles/PMC5002174/ /pubmed/27565575 http://dx.doi.org/10.1186/s12859-016-1180-9 Text en © The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Methodology Article Sun, Xiaoxiao Dalpiaz, David Wu, Di S. Liu, Jun Zhong, Wenxuan Ma, Ping Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model
title	Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model
title_full	Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model
title_fullStr	Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model
title_full_unstemmed	Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model
title_short	Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model
title_sort	statistical inference for time course rna-seq data using a negative binomial mixed-effect model
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5002174/ https://www.ncbi.nlm.nih.gov/pubmed/27565575 http://dx.doi.org/10.1186/s12859-016-1180-9
work_keys_str_mv	AT sunxiaoxiao statisticalinferencefortimecoursernaseqdatausinganegativebinomialmixedeffectmodel AT dalpiazdavid statisticalinferencefortimecoursernaseqdatausinganegativebinomialmixedeffectmodel AT wudi statisticalinferencefortimecoursernaseqdatausinganegativebinomialmixedeffectmodel AT sliujun statisticalinferencefortimecoursernaseqdatausinganegativebinomialmixedeffectmodel AT zhongwenxuan statisticalinferencefortimecoursernaseqdatausinganegativebinomialmixedeffectmodel AT maping statisticalinferencefortimecoursernaseqdatausinganegativebinomialmixedeffectmodel

Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model

Ejemplares similares