Cargando…

multiDE: a dimension reduced model based statistical method for differential expression analysis using RNA-sequencing data with multiple treatment conditions

BACKGROUND: The growing complexity of biological experiment design based on high-throughput RNA sequencing (RNA-seq) is calling for more accommodative statistical tools. We focus on differential expression (DE) analysis using RNA-seq data in the presence of multiple treatment conditions. RESULTS: We...

Descripción completa

Detalles Bibliográficos
Autores principales: Kang, Guangliang, Du, Li, Zhang, Hong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4917940/
https://www.ncbi.nlm.nih.gov/pubmed/27334001
http://dx.doi.org/10.1186/s12859-016-1111-9
_version_ 1782439026693767168
author Kang, Guangliang
Du, Li
Zhang, Hong
author_facet Kang, Guangliang
Du, Li
Zhang, Hong
author_sort Kang, Guangliang
collection PubMed
description BACKGROUND: The growing complexity of biological experiment design based on high-throughput RNA sequencing (RNA-seq) is calling for more accommodative statistical tools. We focus on differential expression (DE) analysis using RNA-seq data in the presence of multiple treatment conditions. RESULTS: We propose a novel method, multiDE, for facilitating DE analysis using RNA-seq read count data with multiple treatment conditions. The read count is assumed to follow a log-linear model incorporating two factors (i.e., condition and gene), where an interaction term is used to quantify the association between gene and condition. The number of the degrees of freedom is reduced to one through the first order decomposition of the interaction, leading to a dramatically power improvement in testing DE genes when the number of conditions is greater than two. In our simulation situations, multiDE outperformed the benchmark methods (i.e. edgeR and DESeq2) even if the underlying model was severely misspecified, and the power gain was increasing in the number of conditions. In the application to two real datasets, multiDE identified more biologically meaningful DE genes than the benchmark methods. An R package implementing multiDE is available publicly at http://homepage.fudan.edu.cn/zhangh/softwares/multiDE. CONCLUSIONS: When the number of conditions is two, multiDE performs comparably with the benchmark methods. When the number of conditions is greater than two, multiDE outperforms the benchmark methods.
format Online
Article
Text
id pubmed-4917940
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-49179402016-06-28 multiDE: a dimension reduced model based statistical method for differential expression analysis using RNA-sequencing data with multiple treatment conditions Kang, Guangliang Du, Li Zhang, Hong BMC Bioinformatics Methodology Article BACKGROUND: The growing complexity of biological experiment design based on high-throughput RNA sequencing (RNA-seq) is calling for more accommodative statistical tools. We focus on differential expression (DE) analysis using RNA-seq data in the presence of multiple treatment conditions. RESULTS: We propose a novel method, multiDE, for facilitating DE analysis using RNA-seq read count data with multiple treatment conditions. The read count is assumed to follow a log-linear model incorporating two factors (i.e., condition and gene), where an interaction term is used to quantify the association between gene and condition. The number of the degrees of freedom is reduced to one through the first order decomposition of the interaction, leading to a dramatically power improvement in testing DE genes when the number of conditions is greater than two. In our simulation situations, multiDE outperformed the benchmark methods (i.e. edgeR and DESeq2) even if the underlying model was severely misspecified, and the power gain was increasing in the number of conditions. In the application to two real datasets, multiDE identified more biologically meaningful DE genes than the benchmark methods. An R package implementing multiDE is available publicly at http://homepage.fudan.edu.cn/zhangh/softwares/multiDE. CONCLUSIONS: When the number of conditions is two, multiDE performs comparably with the benchmark methods. When the number of conditions is greater than two, multiDE outperforms the benchmark methods. BioMed Central 2016-06-22 /pmc/articles/PMC4917940/ /pubmed/27334001 http://dx.doi.org/10.1186/s12859-016-1111-9 Text en © The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Kang, Guangliang
Du, Li
Zhang, Hong
multiDE: a dimension reduced model based statistical method for differential expression analysis using RNA-sequencing data with multiple treatment conditions
title multiDE: a dimension reduced model based statistical method for differential expression analysis using RNA-sequencing data with multiple treatment conditions
title_full multiDE: a dimension reduced model based statistical method for differential expression analysis using RNA-sequencing data with multiple treatment conditions
title_fullStr multiDE: a dimension reduced model based statistical method for differential expression analysis using RNA-sequencing data with multiple treatment conditions
title_full_unstemmed multiDE: a dimension reduced model based statistical method for differential expression analysis using RNA-sequencing data with multiple treatment conditions
title_short multiDE: a dimension reduced model based statistical method for differential expression analysis using RNA-sequencing data with multiple treatment conditions
title_sort multide: a dimension reduced model based statistical method for differential expression analysis using rna-sequencing data with multiple treatment conditions
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4917940/
https://www.ncbi.nlm.nih.gov/pubmed/27334001
http://dx.doi.org/10.1186/s12859-016-1111-9
work_keys_str_mv AT kangguangliang multideadimensionreducedmodelbasedstatisticalmethodfordifferentialexpressionanalysisusingrnasequencingdatawithmultipletreatmentconditions
AT duli multideadimensionreducedmodelbasedstatisticalmethodfordifferentialexpressionanalysisusingrnasequencingdatawithmultipletreatmentconditions
AT zhanghong multideadimensionreducedmodelbasedstatisticalmethodfordifferentialexpressionanalysisusingrnasequencingdatawithmultipletreatmentconditions