Cargando…

Detecting differentially expressed genes for syndromes by considering change in mean and dispersion simultaneously

BACKGROUND: Using next-generation sequencing technology to measure gene expression, an empirically intriguing question concerns the identification of differentially expressed genes across treatment groups. Existing methods aim to identify genes whose mean expressions differ among treatment groups by...

Descripción completa

Detalles Bibliográficos
Autores principales: Ma, Chenchen, Ji, Tieming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6148965/
https://www.ncbi.nlm.nih.gov/pubmed/30236056
http://dx.doi.org/10.1186/s12859-018-2354-4
Descripción
Sumario:BACKGROUND: Using next-generation sequencing technology to measure gene expression, an empirically intriguing question concerns the identification of differentially expressed genes across treatment groups. Existing methods aim to identify genes whose mean expressions differ among treatment groups by assuming equal dispersion across all groups. For syndromes, however, various combinations of gene expression alterations can result in the same disease, leading to greater heteroscedasticity in the biological replicates in the disease group compared to the normal group. Traditional methods that only consider changes in the mean will fail to fully analyze gene expression in such a scenario. In addition, sequencing technology is relatively expensive; most labs can only afford a few replicates per treatment group, which poses further challenges to reliably estimating the mean and dispersion under each treatment condition. RESULTS: We designed an empirical Bayes method and a pooled permutation test to simultaneously consider the change in mean and dispersion across treatment groups. We further computed confidence intervals based on Bayes estimates to identify differentially expressed genes that are unique to each disease sample as well as those that are common across all disease samples. We illustrated our method by applying it to gene expression data from a large offspring syndrome experiment, which motivated this study. We compared our method to competing approaches through simulation studies that mimicked the real datasets to demonstrate the effectiveness of our proposed method. CONCLUSIONS: We will show that, compared to popular methods that only aim to find the difference in the mean, our method can capture greater variation in the disease group to effectively identify differentially expressed genes for syndromes.