Cargando…
Statistical methods on detecting differentially expressed genes for RNA-seq data
BACKGROUND: For RNA-seq data, the aggregated counts of the short reads from the same gene is used to approximate the gene expression level. The count data can be modelled as samples from Poisson distributions with possible different parameters. To detect differentially expressed genes under two situ...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287564/ https://www.ncbi.nlm.nih.gov/pubmed/22784615 http://dx.doi.org/10.1186/1752-0509-5-S3-S1 |
_version_ | 1782224692175699968 |
---|---|
author | Chen, Zhongxue Liu, Jianzhong Ng, Hon Keung Tony Nadarajah, Saralees Kaufman, Howard L Yang, Jack Y Deng, Youping |
author_facet | Chen, Zhongxue Liu, Jianzhong Ng, Hon Keung Tony Nadarajah, Saralees Kaufman, Howard L Yang, Jack Y Deng, Youping |
author_sort | Chen, Zhongxue |
collection | PubMed |
description | BACKGROUND: For RNA-seq data, the aggregated counts of the short reads from the same gene is used to approximate the gene expression level. The count data can be modelled as samples from Poisson distributions with possible different parameters. To detect differentially expressed genes under two situations, statistical methods for detecting the difference of two Poisson means are used. When the expression level of a gene is low, i.e., the number of count is small, it is usually more difficult to detect the mean differences, and therefore statistical methods which are more powerful for low expression level are particularly desirable. In statistical literature, several methods have been proposed to compare two Poisson means (rates). In this paper, we compare these methods by using simulated and real RNA-seq data. RESULTS: Through simulation study and real data analysis, we find that the Wald test with the data being log-transformed is more powerful than other methods, including the likelihood ratio test, which has similar power as the variance stabilizing transformation test; both are more powerful than the conditional exact test and Fisher exact test. CONCLUSIONS: When the count data in RNA-seq can be reasonably modelled as Poisson distribution, the Wald-Log test is more powerful and should be used to detect the differentially expressed genes. |
format | Online Article Text |
id | pubmed-3287564 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32875642012-03-01 Statistical methods on detecting differentially expressed genes for RNA-seq data Chen, Zhongxue Liu, Jianzhong Ng, Hon Keung Tony Nadarajah, Saralees Kaufman, Howard L Yang, Jack Y Deng, Youping BMC Syst Biol Research Article BACKGROUND: For RNA-seq data, the aggregated counts of the short reads from the same gene is used to approximate the gene expression level. The count data can be modelled as samples from Poisson distributions with possible different parameters. To detect differentially expressed genes under two situations, statistical methods for detecting the difference of two Poisson means are used. When the expression level of a gene is low, i.e., the number of count is small, it is usually more difficult to detect the mean differences, and therefore statistical methods which are more powerful for low expression level are particularly desirable. In statistical literature, several methods have been proposed to compare two Poisson means (rates). In this paper, we compare these methods by using simulated and real RNA-seq data. RESULTS: Through simulation study and real data analysis, we find that the Wald test with the data being log-transformed is more powerful than other methods, including the likelihood ratio test, which has similar power as the variance stabilizing transformation test; both are more powerful than the conditional exact test and Fisher exact test. CONCLUSIONS: When the count data in RNA-seq can be reasonably modelled as Poisson distribution, the Wald-Log test is more powerful and should be used to detect the differentially expressed genes. BioMed Central 2011-12-23 /pmc/articles/PMC3287564/ /pubmed/22784615 http://dx.doi.org/10.1186/1752-0509-5-S3-S1 Text en Copyright ©2011 Chen et al. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Chen, Zhongxue Liu, Jianzhong Ng, Hon Keung Tony Nadarajah, Saralees Kaufman, Howard L Yang, Jack Y Deng, Youping Statistical methods on detecting differentially expressed genes for RNA-seq data |
title | Statistical methods on detecting differentially expressed genes for RNA-seq data |
title_full | Statistical methods on detecting differentially expressed genes for RNA-seq data |
title_fullStr | Statistical methods on detecting differentially expressed genes for RNA-seq data |
title_full_unstemmed | Statistical methods on detecting differentially expressed genes for RNA-seq data |
title_short | Statistical methods on detecting differentially expressed genes for RNA-seq data |
title_sort | statistical methods on detecting differentially expressed genes for rna-seq data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287564/ https://www.ncbi.nlm.nih.gov/pubmed/22784615 http://dx.doi.org/10.1186/1752-0509-5-S3-S1 |
work_keys_str_mv | AT chenzhongxue statisticalmethodsondetectingdifferentiallyexpressedgenesforrnaseqdata AT liujianzhong statisticalmethodsondetectingdifferentiallyexpressedgenesforrnaseqdata AT nghonkeungtony statisticalmethodsondetectingdifferentiallyexpressedgenesforrnaseqdata AT nadarajahsaralees statisticalmethodsondetectingdifferentiallyexpressedgenesforrnaseqdata AT kaufmanhowardl statisticalmethodsondetectingdifferentiallyexpressedgenesforrnaseqdata AT yangjacky statisticalmethodsondetectingdifferentiallyexpressedgenesforrnaseqdata AT dengyouping statisticalmethodsondetectingdifferentiallyexpressedgenesforrnaseqdata |