Cargando…

Exaggerated false positives by popular differential expression methods when analyzing human population samples

When identifying differentially expressed genes between two conditions using human population RNA-seq samples, we found a phenomenon by permutation analysis: two popular bioinformatics methods, DESeq2 and edgeR, have unexpectedly high false discovery rates. Expanding the analysis to limma-voom, NOIS...

Descripción completa

Detalles Bibliográficos
Autores principales:	Li, Yumei, Ge, Xinzhou, Peng, Fanglue, Li, Wei, Li, Jingyi Jessica
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2022
Materias:	Short Report
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8922736/ https://www.ncbi.nlm.nih.gov/pubmed/35292087 http://dx.doi.org/10.1186/s13059-022-02648-4

Descripción
Sumario:	When identifying differentially expressed genes between two conditions using human population RNA-seq samples, we found a phenomenon by permutation analysis: two popular bioinformatics methods, DESeq2 and edgeR, have unexpectedly high false discovery rates. Expanding the analysis to limma-voom, NOISeq, dearseq, and Wilcoxon rank-sum test, we found that FDR control is often failed except for the Wilcoxon rank-sum test. Particularly, the actual FDRs of DESeq2 and edgeR sometimes exceed 20% when the target FDR is 5%. Based on these results, for population-level RNA-seq studies with large sample sizes, we recommend the Wilcoxon rank-sum test. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-022-02648-4.

Exaggerated false positives by popular differential expression methods when analyzing human population samples

Ejemplares similares