Cargando…
Benchmarking RNA-seq differential expression analysis methods using spike-in and simulation data
Benchmarking RNA-seq differential expression analysis methods using spike-in and simulated RNA-seq data has often yielded inconsistent results. The spike-in data, which were generated from the same bulk RNA sample, only represent technical variability, making the test results less reliable. We compa...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7192453/ https://www.ncbi.nlm.nih.gov/pubmed/32353015 http://dx.doi.org/10.1371/journal.pone.0232271 |
_version_ | 1783528012108529664 |
---|---|
author | Baik, Bukyung Yoon, Sora Nam, Dougu |
author_facet | Baik, Bukyung Yoon, Sora Nam, Dougu |
author_sort | Baik, Bukyung |
collection | PubMed |
description | Benchmarking RNA-seq differential expression analysis methods using spike-in and simulated RNA-seq data has often yielded inconsistent results. The spike-in data, which were generated from the same bulk RNA sample, only represent technical variability, making the test results less reliable. We compared the performance of 12 differential expression analysis methods for RNA-seq data, including recent variants in widely used software packages, using both RNA spike-in and simulation data for negative binomial (NB) model. Performance of edgeR, DESeq2, and ROTS was particularly different between the two benchmark tests. Then, each method was tested under most extensive simulation conditions especially demonstrating the large impacts of proportion, dispersion, and balance of differentially expressed (DE) genes. DESeq2, a robust version of edgeR (edgeR.rb), voom with TMM normalization (voom.tmm) and sample weights (voom.sw) showed an overall good performance regardless of presence of outliers and proportion of DE genes. The performance of RNA-seq DE gene analysis methods substantially depended on the benchmark used. Based on the simulation results, suitable methods were suggested under various test conditions. |
format | Online Article Text |
id | pubmed-7192453 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-71924532020-05-11 Benchmarking RNA-seq differential expression analysis methods using spike-in and simulation data Baik, Bukyung Yoon, Sora Nam, Dougu PLoS One Research Article Benchmarking RNA-seq differential expression analysis methods using spike-in and simulated RNA-seq data has often yielded inconsistent results. The spike-in data, which were generated from the same bulk RNA sample, only represent technical variability, making the test results less reliable. We compared the performance of 12 differential expression analysis methods for RNA-seq data, including recent variants in widely used software packages, using both RNA spike-in and simulation data for negative binomial (NB) model. Performance of edgeR, DESeq2, and ROTS was particularly different between the two benchmark tests. Then, each method was tested under most extensive simulation conditions especially demonstrating the large impacts of proportion, dispersion, and balance of differentially expressed (DE) genes. DESeq2, a robust version of edgeR (edgeR.rb), voom with TMM normalization (voom.tmm) and sample weights (voom.sw) showed an overall good performance regardless of presence of outliers and proportion of DE genes. The performance of RNA-seq DE gene analysis methods substantially depended on the benchmark used. Based on the simulation results, suitable methods were suggested under various test conditions. Public Library of Science 2020-04-30 /pmc/articles/PMC7192453/ /pubmed/32353015 http://dx.doi.org/10.1371/journal.pone.0232271 Text en © 2020 Baik et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Baik, Bukyung Yoon, Sora Nam, Dougu Benchmarking RNA-seq differential expression analysis methods using spike-in and simulation data |
title | Benchmarking RNA-seq differential expression analysis methods using spike-in and simulation data |
title_full | Benchmarking RNA-seq differential expression analysis methods using spike-in and simulation data |
title_fullStr | Benchmarking RNA-seq differential expression analysis methods using spike-in and simulation data |
title_full_unstemmed | Benchmarking RNA-seq differential expression analysis methods using spike-in and simulation data |
title_short | Benchmarking RNA-seq differential expression analysis methods using spike-in and simulation data |
title_sort | benchmarking rna-seq differential expression analysis methods using spike-in and simulation data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7192453/ https://www.ncbi.nlm.nih.gov/pubmed/32353015 http://dx.doi.org/10.1371/journal.pone.0232271 |
work_keys_str_mv | AT baikbukyung benchmarkingrnaseqdifferentialexpressionanalysismethodsusingspikeinandsimulationdata AT yoonsora benchmarkingrnaseqdifferentialexpressionanalysismethodsusingspikeinandsimulationdata AT namdougu benchmarkingrnaseqdifferentialexpressionanalysismethodsusingspikeinandsimulationdata |