Cargando…
A benchmark study of simulation methods for single-cell RNA sequencing data
Single-cell RNA-seq (scRNA-seq) data simulation is critical for evaluating computational methods for analysing scRNA-seq data especially when ground truth is experimentally unattainable. The reliability of evaluation depends on the ability of simulation methods to capture properties of experimental...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8617278/ https://www.ncbi.nlm.nih.gov/pubmed/34824223 http://dx.doi.org/10.1038/s41467-021-27130-w |
_version_ | 1784604489474375680 |
---|---|
author | Cao, Yue Yang, Pengyi Yang, Jean Yee Hwa |
author_facet | Cao, Yue Yang, Pengyi Yang, Jean Yee Hwa |
author_sort | Cao, Yue |
collection | PubMed |
description | Single-cell RNA-seq (scRNA-seq) data simulation is critical for evaluating computational methods for analysing scRNA-seq data especially when ground truth is experimentally unattainable. The reliability of evaluation depends on the ability of simulation methods to capture properties of experimental data. However, while many scRNA-seq data simulation methods have been proposed, a systematic evaluation of these methods is lacking. We develop a comprehensive evaluation framework, SimBench, including a kernel density estimation measure to benchmark 12 simulation methods through 35 scRNA-seq experimental datasets. We evaluate the simulation methods on a panel of data properties, ability to maintain biological signals, scalability and applicability. Our benchmark uncovers performance differences among the methods and highlights the varying difficulties in simulating data characteristics. Furthermore, we identify several limitations including maintaining heterogeneity of distribution. These results, together with the framework and datasets made publicly available as R packages, will guide simulation methods selection and their future development. |
format | Online Article Text |
id | pubmed-8617278 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-86172782021-12-10 A benchmark study of simulation methods for single-cell RNA sequencing data Cao, Yue Yang, Pengyi Yang, Jean Yee Hwa Nat Commun Article Single-cell RNA-seq (scRNA-seq) data simulation is critical for evaluating computational methods for analysing scRNA-seq data especially when ground truth is experimentally unattainable. The reliability of evaluation depends on the ability of simulation methods to capture properties of experimental data. However, while many scRNA-seq data simulation methods have been proposed, a systematic evaluation of these methods is lacking. We develop a comprehensive evaluation framework, SimBench, including a kernel density estimation measure to benchmark 12 simulation methods through 35 scRNA-seq experimental datasets. We evaluate the simulation methods on a panel of data properties, ability to maintain biological signals, scalability and applicability. Our benchmark uncovers performance differences among the methods and highlights the varying difficulties in simulating data characteristics. Furthermore, we identify several limitations including maintaining heterogeneity of distribution. These results, together with the framework and datasets made publicly available as R packages, will guide simulation methods selection and their future development. Nature Publishing Group UK 2021-11-25 /pmc/articles/PMC8617278/ /pubmed/34824223 http://dx.doi.org/10.1038/s41467-021-27130-w Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Cao, Yue Yang, Pengyi Yang, Jean Yee Hwa A benchmark study of simulation methods for single-cell RNA sequencing data |
title | A benchmark study of simulation methods for single-cell RNA sequencing data |
title_full | A benchmark study of simulation methods for single-cell RNA sequencing data |
title_fullStr | A benchmark study of simulation methods for single-cell RNA sequencing data |
title_full_unstemmed | A benchmark study of simulation methods for single-cell RNA sequencing data |
title_short | A benchmark study of simulation methods for single-cell RNA sequencing data |
title_sort | benchmark study of simulation methods for single-cell rna sequencing data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8617278/ https://www.ncbi.nlm.nih.gov/pubmed/34824223 http://dx.doi.org/10.1038/s41467-021-27130-w |
work_keys_str_mv | AT caoyue abenchmarkstudyofsimulationmethodsforsinglecellrnasequencingdata AT yangpengyi abenchmarkstudyofsimulationmethodsforsinglecellrnasequencingdata AT yangjeanyeehwa abenchmarkstudyofsimulationmethodsforsinglecellrnasequencingdata AT caoyue benchmarkstudyofsimulationmethodsforsinglecellrnasequencingdata AT yangpengyi benchmarkstudyofsimulationmethodsforsinglecellrnasequencingdata AT yangjeanyeehwa benchmarkstudyofsimulationmethodsforsinglecellrnasequencingdata |