Cargando…

Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on Clustering

Despite the scRNA-seq analytic algorithms developed, their performance for cell clustering cannot be quantified due to the unknown “true” clusters. Referencing the transcriptomic heterogeneity of cell clusters, a “true” mRNA number matrix of cell individuals was defined as ground truth. Based on the...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yunhe, Wu, Aoshen, Peng, Xueqing, Liu, Xiaona, Liu, Gang, Liu, Lei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8304014/
https://www.ncbi.nlm.nih.gov/pubmed/34357088
http://dx.doi.org/10.3390/life11070716
_version_ 1783727230736662528
author Liu, Yunhe
Wu, Aoshen
Peng, Xueqing
Liu, Xiaona
Liu, Gang
Liu, Lei
author_facet Liu, Yunhe
Wu, Aoshen
Peng, Xueqing
Liu, Xiaona
Liu, Gang
Liu, Lei
author_sort Liu, Yunhe
collection PubMed
description Despite the scRNA-seq analytic algorithms developed, their performance for cell clustering cannot be quantified due to the unknown “true” clusters. Referencing the transcriptomic heterogeneity of cell clusters, a “true” mRNA number matrix of cell individuals was defined as ground truth. Based on the matrix and the actual data generation procedure, a simulation program (SSCRNA) for raw data was developed. Subsequently, the consistency between simulated data and real data was evaluated. Furthermore, the impact of sequencing depth and algorithms for analyses on cluster accuracy was quantified. As a result, the simulation result was highly consistent with that of the actual data. Among the clustering algorithms, the Gaussian normalization method was the more recommended. As for the clustering algorithms, the K-means clustering method was more stable than K-means plus Louvain clustering. In conclusion, the scRNA simulation algorithm developed restores the actual data generation process, discovers the impact of parameters on classification, compares the normalization/clustering algorithms, and provides novel insight into scRNA analyses.
format Online
Article
Text
id pubmed-8304014
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-83040142021-07-25 Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on Clustering Liu, Yunhe Wu, Aoshen Peng, Xueqing Liu, Xiaona Liu, Gang Liu, Lei Life (Basel) Article Despite the scRNA-seq analytic algorithms developed, their performance for cell clustering cannot be quantified due to the unknown “true” clusters. Referencing the transcriptomic heterogeneity of cell clusters, a “true” mRNA number matrix of cell individuals was defined as ground truth. Based on the matrix and the actual data generation procedure, a simulation program (SSCRNA) for raw data was developed. Subsequently, the consistency between simulated data and real data was evaluated. Furthermore, the impact of sequencing depth and algorithms for analyses on cluster accuracy was quantified. As a result, the simulation result was highly consistent with that of the actual data. Among the clustering algorithms, the Gaussian normalization method was the more recommended. As for the clustering algorithms, the K-means clustering method was more stable than K-means plus Louvain clustering. In conclusion, the scRNA simulation algorithm developed restores the actual data generation process, discovers the impact of parameters on classification, compares the normalization/clustering algorithms, and provides novel insight into scRNA analyses. MDPI 2021-07-19 /pmc/articles/PMC8304014/ /pubmed/34357088 http://dx.doi.org/10.3390/life11070716 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Liu, Yunhe
Wu, Aoshen
Peng, Xueqing
Liu, Xiaona
Liu, Gang
Liu, Lei
Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on Clustering
title Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on Clustering
title_full Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on Clustering
title_fullStr Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on Clustering
title_full_unstemmed Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on Clustering
title_short Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on Clustering
title_sort single-cell transcriptome profiling simulation reveals the impact of sequencing parameters and algorithms on clustering
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8304014/
https://www.ncbi.nlm.nih.gov/pubmed/34357088
http://dx.doi.org/10.3390/life11070716
work_keys_str_mv AT liuyunhe singlecelltranscriptomeprofilingsimulationrevealstheimpactofsequencingparametersandalgorithmsonclustering
AT wuaoshen singlecelltranscriptomeprofilingsimulationrevealstheimpactofsequencingparametersandalgorithmsonclustering
AT pengxueqing singlecelltranscriptomeprofilingsimulationrevealstheimpactofsequencingparametersandalgorithmsonclustering
AT liuxiaona singlecelltranscriptomeprofilingsimulationrevealstheimpactofsequencingparametersandalgorithmsonclustering
AT liugang singlecelltranscriptomeprofilingsimulationrevealstheimpactofsequencingparametersandalgorithmsonclustering
AT liulei singlecelltranscriptomeprofilingsimulationrevealstheimpactofsequencingparametersandalgorithmsonclustering