Cargando…
SIEVE: identifying robust single cell variable genes for single-cell RNA sequencing data
Single-cell RNA-seq data analysis generally requires quality control, normalization, highly variable genes screening, dimensionality reduction and clustering. Among these processes, downstream analysis including dimensionality reduction and clustering are sensitive to the selection of highly variabl...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Lippincott Williams & Wilkins
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8974938/ https://www.ncbi.nlm.nih.gov/pubmed/35402832 http://dx.doi.org/10.1097/BS9.0000000000000072 |
_version_ | 1784680305153540096 |
---|---|
author | Zhang, Yinan Xie, Xiaowei Wu, Peng Zhu, Ping |
author_facet | Zhang, Yinan Xie, Xiaowei Wu, Peng Zhu, Ping |
author_sort | Zhang, Yinan |
collection | PubMed |
description | Single-cell RNA-seq data analysis generally requires quality control, normalization, highly variable genes screening, dimensionality reduction and clustering. Among these processes, downstream analysis including dimensionality reduction and clustering are sensitive to the selection of highly variable genes. Though increasing number of tools for selecting the highly variable genes have been developed, an evaluation of their performances and a general strategy are lack. Here, we compare the performance of nine commonly used methods for screening variable genes by using single-cell RNA-seq data from hematopoietic stem/progenitor cells and mature blood cells, and find that SCHS outperforms other methods regarding to reproducibility and accuracy. However, this method prefers the selection of highly expressed genes. We further propose a new strategy SIEVE (SIngle-cEll Variable gEnes) by multiple rounds of random sampling, therefore minimizing the stochastic noise and identifying a robust set of variable genes. Moreover, SIEVE recovers lowly expressed genes as variable genes and substantially improves the accuracy of single cell classification, especially for the methods with lower reproducibility. The SIEVE software is freely available at https://github.com/YinanZhang522/SIEVE. |
format | Online Article Text |
id | pubmed-8974938 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Lippincott Williams & Wilkins |
record_format | MEDLINE/PubMed |
spelling | pubmed-89749382022-04-07 SIEVE: identifying robust single cell variable genes for single-cell RNA sequencing data Zhang, Yinan Xie, Xiaowei Wu, Peng Zhu, Ping Blood Sci Research Article Single-cell RNA-seq data analysis generally requires quality control, normalization, highly variable genes screening, dimensionality reduction and clustering. Among these processes, downstream analysis including dimensionality reduction and clustering are sensitive to the selection of highly variable genes. Though increasing number of tools for selecting the highly variable genes have been developed, an evaluation of their performances and a general strategy are lack. Here, we compare the performance of nine commonly used methods for screening variable genes by using single-cell RNA-seq data from hematopoietic stem/progenitor cells and mature blood cells, and find that SCHS outperforms other methods regarding to reproducibility and accuracy. However, this method prefers the selection of highly expressed genes. We further propose a new strategy SIEVE (SIngle-cEll Variable gEnes) by multiple rounds of random sampling, therefore minimizing the stochastic noise and identifying a robust set of variable genes. Moreover, SIEVE recovers lowly expressed genes as variable genes and substantially improves the accuracy of single cell classification, especially for the methods with lower reproducibility. The SIEVE software is freely available at https://github.com/YinanZhang522/SIEVE. Lippincott Williams & Wilkins 2021-04-28 /pmc/articles/PMC8974938/ /pubmed/35402832 http://dx.doi.org/10.1097/BS9.0000000000000072 Text en Copyright © 2021 The Authors. Published by Wolters Kluwer Health Inc., on behalf of the Chinese Association for Blood Sciences. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal. http://creativecommons.org/licenses/by-nc-nd/4.0 (https://creativecommons.org/licenses/by-nc-nd/4.0/) |
spellingShingle | Research Article Zhang, Yinan Xie, Xiaowei Wu, Peng Zhu, Ping SIEVE: identifying robust single cell variable genes for single-cell RNA sequencing data |
title | SIEVE: identifying robust single cell variable genes for single-cell RNA sequencing data |
title_full | SIEVE: identifying robust single cell variable genes for single-cell RNA sequencing data |
title_fullStr | SIEVE: identifying robust single cell variable genes for single-cell RNA sequencing data |
title_full_unstemmed | SIEVE: identifying robust single cell variable genes for single-cell RNA sequencing data |
title_short | SIEVE: identifying robust single cell variable genes for single-cell RNA sequencing data |
title_sort | sieve: identifying robust single cell variable genes for single-cell rna sequencing data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8974938/ https://www.ncbi.nlm.nih.gov/pubmed/35402832 http://dx.doi.org/10.1097/BS9.0000000000000072 |
work_keys_str_mv | AT zhangyinan sieveidentifyingrobustsinglecellvariablegenesforsinglecellrnasequencingdata AT xiexiaowei sieveidentifyingrobustsinglecellvariablegenesforsinglecellrnasequencingdata AT wupeng sieveidentifyingrobustsinglecellvariablegenesforsinglecellrnasequencingdata AT zhuping sieveidentifyingrobustsinglecellvariablegenesforsinglecellrnasequencingdata |