Cargando…

RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest

In recent years, the application of single cell RNA-seq (scRNA-seq) has become more and more popular in fields such as biology and medical research. Analyzing scRNA-seq data can discover complex cell populations and infer single-cell trajectories in cell development. Clustering is one of the most im...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Yuan, Fang, Zhao-Yu, Lin, Cui-Xiang, Deng, Chao, Xu, Yun-Pei, Li, Hong-Dong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8354212/
https://www.ncbi.nlm.nih.gov/pubmed/34386033
http://dx.doi.org/10.3389/fgene.2021.665843
_version_ 1783736554585325568
author Zhao, Yuan
Fang, Zhao-Yu
Lin, Cui-Xiang
Deng, Chao
Xu, Yun-Pei
Li, Hong-Dong
author_facet Zhao, Yuan
Fang, Zhao-Yu
Lin, Cui-Xiang
Deng, Chao
Xu, Yun-Pei
Li, Hong-Dong
author_sort Zhao, Yuan
collection PubMed
description In recent years, the application of single cell RNA-seq (scRNA-seq) has become more and more popular in fields such as biology and medical research. Analyzing scRNA-seq data can discover complex cell populations and infer single-cell trajectories in cell development. Clustering is one of the most important methods to analyze scRNA-seq data. In this paper, we focus on improving scRNA-seq clustering through gene selection, which also reduces the dimensionality of scRNA-seq data. Studies have shown that gene selection for scRNA-seq data can improve clustering accuracy. Therefore, it is important to select genes with cell type specificity. Gene selection not only helps to reduce the dimensionality of scRNA-seq data, but also can improve cell type identification in combination with clustering methods. Here, we proposed RFCell, a supervised gene selection method, which is based on permutation and random forest classification. We first use RFCell and three existing gene selection methods to select gene sets on 10 scRNA-seq data sets. Then, three classical clustering algorithms are used to cluster the cells obtained by these gene selection methods. We found that the gene selection performance of RFCell was better than other gene selection methods.
format Online
Article
Text
id pubmed-8354212
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-83542122021-08-11 RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest Zhao, Yuan Fang, Zhao-Yu Lin, Cui-Xiang Deng, Chao Xu, Yun-Pei Li, Hong-Dong Front Genet Genetics In recent years, the application of single cell RNA-seq (scRNA-seq) has become more and more popular in fields such as biology and medical research. Analyzing scRNA-seq data can discover complex cell populations and infer single-cell trajectories in cell development. Clustering is one of the most important methods to analyze scRNA-seq data. In this paper, we focus on improving scRNA-seq clustering through gene selection, which also reduces the dimensionality of scRNA-seq data. Studies have shown that gene selection for scRNA-seq data can improve clustering accuracy. Therefore, it is important to select genes with cell type specificity. Gene selection not only helps to reduce the dimensionality of scRNA-seq data, but also can improve cell type identification in combination with clustering methods. Here, we proposed RFCell, a supervised gene selection method, which is based on permutation and random forest classification. We first use RFCell and three existing gene selection methods to select gene sets on 10 scRNA-seq data sets. Then, three classical clustering algorithms are used to cluster the cells obtained by these gene selection methods. We found that the gene selection performance of RFCell was better than other gene selection methods. Frontiers Media S.A. 2021-07-27 /pmc/articles/PMC8354212/ /pubmed/34386033 http://dx.doi.org/10.3389/fgene.2021.665843 Text en Copyright © 2021 Zhao, Fang, Lin, Deng, Xu and Li. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Zhao, Yuan
Fang, Zhao-Yu
Lin, Cui-Xiang
Deng, Chao
Xu, Yun-Pei
Li, Hong-Dong
RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest
title RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest
title_full RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest
title_fullStr RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest
title_full_unstemmed RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest
title_short RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest
title_sort rfcell: a gene selection approach for scrna-seq clustering based on permutation and random forest
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8354212/
https://www.ncbi.nlm.nih.gov/pubmed/34386033
http://dx.doi.org/10.3389/fgene.2021.665843
work_keys_str_mv AT zhaoyuan rfcellageneselectionapproachforscrnaseqclusteringbasedonpermutationandrandomforest
AT fangzhaoyu rfcellageneselectionapproachforscrnaseqclusteringbasedonpermutationandrandomforest
AT lincuixiang rfcellageneselectionapproachforscrnaseqclusteringbasedonpermutationandrandomforest
AT dengchao rfcellageneselectionapproachforscrnaseqclusteringbasedonpermutationandrandomforest
AT xuyunpei rfcellageneselectionapproachforscrnaseqclusteringbasedonpermutationandrandomforest
AT lihongdong rfcellageneselectionapproachforscrnaseqclusteringbasedonpermutationandrandomforest