Cargando…
RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest
In recent years, the application of single cell RNA-seq (scRNA-seq) has become more and more popular in fields such as biology and medical research. Analyzing scRNA-seq data can discover complex cell populations and infer single-cell trajectories in cell development. Clustering is one of the most im...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8354212/ https://www.ncbi.nlm.nih.gov/pubmed/34386033 http://dx.doi.org/10.3389/fgene.2021.665843 |
_version_ | 1783736554585325568 |
---|---|
author | Zhao, Yuan Fang, Zhao-Yu Lin, Cui-Xiang Deng, Chao Xu, Yun-Pei Li, Hong-Dong |
author_facet | Zhao, Yuan Fang, Zhao-Yu Lin, Cui-Xiang Deng, Chao Xu, Yun-Pei Li, Hong-Dong |
author_sort | Zhao, Yuan |
collection | PubMed |
description | In recent years, the application of single cell RNA-seq (scRNA-seq) has become more and more popular in fields such as biology and medical research. Analyzing scRNA-seq data can discover complex cell populations and infer single-cell trajectories in cell development. Clustering is one of the most important methods to analyze scRNA-seq data. In this paper, we focus on improving scRNA-seq clustering through gene selection, which also reduces the dimensionality of scRNA-seq data. Studies have shown that gene selection for scRNA-seq data can improve clustering accuracy. Therefore, it is important to select genes with cell type specificity. Gene selection not only helps to reduce the dimensionality of scRNA-seq data, but also can improve cell type identification in combination with clustering methods. Here, we proposed RFCell, a supervised gene selection method, which is based on permutation and random forest classification. We first use RFCell and three existing gene selection methods to select gene sets on 10 scRNA-seq data sets. Then, three classical clustering algorithms are used to cluster the cells obtained by these gene selection methods. We found that the gene selection performance of RFCell was better than other gene selection methods. |
format | Online Article Text |
id | pubmed-8354212 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-83542122021-08-11 RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest Zhao, Yuan Fang, Zhao-Yu Lin, Cui-Xiang Deng, Chao Xu, Yun-Pei Li, Hong-Dong Front Genet Genetics In recent years, the application of single cell RNA-seq (scRNA-seq) has become more and more popular in fields such as biology and medical research. Analyzing scRNA-seq data can discover complex cell populations and infer single-cell trajectories in cell development. Clustering is one of the most important methods to analyze scRNA-seq data. In this paper, we focus on improving scRNA-seq clustering through gene selection, which also reduces the dimensionality of scRNA-seq data. Studies have shown that gene selection for scRNA-seq data can improve clustering accuracy. Therefore, it is important to select genes with cell type specificity. Gene selection not only helps to reduce the dimensionality of scRNA-seq data, but also can improve cell type identification in combination with clustering methods. Here, we proposed RFCell, a supervised gene selection method, which is based on permutation and random forest classification. We first use RFCell and three existing gene selection methods to select gene sets on 10 scRNA-seq data sets. Then, three classical clustering algorithms are used to cluster the cells obtained by these gene selection methods. We found that the gene selection performance of RFCell was better than other gene selection methods. Frontiers Media S.A. 2021-07-27 /pmc/articles/PMC8354212/ /pubmed/34386033 http://dx.doi.org/10.3389/fgene.2021.665843 Text en Copyright © 2021 Zhao, Fang, Lin, Deng, Xu and Li. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Zhao, Yuan Fang, Zhao-Yu Lin, Cui-Xiang Deng, Chao Xu, Yun-Pei Li, Hong-Dong RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest |
title | RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest |
title_full | RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest |
title_fullStr | RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest |
title_full_unstemmed | RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest |
title_short | RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest |
title_sort | rfcell: a gene selection approach for scrna-seq clustering based on permutation and random forest |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8354212/ https://www.ncbi.nlm.nih.gov/pubmed/34386033 http://dx.doi.org/10.3389/fgene.2021.665843 |
work_keys_str_mv | AT zhaoyuan rfcellageneselectionapproachforscrnaseqclusteringbasedonpermutationandrandomforest AT fangzhaoyu rfcellageneselectionapproachforscrnaseqclusteringbasedonpermutationandrandomforest AT lincuixiang rfcellageneselectionapproachforscrnaseqclusteringbasedonpermutationandrandomforest AT dengchao rfcellageneselectionapproachforscrnaseqclusteringbasedonpermutationandrandomforest AT xuyunpei rfcellageneselectionapproachforscrnaseqclusteringbasedonpermutationandrandomforest AT lihongdong rfcellageneselectionapproachforscrnaseqclusteringbasedonpermutationandrandomforest |