Cargando…

MarkerCount: A stable, count-based cell type identifier for single-cell RNA-seq experiments

Cell type identification is a key step toward downstream analysis of single cell RNA-seq experiments. Although the primary objective is to identify known cell populations, good identifiers should also recognize unknown clusters which may represent a previously unidentified subpopulation of a known c...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, HanByeol, Lee, Joongho, Kang, Keunsoo, Yoon, Seokhyun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9233224/
https://www.ncbi.nlm.nih.gov/pubmed/35782735
http://dx.doi.org/10.1016/j.csbj.2022.06.010
_version_ 1784735713663647744
author Kim, HanByeol
Lee, Joongho
Kang, Keunsoo
Yoon, Seokhyun
author_facet Kim, HanByeol
Lee, Joongho
Kang, Keunsoo
Yoon, Seokhyun
author_sort Kim, HanByeol
collection PubMed
description Cell type identification is a key step toward downstream analysis of single cell RNA-seq experiments. Although the primary objective is to identify known cell populations, good identifiers should also recognize unknown clusters which may represent a previously unidentified subpopulation of a known cell type or tumor cells of an unknown phenotype. Herein, we present MarkerCount, which utilizes the number of expressed markers, regardless of their expression level. MarkerCount works in both reference- and marker-based mode, where the latter utilizes existing lists of markers, while the former uses a pre-annotated dataset to find markers to be used for cell type identification. In both modes, MarkerCount first utilizes the “marker count” to identify cell populations and, after rejecting uncertain cells, reassigns cell type and/or makes corrections in cluster-basis. The performance of MarkerCount was evaluated and compared with existing identifiers, both marker- and reference-based, that can be customized using publicly available datasets and marker databases. The results show that MarkerCount performs better in the identification of known populations as well as of unknown ones, when compared to other reference- and marker-based cell type identifiers for most of the datasets analyzed.
format Online
Article
Text
id pubmed-9233224
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-92332242022-07-01 MarkerCount: A stable, count-based cell type identifier for single-cell RNA-seq experiments Kim, HanByeol Lee, Joongho Kang, Keunsoo Yoon, Seokhyun Comput Struct Biotechnol J Research Article Cell type identification is a key step toward downstream analysis of single cell RNA-seq experiments. Although the primary objective is to identify known cell populations, good identifiers should also recognize unknown clusters which may represent a previously unidentified subpopulation of a known cell type or tumor cells of an unknown phenotype. Herein, we present MarkerCount, which utilizes the number of expressed markers, regardless of their expression level. MarkerCount works in both reference- and marker-based mode, where the latter utilizes existing lists of markers, while the former uses a pre-annotated dataset to find markers to be used for cell type identification. In both modes, MarkerCount first utilizes the “marker count” to identify cell populations and, after rejecting uncertain cells, reassigns cell type and/or makes corrections in cluster-basis. The performance of MarkerCount was evaluated and compared with existing identifiers, both marker- and reference-based, that can be customized using publicly available datasets and marker databases. The results show that MarkerCount performs better in the identification of known populations as well as of unknown ones, when compared to other reference- and marker-based cell type identifiers for most of the datasets analyzed. Research Network of Computational and Structural Biotechnology 2022-06-14 /pmc/articles/PMC9233224/ /pubmed/35782735 http://dx.doi.org/10.1016/j.csbj.2022.06.010 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Article
Kim, HanByeol
Lee, Joongho
Kang, Keunsoo
Yoon, Seokhyun
MarkerCount: A stable, count-based cell type identifier for single-cell RNA-seq experiments
title MarkerCount: A stable, count-based cell type identifier for single-cell RNA-seq experiments
title_full MarkerCount: A stable, count-based cell type identifier for single-cell RNA-seq experiments
title_fullStr MarkerCount: A stable, count-based cell type identifier for single-cell RNA-seq experiments
title_full_unstemmed MarkerCount: A stable, count-based cell type identifier for single-cell RNA-seq experiments
title_short MarkerCount: A stable, count-based cell type identifier for single-cell RNA-seq experiments
title_sort markercount: a stable, count-based cell type identifier for single-cell rna-seq experiments
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9233224/
https://www.ncbi.nlm.nih.gov/pubmed/35782735
http://dx.doi.org/10.1016/j.csbj.2022.06.010
work_keys_str_mv AT kimhanbyeol markercountastablecountbasedcelltypeidentifierforsinglecellrnaseqexperiments
AT leejoongho markercountastablecountbasedcelltypeidentifierforsinglecellrnaseqexperiments
AT kangkeunsoo markercountastablecountbasedcelltypeidentifierforsinglecellrnaseqexperiments
AT yoonseokhyun markercountastablecountbasedcelltypeidentifierforsinglecellrnaseqexperiments