Cargando…

Accurate Single-Cell Clustering through Ensemble Similarity Learning

Single-cell sequencing provides novel means to interpret the transcriptomic profiles of individual cells. To obtain in-depth analysis of single-cell sequencing, it requires effective computational methods to accurately predict single-cell clusters because single-cell sequencing techniques only provi...

Descripción completa

Detalles Bibliográficos
Autores principales: Jeong, Hyundoo, Shin, Sungtae, Yeom, Hong-Gi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8623803/
https://www.ncbi.nlm.nih.gov/pubmed/34828276
http://dx.doi.org/10.3390/genes12111670
_version_ 1784606020062937088
author Jeong, Hyundoo
Shin, Sungtae
Yeom, Hong-Gi
author_facet Jeong, Hyundoo
Shin, Sungtae
Yeom, Hong-Gi
author_sort Jeong, Hyundoo
collection PubMed
description Single-cell sequencing provides novel means to interpret the transcriptomic profiles of individual cells. To obtain in-depth analysis of single-cell sequencing, it requires effective computational methods to accurately predict single-cell clusters because single-cell sequencing techniques only provide the transcriptomic profiles of each cell. Although an accurate estimation of the cell-to-cell similarity is an essential first step to derive reliable single-cell clustering results, it is challenging to obtain the accurate similarity measurement because it highly depends on a selection of genes for similarity evaluations and the optimal set of genes for the accurate similarity estimation is typically unknown. Moreover, due to technical limitations, single-cell sequencing includes a larger number of artificial zeros, and the technical noise makes it difficult to develop effective single-cell clustering algorithms. Here, we describe a novel single-cell clustering algorithm that can accurately predict single-cell clusters in large-scale single-cell sequencing by effectively reducing the zero-inflated noise and accurately estimating the cell-to-cell similarities. First, we construct an ensemble similarity network based on different similarity estimates, and reduce the artificial noise using a random walk with restart framework. Finally, starting from a larger number small size but highly consistent clusters, we iteratively merge a pair of clusters with the maximum similarities until it reaches the predicted number of clusters. Extensive performance evaluation shows that the proposed single-cell clustering algorithm can yield the accurate single-cell clustering results and it can help deciphering the key messages underlying complex biological mechanisms.
format Online
Article
Text
id pubmed-8623803
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-86238032021-11-27 Accurate Single-Cell Clustering through Ensemble Similarity Learning Jeong, Hyundoo Shin, Sungtae Yeom, Hong-Gi Genes (Basel) Article Single-cell sequencing provides novel means to interpret the transcriptomic profiles of individual cells. To obtain in-depth analysis of single-cell sequencing, it requires effective computational methods to accurately predict single-cell clusters because single-cell sequencing techniques only provide the transcriptomic profiles of each cell. Although an accurate estimation of the cell-to-cell similarity is an essential first step to derive reliable single-cell clustering results, it is challenging to obtain the accurate similarity measurement because it highly depends on a selection of genes for similarity evaluations and the optimal set of genes for the accurate similarity estimation is typically unknown. Moreover, due to technical limitations, single-cell sequencing includes a larger number of artificial zeros, and the technical noise makes it difficult to develop effective single-cell clustering algorithms. Here, we describe a novel single-cell clustering algorithm that can accurately predict single-cell clusters in large-scale single-cell sequencing by effectively reducing the zero-inflated noise and accurately estimating the cell-to-cell similarities. First, we construct an ensemble similarity network based on different similarity estimates, and reduce the artificial noise using a random walk with restart framework. Finally, starting from a larger number small size but highly consistent clusters, we iteratively merge a pair of clusters with the maximum similarities until it reaches the predicted number of clusters. Extensive performance evaluation shows that the proposed single-cell clustering algorithm can yield the accurate single-cell clustering results and it can help deciphering the key messages underlying complex biological mechanisms. MDPI 2021-10-22 /pmc/articles/PMC8623803/ /pubmed/34828276 http://dx.doi.org/10.3390/genes12111670 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Jeong, Hyundoo
Shin, Sungtae
Yeom, Hong-Gi
Accurate Single-Cell Clustering through Ensemble Similarity Learning
title Accurate Single-Cell Clustering through Ensemble Similarity Learning
title_full Accurate Single-Cell Clustering through Ensemble Similarity Learning
title_fullStr Accurate Single-Cell Clustering through Ensemble Similarity Learning
title_full_unstemmed Accurate Single-Cell Clustering through Ensemble Similarity Learning
title_short Accurate Single-Cell Clustering through Ensemble Similarity Learning
title_sort accurate single-cell clustering through ensemble similarity learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8623803/
https://www.ncbi.nlm.nih.gov/pubmed/34828276
http://dx.doi.org/10.3390/genes12111670
work_keys_str_mv AT jeonghyundoo accuratesinglecellclusteringthroughensemblesimilaritylearning
AT shinsungtae accuratesinglecellclusteringthroughensemblesimilaritylearning
AT yeomhonggi accuratesinglecellclusteringthroughensemblesimilaritylearning