Cargando…

A hybrid deep clustering approach for robust cell type profiling using single-cell RNA-seq data

Single-cell RNA sequencing (scRNA-seq) is a recent technology that enables fine-grained discovery of cellular subtypes and specific cell states. Analysis of scRNA-seq data routinely involves machine learning methods, such as feature learning, clustering, and classification, to assist in uncovering n...

Descripción completa

Detalles Bibliográficos
Autores principales: Srinivasan, Suhas, Leshchyk, Anastasia, Johnson, Nathan T., Korkin, Dmitry
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7491323/
https://www.ncbi.nlm.nih.gov/pubmed/32532794
http://dx.doi.org/10.1261/rna.074427.119
_version_ 1783582199717560320
author Srinivasan, Suhas
Leshchyk, Anastasia
Johnson, Nathan T.
Korkin, Dmitry
author_facet Srinivasan, Suhas
Leshchyk, Anastasia
Johnson, Nathan T.
Korkin, Dmitry
author_sort Srinivasan, Suhas
collection PubMed
description Single-cell RNA sequencing (scRNA-seq) is a recent technology that enables fine-grained discovery of cellular subtypes and specific cell states. Analysis of scRNA-seq data routinely involves machine learning methods, such as feature learning, clustering, and classification, to assist in uncovering novel information from scRNA-seq data. However, current methods are not well suited to deal with the substantial amount of noise that is created by the experiments or the variation that occurs due to differences in the cells of the same type. To address this, we developed a new hybrid approach, deep unsupervised single-cell clustering (DUSC), which integrates feature generation based on a deep learning architecture by using a new technique to estimate the number of latent features, with a model-based clustering algorithm, to find a compact and informative representation of the single-cell transcriptomic data generating robust clusters. We also include a technique to estimate an efficient number of latent features in the deep learning model. Our method outperforms both classical and state-of-the-art feature learning and clustering methods, approaching the accuracy of supervised learning. We applied DUSC to a single-cell transcriptomics data set obtained from a triple-negative breast cancer tumor to identify potential cancer subclones accentuated by copy-number variation and investigate the role of clonal heterogeneity. Our method is freely available to the community and will hopefully facilitate our understanding of the cellular atlas of living organisms as well as provide the means to improve patient diagnostics and treatment.
format Online
Article
Text
id pubmed-7491323
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-74913232021-10-01 A hybrid deep clustering approach for robust cell type profiling using single-cell RNA-seq data Srinivasan, Suhas Leshchyk, Anastasia Johnson, Nathan T. Korkin, Dmitry RNA Bioinformatic Single-cell RNA sequencing (scRNA-seq) is a recent technology that enables fine-grained discovery of cellular subtypes and specific cell states. Analysis of scRNA-seq data routinely involves machine learning methods, such as feature learning, clustering, and classification, to assist in uncovering novel information from scRNA-seq data. However, current methods are not well suited to deal with the substantial amount of noise that is created by the experiments or the variation that occurs due to differences in the cells of the same type. To address this, we developed a new hybrid approach, deep unsupervised single-cell clustering (DUSC), which integrates feature generation based on a deep learning architecture by using a new technique to estimate the number of latent features, with a model-based clustering algorithm, to find a compact and informative representation of the single-cell transcriptomic data generating robust clusters. We also include a technique to estimate an efficient number of latent features in the deep learning model. Our method outperforms both classical and state-of-the-art feature learning and clustering methods, approaching the accuracy of supervised learning. We applied DUSC to a single-cell transcriptomics data set obtained from a triple-negative breast cancer tumor to identify potential cancer subclones accentuated by copy-number variation and investigate the role of clonal heterogeneity. Our method is freely available to the community and will hopefully facilitate our understanding of the cellular atlas of living organisms as well as provide the means to improve patient diagnostics and treatment. Cold Spring Harbor Laboratory Press 2020-10 /pmc/articles/PMC7491323/ /pubmed/32532794 http://dx.doi.org/10.1261/rna.074427.119 Text en © 2020 Srinivasan et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed exclusively by the RNA Society for the first 12 months after the full-issue publication date (see http://rnajournal.cshlp.org/site/misc/terms.xhtml). After 12 months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.
spellingShingle Bioinformatic
Srinivasan, Suhas
Leshchyk, Anastasia
Johnson, Nathan T.
Korkin, Dmitry
A hybrid deep clustering approach for robust cell type profiling using single-cell RNA-seq data
title A hybrid deep clustering approach for robust cell type profiling using single-cell RNA-seq data
title_full A hybrid deep clustering approach for robust cell type profiling using single-cell RNA-seq data
title_fullStr A hybrid deep clustering approach for robust cell type profiling using single-cell RNA-seq data
title_full_unstemmed A hybrid deep clustering approach for robust cell type profiling using single-cell RNA-seq data
title_short A hybrid deep clustering approach for robust cell type profiling using single-cell RNA-seq data
title_sort hybrid deep clustering approach for robust cell type profiling using single-cell rna-seq data
topic Bioinformatic
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7491323/
https://www.ncbi.nlm.nih.gov/pubmed/32532794
http://dx.doi.org/10.1261/rna.074427.119
work_keys_str_mv AT srinivasansuhas ahybriddeepclusteringapproachforrobustcelltypeprofilingusingsinglecellrnaseqdata
AT leshchykanastasia ahybriddeepclusteringapproachforrobustcelltypeprofilingusingsinglecellrnaseqdata
AT johnsonnathant ahybriddeepclusteringapproachforrobustcelltypeprofilingusingsinglecellrnaseqdata
AT korkindmitry ahybriddeepclusteringapproachforrobustcelltypeprofilingusingsinglecellrnaseqdata
AT srinivasansuhas hybriddeepclusteringapproachforrobustcelltypeprofilingusingsinglecellrnaseqdata
AT leshchykanastasia hybriddeepclusteringapproachforrobustcelltypeprofilingusingsinglecellrnaseqdata
AT johnsonnathant hybriddeepclusteringapproachforrobustcelltypeprofilingusingsinglecellrnaseqdata
AT korkindmitry hybriddeepclusteringapproachforrobustcelltypeprofilingusingsinglecellrnaseqdata