Cargando…

scDeepInsight: a supervised cell-type identification method for scRNA-seq data with deep learning

Annotation of cell-types is a critical step in the analysis of single-cell RNA sequencing (scRNA-seq) data that allows the study of heterogeneity across multiple cell populations. Currently, this is most commonly done using unsupervised clustering algorithms, which project single-cell expression dat...

Descripción completa

Detalles Bibliográficos
Autores principales: Jia, Shangru, Lysenko, Artem, Boroevich, Keith A, Sharma, Alok, Tsunoda, Tatsuhiko
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10516353/
https://www.ncbi.nlm.nih.gov/pubmed/37523217
http://dx.doi.org/10.1093/bib/bbad266
_version_ 1785109112770527232
author Jia, Shangru
Lysenko, Artem
Boroevich, Keith A
Sharma, Alok
Tsunoda, Tatsuhiko
author_facet Jia, Shangru
Lysenko, Artem
Boroevich, Keith A
Sharma, Alok
Tsunoda, Tatsuhiko
author_sort Jia, Shangru
collection PubMed
description Annotation of cell-types is a critical step in the analysis of single-cell RNA sequencing (scRNA-seq) data that allows the study of heterogeneity across multiple cell populations. Currently, this is most commonly done using unsupervised clustering algorithms, which project single-cell expression data into a lower dimensional space and then cluster cells based on their distances from each other. However, as these methods do not use reference datasets, they can only achieve a rough classification of cell-types, and it is difficult to improve the recognition accuracy further. To effectively solve this issue, we propose a novel supervised annotation method, scDeepInsight. The scDeepInsight method is capable of performing manifold assignments. It is competent in executing data integration through batch normalization, performing supervised training on the reference dataset, doing outlier detection and annotating cell-types on query datasets. Moreover, it can help identify active genes or marker genes related to cell-types. The training of the scDeepInsight model is performed in a unique way. Tabular scRNA-seq data are first converted to corresponding images through the DeepInsight methodology. DeepInsight can create a trainable image transformer to convert non-image RNA data to images by comprehensively comparing interrelationships among multiple genes. Subsequently, the converted images are fed into convolutional neural networks such as EfficientNet-b3. This enables automatic feature extraction to identify the cell-types of scRNA-seq samples. We benchmarked scDeepInsight with six other mainstream cell annotation methods. The average accuracy rate of scDeepInsight reached 87.5%, which is more than 7% higher compared with the state-of-the-art methods.
format Online
Article
Text
id pubmed-10516353
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-105163532023-09-23 scDeepInsight: a supervised cell-type identification method for scRNA-seq data with deep learning Jia, Shangru Lysenko, Artem Boroevich, Keith A Sharma, Alok Tsunoda, Tatsuhiko Brief Bioinform Problem Solving Protocol Annotation of cell-types is a critical step in the analysis of single-cell RNA sequencing (scRNA-seq) data that allows the study of heterogeneity across multiple cell populations. Currently, this is most commonly done using unsupervised clustering algorithms, which project single-cell expression data into a lower dimensional space and then cluster cells based on their distances from each other. However, as these methods do not use reference datasets, they can only achieve a rough classification of cell-types, and it is difficult to improve the recognition accuracy further. To effectively solve this issue, we propose a novel supervised annotation method, scDeepInsight. The scDeepInsight method is capable of performing manifold assignments. It is competent in executing data integration through batch normalization, performing supervised training on the reference dataset, doing outlier detection and annotating cell-types on query datasets. Moreover, it can help identify active genes or marker genes related to cell-types. The training of the scDeepInsight model is performed in a unique way. Tabular scRNA-seq data are first converted to corresponding images through the DeepInsight methodology. DeepInsight can create a trainable image transformer to convert non-image RNA data to images by comprehensively comparing interrelationships among multiple genes. Subsequently, the converted images are fed into convolutional neural networks such as EfficientNet-b3. This enables automatic feature extraction to identify the cell-types of scRNA-seq samples. We benchmarked scDeepInsight with six other mainstream cell annotation methods. The average accuracy rate of scDeepInsight reached 87.5%, which is more than 7% higher compared with the state-of-the-art methods. Oxford University Press 2023-07-31 /pmc/articles/PMC10516353/ /pubmed/37523217 http://dx.doi.org/10.1093/bib/bbad266 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Problem Solving Protocol
Jia, Shangru
Lysenko, Artem
Boroevich, Keith A
Sharma, Alok
Tsunoda, Tatsuhiko
scDeepInsight: a supervised cell-type identification method for scRNA-seq data with deep learning
title scDeepInsight: a supervised cell-type identification method for scRNA-seq data with deep learning
title_full scDeepInsight: a supervised cell-type identification method for scRNA-seq data with deep learning
title_fullStr scDeepInsight: a supervised cell-type identification method for scRNA-seq data with deep learning
title_full_unstemmed scDeepInsight: a supervised cell-type identification method for scRNA-seq data with deep learning
title_short scDeepInsight: a supervised cell-type identification method for scRNA-seq data with deep learning
title_sort scdeepinsight: a supervised cell-type identification method for scrna-seq data with deep learning
topic Problem Solving Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10516353/
https://www.ncbi.nlm.nih.gov/pubmed/37523217
http://dx.doi.org/10.1093/bib/bbad266
work_keys_str_mv AT jiashangru scdeepinsightasupervisedcelltypeidentificationmethodforscrnaseqdatawithdeeplearning
AT lysenkoartem scdeepinsightasupervisedcelltypeidentificationmethodforscrnaseqdatawithdeeplearning
AT boroevichkeitha scdeepinsightasupervisedcelltypeidentificationmethodforscrnaseqdatawithdeeplearning
AT sharmaalok scdeepinsightasupervisedcelltypeidentificationmethodforscrnaseqdatawithdeeplearning
AT tsunodatatsuhiko scdeepinsightasupervisedcelltypeidentificationmethodforscrnaseqdatawithdeeplearning