Cargando…

Cell-type annotation with accurate unseen cell-type identification using multiple references

The recent advances in single-cell RNA sequencing (scRNA-seq) techniques have stimulated efforts to identify and characterize the cellular composition of complex tissues. With the advent of various sequencing techniques, automated cell-type annotation using a well-annotated scRNA-seq reference becom...

Descripción completa

Detalles Bibliográficos
Autores principales: Xiong, Yi-Xuan, Wang, Meng-Guo, Chen, Luonan, Zhang, Xiao-Fei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10335708/
https://www.ncbi.nlm.nih.gov/pubmed/37379341
http://dx.doi.org/10.1371/journal.pcbi.1011261
_version_ 1785071056465166336
author Xiong, Yi-Xuan
Wang, Meng-Guo
Chen, Luonan
Zhang, Xiao-Fei
author_facet Xiong, Yi-Xuan
Wang, Meng-Guo
Chen, Luonan
Zhang, Xiao-Fei
author_sort Xiong, Yi-Xuan
collection PubMed
description The recent advances in single-cell RNA sequencing (scRNA-seq) techniques have stimulated efforts to identify and characterize the cellular composition of complex tissues. With the advent of various sequencing techniques, automated cell-type annotation using a well-annotated scRNA-seq reference becomes popular. But it relies on the diversity of cell types in the reference, which may not capture all the cell types present in the query data of interest. There are generally unseen cell types in the query data of interest because most data atlases are obtained for different purposes and techniques. Identifying previously unseen cell types is essential for improving annotation accuracy and uncovering novel biological discoveries. To address this challenge, we propose mtANN (multiple-reference-based scRNA-seq data annotation), a new method to automatically annotate query data while accurately identifying unseen cell types with the aid of multiple references. Key innovations of mtANN include the integration of deep learning and ensemble learning to improve prediction accuracy, and the introduction of a new metric that considers three complementary aspects to distinguish between unseen cell types and shared cell types. Additionally, we provide a data-driven method to adaptively select a threshold for identifying previously unseen cell types. We demonstrate the advantages of mtANN over state-of-the-art methods for unseen cell-type identification and cell-type annotation on two benchmark dataset collections, as well as its predictive power on a collection of COVID-19 datasets. The source code and tutorial are available at https://github.com/Zhangxf-ccnu/mtANN.
format Online
Article
Text
id pubmed-10335708
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-103357082023-07-12 Cell-type annotation with accurate unseen cell-type identification using multiple references Xiong, Yi-Xuan Wang, Meng-Guo Chen, Luonan Zhang, Xiao-Fei PLoS Comput Biol Research Article The recent advances in single-cell RNA sequencing (scRNA-seq) techniques have stimulated efforts to identify and characterize the cellular composition of complex tissues. With the advent of various sequencing techniques, automated cell-type annotation using a well-annotated scRNA-seq reference becomes popular. But it relies on the diversity of cell types in the reference, which may not capture all the cell types present in the query data of interest. There are generally unseen cell types in the query data of interest because most data atlases are obtained for different purposes and techniques. Identifying previously unseen cell types is essential for improving annotation accuracy and uncovering novel biological discoveries. To address this challenge, we propose mtANN (multiple-reference-based scRNA-seq data annotation), a new method to automatically annotate query data while accurately identifying unseen cell types with the aid of multiple references. Key innovations of mtANN include the integration of deep learning and ensemble learning to improve prediction accuracy, and the introduction of a new metric that considers three complementary aspects to distinguish between unseen cell types and shared cell types. Additionally, we provide a data-driven method to adaptively select a threshold for identifying previously unseen cell types. We demonstrate the advantages of mtANN over state-of-the-art methods for unseen cell-type identification and cell-type annotation on two benchmark dataset collections, as well as its predictive power on a collection of COVID-19 datasets. The source code and tutorial are available at https://github.com/Zhangxf-ccnu/mtANN. Public Library of Science 2023-06-28 /pmc/articles/PMC10335708/ /pubmed/37379341 http://dx.doi.org/10.1371/journal.pcbi.1011261 Text en © 2023 Xiong et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Xiong, Yi-Xuan
Wang, Meng-Guo
Chen, Luonan
Zhang, Xiao-Fei
Cell-type annotation with accurate unseen cell-type identification using multiple references
title Cell-type annotation with accurate unseen cell-type identification using multiple references
title_full Cell-type annotation with accurate unseen cell-type identification using multiple references
title_fullStr Cell-type annotation with accurate unseen cell-type identification using multiple references
title_full_unstemmed Cell-type annotation with accurate unseen cell-type identification using multiple references
title_short Cell-type annotation with accurate unseen cell-type identification using multiple references
title_sort cell-type annotation with accurate unseen cell-type identification using multiple references
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10335708/
https://www.ncbi.nlm.nih.gov/pubmed/37379341
http://dx.doi.org/10.1371/journal.pcbi.1011261
work_keys_str_mv AT xiongyixuan celltypeannotationwithaccurateunseencelltypeidentificationusingmultiplereferences
AT wangmengguo celltypeannotationwithaccurateunseencelltypeidentificationusingmultiplereferences
AT chenluonan celltypeannotationwithaccurateunseencelltypeidentificationusingmultiplereferences
AT zhangxiaofei celltypeannotationwithaccurateunseencelltypeidentificationusingmultiplereferences