Cargando…
Cell-type annotation with accurate unseen cell-type identification using multiple references
The recent advances in single-cell RNA sequencing (scRNA-seq) techniques have stimulated efforts to identify and characterize the cellular composition of complex tissues. With the advent of various sequencing techniques, automated cell-type annotation using a well-annotated scRNA-seq reference becom...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10335708/ https://www.ncbi.nlm.nih.gov/pubmed/37379341 http://dx.doi.org/10.1371/journal.pcbi.1011261 |
_version_ | 1785071056465166336 |
---|---|
author | Xiong, Yi-Xuan Wang, Meng-Guo Chen, Luonan Zhang, Xiao-Fei |
author_facet | Xiong, Yi-Xuan Wang, Meng-Guo Chen, Luonan Zhang, Xiao-Fei |
author_sort | Xiong, Yi-Xuan |
collection | PubMed |
description | The recent advances in single-cell RNA sequencing (scRNA-seq) techniques have stimulated efforts to identify and characterize the cellular composition of complex tissues. With the advent of various sequencing techniques, automated cell-type annotation using a well-annotated scRNA-seq reference becomes popular. But it relies on the diversity of cell types in the reference, which may not capture all the cell types present in the query data of interest. There are generally unseen cell types in the query data of interest because most data atlases are obtained for different purposes and techniques. Identifying previously unseen cell types is essential for improving annotation accuracy and uncovering novel biological discoveries. To address this challenge, we propose mtANN (multiple-reference-based scRNA-seq data annotation), a new method to automatically annotate query data while accurately identifying unseen cell types with the aid of multiple references. Key innovations of mtANN include the integration of deep learning and ensemble learning to improve prediction accuracy, and the introduction of a new metric that considers three complementary aspects to distinguish between unseen cell types and shared cell types. Additionally, we provide a data-driven method to adaptively select a threshold for identifying previously unseen cell types. We demonstrate the advantages of mtANN over state-of-the-art methods for unseen cell-type identification and cell-type annotation on two benchmark dataset collections, as well as its predictive power on a collection of COVID-19 datasets. The source code and tutorial are available at https://github.com/Zhangxf-ccnu/mtANN. |
format | Online Article Text |
id | pubmed-10335708 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-103357082023-07-12 Cell-type annotation with accurate unseen cell-type identification using multiple references Xiong, Yi-Xuan Wang, Meng-Guo Chen, Luonan Zhang, Xiao-Fei PLoS Comput Biol Research Article The recent advances in single-cell RNA sequencing (scRNA-seq) techniques have stimulated efforts to identify and characterize the cellular composition of complex tissues. With the advent of various sequencing techniques, automated cell-type annotation using a well-annotated scRNA-seq reference becomes popular. But it relies on the diversity of cell types in the reference, which may not capture all the cell types present in the query data of interest. There are generally unseen cell types in the query data of interest because most data atlases are obtained for different purposes and techniques. Identifying previously unseen cell types is essential for improving annotation accuracy and uncovering novel biological discoveries. To address this challenge, we propose mtANN (multiple-reference-based scRNA-seq data annotation), a new method to automatically annotate query data while accurately identifying unseen cell types with the aid of multiple references. Key innovations of mtANN include the integration of deep learning and ensemble learning to improve prediction accuracy, and the introduction of a new metric that considers three complementary aspects to distinguish between unseen cell types and shared cell types. Additionally, we provide a data-driven method to adaptively select a threshold for identifying previously unseen cell types. We demonstrate the advantages of mtANN over state-of-the-art methods for unseen cell-type identification and cell-type annotation on two benchmark dataset collections, as well as its predictive power on a collection of COVID-19 datasets. The source code and tutorial are available at https://github.com/Zhangxf-ccnu/mtANN. Public Library of Science 2023-06-28 /pmc/articles/PMC10335708/ /pubmed/37379341 http://dx.doi.org/10.1371/journal.pcbi.1011261 Text en © 2023 Xiong et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Xiong, Yi-Xuan Wang, Meng-Guo Chen, Luonan Zhang, Xiao-Fei Cell-type annotation with accurate unseen cell-type identification using multiple references |
title | Cell-type annotation with accurate unseen cell-type identification using multiple references |
title_full | Cell-type annotation with accurate unseen cell-type identification using multiple references |
title_fullStr | Cell-type annotation with accurate unseen cell-type identification using multiple references |
title_full_unstemmed | Cell-type annotation with accurate unseen cell-type identification using multiple references |
title_short | Cell-type annotation with accurate unseen cell-type identification using multiple references |
title_sort | cell-type annotation with accurate unseen cell-type identification using multiple references |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10335708/ https://www.ncbi.nlm.nih.gov/pubmed/37379341 http://dx.doi.org/10.1371/journal.pcbi.1011261 |
work_keys_str_mv | AT xiongyixuan celltypeannotationwithaccurateunseencelltypeidentificationusingmultiplereferences AT wangmengguo celltypeannotationwithaccurateunseencelltypeidentificationusingmultiplereferences AT chenluonan celltypeannotationwithaccurateunseencelltypeidentificationusingmultiplereferences AT zhangxiaofei celltypeannotationwithaccurateunseencelltypeidentificationusingmultiplereferences |