Cargando…

JIND: joint integration and discrimination for automated single-cell annotation

MOTIVATION: An important step in the transcriptomic analysis of individual cells involves manually determining the cellular identities. To ease this labor-intensive annotation of cell-types, there has been a growing interest in automated cell annotation, which can be achieved by training classificat...

Descripción completa

Detalles Bibliográficos
Autores principales:	Goyal, Mohit, Serrano, Guillermo, Argemi, Josepmaria, Shomorony, Ilan, Hernaez, Mikel, Ochoa, Idoia
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2022
Materias:	Original Papers
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9278043/ https://www.ncbi.nlm.nih.gov/pubmed/35253844 http://dx.doi.org/10.1093/bioinformatics/btac140

_version_	1784746116730847232
author	Goyal, Mohit Serrano, Guillermo Argemi, Josepmaria Shomorony, Ilan Hernaez, Mikel Ochoa, Idoia
author_facet	Goyal, Mohit Serrano, Guillermo Argemi, Josepmaria Shomorony, Ilan Hernaez, Mikel Ochoa, Idoia
author_sort	Goyal, Mohit
collection	PubMed
description	MOTIVATION: An important step in the transcriptomic analysis of individual cells involves manually determining the cellular identities. To ease this labor-intensive annotation of cell-types, there has been a growing interest in automated cell annotation, which can be achieved by training classification algorithms on previously annotated datasets. Existing pipelines employ dataset integration methods to remove potential batch effects between source (annotated) and target (unannotated) datasets. However, the integration and classification steps are usually independent of each other and performed by different tools. We propose JIND (joint integration and discrimination for automated single-cell annotation), a neural-network-based framework for automated cell-type identification that performs integration in a space suitably chosen to facilitate cell classification. To account for batch effects, JIND performs a novel asymmetric alignment in which unseen cells are mapped onto the previously learned latent space, avoiding the need of retraining the classification model for new datasets. JIND also learns cell-type-specific confidence thresholds to identify cells that cannot be reliably classified. RESULTS: We show on several batched datasets that the joint approach to integration and classification of JIND outperforms in accuracy existing pipelines, and a smaller fraction of cells is rejected as unlabeled as a result of the cell-specific confidence thresholds. Moreover, we investigate cells misclassified by JIND and provide evidence suggesting that they could be due to outliers in the annotated datasets or errors in the original approach used for annotation of the target batch. AVAILABILITY AND IMPLEMENTATION: Implementation for JIND is available at https://github.com/mohit1997/JIND and the data underlying this article can be accessed at https://doi.org/10.5281/zenodo.6246322. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format	Online Article Text
id	pubmed-9278043
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-92780432022-07-18 JIND: joint integration and discrimination for automated single-cell annotation Goyal, Mohit Serrano, Guillermo Argemi, Josepmaria Shomorony, Ilan Hernaez, Mikel Ochoa, Idoia Bioinformatics Original Papers MOTIVATION: An important step in the transcriptomic analysis of individual cells involves manually determining the cellular identities. To ease this labor-intensive annotation of cell-types, there has been a growing interest in automated cell annotation, which can be achieved by training classification algorithms on previously annotated datasets. Existing pipelines employ dataset integration methods to remove potential batch effects between source (annotated) and target (unannotated) datasets. However, the integration and classification steps are usually independent of each other and performed by different tools. We propose JIND (joint integration and discrimination for automated single-cell annotation), a neural-network-based framework for automated cell-type identification that performs integration in a space suitably chosen to facilitate cell classification. To account for batch effects, JIND performs a novel asymmetric alignment in which unseen cells are mapped onto the previously learned latent space, avoiding the need of retraining the classification model for new datasets. JIND also learns cell-type-specific confidence thresholds to identify cells that cannot be reliably classified. RESULTS: We show on several batched datasets that the joint approach to integration and classification of JIND outperforms in accuracy existing pipelines, and a smaller fraction of cells is rejected as unlabeled as a result of the cell-specific confidence thresholds. Moreover, we investigate cells misclassified by JIND and provide evidence suggesting that they could be due to outliers in the annotated datasets or errors in the original approach used for annotation of the target batch. AVAILABILITY AND IMPLEMENTATION: Implementation for JIND is available at https://github.com/mohit1997/JIND and the data underlying this article can be accessed at https://doi.org/10.5281/zenodo.6246322. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-03-07 /pmc/articles/PMC9278043/ /pubmed/35253844 http://dx.doi.org/10.1093/bioinformatics/btac140 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Original Papers Goyal, Mohit Serrano, Guillermo Argemi, Josepmaria Shomorony, Ilan Hernaez, Mikel Ochoa, Idoia JIND: joint integration and discrimination for automated single-cell annotation
title	JIND: joint integration and discrimination for automated single-cell annotation
title_full	JIND: joint integration and discrimination for automated single-cell annotation
title_fullStr	JIND: joint integration and discrimination for automated single-cell annotation
title_full_unstemmed	JIND: joint integration and discrimination for automated single-cell annotation
title_short	JIND: joint integration and discrimination for automated single-cell annotation
title_sort	jind: joint integration and discrimination for automated single-cell annotation
topic	Original Papers
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9278043/ https://www.ncbi.nlm.nih.gov/pubmed/35253844 http://dx.doi.org/10.1093/bioinformatics/btac140
work_keys_str_mv	AT goyalmohit jindjointintegrationanddiscriminationforautomatedsinglecellannotation AT serranoguillermo jindjointintegrationanddiscriminationforautomatedsinglecellannotation AT argemijosepmaria jindjointintegrationanddiscriminationforautomatedsinglecellannotation AT shomoronyilan jindjointintegrationanddiscriminationforautomatedsinglecellannotation AT hernaezmikel jindjointintegrationanddiscriminationforautomatedsinglecellannotation AT ochoaidoia jindjointintegrationanddiscriminationforautomatedsinglecellannotation

JIND: joint integration and discrimination for automated single-cell annotation

Ejemplares similares