Cargando…
Probabilistic harmonization and annotation of single‐cell transcriptomics data with deep generative models
As the number of single‐cell transcriptomics datasets grows, the natural next step is to integrate the accumulating data to achieve a common ontology of cell types and states. However, it is not straightforward to compare gene expression levels across datasets and to automatically assign cell type l...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7829634/ https://www.ncbi.nlm.nih.gov/pubmed/33491336 http://dx.doi.org/10.15252/msb.20209620 |
_version_ | 1783641215562940416 |
---|---|
author | Xu, Chenling Lopez, Romain Mehlman, Edouard Regier, Jeffrey Jordan, Michael I Yosef, Nir |
author_facet | Xu, Chenling Lopez, Romain Mehlman, Edouard Regier, Jeffrey Jordan, Michael I Yosef, Nir |
author_sort | Xu, Chenling |
collection | PubMed |
description | As the number of single‐cell transcriptomics datasets grows, the natural next step is to integrate the accumulating data to achieve a common ontology of cell types and states. However, it is not straightforward to compare gene expression levels across datasets and to automatically assign cell type labels in a new dataset based on existing annotations. In this manuscript, we demonstrate that our previously developed method, scVI, provides an effective and fully probabilistic approach for joint representation and analysis of scRNA‐seq data, while accounting for uncertainty caused by biological and measurement noise. We also introduce single‐cell ANnotation using Variational Inference (scANVI), a semi‐supervised variant of scVI designed to leverage existing cell state annotations. We demonstrate that scVI and scANVI compare favorably to state‐of‐the‐art methods for data integration and cell state annotation in terms of accuracy, scalability, and adaptability to challenging settings. In contrast to existing methods, scVI and scANVI integrate multiple datasets with a single generative model that can be directly used for downstream tasks, such as differential expression. Both methods are easily accessible through scvi‐tools. |
format | Online Article Text |
id | pubmed-7829634 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-78296342021-01-29 Probabilistic harmonization and annotation of single‐cell transcriptomics data with deep generative models Xu, Chenling Lopez, Romain Mehlman, Edouard Regier, Jeffrey Jordan, Michael I Yosef, Nir Mol Syst Biol Articles As the number of single‐cell transcriptomics datasets grows, the natural next step is to integrate the accumulating data to achieve a common ontology of cell types and states. However, it is not straightforward to compare gene expression levels across datasets and to automatically assign cell type labels in a new dataset based on existing annotations. In this manuscript, we demonstrate that our previously developed method, scVI, provides an effective and fully probabilistic approach for joint representation and analysis of scRNA‐seq data, while accounting for uncertainty caused by biological and measurement noise. We also introduce single‐cell ANnotation using Variational Inference (scANVI), a semi‐supervised variant of scVI designed to leverage existing cell state annotations. We demonstrate that scVI and scANVI compare favorably to state‐of‐the‐art methods for data integration and cell state annotation in terms of accuracy, scalability, and adaptability to challenging settings. In contrast to existing methods, scVI and scANVI integrate multiple datasets with a single generative model that can be directly used for downstream tasks, such as differential expression. Both methods are easily accessible through scvi‐tools. John Wiley and Sons Inc. 2021-01-25 /pmc/articles/PMC7829634/ /pubmed/33491336 http://dx.doi.org/10.15252/msb.20209620 Text en © 2021 The Authors. Published under the terms of the CC BY 4.0 license. This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Articles Xu, Chenling Lopez, Romain Mehlman, Edouard Regier, Jeffrey Jordan, Michael I Yosef, Nir Probabilistic harmonization and annotation of single‐cell transcriptomics data with deep generative models |
title | Probabilistic harmonization and annotation of single‐cell transcriptomics data with deep generative models |
title_full | Probabilistic harmonization and annotation of single‐cell transcriptomics data with deep generative models |
title_fullStr | Probabilistic harmonization and annotation of single‐cell transcriptomics data with deep generative models |
title_full_unstemmed | Probabilistic harmonization and annotation of single‐cell transcriptomics data with deep generative models |
title_short | Probabilistic harmonization and annotation of single‐cell transcriptomics data with deep generative models |
title_sort | probabilistic harmonization and annotation of single‐cell transcriptomics data with deep generative models |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7829634/ https://www.ncbi.nlm.nih.gov/pubmed/33491336 http://dx.doi.org/10.15252/msb.20209620 |
work_keys_str_mv | AT xuchenling probabilisticharmonizationandannotationofsinglecelltranscriptomicsdatawithdeepgenerativemodels AT lopezromain probabilisticharmonizationandannotationofsinglecelltranscriptomicsdatawithdeepgenerativemodels AT mehlmanedouard probabilisticharmonizationandannotationofsinglecelltranscriptomicsdatawithdeepgenerativemodels AT regierjeffrey probabilisticharmonizationandannotationofsinglecelltranscriptomicsdatawithdeepgenerativemodels AT jordanmichaeli probabilisticharmonizationandannotationofsinglecelltranscriptomicsdatawithdeepgenerativemodels AT yosefnir probabilisticharmonizationandannotationofsinglecelltranscriptomicsdatawithdeepgenerativemodels |