Cargando…

Population-level integration of single-cell datasets enables multi-scale analysis across samples

The increasing generation of population-level single-cell atlases has the potential to link sample metadata with cellular data. Constructing such references requires integration of heterogeneous cohorts with varying metadata. Here we present single-cell population level integration (scPoli), an open...

Descripción completa

Detalles Bibliográficos
Autores principales: De Donno, Carlo, Hediyeh-Zadeh, Soroor, Moinfar, Amir Ali, Wagenstetter, Marco, Zappia, Luke, Lotfollahi, Mohammad, Theis, Fabian J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group US 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10630133/
https://www.ncbi.nlm.nih.gov/pubmed/37813989
http://dx.doi.org/10.1038/s41592-023-02035-2
_version_ 1785132091290157056
author De Donno, Carlo
Hediyeh-Zadeh, Soroor
Moinfar, Amir Ali
Wagenstetter, Marco
Zappia, Luke
Lotfollahi, Mohammad
Theis, Fabian J.
author_facet De Donno, Carlo
Hediyeh-Zadeh, Soroor
Moinfar, Amir Ali
Wagenstetter, Marco
Zappia, Luke
Lotfollahi, Mohammad
Theis, Fabian J.
author_sort De Donno, Carlo
collection PubMed
description The increasing generation of population-level single-cell atlases has the potential to link sample metadata with cellular data. Constructing such references requires integration of heterogeneous cohorts with varying metadata. Here we present single-cell population level integration (scPoli), an open-world learner that incorporates generative models to learn sample and cell representations for data integration, label transfer and reference mapping. We applied scPoli on population-level atlases of lung and peripheral blood mononuclear cells, the latter consisting of 7.8 million cells across 2,375 samples. We demonstrate that scPoli can explain sample-level biological and technical variations using sample embeddings revealing genes associated with batch effects and biological effects. scPoli is further applicable to single-cell sequencing assay for transposase-accessible chromatin and cross-species datasets, offering insights into chromatin accessibility and comparative genomics. We envision scPoli becoming an important tool for population-level single-cell data integration facilitating atlas use but also interpretation by means of multi-scale analyses.
format Online
Article
Text
id pubmed-10630133
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group US
record_format MEDLINE/PubMed
spelling pubmed-106301332023-11-09 Population-level integration of single-cell datasets enables multi-scale analysis across samples De Donno, Carlo Hediyeh-Zadeh, Soroor Moinfar, Amir Ali Wagenstetter, Marco Zappia, Luke Lotfollahi, Mohammad Theis, Fabian J. Nat Methods Article The increasing generation of population-level single-cell atlases has the potential to link sample metadata with cellular data. Constructing such references requires integration of heterogeneous cohorts with varying metadata. Here we present single-cell population level integration (scPoli), an open-world learner that incorporates generative models to learn sample and cell representations for data integration, label transfer and reference mapping. We applied scPoli on population-level atlases of lung and peripheral blood mononuclear cells, the latter consisting of 7.8 million cells across 2,375 samples. We demonstrate that scPoli can explain sample-level biological and technical variations using sample embeddings revealing genes associated with batch effects and biological effects. scPoli is further applicable to single-cell sequencing assay for transposase-accessible chromatin and cross-species datasets, offering insights into chromatin accessibility and comparative genomics. We envision scPoli becoming an important tool for population-level single-cell data integration facilitating atlas use but also interpretation by means of multi-scale analyses. Nature Publishing Group US 2023-10-09 2023 /pmc/articles/PMC10630133/ /pubmed/37813989 http://dx.doi.org/10.1038/s41592-023-02035-2 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
De Donno, Carlo
Hediyeh-Zadeh, Soroor
Moinfar, Amir Ali
Wagenstetter, Marco
Zappia, Luke
Lotfollahi, Mohammad
Theis, Fabian J.
Population-level integration of single-cell datasets enables multi-scale analysis across samples
title Population-level integration of single-cell datasets enables multi-scale analysis across samples
title_full Population-level integration of single-cell datasets enables multi-scale analysis across samples
title_fullStr Population-level integration of single-cell datasets enables multi-scale analysis across samples
title_full_unstemmed Population-level integration of single-cell datasets enables multi-scale analysis across samples
title_short Population-level integration of single-cell datasets enables multi-scale analysis across samples
title_sort population-level integration of single-cell datasets enables multi-scale analysis across samples
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10630133/
https://www.ncbi.nlm.nih.gov/pubmed/37813989
http://dx.doi.org/10.1038/s41592-023-02035-2
work_keys_str_mv AT dedonnocarlo populationlevelintegrationofsinglecelldatasetsenablesmultiscaleanalysisacrosssamples
AT hediyehzadehsoroor populationlevelintegrationofsinglecelldatasetsenablesmultiscaleanalysisacrosssamples
AT moinfaramirali populationlevelintegrationofsinglecelldatasetsenablesmultiscaleanalysisacrosssamples
AT wagenstettermarco populationlevelintegrationofsinglecelldatasetsenablesmultiscaleanalysisacrosssamples
AT zappialuke populationlevelintegrationofsinglecelldatasetsenablesmultiscaleanalysisacrosssamples
AT lotfollahimohammad populationlevelintegrationofsinglecelldatasetsenablesmultiscaleanalysisacrosssamples
AT theisfabianj populationlevelintegrationofsinglecelldatasetsenablesmultiscaleanalysisacrosssamples