Cargando…
Population-level integration of single-cell datasets enables multi-scale analysis across samples
The increasing generation of population-level single-cell atlases has the potential to link sample metadata with cellular data. Constructing such references requires integration of heterogeneous cohorts with varying metadata. Here we present single-cell population level integration (scPoli), an open...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group US
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10630133/ https://www.ncbi.nlm.nih.gov/pubmed/37813989 http://dx.doi.org/10.1038/s41592-023-02035-2 |
_version_ | 1785132091290157056 |
---|---|
author | De Donno, Carlo Hediyeh-Zadeh, Soroor Moinfar, Amir Ali Wagenstetter, Marco Zappia, Luke Lotfollahi, Mohammad Theis, Fabian J. |
author_facet | De Donno, Carlo Hediyeh-Zadeh, Soroor Moinfar, Amir Ali Wagenstetter, Marco Zappia, Luke Lotfollahi, Mohammad Theis, Fabian J. |
author_sort | De Donno, Carlo |
collection | PubMed |
description | The increasing generation of population-level single-cell atlases has the potential to link sample metadata with cellular data. Constructing such references requires integration of heterogeneous cohorts with varying metadata. Here we present single-cell population level integration (scPoli), an open-world learner that incorporates generative models to learn sample and cell representations for data integration, label transfer and reference mapping. We applied scPoli on population-level atlases of lung and peripheral blood mononuclear cells, the latter consisting of 7.8 million cells across 2,375 samples. We demonstrate that scPoli can explain sample-level biological and technical variations using sample embeddings revealing genes associated with batch effects and biological effects. scPoli is further applicable to single-cell sequencing assay for transposase-accessible chromatin and cross-species datasets, offering insights into chromatin accessibility and comparative genomics. We envision scPoli becoming an important tool for population-level single-cell data integration facilitating atlas use but also interpretation by means of multi-scale analyses. |
format | Online Article Text |
id | pubmed-10630133 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group US |
record_format | MEDLINE/PubMed |
spelling | pubmed-106301332023-11-09 Population-level integration of single-cell datasets enables multi-scale analysis across samples De Donno, Carlo Hediyeh-Zadeh, Soroor Moinfar, Amir Ali Wagenstetter, Marco Zappia, Luke Lotfollahi, Mohammad Theis, Fabian J. Nat Methods Article The increasing generation of population-level single-cell atlases has the potential to link sample metadata with cellular data. Constructing such references requires integration of heterogeneous cohorts with varying metadata. Here we present single-cell population level integration (scPoli), an open-world learner that incorporates generative models to learn sample and cell representations for data integration, label transfer and reference mapping. We applied scPoli on population-level atlases of lung and peripheral blood mononuclear cells, the latter consisting of 7.8 million cells across 2,375 samples. We demonstrate that scPoli can explain sample-level biological and technical variations using sample embeddings revealing genes associated with batch effects and biological effects. scPoli is further applicable to single-cell sequencing assay for transposase-accessible chromatin and cross-species datasets, offering insights into chromatin accessibility and comparative genomics. We envision scPoli becoming an important tool for population-level single-cell data integration facilitating atlas use but also interpretation by means of multi-scale analyses. Nature Publishing Group US 2023-10-09 2023 /pmc/articles/PMC10630133/ /pubmed/37813989 http://dx.doi.org/10.1038/s41592-023-02035-2 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article De Donno, Carlo Hediyeh-Zadeh, Soroor Moinfar, Amir Ali Wagenstetter, Marco Zappia, Luke Lotfollahi, Mohammad Theis, Fabian J. Population-level integration of single-cell datasets enables multi-scale analysis across samples |
title | Population-level integration of single-cell datasets enables multi-scale analysis across samples |
title_full | Population-level integration of single-cell datasets enables multi-scale analysis across samples |
title_fullStr | Population-level integration of single-cell datasets enables multi-scale analysis across samples |
title_full_unstemmed | Population-level integration of single-cell datasets enables multi-scale analysis across samples |
title_short | Population-level integration of single-cell datasets enables multi-scale analysis across samples |
title_sort | population-level integration of single-cell datasets enables multi-scale analysis across samples |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10630133/ https://www.ncbi.nlm.nih.gov/pubmed/37813989 http://dx.doi.org/10.1038/s41592-023-02035-2 |
work_keys_str_mv | AT dedonnocarlo populationlevelintegrationofsinglecelldatasetsenablesmultiscaleanalysisacrosssamples AT hediyehzadehsoroor populationlevelintegrationofsinglecelldatasetsenablesmultiscaleanalysisacrosssamples AT moinfaramirali populationlevelintegrationofsinglecelldatasetsenablesmultiscaleanalysisacrosssamples AT wagenstettermarco populationlevelintegrationofsinglecelldatasetsenablesmultiscaleanalysisacrosssamples AT zappialuke populationlevelintegrationofsinglecelldatasetsenablesmultiscaleanalysisacrosssamples AT lotfollahimohammad populationlevelintegrationofsinglecelldatasetsenablesmultiscaleanalysisacrosssamples AT theisfabianj populationlevelintegrationofsinglecelldatasetsenablesmultiscaleanalysisacrosssamples |