Cargando…
Mapping single-cell data to reference atlases by transfer learning
Large single-cell atlases are now routinely generated to serve as references for analysis of smaller-scale studies. Yet learning from reference data is complicated by batch effects between datasets, limited availability of computational resources and sharing restrictions on raw data. Here we introdu...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group US
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8763644/ https://www.ncbi.nlm.nih.gov/pubmed/34462589 http://dx.doi.org/10.1038/s41587-021-01001-7 |
_version_ | 1784633992637579264 |
---|---|
author | Lotfollahi, Mohammad Naghipourfar, Mohsen Luecken, Malte D. Khajavi, Matin Büttner, Maren Wagenstetter, Marco Avsec, Žiga Gayoso, Adam Yosef, Nir Interlandi, Marta Rybakov, Sergei Misharin, Alexander V. Theis, Fabian J. |
author_facet | Lotfollahi, Mohammad Naghipourfar, Mohsen Luecken, Malte D. Khajavi, Matin Büttner, Maren Wagenstetter, Marco Avsec, Žiga Gayoso, Adam Yosef, Nir Interlandi, Marta Rybakov, Sergei Misharin, Alexander V. Theis, Fabian J. |
author_sort | Lotfollahi, Mohammad |
collection | PubMed |
description | Large single-cell atlases are now routinely generated to serve as references for analysis of smaller-scale studies. Yet learning from reference data is complicated by batch effects between datasets, limited availability of computational resources and sharing restrictions on raw data. Here we introduce a deep learning strategy for mapping query datasets on top of a reference called single-cell architectural surgery (scArches). scArches uses transfer learning and parameter optimization to enable efficient, decentralized, iterative reference building and contextualization of new datasets with existing references without sharing raw data. Using examples from mouse brain, pancreas, immune and whole-organism atlases, we show that scArches preserves biological state information while removing batch effects, despite using four orders of magnitude fewer parameters than de novo integration. scArches generalizes to multimodal reference mapping, allowing imputation of missing modalities. Finally, scArches retains coronavirus disease 2019 (COVID-19) disease variation when mapping to a healthy reference, enabling the discovery of disease-specific cell states. scArches will facilitate collaborative projects by enabling iterative construction, updating, sharing and efficient use of reference atlases. |
format | Online Article Text |
id | pubmed-8763644 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group US |
record_format | MEDLINE/PubMed |
spelling | pubmed-87636442022-01-26 Mapping single-cell data to reference atlases by transfer learning Lotfollahi, Mohammad Naghipourfar, Mohsen Luecken, Malte D. Khajavi, Matin Büttner, Maren Wagenstetter, Marco Avsec, Žiga Gayoso, Adam Yosef, Nir Interlandi, Marta Rybakov, Sergei Misharin, Alexander V. Theis, Fabian J. Nat Biotechnol Analysis Large single-cell atlases are now routinely generated to serve as references for analysis of smaller-scale studies. Yet learning from reference data is complicated by batch effects between datasets, limited availability of computational resources and sharing restrictions on raw data. Here we introduce a deep learning strategy for mapping query datasets on top of a reference called single-cell architectural surgery (scArches). scArches uses transfer learning and parameter optimization to enable efficient, decentralized, iterative reference building and contextualization of new datasets with existing references without sharing raw data. Using examples from mouse brain, pancreas, immune and whole-organism atlases, we show that scArches preserves biological state information while removing batch effects, despite using four orders of magnitude fewer parameters than de novo integration. scArches generalizes to multimodal reference mapping, allowing imputation of missing modalities. Finally, scArches retains coronavirus disease 2019 (COVID-19) disease variation when mapping to a healthy reference, enabling the discovery of disease-specific cell states. scArches will facilitate collaborative projects by enabling iterative construction, updating, sharing and efficient use of reference atlases. Nature Publishing Group US 2021-08-30 2022 /pmc/articles/PMC8763644/ /pubmed/34462589 http://dx.doi.org/10.1038/s41587-021-01001-7 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Analysis Lotfollahi, Mohammad Naghipourfar, Mohsen Luecken, Malte D. Khajavi, Matin Büttner, Maren Wagenstetter, Marco Avsec, Žiga Gayoso, Adam Yosef, Nir Interlandi, Marta Rybakov, Sergei Misharin, Alexander V. Theis, Fabian J. Mapping single-cell data to reference atlases by transfer learning |
title | Mapping single-cell data to reference atlases by transfer learning |
title_full | Mapping single-cell data to reference atlases by transfer learning |
title_fullStr | Mapping single-cell data to reference atlases by transfer learning |
title_full_unstemmed | Mapping single-cell data to reference atlases by transfer learning |
title_short | Mapping single-cell data to reference atlases by transfer learning |
title_sort | mapping single-cell data to reference atlases by transfer learning |
topic | Analysis |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8763644/ https://www.ncbi.nlm.nih.gov/pubmed/34462589 http://dx.doi.org/10.1038/s41587-021-01001-7 |
work_keys_str_mv | AT lotfollahimohammad mappingsinglecelldatatoreferenceatlasesbytransferlearning AT naghipourfarmohsen mappingsinglecelldatatoreferenceatlasesbytransferlearning AT lueckenmalted mappingsinglecelldatatoreferenceatlasesbytransferlearning AT khajavimatin mappingsinglecelldatatoreferenceatlasesbytransferlearning AT buttnermaren mappingsinglecelldatatoreferenceatlasesbytransferlearning AT wagenstettermarco mappingsinglecelldatatoreferenceatlasesbytransferlearning AT avsecziga mappingsinglecelldatatoreferenceatlasesbytransferlearning AT gayosoadam mappingsinglecelldatatoreferenceatlasesbytransferlearning AT yosefnir mappingsinglecelldatatoreferenceatlasesbytransferlearning AT interlandimarta mappingsinglecelldatatoreferenceatlasesbytransferlearning AT rybakovsergei mappingsinglecelldatatoreferenceatlasesbytransferlearning AT misharinalexanderv mappingsinglecelldatatoreferenceatlasesbytransferlearning AT theisfabianj mappingsinglecelldatatoreferenceatlasesbytransferlearning |