Cargando…
Universal prediction of cell-cycle position using transfer learning
BACKGROUND: The cell cycle is a highly conserved, continuous process which controls faithful replication and division of cells. Single-cell technologies have enabled increasingly precise measurements of the cell cycle both as a biological process of interest and as a possible confounding factor. Des...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8802487/ https://www.ncbi.nlm.nih.gov/pubmed/35101061 http://dx.doi.org/10.1186/s13059-021-02581-y |
_version_ | 1784642690269315072 |
---|---|
author | Zheng, Shijie C. Stein-O’Brien, Genevieve Augustin, Jonathan J. Slosberg, Jared Carosso, Giovanni A. Winer, Briana Shin, Gloria Bjornsson, Hans T. Goff, Loyal A. Hansen, Kasper D. |
author_facet | Zheng, Shijie C. Stein-O’Brien, Genevieve Augustin, Jonathan J. Slosberg, Jared Carosso, Giovanni A. Winer, Briana Shin, Gloria Bjornsson, Hans T. Goff, Loyal A. Hansen, Kasper D. |
author_sort | Zheng, Shijie C. |
collection | PubMed |
description | BACKGROUND: The cell cycle is a highly conserved, continuous process which controls faithful replication and division of cells. Single-cell technologies have enabled increasingly precise measurements of the cell cycle both as a biological process of interest and as a possible confounding factor. Despite its importance and conservation, there is no universally applicable approach to infer position in the cell cycle with high-resolution from single-cell RNA-seq data. RESULTS: Here, we present tricycle, an R/Bioconductor package, to address this challenge by leveraging key features of the biology of the cell cycle, the mathematical properties of principal component analysis of periodic functions, and the use of transfer learning. We estimate a cell-cycle embedding using a fixed reference dataset and project new data into this reference embedding, an approach that overcomes key limitations of learning a dataset-dependent embedding. Tricycle then predicts a cell-specific position in the cell cycle based on the data projection. The accuracy of tricycle compares favorably to gold-standard experimental assays, which generally require specialized measurements in specifically constructed in vitro systems. Using internal controls which are available for any dataset, we show that tricycle predictions generalize to datasets with multiple cell types, across tissues, species, and even sequencing assays. CONCLUSIONS: Tricycle generalizes across datasets and is highly scalable and applicable to atlas-level single-cell RNA-seq data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s13059-021-02581-y). |
format | Online Article Text |
id | pubmed-8802487 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-88024872022-02-02 Universal prediction of cell-cycle position using transfer learning Zheng, Shijie C. Stein-O’Brien, Genevieve Augustin, Jonathan J. Slosberg, Jared Carosso, Giovanni A. Winer, Briana Shin, Gloria Bjornsson, Hans T. Goff, Loyal A. Hansen, Kasper D. Genome Biol Research BACKGROUND: The cell cycle is a highly conserved, continuous process which controls faithful replication and division of cells. Single-cell technologies have enabled increasingly precise measurements of the cell cycle both as a biological process of interest and as a possible confounding factor. Despite its importance and conservation, there is no universally applicable approach to infer position in the cell cycle with high-resolution from single-cell RNA-seq data. RESULTS: Here, we present tricycle, an R/Bioconductor package, to address this challenge by leveraging key features of the biology of the cell cycle, the mathematical properties of principal component analysis of periodic functions, and the use of transfer learning. We estimate a cell-cycle embedding using a fixed reference dataset and project new data into this reference embedding, an approach that overcomes key limitations of learning a dataset-dependent embedding. Tricycle then predicts a cell-specific position in the cell cycle based on the data projection. The accuracy of tricycle compares favorably to gold-standard experimental assays, which generally require specialized measurements in specifically constructed in vitro systems. Using internal controls which are available for any dataset, we show that tricycle predictions generalize to datasets with multiple cell types, across tissues, species, and even sequencing assays. CONCLUSIONS: Tricycle generalizes across datasets and is highly scalable and applicable to atlas-level single-cell RNA-seq data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s13059-021-02581-y). BioMed Central 2022-01-31 /pmc/articles/PMC8802487/ /pubmed/35101061 http://dx.doi.org/10.1186/s13059-021-02581-y Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visithttp://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Zheng, Shijie C. Stein-O’Brien, Genevieve Augustin, Jonathan J. Slosberg, Jared Carosso, Giovanni A. Winer, Briana Shin, Gloria Bjornsson, Hans T. Goff, Loyal A. Hansen, Kasper D. Universal prediction of cell-cycle position using transfer learning |
title | Universal prediction of cell-cycle position using transfer learning |
title_full | Universal prediction of cell-cycle position using transfer learning |
title_fullStr | Universal prediction of cell-cycle position using transfer learning |
title_full_unstemmed | Universal prediction of cell-cycle position using transfer learning |
title_short | Universal prediction of cell-cycle position using transfer learning |
title_sort | universal prediction of cell-cycle position using transfer learning |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8802487/ https://www.ncbi.nlm.nih.gov/pubmed/35101061 http://dx.doi.org/10.1186/s13059-021-02581-y |
work_keys_str_mv | AT zhengshijiec universalpredictionofcellcyclepositionusingtransferlearning AT steinobriengenevieve universalpredictionofcellcyclepositionusingtransferlearning AT augustinjonathanj universalpredictionofcellcyclepositionusingtransferlearning AT slosbergjared universalpredictionofcellcyclepositionusingtransferlearning AT carossogiovannia universalpredictionofcellcyclepositionusingtransferlearning AT winerbriana universalpredictionofcellcyclepositionusingtransferlearning AT shingloria universalpredictionofcellcyclepositionusingtransferlearning AT bjornssonhanst universalpredictionofcellcyclepositionusingtransferlearning AT goffloyala universalpredictionofcellcyclepositionusingtransferlearning AT hansenkasperd universalpredictionofcellcyclepositionusingtransferlearning |