Cargando…

psupertime: supervised pseudotime analysis for time-series single-cell RNA-seq data

MOTIVATION: Improvements in single-cell RNA-seq technologies mean that studies measuring multiple experimental conditions, such as time series, have become more common. At present, few computational methods exist to infer time series-specific transcriptome changes, and such studies have therefore ty...

Descripción completa

Detalles Bibliográficos
Autores principales: Macnair, Will, Gupta, Revant, Claassen, Manfred
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9235474/
https://www.ncbi.nlm.nih.gov/pubmed/35758781
http://dx.doi.org/10.1093/bioinformatics/btac227
_version_ 1784736317551149056
author Macnair, Will
Gupta, Revant
Claassen, Manfred
author_facet Macnair, Will
Gupta, Revant
Claassen, Manfred
author_sort Macnair, Will
collection PubMed
description MOTIVATION: Improvements in single-cell RNA-seq technologies mean that studies measuring multiple experimental conditions, such as time series, have become more common. At present, few computational methods exist to infer time series-specific transcriptome changes, and such studies have therefore typically used unsupervised pseudotime methods. While these methods identify cell subpopulations and the transitions between them, they are not appropriate for identifying the genes that vary coherently along the time series. In addition, the orderings they estimate are based only on the major sources of variation in the data, which may not correspond to the processes related to the time labels. RESULTS: We introduce psupertime, a supervised pseudotime approach based on a regression model, which explicitly uses time-series labels as input. It identifies genes that vary coherently along a time series, in addition to pseudotime values for individual cells, and a classifier that can be used to estimate labels for new data with unknown or differing labels. We show that psupertime outperforms benchmark classifiers in terms of identifying time-varying genes and provides better individual cell orderings than popular unsupervised pseudotime techniques. psupertime is applicable to any single-cell RNA-seq dataset with sequential labels (e.g. principally time series but also drug dosage and disease progression), derived from either experimental design and provides a fast, interpretable tool for targeted identification of genes varying along with specific biological processes. AVAILABILITY AND IMPLEMENTATION: R package available at github.com/wmacnair/psupertime and code for results reproduction at github.com/wmacnair/psupplementary. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9235474
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-92354742022-06-29 psupertime: supervised pseudotime analysis for time-series single-cell RNA-seq data Macnair, Will Gupta, Revant Claassen, Manfred Bioinformatics ISCB/Ismb 2022 MOTIVATION: Improvements in single-cell RNA-seq technologies mean that studies measuring multiple experimental conditions, such as time series, have become more common. At present, few computational methods exist to infer time series-specific transcriptome changes, and such studies have therefore typically used unsupervised pseudotime methods. While these methods identify cell subpopulations and the transitions between them, they are not appropriate for identifying the genes that vary coherently along the time series. In addition, the orderings they estimate are based only on the major sources of variation in the data, which may not correspond to the processes related to the time labels. RESULTS: We introduce psupertime, a supervised pseudotime approach based on a regression model, which explicitly uses time-series labels as input. It identifies genes that vary coherently along a time series, in addition to pseudotime values for individual cells, and a classifier that can be used to estimate labels for new data with unknown or differing labels. We show that psupertime outperforms benchmark classifiers in terms of identifying time-varying genes and provides better individual cell orderings than popular unsupervised pseudotime techniques. psupertime is applicable to any single-cell RNA-seq dataset with sequential labels (e.g. principally time series but also drug dosage and disease progression), derived from either experimental design and provides a fast, interpretable tool for targeted identification of genes varying along with specific biological processes. AVAILABILITY AND IMPLEMENTATION: R package available at github.com/wmacnair/psupertime and code for results reproduction at github.com/wmacnair/psupplementary. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-06-27 /pmc/articles/PMC9235474/ /pubmed/35758781 http://dx.doi.org/10.1093/bioinformatics/btac227 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle ISCB/Ismb 2022
Macnair, Will
Gupta, Revant
Claassen, Manfred
psupertime: supervised pseudotime analysis for time-series single-cell RNA-seq data
title psupertime: supervised pseudotime analysis for time-series single-cell RNA-seq data
title_full psupertime: supervised pseudotime analysis for time-series single-cell RNA-seq data
title_fullStr psupertime: supervised pseudotime analysis for time-series single-cell RNA-seq data
title_full_unstemmed psupertime: supervised pseudotime analysis for time-series single-cell RNA-seq data
title_short psupertime: supervised pseudotime analysis for time-series single-cell RNA-seq data
title_sort psupertime: supervised pseudotime analysis for time-series single-cell rna-seq data
topic ISCB/Ismb 2022
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9235474/
https://www.ncbi.nlm.nih.gov/pubmed/35758781
http://dx.doi.org/10.1093/bioinformatics/btac227
work_keys_str_mv AT macnairwill psupertimesupervisedpseudotimeanalysisfortimeseriessinglecellrnaseqdata
AT guptarevant psupertimesupervisedpseudotimeanalysisfortimeseriessinglecellrnaseqdata
AT claassenmanfred psupertimesupervisedpseudotimeanalysisfortimeseriessinglecellrnaseqdata