Cargando…

Dictionary learning allows model-free pseudotime estimation of transcriptomic data

BACKGROUND: Pseudotime estimation from dynamic single-cell transcriptomic data enables characterisation and understanding of the underlying processes, for example developmental processes. Various pseudotime estimation methods have been proposed during the last years. Typically, these methods start w...

Descripción completa

Detalles Bibliográficos
Autores principales: Rams, Mona, Conrad, Tim O.F.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8760643/
https://www.ncbi.nlm.nih.gov/pubmed/35033004
http://dx.doi.org/10.1186/s12864-021-08276-9
_version_ 1784633365022900224
author Rams, Mona
Conrad, Tim O.F.
author_facet Rams, Mona
Conrad, Tim O.F.
author_sort Rams, Mona
collection PubMed
description BACKGROUND: Pseudotime estimation from dynamic single-cell transcriptomic data enables characterisation and understanding of the underlying processes, for example developmental processes. Various pseudotime estimation methods have been proposed during the last years. Typically, these methods start with a dimension reduction step because the low-dimensional representation is usually easier to analyse. Approaches such as PCA, ICA or t-SNE belong to the most widely used methods for dimension reduction in pseudotime estimation methods. However, these methods usually make assumptions on the derived dimensions, which can result in important dataset properties being missed. In this paper, we suggest a new dictionary learning based approach, dynDLT, for dimension reduction and pseudotime estimation of dynamic transcriptomic data. Dictionary learning is a matrix factorisation approach that does not restrict the dependence of the derived dimensions. To evaluate the performance, we conduct a large simulation study and analyse 8 real-world datasets. RESULTS: The simulation studies reveal that firstly, dynDLT preserves the simulated patterns in low-dimension and the pseudotimes can be derived from the low-dimensional representation. Secondly, the results show that dynDLT is suitable for the detection of genes exhibiting the simulated dynamic patterns, thereby facilitating the interpretation of the compressed representation and thus the dynamic processes. For the real-world data analysis, we select datasets with samples that are taken at different time points throughout an experiment. The pseudotimes found by dynDLT have high correlations with the experimental times. We compare the results to other approaches used in pseudotime estimation, or those that are method-wise closely connected to dictionary learning: ICA, NMF, PCA, t-SNE, and UMAP. DynDLT has the best overall performance for the simulated and real-world datasets. CONCLUSIONS: We introduce dynDLT, a method that is suitable for pseudotime estimation. Its main advantages are: (1) It presents a model-free approach, meaning that it does not restrict the dependence of the derived dimensions; (2) Genes that are relevant in the detected dynamic processes can be identified from the dictionary matrix; (3) By a restriction of the dictionary entries to positive values, the dictionary atoms are highly interpretable. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s12864-021-08276-9).
format Online
Article
Text
id pubmed-8760643
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-87606432022-01-18 Dictionary learning allows model-free pseudotime estimation of transcriptomic data Rams, Mona Conrad, Tim O.F. BMC Genomics Methodology Article BACKGROUND: Pseudotime estimation from dynamic single-cell transcriptomic data enables characterisation and understanding of the underlying processes, for example developmental processes. Various pseudotime estimation methods have been proposed during the last years. Typically, these methods start with a dimension reduction step because the low-dimensional representation is usually easier to analyse. Approaches such as PCA, ICA or t-SNE belong to the most widely used methods for dimension reduction in pseudotime estimation methods. However, these methods usually make assumptions on the derived dimensions, which can result in important dataset properties being missed. In this paper, we suggest a new dictionary learning based approach, dynDLT, for dimension reduction and pseudotime estimation of dynamic transcriptomic data. Dictionary learning is a matrix factorisation approach that does not restrict the dependence of the derived dimensions. To evaluate the performance, we conduct a large simulation study and analyse 8 real-world datasets. RESULTS: The simulation studies reveal that firstly, dynDLT preserves the simulated patterns in low-dimension and the pseudotimes can be derived from the low-dimensional representation. Secondly, the results show that dynDLT is suitable for the detection of genes exhibiting the simulated dynamic patterns, thereby facilitating the interpretation of the compressed representation and thus the dynamic processes. For the real-world data analysis, we select datasets with samples that are taken at different time points throughout an experiment. The pseudotimes found by dynDLT have high correlations with the experimental times. We compare the results to other approaches used in pseudotime estimation, or those that are method-wise closely connected to dictionary learning: ICA, NMF, PCA, t-SNE, and UMAP. DynDLT has the best overall performance for the simulated and real-world datasets. CONCLUSIONS: We introduce dynDLT, a method that is suitable for pseudotime estimation. Its main advantages are: (1) It presents a model-free approach, meaning that it does not restrict the dependence of the derived dimensions; (2) Genes that are relevant in the detected dynamic processes can be identified from the dictionary matrix; (3) By a restriction of the dictionary entries to positive values, the dictionary atoms are highly interpretable. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s12864-021-08276-9). BioMed Central 2022-01-15 /pmc/articles/PMC8760643/ /pubmed/35033004 http://dx.doi.org/10.1186/s12864-021-08276-9 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Rams, Mona
Conrad, Tim O.F.
Dictionary learning allows model-free pseudotime estimation of transcriptomic data
title Dictionary learning allows model-free pseudotime estimation of transcriptomic data
title_full Dictionary learning allows model-free pseudotime estimation of transcriptomic data
title_fullStr Dictionary learning allows model-free pseudotime estimation of transcriptomic data
title_full_unstemmed Dictionary learning allows model-free pseudotime estimation of transcriptomic data
title_short Dictionary learning allows model-free pseudotime estimation of transcriptomic data
title_sort dictionary learning allows model-free pseudotime estimation of transcriptomic data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8760643/
https://www.ncbi.nlm.nih.gov/pubmed/35033004
http://dx.doi.org/10.1186/s12864-021-08276-9
work_keys_str_mv AT ramsmona dictionarylearningallowsmodelfreepseudotimeestimationoftranscriptomicdata
AT conradtimof dictionarylearningallowsmodelfreepseudotimeestimationoftranscriptomicdata