Cargando…

Dictionary learning allows model-free pseudotime estimation of transcriptomic data

BACKGROUND: Pseudotime estimation from dynamic single-cell transcriptomic data enables characterisation and understanding of the underlying processes, for example developmental processes. Various pseudotime estimation methods have been proposed during the last years. Typically, these methods start w...

Descripción completa

Detalles Bibliográficos
Autores principales:	Rams, Mona, Conrad, Tim O.F.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2022
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8760643/ https://www.ncbi.nlm.nih.gov/pubmed/35033004 http://dx.doi.org/10.1186/s12864-021-08276-9

_version_	1784633365022900224
author	Rams, Mona Conrad, Tim O.F.
author_facet	Rams, Mona Conrad, Tim O.F.
author_sort	Rams, Mona
collection	PubMed
description	BACKGROUND: Pseudotime estimation from dynamic single-cell transcriptomic data enables characterisation and understanding of the underlying processes, for example developmental processes. Various pseudotime estimation methods have been proposed during the last years. Typically, these methods start with a dimension reduction step because the low-dimensional representation is usually easier to analyse. Approaches such as PCA, ICA or t-SNE belong to the most widely used methods for dimension reduction in pseudotime estimation methods. However, these methods usually make assumptions on the derived dimensions, which can result in important dataset properties being missed. In this paper, we suggest a new dictionary learning based approach, dynDLT, for dimension reduction and pseudotime estimation of dynamic transcriptomic data. Dictionary learning is a matrix factorisation approach that does not restrict the dependence of the derived dimensions. To evaluate the performance, we conduct a large simulation study and analyse 8 real-world datasets. RESULTS: The simulation studies reveal that firstly, dynDLT preserves the simulated patterns in low-dimension and the pseudotimes can be derived from the low-dimensional representation. Secondly, the results show that dynDLT is suitable for the detection of genes exhibiting the simulated dynamic patterns, thereby facilitating the interpretation of the compressed representation and thus the dynamic processes. For the real-world data analysis, we select datasets with samples that are taken at different time points throughout an experiment. The pseudotimes found by dynDLT have high correlations with the experimental times. We compare the results to other approaches used in pseudotime estimation, or those that are method-wise closely connected to dictionary learning: ICA, NMF, PCA, t-SNE, and UMAP. DynDLT has the best overall performance for the simulated and real-world datasets. CONCLUSIONS: We introduce dynDLT, a method that is suitable for pseudotime estimation. Its main advantages are: (1) It presents a model-free approach, meaning that it does not restrict the dependence of the derived dimensions; (2) Genes that are relevant in the detected dynamic processes can be identified from the dictionary matrix; (3) By a restriction of the dictionary entries to positive values, the dictionary atoms are highly interpretable. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s12864-021-08276-9).
format	Online Article Text
id	pubmed-8760643
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-87606432022-01-18 Dictionary learning allows model-free pseudotime estimation of transcriptomic data Rams, Mona Conrad, Tim O.F. BMC Genomics Methodology Article BACKGROUND: Pseudotime estimation from dynamic single-cell transcriptomic data enables characterisation and understanding of the underlying processes, for example developmental processes. Various pseudotime estimation methods have been proposed during the last years. Typically, these methods start with a dimension reduction step because the low-dimensional representation is usually easier to analyse. Approaches such as PCA, ICA or t-SNE belong to the most widely used methods for dimension reduction in pseudotime estimation methods. However, these methods usually make assumptions on the derived dimensions, which can result in important dataset properties being missed. In this paper, we suggest a new dictionary learning based approach, dynDLT, for dimension reduction and pseudotime estimation of dynamic transcriptomic data. Dictionary learning is a matrix factorisation approach that does not restrict the dependence of the derived dimensions. To evaluate the performance, we conduct a large simulation study and analyse 8 real-world datasets. RESULTS: The simulation studies reveal that firstly, dynDLT preserves the simulated patterns in low-dimension and the pseudotimes can be derived from the low-dimensional representation. Secondly, the results show that dynDLT is suitable for the detection of genes exhibiting the simulated dynamic patterns, thereby facilitating the interpretation of the compressed representation and thus the dynamic processes. For the real-world data analysis, we select datasets with samples that are taken at different time points throughout an experiment. The pseudotimes found by dynDLT have high correlations with the experimental times. We compare the results to other approaches used in pseudotime estimation, or those that are method-wise closely connected to dictionary learning: ICA, NMF, PCA, t-SNE, and UMAP. DynDLT has the best overall performance for the simulated and real-world datasets. CONCLUSIONS: We introduce dynDLT, a method that is suitable for pseudotime estimation. Its main advantages are: (1) It presents a model-free approach, meaning that it does not restrict the dependence of the derived dimensions; (2) Genes that are relevant in the detected dynamic processes can be identified from the dictionary matrix; (3) By a restriction of the dictionary entries to positive values, the dictionary atoms are highly interpretable. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s12864-021-08276-9). BioMed Central 2022-01-15 /pmc/articles/PMC8760643/ /pubmed/35033004 http://dx.doi.org/10.1186/s12864-021-08276-9 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Methodology Article Rams, Mona Conrad, Tim O.F. Dictionary learning allows model-free pseudotime estimation of transcriptomic data
title	Dictionary learning allows model-free pseudotime estimation of transcriptomic data
title_full	Dictionary learning allows model-free pseudotime estimation of transcriptomic data
title_fullStr	Dictionary learning allows model-free pseudotime estimation of transcriptomic data
title_full_unstemmed	Dictionary learning allows model-free pseudotime estimation of transcriptomic data
title_short	Dictionary learning allows model-free pseudotime estimation of transcriptomic data
title_sort	dictionary learning allows model-free pseudotime estimation of transcriptomic data
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8760643/ https://www.ncbi.nlm.nih.gov/pubmed/35033004 http://dx.doi.org/10.1186/s12864-021-08276-9
work_keys_str_mv	AT ramsmona dictionarylearningallowsmodelfreepseudotimeestimationoftranscriptomicdata AT conradtimof dictionarylearningallowsmodelfreepseudotimeestimationoftranscriptomicdata

Dictionary learning allows model-free pseudotime estimation of transcriptomic data

Ejemplares similares