Cargando…

FISHFactor: a probabilistic factor model for spatial transcriptomics data with subcellular resolution

MOTIVATION: Factor analysis is a widely used tool for unsupervised dimensionality reduction of high-throughput datasets in molecular biology, with recently proposed extensions designed specifically for spatial transcriptomics data. However, these methods expect (count) matrices as data input and are...

Descripción completa

Detalles Bibliográficos
Autores principales: Walter, Florin C, Stegle, Oliver, Velten, Britta
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10176502/
https://www.ncbi.nlm.nih.gov/pubmed/37039825
http://dx.doi.org/10.1093/bioinformatics/btad183
_version_ 1785040442720518144
author Walter, Florin C
Stegle, Oliver
Velten, Britta
author_facet Walter, Florin C
Stegle, Oliver
Velten, Britta
author_sort Walter, Florin C
collection PubMed
description MOTIVATION: Factor analysis is a widely used tool for unsupervised dimensionality reduction of high-throughput datasets in molecular biology, with recently proposed extensions designed specifically for spatial transcriptomics data. However, these methods expect (count) matrices as data input and are therefore not directly applicable to single molecule resolution data, which are in the form of coordinate lists annotated with genes and provide insight into subcellular spatial expression patterns. To address this, we here propose FISHFactor, a probabilistic factor model that combines the benefits of spatial, non-negative factor analysis with a Poisson point process likelihood to explicitly model and account for the nature of single molecule resolution data. In addition, FISHFactor shares information across a potentially large number of cells in a common weight matrix, allowing consistent interpretation of factors across cells and yielding improved latent variable estimates. RESULTS: We compare FISHFactor to existing methods that rely on aggregating information through spatial binning and cannot combine information from multiple cells and show that our method leads to more accurate results on simulated data. We show that our method is scalable and can be readily applied to large datasets. Finally, we demonstrate on a real dataset that FISHFactor is able to identify major subcellular expression patterns and spatial gene clusters in a data-driven manner. AVAILABILITY AND IMPLEMENTATION: The model implementation, data simulation and experiment scripts are available under https://www.github.com/bioFAM/FISHFactor.
format Online
Article
Text
id pubmed-10176502
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-101765022023-05-13 FISHFactor: a probabilistic factor model for spatial transcriptomics data with subcellular resolution Walter, Florin C Stegle, Oliver Velten, Britta Bioinformatics Original Paper MOTIVATION: Factor analysis is a widely used tool for unsupervised dimensionality reduction of high-throughput datasets in molecular biology, with recently proposed extensions designed specifically for spatial transcriptomics data. However, these methods expect (count) matrices as data input and are therefore not directly applicable to single molecule resolution data, which are in the form of coordinate lists annotated with genes and provide insight into subcellular spatial expression patterns. To address this, we here propose FISHFactor, a probabilistic factor model that combines the benefits of spatial, non-negative factor analysis with a Poisson point process likelihood to explicitly model and account for the nature of single molecule resolution data. In addition, FISHFactor shares information across a potentially large number of cells in a common weight matrix, allowing consistent interpretation of factors across cells and yielding improved latent variable estimates. RESULTS: We compare FISHFactor to existing methods that rely on aggregating information through spatial binning and cannot combine information from multiple cells and show that our method leads to more accurate results on simulated data. We show that our method is scalable and can be readily applied to large datasets. Finally, we demonstrate on a real dataset that FISHFactor is able to identify major subcellular expression patterns and spatial gene clusters in a data-driven manner. AVAILABILITY AND IMPLEMENTATION: The model implementation, data simulation and experiment scripts are available under https://www.github.com/bioFAM/FISHFactor. Oxford University Press 2023-04-11 /pmc/articles/PMC10176502/ /pubmed/37039825 http://dx.doi.org/10.1093/bioinformatics/btad183 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Walter, Florin C
Stegle, Oliver
Velten, Britta
FISHFactor: a probabilistic factor model for spatial transcriptomics data with subcellular resolution
title FISHFactor: a probabilistic factor model for spatial transcriptomics data with subcellular resolution
title_full FISHFactor: a probabilistic factor model for spatial transcriptomics data with subcellular resolution
title_fullStr FISHFactor: a probabilistic factor model for spatial transcriptomics data with subcellular resolution
title_full_unstemmed FISHFactor: a probabilistic factor model for spatial transcriptomics data with subcellular resolution
title_short FISHFactor: a probabilistic factor model for spatial transcriptomics data with subcellular resolution
title_sort fishfactor: a probabilistic factor model for spatial transcriptomics data with subcellular resolution
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10176502/
https://www.ncbi.nlm.nih.gov/pubmed/37039825
http://dx.doi.org/10.1093/bioinformatics/btad183
work_keys_str_mv AT walterflorinc fishfactoraprobabilisticfactormodelforspatialtranscriptomicsdatawithsubcellularresolution
AT stegleoliver fishfactoraprobabilisticfactormodelforspatialtranscriptomicsdatawithsubcellularresolution
AT veltenbritta fishfactoraprobabilisticfactormodelforspatialtranscriptomicsdatawithsubcellularresolution