Cargando…

Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers

Modeling bifurcations in single-cell transcriptomics data has become an increasingly popular field of research. Several methods have been proposed to infer bifurcation structure from such data, but all rely on heuristic non-probabilistic inference. Here we propose the first generative, fully probabi...

Descripción completa

Detalles Bibliográficos
Autores principales: Campbell, Kieran R, Yau, Christopher
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000Research 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5428745/
https://www.ncbi.nlm.nih.gov/pubmed/28503665
http://dx.doi.org/10.12688/wellcomeopenres.11087.1
_version_ 1783235891747094528
author Campbell, Kieran R
Yau, Christopher
author_facet Campbell, Kieran R
Yau, Christopher
author_sort Campbell, Kieran R
collection PubMed
description Modeling bifurcations in single-cell transcriptomics data has become an increasingly popular field of research. Several methods have been proposed to infer bifurcation structure from such data, but all rely on heuristic non-probabilistic inference. Here we propose the first generative, fully probabilistic model for such inference based on a Bayesian hierarchical mixture of factor analyzers. Our model exhibits competitive performance on large datasets despite implementing full Markov-Chain Monte Carlo sampling, and its unique hierarchical prior structure enables automatic determination of genes driving the bifurcation process. We additionally propose an Empirical-Bayes like extension that deals with the high levels of zero-inflation in single-cell RNA-seq data and quantify when such models are useful. We apply or model to both real and simulated single-cell gene expression data and compare the results to existing pseudotime methods. Finally, we discuss both the merits and weaknesses of such a unified, probabilistic approach in the context practical bioinformatics analyses.
format Online
Article
Text
id pubmed-5428745
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher F1000Research
record_format MEDLINE/PubMed
spelling pubmed-54287452017-05-12 Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers Campbell, Kieran R Yau, Christopher Wellcome Open Res Method Article Modeling bifurcations in single-cell transcriptomics data has become an increasingly popular field of research. Several methods have been proposed to infer bifurcation structure from such data, but all rely on heuristic non-probabilistic inference. Here we propose the first generative, fully probabilistic model for such inference based on a Bayesian hierarchical mixture of factor analyzers. Our model exhibits competitive performance on large datasets despite implementing full Markov-Chain Monte Carlo sampling, and its unique hierarchical prior structure enables automatic determination of genes driving the bifurcation process. We additionally propose an Empirical-Bayes like extension that deals with the high levels of zero-inflation in single-cell RNA-seq data and quantify when such models are useful. We apply or model to both real and simulated single-cell gene expression data and compare the results to existing pseudotime methods. Finally, we discuss both the merits and weaknesses of such a unified, probabilistic approach in the context practical bioinformatics analyses. F1000Research 2017-03-15 /pmc/articles/PMC5428745/ /pubmed/28503665 http://dx.doi.org/10.12688/wellcomeopenres.11087.1 Text en Copyright: © 2017 Campbell KR and Yau C http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Method Article
Campbell, Kieran R
Yau, Christopher
Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers
title Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers
title_full Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers
title_fullStr Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers
title_full_unstemmed Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers
title_short Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers
title_sort probabilistic modeling of bifurcations in single-cell gene expression data using a bayesian mixture of factor analyzers
topic Method Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5428745/
https://www.ncbi.nlm.nih.gov/pubmed/28503665
http://dx.doi.org/10.12688/wellcomeopenres.11087.1
work_keys_str_mv AT campbellkieranr probabilisticmodelingofbifurcationsinsinglecellgeneexpressiondatausingabayesianmixtureoffactoranalyzers
AT yauchristopher probabilisticmodelingofbifurcationsinsinglecellgeneexpressiondatausingabayesianmixtureoffactoranalyzers