Cargando…
Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers
Modeling bifurcations in single-cell transcriptomics data has become an increasingly popular field of research. Several methods have been proposed to infer bifurcation structure from such data, but all rely on heuristic non-probabilistic inference. Here we propose the first generative, fully probabi...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
F1000Research
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5428745/ https://www.ncbi.nlm.nih.gov/pubmed/28503665 http://dx.doi.org/10.12688/wellcomeopenres.11087.1 |
_version_ | 1783235891747094528 |
---|---|
author | Campbell, Kieran R Yau, Christopher |
author_facet | Campbell, Kieran R Yau, Christopher |
author_sort | Campbell, Kieran R |
collection | PubMed |
description | Modeling bifurcations in single-cell transcriptomics data has become an increasingly popular field of research. Several methods have been proposed to infer bifurcation structure from such data, but all rely on heuristic non-probabilistic inference. Here we propose the first generative, fully probabilistic model for such inference based on a Bayesian hierarchical mixture of factor analyzers. Our model exhibits competitive performance on large datasets despite implementing full Markov-Chain Monte Carlo sampling, and its unique hierarchical prior structure enables automatic determination of genes driving the bifurcation process. We additionally propose an Empirical-Bayes like extension that deals with the high levels of zero-inflation in single-cell RNA-seq data and quantify when such models are useful. We apply or model to both real and simulated single-cell gene expression data and compare the results to existing pseudotime methods. Finally, we discuss both the merits and weaknesses of such a unified, probabilistic approach in the context practical bioinformatics analyses. |
format | Online Article Text |
id | pubmed-5428745 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | F1000Research |
record_format | MEDLINE/PubMed |
spelling | pubmed-54287452017-05-12 Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers Campbell, Kieran R Yau, Christopher Wellcome Open Res Method Article Modeling bifurcations in single-cell transcriptomics data has become an increasingly popular field of research. Several methods have been proposed to infer bifurcation structure from such data, but all rely on heuristic non-probabilistic inference. Here we propose the first generative, fully probabilistic model for such inference based on a Bayesian hierarchical mixture of factor analyzers. Our model exhibits competitive performance on large datasets despite implementing full Markov-Chain Monte Carlo sampling, and its unique hierarchical prior structure enables automatic determination of genes driving the bifurcation process. We additionally propose an Empirical-Bayes like extension that deals with the high levels of zero-inflation in single-cell RNA-seq data and quantify when such models are useful. We apply or model to both real and simulated single-cell gene expression data and compare the results to existing pseudotime methods. Finally, we discuss both the merits and weaknesses of such a unified, probabilistic approach in the context practical bioinformatics analyses. F1000Research 2017-03-15 /pmc/articles/PMC5428745/ /pubmed/28503665 http://dx.doi.org/10.12688/wellcomeopenres.11087.1 Text en Copyright: © 2017 Campbell KR and Yau C http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Method Article Campbell, Kieran R Yau, Christopher Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers |
title | Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers |
title_full | Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers |
title_fullStr | Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers |
title_full_unstemmed | Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers |
title_short | Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers |
title_sort | probabilistic modeling of bifurcations in single-cell gene expression data using a bayesian mixture of factor analyzers |
topic | Method Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5428745/ https://www.ncbi.nlm.nih.gov/pubmed/28503665 http://dx.doi.org/10.12688/wellcomeopenres.11087.1 |
work_keys_str_mv | AT campbellkieranr probabilisticmodelingofbifurcationsinsinglecellgeneexpressiondatausingabayesianmixtureoffactoranalyzers AT yauchristopher probabilisticmodelingofbifurcationsinsinglecellgeneexpressiondatausingabayesianmixtureoffactoranalyzers |