Cargando…

Efficient inference for sparse latent variable models of transcriptional regulation

MOTIVATION: Regulation of gene expression in prokaryotes involves complex co-regulatory mechanisms involving large numbers of transcriptional regulatory proteins and their target genes. Uncovering these genome-scale interactions constitutes a major bottleneck in systems biology. Sparse latent factor...

Descripción completa

Detalles Bibliográficos
Autores principales:	Dai, Zhenwen, Iqbal, Mudassar, Lawrence, Neil D, Rattray, Magnus
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2017
Materias:	Original Papers
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5860323/ https://www.ncbi.nlm.nih.gov/pubmed/28961802 http://dx.doi.org/10.1093/bioinformatics/btx508

_version_	1783307966636621824
author	Dai, Zhenwen Iqbal, Mudassar Lawrence, Neil D Rattray, Magnus
author_facet	Dai, Zhenwen Iqbal, Mudassar Lawrence, Neil D Rattray, Magnus
author_sort	Dai, Zhenwen
collection	PubMed
description	MOTIVATION: Regulation of gene expression in prokaryotes involves complex co-regulatory mechanisms involving large numbers of transcriptional regulatory proteins and their target genes. Uncovering these genome-scale interactions constitutes a major bottleneck in systems biology. Sparse latent factor models, assuming activity of transcription factors (TFs) as unobserved, provide a biologically interpretable modelling framework, integrating gene expression and genome-wide binding data, but at the same time pose a hard computational inference problem. Existing probabilistic inference methods for such models rely on subjective filtering and suffer from scalability issues, thus are not well-suited for realistic genome-scale applications. RESULTS: We present a fast Bayesian sparse factor model, which takes input gene expression and binding sites data, either from ChIP-seq experiments or motif predictions, and outputs active TF-gene links as well as latent TF activities. Our method employs an efficient variational Bayes scheme for model inference enabling its application to large datasets which was not feasible with existing MCMC-based inference methods for such models. We validate our method on synthetic data against a similar model in the literature, employing MCMC for inference, and obtain comparable results with a small fraction of the computational time. We also apply our method to large-scale data from Mycobacterium tuberculosis involving ChIP-seq data on 113 TFs and matched gene expression data for 3863 putative target genes. We evaluate our predictions using an independent transcriptomics experiment involving over-expression of TFs. AVAILABILITY AND IMPLEMENTATION: An easy-to-use Jupyter notebook demo of our method with data is available at https://github.com/zhenwendai/SITAR. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format	Online Article Text
id	pubmed-5860323
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-58603232018-03-21 Efficient inference for sparse latent variable models of transcriptional regulation Dai, Zhenwen Iqbal, Mudassar Lawrence, Neil D Rattray, Magnus Bioinformatics Original Papers MOTIVATION: Regulation of gene expression in prokaryotes involves complex co-regulatory mechanisms involving large numbers of transcriptional regulatory proteins and their target genes. Uncovering these genome-scale interactions constitutes a major bottleneck in systems biology. Sparse latent factor models, assuming activity of transcription factors (TFs) as unobserved, provide a biologically interpretable modelling framework, integrating gene expression and genome-wide binding data, but at the same time pose a hard computational inference problem. Existing probabilistic inference methods for such models rely on subjective filtering and suffer from scalability issues, thus are not well-suited for realistic genome-scale applications. RESULTS: We present a fast Bayesian sparse factor model, which takes input gene expression and binding sites data, either from ChIP-seq experiments or motif predictions, and outputs active TF-gene links as well as latent TF activities. Our method employs an efficient variational Bayes scheme for model inference enabling its application to large datasets which was not feasible with existing MCMC-based inference methods for such models. We validate our method on synthetic data against a similar model in the literature, employing MCMC for inference, and obtain comparable results with a small fraction of the computational time. We also apply our method to large-scale data from Mycobacterium tuberculosis involving ChIP-seq data on 113 TFs and matched gene expression data for 3863 putative target genes. We evaluate our predictions using an independent transcriptomics experiment involving over-expression of TFs. AVAILABILITY AND IMPLEMENTATION: An easy-to-use Jupyter notebook demo of our method with data is available at https://github.com/zhenwendai/SITAR. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2017-12-01 2017-08-26 /pmc/articles/PMC5860323/ /pubmed/28961802 http://dx.doi.org/10.1093/bioinformatics/btx508 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Original Papers Dai, Zhenwen Iqbal, Mudassar Lawrence, Neil D Rattray, Magnus Efficient inference for sparse latent variable models of transcriptional regulation
title	Efficient inference for sparse latent variable models of transcriptional regulation
title_full	Efficient inference for sparse latent variable models of transcriptional regulation
title_fullStr	Efficient inference for sparse latent variable models of transcriptional regulation
title_full_unstemmed	Efficient inference for sparse latent variable models of transcriptional regulation
title_short	Efficient inference for sparse latent variable models of transcriptional regulation
title_sort	efficient inference for sparse latent variable models of transcriptional regulation
topic	Original Papers
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5860323/ https://www.ncbi.nlm.nih.gov/pubmed/28961802 http://dx.doi.org/10.1093/bioinformatics/btx508
work_keys_str_mv	AT daizhenwen efficientinferenceforsparselatentvariablemodelsoftranscriptionalregulation AT iqbalmudassar efficientinferenceforsparselatentvariablemodelsoftranscriptionalregulation AT lawrenceneild efficientinferenceforsparselatentvariablemodelsoftranscriptionalregulation AT rattraymagnus efficientinferenceforsparselatentvariablemodelsoftranscriptionalregulation

Efficient inference for sparse latent variable models of transcriptional regulation

Ejemplares similares