Cargando…

An information theoretic treatment of sequence-to-expression modeling

Studying a gene’s regulatory mechanisms is a tedious process that involves identification of candidate regulators by transcription factor (TF) knockout or over-expression experiments, delineation of enhancers by reporter assays, and demonstration of direct TF influence by site mutagenesis, among oth...

Descripción completa

Detalles Bibliográficos
Autores principales: Khajouei, Farzaneh, Sinha, Saurabh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6175532/
https://www.ncbi.nlm.nih.gov/pubmed/30256780
http://dx.doi.org/10.1371/journal.pcbi.1006459
_version_ 1783361535744147456
author Khajouei, Farzaneh
Sinha, Saurabh
author_facet Khajouei, Farzaneh
Sinha, Saurabh
author_sort Khajouei, Farzaneh
collection PubMed
description Studying a gene’s regulatory mechanisms is a tedious process that involves identification of candidate regulators by transcription factor (TF) knockout or over-expression experiments, delineation of enhancers by reporter assays, and demonstration of direct TF influence by site mutagenesis, among other approaches. Such experiments are often chosen based on the biologist’s intuition, from several testable hypotheses. We pursue the goal of making this process systematic by using ideas from information theory to reason about experiments in gene regulation, in the hope of ultimately enabling rigorous experiment design strategies. For this, we make use of a state-of-the-art mathematical model of gene expression, which provides a way to formalize our current knowledge of cis- as well as trans- regulatory mechanisms of a gene. Ambiguities in such knowledge can be expressed as uncertainties in the model, which we capture formally by building an ensemble of plausible models that fit the existing data and defining a probability distribution over the ensemble. We then characterize the impact of a new experiment on our understanding of the gene’s regulation based on how the ensemble of plausible models and its probability distribution changes when challenged with results from that experiment. This allows us to assess the ‘value’ of the experiment retroactively as the reduction in entropy of the distribution (information gain) resulting from the experiment’s results. We fully formalize this novel approach to reasoning about gene regulation experiments and use it to evaluate a variety of perturbation experiments on two developmental genes of D. melanogaster. We also provide objective and ‘biologist-friendly’ descriptions of the information gained from each such experiment. The rigorously defined information theoretic approaches presented here can be used in the future to formulate systematic strategies for experiment design pertaining to studies of gene regulatory mechanisms.
format Online
Article
Text
id pubmed-6175532
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-61755322018-10-19 An information theoretic treatment of sequence-to-expression modeling Khajouei, Farzaneh Sinha, Saurabh PLoS Comput Biol Research Article Studying a gene’s regulatory mechanisms is a tedious process that involves identification of candidate regulators by transcription factor (TF) knockout or over-expression experiments, delineation of enhancers by reporter assays, and demonstration of direct TF influence by site mutagenesis, among other approaches. Such experiments are often chosen based on the biologist’s intuition, from several testable hypotheses. We pursue the goal of making this process systematic by using ideas from information theory to reason about experiments in gene regulation, in the hope of ultimately enabling rigorous experiment design strategies. For this, we make use of a state-of-the-art mathematical model of gene expression, which provides a way to formalize our current knowledge of cis- as well as trans- regulatory mechanisms of a gene. Ambiguities in such knowledge can be expressed as uncertainties in the model, which we capture formally by building an ensemble of plausible models that fit the existing data and defining a probability distribution over the ensemble. We then characterize the impact of a new experiment on our understanding of the gene’s regulation based on how the ensemble of plausible models and its probability distribution changes when challenged with results from that experiment. This allows us to assess the ‘value’ of the experiment retroactively as the reduction in entropy of the distribution (information gain) resulting from the experiment’s results. We fully formalize this novel approach to reasoning about gene regulation experiments and use it to evaluate a variety of perturbation experiments on two developmental genes of D. melanogaster. We also provide objective and ‘biologist-friendly’ descriptions of the information gained from each such experiment. The rigorously defined information theoretic approaches presented here can be used in the future to formulate systematic strategies for experiment design pertaining to studies of gene regulatory mechanisms. Public Library of Science 2018-09-26 /pmc/articles/PMC6175532/ /pubmed/30256780 http://dx.doi.org/10.1371/journal.pcbi.1006459 Text en © 2018 Khajouei, Sinha http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Khajouei, Farzaneh
Sinha, Saurabh
An information theoretic treatment of sequence-to-expression modeling
title An information theoretic treatment of sequence-to-expression modeling
title_full An information theoretic treatment of sequence-to-expression modeling
title_fullStr An information theoretic treatment of sequence-to-expression modeling
title_full_unstemmed An information theoretic treatment of sequence-to-expression modeling
title_short An information theoretic treatment of sequence-to-expression modeling
title_sort information theoretic treatment of sequence-to-expression modeling
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6175532/
https://www.ncbi.nlm.nih.gov/pubmed/30256780
http://dx.doi.org/10.1371/journal.pcbi.1006459
work_keys_str_mv AT khajoueifarzaneh aninformationtheoretictreatmentofsequencetoexpressionmodeling
AT sinhasaurabh aninformationtheoretictreatmentofsequencetoexpressionmodeling
AT khajoueifarzaneh informationtheoretictreatmentofsequencetoexpressionmodeling
AT sinhasaurabh informationtheoretictreatmentofsequencetoexpressionmodeling