Cargando…

Estimating genome-wide regulatory activity from multi-omics data sets using mathematical optimization

BACKGROUND: Gene regulation is one of the most important cellular processes, indispensable for the adaptability of organisms and closely interlinked with several classes of pathogenesis and their progression. Elucidation of regulatory mechanisms can be approached by a multitude of experimental metho...

Descripción completa

Detalles Bibliográficos
Autores principales: Trescher, Saskia, Münchmeyer, Jannes, Leser, Ulf
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5369021/
https://www.ncbi.nlm.nih.gov/pubmed/28347313
http://dx.doi.org/10.1186/s12918-017-0419-z
Descripción
Sumario:BACKGROUND: Gene regulation is one of the most important cellular processes, indispensable for the adaptability of organisms and closely interlinked with several classes of pathogenesis and their progression. Elucidation of regulatory mechanisms can be approached by a multitude of experimental methods, yet integration of the resulting heterogeneous, large, and noisy data sets into comprehensive and tissue or disease-specific cellular models requires rigorous computational methods. Recently, several algorithms have been proposed which model genome-wide gene regulation as sets of (linear) equations over the activity and relationships of transcription factors, genes and other factors. Subsequent optimization finds those parameters that minimize the divergence of predicted and measured expression intensities. In various settings, these methods produced promising results in terms of estimating transcription factor activity and identifying key biomarkers for specific phenotypes. However, despite their common root in mathematical optimization, they vastly differ in the types of experimental data being integrated, the background knowledge necessary for their application, the granularity of their regulatory model, the concrete paradigm used for solving the optimization problem and the data sets used for evaluation. RESULTS: Here, we review five recent methods of this class in detail and compare them with respect to several key properties. Furthermore, we quantitatively compare the results of four of the presented methods based on publicly available data sets. CONCLUSIONS: The results show that all methods seem to find biologically relevant information. However, we also observe that the mutual result overlaps are very low, which contradicts biological intuition. Our aim is to raise further awareness of the power of these methods, yet also to identify common shortcomings and necessary extensions enabling focused research on the critical points. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12918-017-0419-z) contains supplementary material, which is available to authorized users.