Cargando…
Integrative analysis of genomic, functional and protein interaction data predicts long-range enhancer-target gene interactions
Multicellular organismal development is controlled by a complex network of transcription factors, promoters and enhancers. Although reliable computational and experimental methods exist for enhancer detection, prediction of their target genes remains a major challenge. On the basis of available lite...
Autores principales: | , , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3074119/ https://www.ncbi.nlm.nih.gov/pubmed/21109530 http://dx.doi.org/10.1093/nar/gkq1081 |
_version_ | 1782201688268996608 |
---|---|
author | Rödelsperger, Christian Guo, Gao Kolanczyk, Mateusz Pletschacher, Angelika Köhler, Sebastian Bauer, Sebastian Schulz, Marcel H. Robinson, Peter N. |
author_facet | Rödelsperger, Christian Guo, Gao Kolanczyk, Mateusz Pletschacher, Angelika Köhler, Sebastian Bauer, Sebastian Schulz, Marcel H. Robinson, Peter N. |
author_sort | Rödelsperger, Christian |
collection | PubMed |
description | Multicellular organismal development is controlled by a complex network of transcription factors, promoters and enhancers. Although reliable computational and experimental methods exist for enhancer detection, prediction of their target genes remains a major challenge. On the basis of available literature and ChIP-seq and ChIP-chip data for enhanceosome factor p300 and the transcriptional regulator Gli3, we found that genomic proximity and conserved synteny predict target genes with a relatively low recall of 12–27% within 2 Mb intervals centered at the enhancers. Here, we show that functional similarities between enhancer binding proteins and their transcriptional targets and proximity in the protein–protein interactome improve prediction of target genes. We used all four features to train random forest classifiers that predict target genes with a recall of 58% in 2 Mb intervals that may contain dozens of genes, representing a better than two-fold improvement over the performance of prediction based on single features alone. Genome-wide ChIP data is still relatively poorly understood, and it remains difficult to assign biological significance to binding events. Our study represents a first step in integrating various genomic features in order to elucidate the genomic network of long-range regulatory interactions. |
format | Text |
id | pubmed-3074119 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-30741192011-04-12 Integrative analysis of genomic, functional and protein interaction data predicts long-range enhancer-target gene interactions Rödelsperger, Christian Guo, Gao Kolanczyk, Mateusz Pletschacher, Angelika Köhler, Sebastian Bauer, Sebastian Schulz, Marcel H. Robinson, Peter N. Nucleic Acids Res Computational Biology Multicellular organismal development is controlled by a complex network of transcription factors, promoters and enhancers. Although reliable computational and experimental methods exist for enhancer detection, prediction of their target genes remains a major challenge. On the basis of available literature and ChIP-seq and ChIP-chip data for enhanceosome factor p300 and the transcriptional regulator Gli3, we found that genomic proximity and conserved synteny predict target genes with a relatively low recall of 12–27% within 2 Mb intervals centered at the enhancers. Here, we show that functional similarities between enhancer binding proteins and their transcriptional targets and proximity in the protein–protein interactome improve prediction of target genes. We used all four features to train random forest classifiers that predict target genes with a recall of 58% in 2 Mb intervals that may contain dozens of genes, representing a better than two-fold improvement over the performance of prediction based on single features alone. Genome-wide ChIP data is still relatively poorly understood, and it remains difficult to assign biological significance to binding events. Our study represents a first step in integrating various genomic features in order to elucidate the genomic network of long-range regulatory interactions. Oxford University Press 2011-04 2010-11-24 /pmc/articles/PMC3074119/ /pubmed/21109530 http://dx.doi.org/10.1093/nar/gkq1081 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Computational Biology Rödelsperger, Christian Guo, Gao Kolanczyk, Mateusz Pletschacher, Angelika Köhler, Sebastian Bauer, Sebastian Schulz, Marcel H. Robinson, Peter N. Integrative analysis of genomic, functional and protein interaction data predicts long-range enhancer-target gene interactions |
title | Integrative analysis of genomic, functional and protein interaction data predicts long-range enhancer-target gene interactions |
title_full | Integrative analysis of genomic, functional and protein interaction data predicts long-range enhancer-target gene interactions |
title_fullStr | Integrative analysis of genomic, functional and protein interaction data predicts long-range enhancer-target gene interactions |
title_full_unstemmed | Integrative analysis of genomic, functional and protein interaction data predicts long-range enhancer-target gene interactions |
title_short | Integrative analysis of genomic, functional and protein interaction data predicts long-range enhancer-target gene interactions |
title_sort | integrative analysis of genomic, functional and protein interaction data predicts long-range enhancer-target gene interactions |
topic | Computational Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3074119/ https://www.ncbi.nlm.nih.gov/pubmed/21109530 http://dx.doi.org/10.1093/nar/gkq1081 |
work_keys_str_mv | AT rodelspergerchristian integrativeanalysisofgenomicfunctionalandproteininteractiondatapredictslongrangeenhancertargetgeneinteractions AT guogao integrativeanalysisofgenomicfunctionalandproteininteractiondatapredictslongrangeenhancertargetgeneinteractions AT kolanczykmateusz integrativeanalysisofgenomicfunctionalandproteininteractiondatapredictslongrangeenhancertargetgeneinteractions AT pletschacherangelika integrativeanalysisofgenomicfunctionalandproteininteractiondatapredictslongrangeenhancertargetgeneinteractions AT kohlersebastian integrativeanalysisofgenomicfunctionalandproteininteractiondatapredictslongrangeenhancertargetgeneinteractions AT bauersebastian integrativeanalysisofgenomicfunctionalandproteininteractiondatapredictslongrangeenhancertargetgeneinteractions AT schulzmarcelh integrativeanalysisofgenomicfunctionalandproteininteractiondatapredictslongrangeenhancertargetgeneinteractions AT robinsonpetern integrativeanalysisofgenomicfunctionalandproteininteractiondatapredictslongrangeenhancertargetgeneinteractions |