Cargando…
PRISM offers a comprehensive genomic approach to transcription factor function prediction
The human genome encodes 1500–2000 different transcription factors (TFs). ChIP-seq is revealing the global binding profiles of a fraction of TFs in a fraction of their biological contexts. These data show that the majority of TFs bind directly next to a large number of context-relevant target genes,...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory Press
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3638144/ https://www.ncbi.nlm.nih.gov/pubmed/23382538 http://dx.doi.org/10.1101/gr.139071.112 |
_version_ | 1782475802003111936 |
---|---|
author | Wenger, Aaron M. Clarke, Shoa L. Guturu, Harendra Chen, Jenny Schaar, Bruce T. McLean, Cory Y. Bejerano, Gill |
author_facet | Wenger, Aaron M. Clarke, Shoa L. Guturu, Harendra Chen, Jenny Schaar, Bruce T. McLean, Cory Y. Bejerano, Gill |
author_sort | Wenger, Aaron M. |
collection | PubMed |
description | The human genome encodes 1500–2000 different transcription factors (TFs). ChIP-seq is revealing the global binding profiles of a fraction of TFs in a fraction of their biological contexts. These data show that the majority of TFs bind directly next to a large number of context-relevant target genes, that most binding is distal, and that binding is context specific. Because of the effort and cost involved, ChIP-seq is seldom used in search of novel TF function. Such exploration is instead done using expression perturbation and genetic screens. Here we propose a comprehensive computational framework for transcription factor function prediction. We curate 332 high-quality nonredundant TF binding motifs that represent all major DNA binding domains, and improve cross-species conserved binding site prediction to obtain 3.3 million conserved, mostly distal, binding site predictions. We combine these with 2.4 million facts about all human and mouse gene functions, in a novel statistical framework, in search of enrichments of particular motifs next to groups of target genes of particular functions. Rigorous parameter tuning and a harsh null are used to minimize false positives. Our novel PRISM (predicting regulatory information from single motifs) approach obtains 2543 TF function predictions in a large variety of contexts, at a false discovery rate of 16%. The predictions are highly enriched for validated TF roles, and 45 of 67 (67%) tested binding site regions in five different contexts act as enhancers in functionally matched cells. |
format | Online Article Text |
id | pubmed-3638144 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Cold Spring Harbor Laboratory Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-36381442013-05-04 PRISM offers a comprehensive genomic approach to transcription factor function prediction Wenger, Aaron M. Clarke, Shoa L. Guturu, Harendra Chen, Jenny Schaar, Bruce T. McLean, Cory Y. Bejerano, Gill Genome Res Resource The human genome encodes 1500–2000 different transcription factors (TFs). ChIP-seq is revealing the global binding profiles of a fraction of TFs in a fraction of their biological contexts. These data show that the majority of TFs bind directly next to a large number of context-relevant target genes, that most binding is distal, and that binding is context specific. Because of the effort and cost involved, ChIP-seq is seldom used in search of novel TF function. Such exploration is instead done using expression perturbation and genetic screens. Here we propose a comprehensive computational framework for transcription factor function prediction. We curate 332 high-quality nonredundant TF binding motifs that represent all major DNA binding domains, and improve cross-species conserved binding site prediction to obtain 3.3 million conserved, mostly distal, binding site predictions. We combine these with 2.4 million facts about all human and mouse gene functions, in a novel statistical framework, in search of enrichments of particular motifs next to groups of target genes of particular functions. Rigorous parameter tuning and a harsh null are used to minimize false positives. Our novel PRISM (predicting regulatory information from single motifs) approach obtains 2543 TF function predictions in a large variety of contexts, at a false discovery rate of 16%. The predictions are highly enriched for validated TF roles, and 45 of 67 (67%) tested binding site regions in five different contexts act as enhancers in functionally matched cells. Cold Spring Harbor Laboratory Press 2013-05 /pmc/articles/PMC3638144/ /pubmed/23382538 http://dx.doi.org/10.1101/gr.139071.112 Text en © 2013, Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/3.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/. |
spellingShingle | Resource Wenger, Aaron M. Clarke, Shoa L. Guturu, Harendra Chen, Jenny Schaar, Bruce T. McLean, Cory Y. Bejerano, Gill PRISM offers a comprehensive genomic approach to transcription factor function prediction |
title | PRISM offers a comprehensive genomic approach to transcription factor function prediction |
title_full | PRISM offers a comprehensive genomic approach to transcription factor function prediction |
title_fullStr | PRISM offers a comprehensive genomic approach to transcription factor function prediction |
title_full_unstemmed | PRISM offers a comprehensive genomic approach to transcription factor function prediction |
title_short | PRISM offers a comprehensive genomic approach to transcription factor function prediction |
title_sort | prism offers a comprehensive genomic approach to transcription factor function prediction |
topic | Resource |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3638144/ https://www.ncbi.nlm.nih.gov/pubmed/23382538 http://dx.doi.org/10.1101/gr.139071.112 |
work_keys_str_mv | AT wengeraaronm prismoffersacomprehensivegenomicapproachtotranscriptionfactorfunctionprediction AT clarkeshoal prismoffersacomprehensivegenomicapproachtotranscriptionfactorfunctionprediction AT guturuharendra prismoffersacomprehensivegenomicapproachtotranscriptionfactorfunctionprediction AT chenjenny prismoffersacomprehensivegenomicapproachtotranscriptionfactorfunctionprediction AT schaarbrucet prismoffersacomprehensivegenomicapproachtotranscriptionfactorfunctionprediction AT mcleancoryy prismoffersacomprehensivegenomicapproachtotranscriptionfactorfunctionprediction AT bejeranogill prismoffersacomprehensivegenomicapproachtotranscriptionfactorfunctionprediction |