Cargando…

PRISM offers a comprehensive genomic approach to transcription factor function prediction

The human genome encodes 1500–2000 different transcription factors (TFs). ChIP-seq is revealing the global binding profiles of a fraction of TFs in a fraction of their biological contexts. These data show that the majority of TFs bind directly next to a large number of context-relevant target genes,...

Descripción completa

Detalles Bibliográficos
Autores principales: Wenger, Aaron M., Clarke, Shoa L., Guturu, Harendra, Chen, Jenny, Schaar, Bruce T., McLean, Cory Y., Bejerano, Gill
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3638144/
https://www.ncbi.nlm.nih.gov/pubmed/23382538
http://dx.doi.org/10.1101/gr.139071.112
_version_ 1782475802003111936
author Wenger, Aaron M.
Clarke, Shoa L.
Guturu, Harendra
Chen, Jenny
Schaar, Bruce T.
McLean, Cory Y.
Bejerano, Gill
author_facet Wenger, Aaron M.
Clarke, Shoa L.
Guturu, Harendra
Chen, Jenny
Schaar, Bruce T.
McLean, Cory Y.
Bejerano, Gill
author_sort Wenger, Aaron M.
collection PubMed
description The human genome encodes 1500–2000 different transcription factors (TFs). ChIP-seq is revealing the global binding profiles of a fraction of TFs in a fraction of their biological contexts. These data show that the majority of TFs bind directly next to a large number of context-relevant target genes, that most binding is distal, and that binding is context specific. Because of the effort and cost involved, ChIP-seq is seldom used in search of novel TF function. Such exploration is instead done using expression perturbation and genetic screens. Here we propose a comprehensive computational framework for transcription factor function prediction. We curate 332 high-quality nonredundant TF binding motifs that represent all major DNA binding domains, and improve cross-species conserved binding site prediction to obtain 3.3 million conserved, mostly distal, binding site predictions. We combine these with 2.4 million facts about all human and mouse gene functions, in a novel statistical framework, in search of enrichments of particular motifs next to groups of target genes of particular functions. Rigorous parameter tuning and a harsh null are used to minimize false positives. Our novel PRISM (predicting regulatory information from single motifs) approach obtains 2543 TF function predictions in a large variety of contexts, at a false discovery rate of 16%. The predictions are highly enriched for validated TF roles, and 45 of 67 (67%) tested binding site regions in five different contexts act as enhancers in functionally matched cells.
format Online
Article
Text
id pubmed-3638144
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-36381442013-05-04 PRISM offers a comprehensive genomic approach to transcription factor function prediction Wenger, Aaron M. Clarke, Shoa L. Guturu, Harendra Chen, Jenny Schaar, Bruce T. McLean, Cory Y. Bejerano, Gill Genome Res Resource The human genome encodes 1500–2000 different transcription factors (TFs). ChIP-seq is revealing the global binding profiles of a fraction of TFs in a fraction of their biological contexts. These data show that the majority of TFs bind directly next to a large number of context-relevant target genes, that most binding is distal, and that binding is context specific. Because of the effort and cost involved, ChIP-seq is seldom used in search of novel TF function. Such exploration is instead done using expression perturbation and genetic screens. Here we propose a comprehensive computational framework for transcription factor function prediction. We curate 332 high-quality nonredundant TF binding motifs that represent all major DNA binding domains, and improve cross-species conserved binding site prediction to obtain 3.3 million conserved, mostly distal, binding site predictions. We combine these with 2.4 million facts about all human and mouse gene functions, in a novel statistical framework, in search of enrichments of particular motifs next to groups of target genes of particular functions. Rigorous parameter tuning and a harsh null are used to minimize false positives. Our novel PRISM (predicting regulatory information from single motifs) approach obtains 2543 TF function predictions in a large variety of contexts, at a false discovery rate of 16%. The predictions are highly enriched for validated TF roles, and 45 of 67 (67%) tested binding site regions in five different contexts act as enhancers in functionally matched cells. Cold Spring Harbor Laboratory Press 2013-05 /pmc/articles/PMC3638144/ /pubmed/23382538 http://dx.doi.org/10.1101/gr.139071.112 Text en © 2013, Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/3.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.
spellingShingle Resource
Wenger, Aaron M.
Clarke, Shoa L.
Guturu, Harendra
Chen, Jenny
Schaar, Bruce T.
McLean, Cory Y.
Bejerano, Gill
PRISM offers a comprehensive genomic approach to transcription factor function prediction
title PRISM offers a comprehensive genomic approach to transcription factor function prediction
title_full PRISM offers a comprehensive genomic approach to transcription factor function prediction
title_fullStr PRISM offers a comprehensive genomic approach to transcription factor function prediction
title_full_unstemmed PRISM offers a comprehensive genomic approach to transcription factor function prediction
title_short PRISM offers a comprehensive genomic approach to transcription factor function prediction
title_sort prism offers a comprehensive genomic approach to transcription factor function prediction
topic Resource
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3638144/
https://www.ncbi.nlm.nih.gov/pubmed/23382538
http://dx.doi.org/10.1101/gr.139071.112
work_keys_str_mv AT wengeraaronm prismoffersacomprehensivegenomicapproachtotranscriptionfactorfunctionprediction
AT clarkeshoal prismoffersacomprehensivegenomicapproachtotranscriptionfactorfunctionprediction
AT guturuharendra prismoffersacomprehensivegenomicapproachtotranscriptionfactorfunctionprediction
AT chenjenny prismoffersacomprehensivegenomicapproachtotranscriptionfactorfunctionprediction
AT schaarbrucet prismoffersacomprehensivegenomicapproachtotranscriptionfactorfunctionprediction
AT mcleancoryy prismoffersacomprehensivegenomicapproachtotranscriptionfactorfunctionprediction
AT bejeranogill prismoffersacomprehensivegenomicapproachtotranscriptionfactorfunctionprediction