Cargando…

Integration of ChIP-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes

We take a comprehensive approach to the study of regulatory control of gene expression in melanocytes that proceeds from large-scale enhancer discovery facilitated by ChIP-seq; to rigorous validation in silico, in vitro, and in vivo; and finally to the use of machine learning to elucidate a regulato...

Descripción completa

Detalles Bibliográficos
Autores principales: Gorkin, David U., Lee, Dongwon, Reed, Xylena, Fletez-Brant, Christopher, Bessling, Seneca L., Loftus, Stacie K., Beer, Michael A., Pavan, William J., McCallion, Andrew S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3483558/
https://www.ncbi.nlm.nih.gov/pubmed/23019145
http://dx.doi.org/10.1101/gr.139360.112
_version_ 1782248019394035712
author Gorkin, David U.
Lee, Dongwon
Reed, Xylena
Fletez-Brant, Christopher
Bessling, Seneca L.
Loftus, Stacie K.
Beer, Michael A.
Pavan, William J.
McCallion, Andrew S.
author_facet Gorkin, David U.
Lee, Dongwon
Reed, Xylena
Fletez-Brant, Christopher
Bessling, Seneca L.
Loftus, Stacie K.
Beer, Michael A.
Pavan, William J.
McCallion, Andrew S.
author_sort Gorkin, David U.
collection PubMed
description We take a comprehensive approach to the study of regulatory control of gene expression in melanocytes that proceeds from large-scale enhancer discovery facilitated by ChIP-seq; to rigorous validation in silico, in vitro, and in vivo; and finally to the use of machine learning to elucidate a regulatory vocabulary with genome-wide predictive power. We identify 2489 putative melanocyte enhancer loci in the mouse genome by ChIP-seq for EP300 and H3K4me1. We demonstrate that these putative enhancers are evolutionarily constrained, enriched for sequence motifs predicted to bind key melanocyte transcription factors, located near genes relevant to melanocyte biology, and capable of driving reporter gene expression in melanocytes in culture (86%; 43/50) and in transgenic zebrafish (70%; 7/10). Next, using the sequences of these putative enhancers as a training set for a supervised machine learning algorithm, we develop a vocabulary of 6-mers predictive of melanocyte enhancer function. Lastly, we demonstrate that this vocabulary has genome-wide predictive power in both the mouse and human genomes. This study provides deep insight into the regulation of gene expression in melanocytes and demonstrates a powerful approach to the investigation of regulatory sequences that can be applied to other cell types.
format Online
Article
Text
id pubmed-3483558
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-34835582013-05-01 Integration of ChIP-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes Gorkin, David U. Lee, Dongwon Reed, Xylena Fletez-Brant, Christopher Bessling, Seneca L. Loftus, Stacie K. Beer, Michael A. Pavan, William J. McCallion, Andrew S. Genome Res Method We take a comprehensive approach to the study of regulatory control of gene expression in melanocytes that proceeds from large-scale enhancer discovery facilitated by ChIP-seq; to rigorous validation in silico, in vitro, and in vivo; and finally to the use of machine learning to elucidate a regulatory vocabulary with genome-wide predictive power. We identify 2489 putative melanocyte enhancer loci in the mouse genome by ChIP-seq for EP300 and H3K4me1. We demonstrate that these putative enhancers are evolutionarily constrained, enriched for sequence motifs predicted to bind key melanocyte transcription factors, located near genes relevant to melanocyte biology, and capable of driving reporter gene expression in melanocytes in culture (86%; 43/50) and in transgenic zebrafish (70%; 7/10). Next, using the sequences of these putative enhancers as a training set for a supervised machine learning algorithm, we develop a vocabulary of 6-mers predictive of melanocyte enhancer function. Lastly, we demonstrate that this vocabulary has genome-wide predictive power in both the mouse and human genomes. This study provides deep insight into the regulation of gene expression in melanocytes and demonstrates a powerful approach to the investigation of regulatory sequences that can be applied to other cell types. Cold Spring Harbor Laboratory Press 2012-11 /pmc/articles/PMC3483558/ /pubmed/23019145 http://dx.doi.org/10.1101/gr.139360.112 Text en © 2012, Published by Cold Spring Harbor Laboratory Press This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.
spellingShingle Method
Gorkin, David U.
Lee, Dongwon
Reed, Xylena
Fletez-Brant, Christopher
Bessling, Seneca L.
Loftus, Stacie K.
Beer, Michael A.
Pavan, William J.
McCallion, Andrew S.
Integration of ChIP-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes
title Integration of ChIP-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes
title_full Integration of ChIP-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes
title_fullStr Integration of ChIP-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes
title_full_unstemmed Integration of ChIP-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes
title_short Integration of ChIP-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes
title_sort integration of chip-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3483558/
https://www.ncbi.nlm.nih.gov/pubmed/23019145
http://dx.doi.org/10.1101/gr.139360.112
work_keys_str_mv AT gorkindavidu integrationofchipseqandmachinelearningrevealsenhancersandapredictiveregulatorysequencevocabularyinmelanocytes
AT leedongwon integrationofchipseqandmachinelearningrevealsenhancersandapredictiveregulatorysequencevocabularyinmelanocytes
AT reedxylena integrationofchipseqandmachinelearningrevealsenhancersandapredictiveregulatorysequencevocabularyinmelanocytes
AT fletezbrantchristopher integrationofchipseqandmachinelearningrevealsenhancersandapredictiveregulatorysequencevocabularyinmelanocytes
AT besslingsenecal integrationofchipseqandmachinelearningrevealsenhancersandapredictiveregulatorysequencevocabularyinmelanocytes
AT loftusstaciek integrationofchipseqandmachinelearningrevealsenhancersandapredictiveregulatorysequencevocabularyinmelanocytes
AT beermichaela integrationofchipseqandmachinelearningrevealsenhancersandapredictiveregulatorysequencevocabularyinmelanocytes
AT pavanwilliamj integrationofchipseqandmachinelearningrevealsenhancersandapredictiveregulatorysequencevocabularyinmelanocytes
AT mccallionandrews integrationofchipseqandmachinelearningrevealsenhancersandapredictiveregulatorysequencevocabularyinmelanocytes