Cargando…

Predicting cell-type–specific gene expression from regions of open chromatin

Complex patterns of cell-type–specific gene expression are thought to be achieved by combinatorial binding of transcription factors (TFs) to sequence elements in regulatory regions. Predicting cell-type–specific expression in mammals has been hindered by the oftentimes unknown location of distal reg...

Descripción completa

Detalles Bibliográficos
Autores principales: Natarajan, Anirudh, Yardımcı, Galip Gürkan, Sheffield, Nathan C., Crawford, Gregory E., Ohler, Uwe
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431488/
https://www.ncbi.nlm.nih.gov/pubmed/22955983
http://dx.doi.org/10.1101/gr.135129.111
_version_ 1782242091367137280
author Natarajan, Anirudh
Yardımcı, Galip Gürkan
Sheffield, Nathan C.
Crawford, Gregory E.
Ohler, Uwe
author_facet Natarajan, Anirudh
Yardımcı, Galip Gürkan
Sheffield, Nathan C.
Crawford, Gregory E.
Ohler, Uwe
author_sort Natarajan, Anirudh
collection PubMed
description Complex patterns of cell-type–specific gene expression are thought to be achieved by combinatorial binding of transcription factors (TFs) to sequence elements in regulatory regions. Predicting cell-type–specific expression in mammals has been hindered by the oftentimes unknown location of distal regulatory regions. To alleviate this bottleneck, we used DNase-seq data from 19 diverse human cell types to identify proximal and distal regulatory elements at genome-wide scale. Matched expression data allowed us to separate genes into classes of cell-type–specific up-regulated, down-regulated, and constitutively expressed genes. CG dinucleotide content and DNA accessibility in the promoters of these three classes of genes displayed substantial differences, highlighting the importance of including these aspects in modeling gene expression. We associated DNase I hypersensitive sites (DHSs) with genes, and trained classifiers for different expression patterns. TF sequence motif matches in DHSs provided a strong performance improvement in predicting gene expression over the typical baseline approach of using proximal promoter sequences. In particular, we achieved competitive performance when discriminating up-regulated genes from different cell types or genes up- and down-regulated under the same conditions. We identified previously known and new candidate cell-type–specific regulators. The models generated testable predictions of activating or repressive functions of regulators. DNase I footprints for these regulators were indicative of their direct binding to DNA. In summary, we successfully used information of open chromatin obtained by a single assay, DNase-seq, to address the problem of predicting cell-type–specific gene expression in mammalian organisms directly from regulatory sequence.
format Online
Article
Text
id pubmed-3431488
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-34314882012-09-08 Predicting cell-type–specific gene expression from regions of open chromatin Natarajan, Anirudh Yardımcı, Galip Gürkan Sheffield, Nathan C. Crawford, Gregory E. Ohler, Uwe Genome Res Method Complex patterns of cell-type–specific gene expression are thought to be achieved by combinatorial binding of transcription factors (TFs) to sequence elements in regulatory regions. Predicting cell-type–specific expression in mammals has been hindered by the oftentimes unknown location of distal regulatory regions. To alleviate this bottleneck, we used DNase-seq data from 19 diverse human cell types to identify proximal and distal regulatory elements at genome-wide scale. Matched expression data allowed us to separate genes into classes of cell-type–specific up-regulated, down-regulated, and constitutively expressed genes. CG dinucleotide content and DNA accessibility in the promoters of these three classes of genes displayed substantial differences, highlighting the importance of including these aspects in modeling gene expression. We associated DNase I hypersensitive sites (DHSs) with genes, and trained classifiers for different expression patterns. TF sequence motif matches in DHSs provided a strong performance improvement in predicting gene expression over the typical baseline approach of using proximal promoter sequences. In particular, we achieved competitive performance when discriminating up-regulated genes from different cell types or genes up- and down-regulated under the same conditions. We identified previously known and new candidate cell-type–specific regulators. The models generated testable predictions of activating or repressive functions of regulators. DNase I footprints for these regulators were indicative of their direct binding to DNA. In summary, we successfully used information of open chromatin obtained by a single assay, DNase-seq, to address the problem of predicting cell-type–specific gene expression in mammalian organisms directly from regulatory sequence. Cold Spring Harbor Laboratory Press 2012-09 /pmc/articles/PMC3431488/ /pubmed/22955983 http://dx.doi.org/10.1101/gr.135129.111 Text en © 2012, Published by Cold Spring Harbor Laboratory Press This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.
spellingShingle Method
Natarajan, Anirudh
Yardımcı, Galip Gürkan
Sheffield, Nathan C.
Crawford, Gregory E.
Ohler, Uwe
Predicting cell-type–specific gene expression from regions of open chromatin
title Predicting cell-type–specific gene expression from regions of open chromatin
title_full Predicting cell-type–specific gene expression from regions of open chromatin
title_fullStr Predicting cell-type–specific gene expression from regions of open chromatin
title_full_unstemmed Predicting cell-type–specific gene expression from regions of open chromatin
title_short Predicting cell-type–specific gene expression from regions of open chromatin
title_sort predicting cell-type–specific gene expression from regions of open chromatin
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431488/
https://www.ncbi.nlm.nih.gov/pubmed/22955983
http://dx.doi.org/10.1101/gr.135129.111
work_keys_str_mv AT natarajananirudh predictingcelltypespecificgeneexpressionfromregionsofopenchromatin
AT yardımcıgalipgurkan predictingcelltypespecificgeneexpressionfromregionsofopenchromatin
AT sheffieldnathanc predictingcelltypespecificgeneexpressionfromregionsofopenchromatin
AT crawfordgregorye predictingcelltypespecificgeneexpressionfromregionsofopenchromatin
AT ohleruwe predictingcelltypespecificgeneexpressionfromregionsofopenchromatin