Cargando…

An Integrated Approach to Identifying Cis-Regulatory Modules in the Human Genome

In eukaryotic genomes, it is challenging to accurately determine target sites of transcription factors (TFs) by only using sequence information. Previous efforts were made to tackle this task by considering the fact that TF binding sites tend to be more conserved than other functional sites and the...

Descripción completa

Detalles Bibliográficos
Autores principales: Won, Kyoung-Jae, Agarwal, Saurabh, Shen, Li, Shoemaker, Robert, Ren, Bing, Wang, Wei
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2677454/
https://www.ncbi.nlm.nih.gov/pubmed/19434238
http://dx.doi.org/10.1371/journal.pone.0005501
_version_ 1782166783911788544
author Won, Kyoung-Jae
Agarwal, Saurabh
Shen, Li
Shoemaker, Robert
Ren, Bing
Wang, Wei
author_facet Won, Kyoung-Jae
Agarwal, Saurabh
Shen, Li
Shoemaker, Robert
Ren, Bing
Wang, Wei
author_sort Won, Kyoung-Jae
collection PubMed
description In eukaryotic genomes, it is challenging to accurately determine target sites of transcription factors (TFs) by only using sequence information. Previous efforts were made to tackle this task by considering the fact that TF binding sites tend to be more conserved than other functional sites and the binding sites of several TFs are often clustered. Recently, ChIP-chip and ChIP-sequencing experiments have been accumulated to identify TF binding sites as well as survey the chromatin modification patterns at the regulatory elements such as promoters and enhancers. We propose here a hidden Markov model (HMM) to incorporate sequence motif information, TF-DNA interaction data and chromatin modification patterns to precisely identify cis-regulatory modules (CRMs). We conducted ChIP-chip experiments on four TFs, CREB, E2F1, MAX, and YY1 in 1% of the human genome. We then trained a hidden Markov model (HMM) to identify the labels of the CRMs by incorporating the sequence motifs recognized by these TFs and the ChIP-chip ratio. Chromatin modification data was used to predict the functional sites and to further remove false positives. Cross-validation showed that our integrated HMM had a performance superior to other existing methods on predicting CRMs. Incorporating histone signature information successfully penalized false prediction and improved the whole performance. The dataset we used and the software are available at http://nash.ucsd.edu/CIS/.
format Text
id pubmed-2677454
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-26774542009-05-12 An Integrated Approach to Identifying Cis-Regulatory Modules in the Human Genome Won, Kyoung-Jae Agarwal, Saurabh Shen, Li Shoemaker, Robert Ren, Bing Wang, Wei PLoS One Research Article In eukaryotic genomes, it is challenging to accurately determine target sites of transcription factors (TFs) by only using sequence information. Previous efforts were made to tackle this task by considering the fact that TF binding sites tend to be more conserved than other functional sites and the binding sites of several TFs are often clustered. Recently, ChIP-chip and ChIP-sequencing experiments have been accumulated to identify TF binding sites as well as survey the chromatin modification patterns at the regulatory elements such as promoters and enhancers. We propose here a hidden Markov model (HMM) to incorporate sequence motif information, TF-DNA interaction data and chromatin modification patterns to precisely identify cis-regulatory modules (CRMs). We conducted ChIP-chip experiments on four TFs, CREB, E2F1, MAX, and YY1 in 1% of the human genome. We then trained a hidden Markov model (HMM) to identify the labels of the CRMs by incorporating the sequence motifs recognized by these TFs and the ChIP-chip ratio. Chromatin modification data was used to predict the functional sites and to further remove false positives. Cross-validation showed that our integrated HMM had a performance superior to other existing methods on predicting CRMs. Incorporating histone signature information successfully penalized false prediction and improved the whole performance. The dataset we used and the software are available at http://nash.ucsd.edu/CIS/. Public Library of Science 2009-05-12 /pmc/articles/PMC2677454/ /pubmed/19434238 http://dx.doi.org/10.1371/journal.pone.0005501 Text en Won et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Won, Kyoung-Jae
Agarwal, Saurabh
Shen, Li
Shoemaker, Robert
Ren, Bing
Wang, Wei
An Integrated Approach to Identifying Cis-Regulatory Modules in the Human Genome
title An Integrated Approach to Identifying Cis-Regulatory Modules in the Human Genome
title_full An Integrated Approach to Identifying Cis-Regulatory Modules in the Human Genome
title_fullStr An Integrated Approach to Identifying Cis-Regulatory Modules in the Human Genome
title_full_unstemmed An Integrated Approach to Identifying Cis-Regulatory Modules in the Human Genome
title_short An Integrated Approach to Identifying Cis-Regulatory Modules in the Human Genome
title_sort integrated approach to identifying cis-regulatory modules in the human genome
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2677454/
https://www.ncbi.nlm.nih.gov/pubmed/19434238
http://dx.doi.org/10.1371/journal.pone.0005501
work_keys_str_mv AT wonkyoungjae anintegratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT agarwalsaurabh anintegratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT shenli anintegratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT shoemakerrobert anintegratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT renbing anintegratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT wangwei anintegratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT wonkyoungjae integratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT agarwalsaurabh integratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT shenli integratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT shoemakerrobert integratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT renbing integratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT wangwei integratedapproachtoidentifyingcisregulatorymodulesinthehumangenome