Cargando…

Improving ChIP-seq peak-calling for functional co-regulator binding by integrating multiple sources of biological information

BACKGROUND: Chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-seq) is increasingly being applied to study genome-wide binding sites of transcription factors. There is an increasing interest in understanding the mechanism of action of co-regulator proteins, which do not b...

Descripción completa

Detalles Bibliográficos
Autores principales: Osmanbeyoglu, Hatice Ulku, Hartmaier, Ryan J, Oesterreich, Steffi, Lu, Xinghua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3439677/
https://www.ncbi.nlm.nih.gov/pubmed/22369349
http://dx.doi.org/10.1186/1471-2164-13-S1-S1
_version_ 1782243043478339584
author Osmanbeyoglu, Hatice Ulku
Hartmaier, Ryan J
Oesterreich, Steffi
Lu, Xinghua
author_facet Osmanbeyoglu, Hatice Ulku
Hartmaier, Ryan J
Oesterreich, Steffi
Lu, Xinghua
author_sort Osmanbeyoglu, Hatice Ulku
collection PubMed
description BACKGROUND: Chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-seq) is increasingly being applied to study genome-wide binding sites of transcription factors. There is an increasing interest in understanding the mechanism of action of co-regulator proteins, which do not bind DNA directly, but exert their effects by binding to transcription factors such as the estrogen receptor (ER). However, due to the nature of detecting indirect protein-DNA interaction, ChIP-seq signals from co-regulators can be relatively weak and thus biologically meaningful interactions remain difficult to identify. RESULTS: In this study, we investigated and compared different statistical and machine learning approaches including unsupervised, supervised, and semi-supervised classification (self-training) approaches to integrate multiple types of genomic and transcriptomic information derived from our experiments and public database to overcome difficulty of identifying functional DNA binding sites of the co-regulator SRC-1 in the context of estrogen response. Our results indicate that supervised learning with naïve Bayes algorithm significantly enhances peak calling of weak ChIP-seq signals and outperforms other machine learning algorithms. Our integrative approach revealed many potential ERα/SRC-1 DNA binding sites that would otherwise be missed by conventional peak calling algorithms with default settings. CONCLUSIONS: Our results indicate that a supervised classification approach enables one to utilize limited amounts of prior knowledge together with multiple types of biological data to enhance the sensitivity and specificity of the identification of DNA binding sites from co-regulator proteins.
format Online
Article
Text
id pubmed-3439677
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-34396772012-09-17 Improving ChIP-seq peak-calling for functional co-regulator binding by integrating multiple sources of biological information Osmanbeyoglu, Hatice Ulku Hartmaier, Ryan J Oesterreich, Steffi Lu, Xinghua BMC Genomics Proceedings BACKGROUND: Chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-seq) is increasingly being applied to study genome-wide binding sites of transcription factors. There is an increasing interest in understanding the mechanism of action of co-regulator proteins, which do not bind DNA directly, but exert their effects by binding to transcription factors such as the estrogen receptor (ER). However, due to the nature of detecting indirect protein-DNA interaction, ChIP-seq signals from co-regulators can be relatively weak and thus biologically meaningful interactions remain difficult to identify. RESULTS: In this study, we investigated and compared different statistical and machine learning approaches including unsupervised, supervised, and semi-supervised classification (self-training) approaches to integrate multiple types of genomic and transcriptomic information derived from our experiments and public database to overcome difficulty of identifying functional DNA binding sites of the co-regulator SRC-1 in the context of estrogen response. Our results indicate that supervised learning with naïve Bayes algorithm significantly enhances peak calling of weak ChIP-seq signals and outperforms other machine learning algorithms. Our integrative approach revealed many potential ERα/SRC-1 DNA binding sites that would otherwise be missed by conventional peak calling algorithms with default settings. CONCLUSIONS: Our results indicate that a supervised classification approach enables one to utilize limited amounts of prior knowledge together with multiple types of biological data to enhance the sensitivity and specificity of the identification of DNA binding sites from co-regulator proteins. BioMed Central 2012-01-17 /pmc/articles/PMC3439677/ /pubmed/22369349 http://dx.doi.org/10.1186/1471-2164-13-S1-S1 Text en Copyright ©2012 Osmanbeyoglu et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Osmanbeyoglu, Hatice Ulku
Hartmaier, Ryan J
Oesterreich, Steffi
Lu, Xinghua
Improving ChIP-seq peak-calling for functional co-regulator binding by integrating multiple sources of biological information
title Improving ChIP-seq peak-calling for functional co-regulator binding by integrating multiple sources of biological information
title_full Improving ChIP-seq peak-calling for functional co-regulator binding by integrating multiple sources of biological information
title_fullStr Improving ChIP-seq peak-calling for functional co-regulator binding by integrating multiple sources of biological information
title_full_unstemmed Improving ChIP-seq peak-calling for functional co-regulator binding by integrating multiple sources of biological information
title_short Improving ChIP-seq peak-calling for functional co-regulator binding by integrating multiple sources of biological information
title_sort improving chip-seq peak-calling for functional co-regulator binding by integrating multiple sources of biological information
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3439677/
https://www.ncbi.nlm.nih.gov/pubmed/22369349
http://dx.doi.org/10.1186/1471-2164-13-S1-S1
work_keys_str_mv AT osmanbeyogluhaticeulku improvingchipseqpeakcallingforfunctionalcoregulatorbindingbyintegratingmultiplesourcesofbiologicalinformation
AT hartmaierryanj improvingchipseqpeakcallingforfunctionalcoregulatorbindingbyintegratingmultiplesourcesofbiologicalinformation
AT oesterreichsteffi improvingchipseqpeakcallingforfunctionalcoregulatorbindingbyintegratingmultiplesourcesofbiologicalinformation
AT luxinghua improvingchipseqpeakcallingforfunctionalcoregulatorbindingbyintegratingmultiplesourcesofbiologicalinformation