Cargando…

HAT: Hypergeometric Analysis of Tiling-arrays with application to promoter-GeneChip data

BACKGROUND: Tiling-arrays are applicable to multiple types of biological research questions. Due to its advantages (high sensitivity, resolution, unbiased), the technology is often employed in genome-wide investigations. A major challenge in the analysis of tiling-array data is to define regions-of-...

Descripción completa

Detalles Bibliográficos
Autores principales: Taskesen, Erdogan, Beekman, Renee, de Ridder, Jeroen, Wouters, Bas J, Peeters, Justine K, Touw, Ivo P, Reinders, Marcel JT, Delwel, Ruud
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2892465/
https://www.ncbi.nlm.nih.gov/pubmed/20492700
http://dx.doi.org/10.1186/1471-2105-11-275
_version_ 1782182950555615232
author Taskesen, Erdogan
Beekman, Renee
de Ridder, Jeroen
Wouters, Bas J
Peeters, Justine K
Touw, Ivo P
Reinders, Marcel JT
Delwel, Ruud
author_facet Taskesen, Erdogan
Beekman, Renee
de Ridder, Jeroen
Wouters, Bas J
Peeters, Justine K
Touw, Ivo P
Reinders, Marcel JT
Delwel, Ruud
author_sort Taskesen, Erdogan
collection PubMed
description BACKGROUND: Tiling-arrays are applicable to multiple types of biological research questions. Due to its advantages (high sensitivity, resolution, unbiased), the technology is often employed in genome-wide investigations. A major challenge in the analysis of tiling-array data is to define regions-of-interest, i.e., contiguous probes with increased signal intensity (as a result of hybridization of labeled DNA) in a region. Currently, no standard criteria are available to define these regions-of-interest as there is no single probe intensity cut-off level, different regions-of-interest can contain various numbers of probes, and can vary in genomic width. Furthermore, the chromosomal distance between neighboring probes can vary across the genome among different arrays. RESULTS: We have developed Hypergeometric Analysis of Tiling-arrays (HAT), and first evaluated its performance for tiling-array datasets from a Chromatin Immunoprecipitation study on chip (ChIP-on-chip) for the identification of genome-wide DNA binding profiles of transcription factor Cebpa (used for method comparison). Using this assay, we can refine the detection of regions-of-interest by illustrating that regions detected by HAT are more highly enriched for expected motifs in comparison with an alternative detection method (MAT). Subsequently, data from a retroviral insertional mutagenesis screen were used to examine the performance of HAT among different applications of tiling-array datasets. In both studies, detected regions-of-interest have been validated with (q)PCR. CONCLUSIONS: We demonstrate that HAT has increased specificity for analysis of tiling-array data in comparison with the alternative method, and that it accurately detects regions-of-interest in two different applications of tiling-arrays. HAT has several advantages over previous methods: i) as there is no single cut-off level for probe-intensity, HAT can detect regions-of-interest at various thresholds, ii) it can detect regions-of-interest of any size, iii) it is independent of probe-resolution across the genome, and across tiling-array platforms and iv) it employs a single user defined parameter: the significance level. Regions-of-interest are detected by computing the hypergeometric-probability, while controlling the Family Wise Error. Furthermore, the method does not require experimental replicates, common regions-of-interest are indicated, a sequence-of-interest can be examined for every detected region-of-interest, and flanking genes can be reported.
format Text
id pubmed-2892465
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28924652010-06-26 HAT: Hypergeometric Analysis of Tiling-arrays with application to promoter-GeneChip data Taskesen, Erdogan Beekman, Renee de Ridder, Jeroen Wouters, Bas J Peeters, Justine K Touw, Ivo P Reinders, Marcel JT Delwel, Ruud BMC Bioinformatics Methodology article BACKGROUND: Tiling-arrays are applicable to multiple types of biological research questions. Due to its advantages (high sensitivity, resolution, unbiased), the technology is often employed in genome-wide investigations. A major challenge in the analysis of tiling-array data is to define regions-of-interest, i.e., contiguous probes with increased signal intensity (as a result of hybridization of labeled DNA) in a region. Currently, no standard criteria are available to define these regions-of-interest as there is no single probe intensity cut-off level, different regions-of-interest can contain various numbers of probes, and can vary in genomic width. Furthermore, the chromosomal distance between neighboring probes can vary across the genome among different arrays. RESULTS: We have developed Hypergeometric Analysis of Tiling-arrays (HAT), and first evaluated its performance for tiling-array datasets from a Chromatin Immunoprecipitation study on chip (ChIP-on-chip) for the identification of genome-wide DNA binding profiles of transcription factor Cebpa (used for method comparison). Using this assay, we can refine the detection of regions-of-interest by illustrating that regions detected by HAT are more highly enriched for expected motifs in comparison with an alternative detection method (MAT). Subsequently, data from a retroviral insertional mutagenesis screen were used to examine the performance of HAT among different applications of tiling-array datasets. In both studies, detected regions-of-interest have been validated with (q)PCR. CONCLUSIONS: We demonstrate that HAT has increased specificity for analysis of tiling-array data in comparison with the alternative method, and that it accurately detects regions-of-interest in two different applications of tiling-arrays. HAT has several advantages over previous methods: i) as there is no single cut-off level for probe-intensity, HAT can detect regions-of-interest at various thresholds, ii) it can detect regions-of-interest of any size, iii) it is independent of probe-resolution across the genome, and across tiling-array platforms and iv) it employs a single user defined parameter: the significance level. Regions-of-interest are detected by computing the hypergeometric-probability, while controlling the Family Wise Error. Furthermore, the method does not require experimental replicates, common regions-of-interest are indicated, a sequence-of-interest can be examined for every detected region-of-interest, and flanking genes can be reported. BioMed Central 2010-05-21 /pmc/articles/PMC2892465/ /pubmed/20492700 http://dx.doi.org/10.1186/1471-2105-11-275 Text en Copyright ©2010 Taskesen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology article
Taskesen, Erdogan
Beekman, Renee
de Ridder, Jeroen
Wouters, Bas J
Peeters, Justine K
Touw, Ivo P
Reinders, Marcel JT
Delwel, Ruud
HAT: Hypergeometric Analysis of Tiling-arrays with application to promoter-GeneChip data
title HAT: Hypergeometric Analysis of Tiling-arrays with application to promoter-GeneChip data
title_full HAT: Hypergeometric Analysis of Tiling-arrays with application to promoter-GeneChip data
title_fullStr HAT: Hypergeometric Analysis of Tiling-arrays with application to promoter-GeneChip data
title_full_unstemmed HAT: Hypergeometric Analysis of Tiling-arrays with application to promoter-GeneChip data
title_short HAT: Hypergeometric Analysis of Tiling-arrays with application to promoter-GeneChip data
title_sort hat: hypergeometric analysis of tiling-arrays with application to promoter-genechip data
topic Methodology article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2892465/
https://www.ncbi.nlm.nih.gov/pubmed/20492700
http://dx.doi.org/10.1186/1471-2105-11-275
work_keys_str_mv AT taskesenerdogan hathypergeometricanalysisoftilingarrayswithapplicationtopromotergenechipdata
AT beekmanrenee hathypergeometricanalysisoftilingarrayswithapplicationtopromotergenechipdata
AT deridderjeroen hathypergeometricanalysisoftilingarrayswithapplicationtopromotergenechipdata
AT woutersbasj hathypergeometricanalysisoftilingarrayswithapplicationtopromotergenechipdata
AT peetersjustinek hathypergeometricanalysisoftilingarrayswithapplicationtopromotergenechipdata
AT touwivop hathypergeometricanalysisoftilingarrayswithapplicationtopromotergenechipdata
AT reindersmarceljt hathypergeometricanalysisoftilingarrayswithapplicationtopromotergenechipdata
AT delwelruud hathypergeometricanalysisoftilingarrayswithapplicationtopromotergenechipdata