Cargando…
HAT: Hypergeometric Analysis of Tiling-arrays with application to promoter-GeneChip data
BACKGROUND: Tiling-arrays are applicable to multiple types of biological research questions. Due to its advantages (high sensitivity, resolution, unbiased), the technology is often employed in genome-wide investigations. A major challenge in the analysis of tiling-array data is to define regions-of-...
Autores principales: | , , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2892465/ https://www.ncbi.nlm.nih.gov/pubmed/20492700 http://dx.doi.org/10.1186/1471-2105-11-275 |
_version_ | 1782182950555615232 |
---|---|
author | Taskesen, Erdogan Beekman, Renee de Ridder, Jeroen Wouters, Bas J Peeters, Justine K Touw, Ivo P Reinders, Marcel JT Delwel, Ruud |
author_facet | Taskesen, Erdogan Beekman, Renee de Ridder, Jeroen Wouters, Bas J Peeters, Justine K Touw, Ivo P Reinders, Marcel JT Delwel, Ruud |
author_sort | Taskesen, Erdogan |
collection | PubMed |
description | BACKGROUND: Tiling-arrays are applicable to multiple types of biological research questions. Due to its advantages (high sensitivity, resolution, unbiased), the technology is often employed in genome-wide investigations. A major challenge in the analysis of tiling-array data is to define regions-of-interest, i.e., contiguous probes with increased signal intensity (as a result of hybridization of labeled DNA) in a region. Currently, no standard criteria are available to define these regions-of-interest as there is no single probe intensity cut-off level, different regions-of-interest can contain various numbers of probes, and can vary in genomic width. Furthermore, the chromosomal distance between neighboring probes can vary across the genome among different arrays. RESULTS: We have developed Hypergeometric Analysis of Tiling-arrays (HAT), and first evaluated its performance for tiling-array datasets from a Chromatin Immunoprecipitation study on chip (ChIP-on-chip) for the identification of genome-wide DNA binding profiles of transcription factor Cebpa (used for method comparison). Using this assay, we can refine the detection of regions-of-interest by illustrating that regions detected by HAT are more highly enriched for expected motifs in comparison with an alternative detection method (MAT). Subsequently, data from a retroviral insertional mutagenesis screen were used to examine the performance of HAT among different applications of tiling-array datasets. In both studies, detected regions-of-interest have been validated with (q)PCR. CONCLUSIONS: We demonstrate that HAT has increased specificity for analysis of tiling-array data in comparison with the alternative method, and that it accurately detects regions-of-interest in two different applications of tiling-arrays. HAT has several advantages over previous methods: i) as there is no single cut-off level for probe-intensity, HAT can detect regions-of-interest at various thresholds, ii) it can detect regions-of-interest of any size, iii) it is independent of probe-resolution across the genome, and across tiling-array platforms and iv) it employs a single user defined parameter: the significance level. Regions-of-interest are detected by computing the hypergeometric-probability, while controlling the Family Wise Error. Furthermore, the method does not require experimental replicates, common regions-of-interest are indicated, a sequence-of-interest can be examined for every detected region-of-interest, and flanking genes can be reported. |
format | Text |
id | pubmed-2892465 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-28924652010-06-26 HAT: Hypergeometric Analysis of Tiling-arrays with application to promoter-GeneChip data Taskesen, Erdogan Beekman, Renee de Ridder, Jeroen Wouters, Bas J Peeters, Justine K Touw, Ivo P Reinders, Marcel JT Delwel, Ruud BMC Bioinformatics Methodology article BACKGROUND: Tiling-arrays are applicable to multiple types of biological research questions. Due to its advantages (high sensitivity, resolution, unbiased), the technology is often employed in genome-wide investigations. A major challenge in the analysis of tiling-array data is to define regions-of-interest, i.e., contiguous probes with increased signal intensity (as a result of hybridization of labeled DNA) in a region. Currently, no standard criteria are available to define these regions-of-interest as there is no single probe intensity cut-off level, different regions-of-interest can contain various numbers of probes, and can vary in genomic width. Furthermore, the chromosomal distance between neighboring probes can vary across the genome among different arrays. RESULTS: We have developed Hypergeometric Analysis of Tiling-arrays (HAT), and first evaluated its performance for tiling-array datasets from a Chromatin Immunoprecipitation study on chip (ChIP-on-chip) for the identification of genome-wide DNA binding profiles of transcription factor Cebpa (used for method comparison). Using this assay, we can refine the detection of regions-of-interest by illustrating that regions detected by HAT are more highly enriched for expected motifs in comparison with an alternative detection method (MAT). Subsequently, data from a retroviral insertional mutagenesis screen were used to examine the performance of HAT among different applications of tiling-array datasets. In both studies, detected regions-of-interest have been validated with (q)PCR. CONCLUSIONS: We demonstrate that HAT has increased specificity for analysis of tiling-array data in comparison with the alternative method, and that it accurately detects regions-of-interest in two different applications of tiling-arrays. HAT has several advantages over previous methods: i) as there is no single cut-off level for probe-intensity, HAT can detect regions-of-interest at various thresholds, ii) it can detect regions-of-interest of any size, iii) it is independent of probe-resolution across the genome, and across tiling-array platforms and iv) it employs a single user defined parameter: the significance level. Regions-of-interest are detected by computing the hypergeometric-probability, while controlling the Family Wise Error. Furthermore, the method does not require experimental replicates, common regions-of-interest are indicated, a sequence-of-interest can be examined for every detected region-of-interest, and flanking genes can be reported. BioMed Central 2010-05-21 /pmc/articles/PMC2892465/ /pubmed/20492700 http://dx.doi.org/10.1186/1471-2105-11-275 Text en Copyright ©2010 Taskesen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology article Taskesen, Erdogan Beekman, Renee de Ridder, Jeroen Wouters, Bas J Peeters, Justine K Touw, Ivo P Reinders, Marcel JT Delwel, Ruud HAT: Hypergeometric Analysis of Tiling-arrays with application to promoter-GeneChip data |
title | HAT: Hypergeometric Analysis of Tiling-arrays with application to promoter-GeneChip data |
title_full | HAT: Hypergeometric Analysis of Tiling-arrays with application to promoter-GeneChip data |
title_fullStr | HAT: Hypergeometric Analysis of Tiling-arrays with application to promoter-GeneChip data |
title_full_unstemmed | HAT: Hypergeometric Analysis of Tiling-arrays with application to promoter-GeneChip data |
title_short | HAT: Hypergeometric Analysis of Tiling-arrays with application to promoter-GeneChip data |
title_sort | hat: hypergeometric analysis of tiling-arrays with application to promoter-genechip data |
topic | Methodology article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2892465/ https://www.ncbi.nlm.nih.gov/pubmed/20492700 http://dx.doi.org/10.1186/1471-2105-11-275 |
work_keys_str_mv | AT taskesenerdogan hathypergeometricanalysisoftilingarrayswithapplicationtopromotergenechipdata AT beekmanrenee hathypergeometricanalysisoftilingarrayswithapplicationtopromotergenechipdata AT deridderjeroen hathypergeometricanalysisoftilingarrayswithapplicationtopromotergenechipdata AT woutersbasj hathypergeometricanalysisoftilingarrayswithapplicationtopromotergenechipdata AT peetersjustinek hathypergeometricanalysisoftilingarrayswithapplicationtopromotergenechipdata AT touwivop hathypergeometricanalysisoftilingarrayswithapplicationtopromotergenechipdata AT reindersmarceljt hathypergeometricanalysisoftilingarrayswithapplicationtopromotergenechipdata AT delwelruud hathypergeometricanalysisoftilingarrayswithapplicationtopromotergenechipdata |