Cargando…
Exploiting noise in array CGH data to improve detection of DNA copy number change
Developing effective methods for analyzing array-CGH data to detect chromosomal aberrations is very important for the diagnosis of pathogenesis of cancer and other diseases. Current analysis methods, being largely based on smoothing and/or segmentation, are not quite capable of detecting both the ab...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1994778/ https://www.ncbi.nlm.nih.gov/pubmed/17272296 http://dx.doi.org/10.1093/nar/gkl730 |
_version_ | 1782135495535362048 |
---|---|
author | Hu, Jing Gao, Jian-Bo Cao, Yinhe Bottinger, Erwin Zhang, Weijia |
author_facet | Hu, Jing Gao, Jian-Bo Cao, Yinhe Bottinger, Erwin Zhang, Weijia |
author_sort | Hu, Jing |
collection | PubMed |
description | Developing effective methods for analyzing array-CGH data to detect chromosomal aberrations is very important for the diagnosis of pathogenesis of cancer and other diseases. Current analysis methods, being largely based on smoothing and/or segmentation, are not quite capable of detecting both the aberration regions and the boundary break points very accurately. Furthermore, when evaluating the accuracy of an algorithm for analyzing array-CGH data, it is commonly assumed that noise in the data follows normal distribution. A fundamental question is whether noise in array-CGH is indeed Gaussian, and if not, can one exploit the characteristics of noise to develop novel analysis methods that are capable of detecting accurately the aberration regions as well as the boundary break points simultaneously? By analyzing bacterial artificial chromosomes (BACs) arrays with an average 1 mb resolution, 19 k oligo arrays with the average probe spacing <100 kb and 385 k oligo arrays with the average probe spacing of about 6 kb, we show that when there are aberrations, noise in all three types of arrays is highly non-Gaussian and possesses long-range spatial correlations, and that such noise leads to worse performance of existing methods for detecting aberrations in array-CGH than the Gaussian noise case. We further develop a novel method, which has optimally exploited the character of the noise, and is capable of identifying both aberration regions as well as the boundary break points very accurately. Finally, we propose a new concept, posteriori signal-to-noise ratio (p-SNR), to assign certain confidence level to an aberration region and boundaries detected. |
format | Text |
id | pubmed-1994778 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-19947782007-09-27 Exploiting noise in array CGH data to improve detection of DNA copy number change Hu, Jing Gao, Jian-Bo Cao, Yinhe Bottinger, Erwin Zhang, Weijia Nucleic Acids Res Methods Online Developing effective methods for analyzing array-CGH data to detect chromosomal aberrations is very important for the diagnosis of pathogenesis of cancer and other diseases. Current analysis methods, being largely based on smoothing and/or segmentation, are not quite capable of detecting both the aberration regions and the boundary break points very accurately. Furthermore, when evaluating the accuracy of an algorithm for analyzing array-CGH data, it is commonly assumed that noise in the data follows normal distribution. A fundamental question is whether noise in array-CGH is indeed Gaussian, and if not, can one exploit the characteristics of noise to develop novel analysis methods that are capable of detecting accurately the aberration regions as well as the boundary break points simultaneously? By analyzing bacterial artificial chromosomes (BACs) arrays with an average 1 mb resolution, 19 k oligo arrays with the average probe spacing <100 kb and 385 k oligo arrays with the average probe spacing of about 6 kb, we show that when there are aberrations, noise in all three types of arrays is highly non-Gaussian and possesses long-range spatial correlations, and that such noise leads to worse performance of existing methods for detecting aberrations in array-CGH than the Gaussian noise case. We further develop a novel method, which has optimally exploited the character of the noise, and is capable of identifying both aberration regions as well as the boundary break points very accurately. Finally, we propose a new concept, posteriori signal-to-noise ratio (p-SNR), to assign certain confidence level to an aberration region and boundaries detected. Oxford University Press 2007-03 2007-02-01 /pmc/articles/PMC1994778/ /pubmed/17272296 http://dx.doi.org/10.1093/nar/gkl730 Text en © 2007 The Author(s). This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Online Hu, Jing Gao, Jian-Bo Cao, Yinhe Bottinger, Erwin Zhang, Weijia Exploiting noise in array CGH data to improve detection of DNA copy number change |
title | Exploiting noise in array CGH data to improve detection of DNA copy number change |
title_full | Exploiting noise in array CGH data to improve detection of DNA copy number change |
title_fullStr | Exploiting noise in array CGH data to improve detection of DNA copy number change |
title_full_unstemmed | Exploiting noise in array CGH data to improve detection of DNA copy number change |
title_short | Exploiting noise in array CGH data to improve detection of DNA copy number change |
title_sort | exploiting noise in array cgh data to improve detection of dna copy number change |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1994778/ https://www.ncbi.nlm.nih.gov/pubmed/17272296 http://dx.doi.org/10.1093/nar/gkl730 |
work_keys_str_mv | AT hujing exploitingnoiseinarraycghdatatoimprovedetectionofdnacopynumberchange AT gaojianbo exploitingnoiseinarraycghdatatoimprovedetectionofdnacopynumberchange AT caoyinhe exploitingnoiseinarraycghdatatoimprovedetectionofdnacopynumberchange AT bottingererwin exploitingnoiseinarraycghdatatoimprovedetectionofdnacopynumberchange AT zhangweijia exploitingnoiseinarraycghdatatoimprovedetectionofdnacopynumberchange |