Cargando…

Exploiting noise in array CGH data to improve detection of DNA copy number change

Developing effective methods for analyzing array-CGH data to detect chromosomal aberrations is very important for the diagnosis of pathogenesis of cancer and other diseases. Current analysis methods, being largely based on smoothing and/or segmentation, are not quite capable of detecting both the ab...

Descripción completa

Detalles Bibliográficos
Autores principales: Hu, Jing, Gao, Jian-Bo, Cao, Yinhe, Bottinger, Erwin, Zhang, Weijia
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1994778/
https://www.ncbi.nlm.nih.gov/pubmed/17272296
http://dx.doi.org/10.1093/nar/gkl730
_version_ 1782135495535362048
author Hu, Jing
Gao, Jian-Bo
Cao, Yinhe
Bottinger, Erwin
Zhang, Weijia
author_facet Hu, Jing
Gao, Jian-Bo
Cao, Yinhe
Bottinger, Erwin
Zhang, Weijia
author_sort Hu, Jing
collection PubMed
description Developing effective methods for analyzing array-CGH data to detect chromosomal aberrations is very important for the diagnosis of pathogenesis of cancer and other diseases. Current analysis methods, being largely based on smoothing and/or segmentation, are not quite capable of detecting both the aberration regions and the boundary break points very accurately. Furthermore, when evaluating the accuracy of an algorithm for analyzing array-CGH data, it is commonly assumed that noise in the data follows normal distribution. A fundamental question is whether noise in array-CGH is indeed Gaussian, and if not, can one exploit the characteristics of noise to develop novel analysis methods that are capable of detecting accurately the aberration regions as well as the boundary break points simultaneously? By analyzing bacterial artificial chromosomes (BACs) arrays with an average 1 mb resolution, 19 k oligo arrays with the average probe spacing <100 kb and 385 k oligo arrays with the average probe spacing of about 6 kb, we show that when there are aberrations, noise in all three types of arrays is highly non-Gaussian and possesses long-range spatial correlations, and that such noise leads to worse performance of existing methods for detecting aberrations in array-CGH than the Gaussian noise case. We further develop a novel method, which has optimally exploited the character of the noise, and is capable of identifying both aberration regions as well as the boundary break points very accurately. Finally, we propose a new concept, posteriori signal-to-noise ratio (p-SNR), to assign certain confidence level to an aberration region and boundaries detected.
format Text
id pubmed-1994778
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-19947782007-09-27 Exploiting noise in array CGH data to improve detection of DNA copy number change Hu, Jing Gao, Jian-Bo Cao, Yinhe Bottinger, Erwin Zhang, Weijia Nucleic Acids Res Methods Online Developing effective methods for analyzing array-CGH data to detect chromosomal aberrations is very important for the diagnosis of pathogenesis of cancer and other diseases. Current analysis methods, being largely based on smoothing and/or segmentation, are not quite capable of detecting both the aberration regions and the boundary break points very accurately. Furthermore, when evaluating the accuracy of an algorithm for analyzing array-CGH data, it is commonly assumed that noise in the data follows normal distribution. A fundamental question is whether noise in array-CGH is indeed Gaussian, and if not, can one exploit the characteristics of noise to develop novel analysis methods that are capable of detecting accurately the aberration regions as well as the boundary break points simultaneously? By analyzing bacterial artificial chromosomes (BACs) arrays with an average 1 mb resolution, 19 k oligo arrays with the average probe spacing <100 kb and 385 k oligo arrays with the average probe spacing of about 6 kb, we show that when there are aberrations, noise in all three types of arrays is highly non-Gaussian and possesses long-range spatial correlations, and that such noise leads to worse performance of existing methods for detecting aberrations in array-CGH than the Gaussian noise case. We further develop a novel method, which has optimally exploited the character of the noise, and is capable of identifying both aberration regions as well as the boundary break points very accurately. Finally, we propose a new concept, posteriori signal-to-noise ratio (p-SNR), to assign certain confidence level to an aberration region and boundaries detected. Oxford University Press 2007-03 2007-02-01 /pmc/articles/PMC1994778/ /pubmed/17272296 http://dx.doi.org/10.1093/nar/gkl730 Text en © 2007 The Author(s). This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Hu, Jing
Gao, Jian-Bo
Cao, Yinhe
Bottinger, Erwin
Zhang, Weijia
Exploiting noise in array CGH data to improve detection of DNA copy number change
title Exploiting noise in array CGH data to improve detection of DNA copy number change
title_full Exploiting noise in array CGH data to improve detection of DNA copy number change
title_fullStr Exploiting noise in array CGH data to improve detection of DNA copy number change
title_full_unstemmed Exploiting noise in array CGH data to improve detection of DNA copy number change
title_short Exploiting noise in array CGH data to improve detection of DNA copy number change
title_sort exploiting noise in array cgh data to improve detection of dna copy number change
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1994778/
https://www.ncbi.nlm.nih.gov/pubmed/17272296
http://dx.doi.org/10.1093/nar/gkl730
work_keys_str_mv AT hujing exploitingnoiseinarraycghdatatoimprovedetectionofdnacopynumberchange
AT gaojianbo exploitingnoiseinarraycghdatatoimprovedetectionofdnacopynumberchange
AT caoyinhe exploitingnoiseinarraycghdatatoimprovedetectionofdnacopynumberchange
AT bottingererwin exploitingnoiseinarraycghdatatoimprovedetectionofdnacopynumberchange
AT zhangweijia exploitingnoiseinarraycghdatatoimprovedetectionofdnacopynumberchange