Cargando…

Detection of copy number variation from array intensity and sequencing read depth using a stepwise Bayesian model

BACKGROUND: Copy number variants (CNVs) have been demonstrated to occur at a high frequency and are now widely believed to make a significant contribution to the phenotypic variation in human populations. Array-based comparative genomic hybridization (array-CGH) and newly developed read-depth approa...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhang, Zhengdong D, Gerstein, Mark B
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2010
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2992546/ https://www.ncbi.nlm.nih.gov/pubmed/21034510 http://dx.doi.org/10.1186/1471-2105-11-539

_version_	1782192758896721920
author	Zhang, Zhengdong D Gerstein, Mark B
author_facet	Zhang, Zhengdong D Gerstein, Mark B
author_sort	Zhang, Zhengdong D
collection	PubMed
description	BACKGROUND: Copy number variants (CNVs) have been demonstrated to occur at a high frequency and are now widely believed to make a significant contribution to the phenotypic variation in human populations. Array-based comparative genomic hybridization (array-CGH) and newly developed read-depth approach through ultrahigh throughput genomic sequencing both provide rapid, robust, and comprehensive methods to identify CNVs on a whole-genome scale. RESULTS: We developed a Bayesian statistical analysis algorithm for the detection of CNVs from both types of genomic data. The algorithm can analyze such data obtained from PCR-based bacterial artificial chromosome arrays, high-density oligonucleotide arrays, and more recently developed high-throughput DNA sequencing. Treating parameters--e.g., the number of CNVs, the position of each CNV, and the data noise level--that define the underlying data generating process as random variables, our approach derives the posterior distribution of the genomic CNV structure given the observed data. Sampling from the posterior distribution using a Markov chain Monte Carlo method, we get not only best estimates for these unknown parameters but also Bayesian credible intervals for the estimates. We illustrate the characteristics of our algorithm by applying it to both synthetic and experimental data sets in comparison to other segmentation algorithms. CONCLUSIONS: In particular, the synthetic data comparison shows that our method is more sensitive than other approaches at low false positive rates. Furthermore, given its Bayesian origin, our method can also be seen as a technique to refine CNVs identified by fast point-estimate methods and also as a framework to integrate array-CGH and sequencing data with other CNV-related biological knowledge, all through informative priors.
format	Text
id	pubmed-2992546
institution	National Center for Biotechnology Information
language	English
publishDate	2010
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-29925462010-12-20 Detection of copy number variation from array intensity and sequencing read depth using a stepwise Bayesian model Zhang, Zhengdong D Gerstein, Mark B BMC Bioinformatics Methodology Article BACKGROUND: Copy number variants (CNVs) have been demonstrated to occur at a high frequency and are now widely believed to make a significant contribution to the phenotypic variation in human populations. Array-based comparative genomic hybridization (array-CGH) and newly developed read-depth approach through ultrahigh throughput genomic sequencing both provide rapid, robust, and comprehensive methods to identify CNVs on a whole-genome scale. RESULTS: We developed a Bayesian statistical analysis algorithm for the detection of CNVs from both types of genomic data. The algorithm can analyze such data obtained from PCR-based bacterial artificial chromosome arrays, high-density oligonucleotide arrays, and more recently developed high-throughput DNA sequencing. Treating parameters--e.g., the number of CNVs, the position of each CNV, and the data noise level--that define the underlying data generating process as random variables, our approach derives the posterior distribution of the genomic CNV structure given the observed data. Sampling from the posterior distribution using a Markov chain Monte Carlo method, we get not only best estimates for these unknown parameters but also Bayesian credible intervals for the estimates. We illustrate the characteristics of our algorithm by applying it to both synthetic and experimental data sets in comparison to other segmentation algorithms. CONCLUSIONS: In particular, the synthetic data comparison shows that our method is more sensitive than other approaches at low false positive rates. Furthermore, given its Bayesian origin, our method can also be seen as a technique to refine CNVs identified by fast point-estimate methods and also as a framework to integrate array-CGH and sequencing data with other CNV-related biological knowledge, all through informative priors. BioMed Central 2010-10-31 /pmc/articles/PMC2992546/ /pubmed/21034510 http://dx.doi.org/10.1186/1471-2105-11-539 Text en Copyright ©2010 Zhang and Gerstein; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Zhang, Zhengdong D Gerstein, Mark B Detection of copy number variation from array intensity and sequencing read depth using a stepwise Bayesian model
title	Detection of copy number variation from array intensity and sequencing read depth using a stepwise Bayesian model
title_full	Detection of copy number variation from array intensity and sequencing read depth using a stepwise Bayesian model
title_fullStr	Detection of copy number variation from array intensity and sequencing read depth using a stepwise Bayesian model
title_full_unstemmed	Detection of copy number variation from array intensity and sequencing read depth using a stepwise Bayesian model
title_short	Detection of copy number variation from array intensity and sequencing read depth using a stepwise Bayesian model
title_sort	detection of copy number variation from array intensity and sequencing read depth using a stepwise bayesian model
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2992546/ https://www.ncbi.nlm.nih.gov/pubmed/21034510 http://dx.doi.org/10.1186/1471-2105-11-539
work_keys_str_mv	AT zhangzhengdongd detectionofcopynumbervariationfromarrayintensityandsequencingreaddepthusingastepwisebayesianmodel AT gersteinmarkb detectionofcopynumbervariationfromarrayintensityandsequencingreaddepthusingastepwisebayesianmodel

Detection of copy number variation from array intensity and sequencing read depth using a stepwise Bayesian model

Ejemplares similares