Cargando…

An integrated Bayesian analysis of LOH and copy number data

BACKGROUND: Cancer and other disorders are due to genomic lesions. SNP-microarrays are able to measure simultaneously both genotype and copy number (CN) at several Single Nucleotide Polymorphisms (SNPs) along the genome. CN is defined as the number of DNA copies, and the normal is two, since we have...

Descripción completa

Detalles Bibliográficos
Autores principales: Rancoita, Paola MV, Hutter, Marcus, Bertoni, Francesco, Kwee, Ivo
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2912301/
https://www.ncbi.nlm.nih.gov/pubmed/20550648
http://dx.doi.org/10.1186/1471-2105-11-321
_version_ 1782184575962710016
author Rancoita, Paola MV
Hutter, Marcus
Bertoni, Francesco
Kwee, Ivo
author_facet Rancoita, Paola MV
Hutter, Marcus
Bertoni, Francesco
Kwee, Ivo
author_sort Rancoita, Paola MV
collection PubMed
description BACKGROUND: Cancer and other disorders are due to genomic lesions. SNP-microarrays are able to measure simultaneously both genotype and copy number (CN) at several Single Nucleotide Polymorphisms (SNPs) along the genome. CN is defined as the number of DNA copies, and the normal is two, since we have two copies of each chromosome. The genotype of a SNP is the status given by the nucleotides (alleles) which are present on the two copies of DNA. It is defined homozygous or heterozygous if the two alleles are the same or if they differ, respectively. Loss of heterozygosity (LOH) is the loss of the heterozygous status due to genomic events. Combining CN and LOH data, it is possible to better identify different types of genomic aberrations. For example, a long sequence of homozygous SNPs might be caused by either the physical loss of one copy or a uniparental disomy event (UPD), i.e. each SNP has two identical nucleotides both derived from only one parent. In this situation, the knowledge of the CN can help in distinguishing between these two events. RESULTS: To better identify genomic aberrations, we propose a method (called gBPCR) which infers the type of aberration occurred, taking into account all the possible influence in the microarray detection of the homozygosity status of the SNPs, resulting from an altered CN level. Namely, we model the distributions of the detected genotype, given a specific genomic alteration and we estimate the parameters involved on public reference datasets. The estimation is performed similarly to the modified Bayesian Piecewise Constant Regression, but with improved estimators for the detection of the breakpoints. Using artificial and real data, we evaluate the quality of the estimation of gBPCR and we also show that it outperforms other well-known methods for LOH estimation. CONCLUSIONS: We propose a method (gBPCR) for the estimation of both LOH and CN aberrations, improving their estimation by integrating both types of data and accounting for their relationships. Moreover, gBPCR performed very well in comparison with other methods for LOH estimation and the estimated CN lesions on real data have been validated with another technique.
format Text
id pubmed-2912301
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-29123012010-07-30 An integrated Bayesian analysis of LOH and copy number data Rancoita, Paola MV Hutter, Marcus Bertoni, Francesco Kwee, Ivo BMC Bioinformatics Research Article BACKGROUND: Cancer and other disorders are due to genomic lesions. SNP-microarrays are able to measure simultaneously both genotype and copy number (CN) at several Single Nucleotide Polymorphisms (SNPs) along the genome. CN is defined as the number of DNA copies, and the normal is two, since we have two copies of each chromosome. The genotype of a SNP is the status given by the nucleotides (alleles) which are present on the two copies of DNA. It is defined homozygous or heterozygous if the two alleles are the same or if they differ, respectively. Loss of heterozygosity (LOH) is the loss of the heterozygous status due to genomic events. Combining CN and LOH data, it is possible to better identify different types of genomic aberrations. For example, a long sequence of homozygous SNPs might be caused by either the physical loss of one copy or a uniparental disomy event (UPD), i.e. each SNP has two identical nucleotides both derived from only one parent. In this situation, the knowledge of the CN can help in distinguishing between these two events. RESULTS: To better identify genomic aberrations, we propose a method (called gBPCR) which infers the type of aberration occurred, taking into account all the possible influence in the microarray detection of the homozygosity status of the SNPs, resulting from an altered CN level. Namely, we model the distributions of the detected genotype, given a specific genomic alteration and we estimate the parameters involved on public reference datasets. The estimation is performed similarly to the modified Bayesian Piecewise Constant Regression, but with improved estimators for the detection of the breakpoints. Using artificial and real data, we evaluate the quality of the estimation of gBPCR and we also show that it outperforms other well-known methods for LOH estimation. CONCLUSIONS: We propose a method (gBPCR) for the estimation of both LOH and CN aberrations, improving their estimation by integrating both types of data and accounting for their relationships. Moreover, gBPCR performed very well in comparison with other methods for LOH estimation and the estimated CN lesions on real data have been validated with another technique. BioMed Central 2010-06-15 /pmc/articles/PMC2912301/ /pubmed/20550648 http://dx.doi.org/10.1186/1471-2105-11-321 Text en Copyright ©2010 Rancoita et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Rancoita, Paola MV
Hutter, Marcus
Bertoni, Francesco
Kwee, Ivo
An integrated Bayesian analysis of LOH and copy number data
title An integrated Bayesian analysis of LOH and copy number data
title_full An integrated Bayesian analysis of LOH and copy number data
title_fullStr An integrated Bayesian analysis of LOH and copy number data
title_full_unstemmed An integrated Bayesian analysis of LOH and copy number data
title_short An integrated Bayesian analysis of LOH and copy number data
title_sort integrated bayesian analysis of loh and copy number data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2912301/
https://www.ncbi.nlm.nih.gov/pubmed/20550648
http://dx.doi.org/10.1186/1471-2105-11-321
work_keys_str_mv AT rancoitapaolamv anintegratedbayesiananalysisoflohandcopynumberdata
AT huttermarcus anintegratedbayesiananalysisoflohandcopynumberdata
AT bertonifrancesco anintegratedbayesiananalysisoflohandcopynumberdata
AT kweeivo anintegratedbayesiananalysisoflohandcopynumberdata
AT rancoitapaolamv integratedbayesiananalysisoflohandcopynumberdata
AT huttermarcus integratedbayesiananalysisoflohandcopynumberdata
AT bertonifrancesco integratedbayesiananalysisoflohandcopynumberdata
AT kweeivo integratedbayesiananalysisoflohandcopynumberdata