Cargando…

BCRgt: a Bayesian cluster regression-based genotyping algorithm for the samples with copy number alterations

BACKGROUND: Accurate genotype calling is a pre-requisite of a successful Genome-Wide Association Study (GWAS). Although most genotyping algorithms can achieve an accuracy rate greater than 99% for genotyping DNA samples without copy number alterations (CNAs), almost all of these algorithms are not d...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Shengping, Cui, Xiangqin, Fang, Zhide
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4003822/
https://www.ncbi.nlm.nih.gov/pubmed/24629125
http://dx.doi.org/10.1186/1471-2105-15-74
_version_ 1782313893355323392
author Yang, Shengping
Cui, Xiangqin
Fang, Zhide
author_facet Yang, Shengping
Cui, Xiangqin
Fang, Zhide
author_sort Yang, Shengping
collection PubMed
description BACKGROUND: Accurate genotype calling is a pre-requisite of a successful Genome-Wide Association Study (GWAS). Although most genotyping algorithms can achieve an accuracy rate greater than 99% for genotyping DNA samples without copy number alterations (CNAs), almost all of these algorithms are not designed for genotyping tumor samples that are known to have large regions of CNAs. RESULTS: This study aims to develop a statistical method that can accurately genotype tumor samples with CNAs. The proposed method adds a Bayesian layer to a cluster regression model and is termed a Bayesian Cluster Regression-based genotyping algorithm (BCRgt). We demonstrate that high concordance rates with HapMap calls can be achieved without using reference/training samples, when CNAs do not exist. By adding a training step, we have obtained higher genotyping concordance rates, without requiring large sample sizes. When CNAs exist in the samples, accuracy can be dramatically improved in regions with DNA copy loss and slightly improved in regions with copy number gain, comparing with the Bayesian Robust Linear Model with Mahalanobis distance classifier (BRLMM). CONCLUSIONS: In conclusion, we have demonstrated that BCRgt can provide accurate genotyping calls for tumor samples with CNAs.
format Online
Article
Text
id pubmed-4003822
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40038222014-05-19 BCRgt: a Bayesian cluster regression-based genotyping algorithm for the samples with copy number alterations Yang, Shengping Cui, Xiangqin Fang, Zhide BMC Bioinformatics Research Article BACKGROUND: Accurate genotype calling is a pre-requisite of a successful Genome-Wide Association Study (GWAS). Although most genotyping algorithms can achieve an accuracy rate greater than 99% for genotyping DNA samples without copy number alterations (CNAs), almost all of these algorithms are not designed for genotyping tumor samples that are known to have large regions of CNAs. RESULTS: This study aims to develop a statistical method that can accurately genotype tumor samples with CNAs. The proposed method adds a Bayesian layer to a cluster regression model and is termed a Bayesian Cluster Regression-based genotyping algorithm (BCRgt). We demonstrate that high concordance rates with HapMap calls can be achieved without using reference/training samples, when CNAs do not exist. By adding a training step, we have obtained higher genotyping concordance rates, without requiring large sample sizes. When CNAs exist in the samples, accuracy can be dramatically improved in regions with DNA copy loss and slightly improved in regions with copy number gain, comparing with the Bayesian Robust Linear Model with Mahalanobis distance classifier (BRLMM). CONCLUSIONS: In conclusion, we have demonstrated that BCRgt can provide accurate genotyping calls for tumor samples with CNAs. BioMed Central 2014-03-15 /pmc/articles/PMC4003822/ /pubmed/24629125 http://dx.doi.org/10.1186/1471-2105-15-74 Text en Copyright © 2014 Yang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.
spellingShingle Research Article
Yang, Shengping
Cui, Xiangqin
Fang, Zhide
BCRgt: a Bayesian cluster regression-based genotyping algorithm for the samples with copy number alterations
title BCRgt: a Bayesian cluster regression-based genotyping algorithm for the samples with copy number alterations
title_full BCRgt: a Bayesian cluster regression-based genotyping algorithm for the samples with copy number alterations
title_fullStr BCRgt: a Bayesian cluster regression-based genotyping algorithm for the samples with copy number alterations
title_full_unstemmed BCRgt: a Bayesian cluster regression-based genotyping algorithm for the samples with copy number alterations
title_short BCRgt: a Bayesian cluster regression-based genotyping algorithm for the samples with copy number alterations
title_sort bcrgt: a bayesian cluster regression-based genotyping algorithm for the samples with copy number alterations
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4003822/
https://www.ncbi.nlm.nih.gov/pubmed/24629125
http://dx.doi.org/10.1186/1471-2105-15-74
work_keys_str_mv AT yangshengping bcrgtabayesianclusterregressionbasedgenotypingalgorithmforthesampleswithcopynumberalterations
AT cuixiangqin bcrgtabayesianclusterregressionbasedgenotypingalgorithmforthesampleswithcopynumberalterations
AT fangzhide bcrgtabayesianclusterregressionbasedgenotypingalgorithmforthesampleswithcopynumberalterations