Cargando…

A new model calling procedure for Illumina BeadArray data

BACKGROUND: Accurate genotype calling for high throughput Illumina data is an important step to extract more genetic information for a large scale genome wide association studies. Many popular calling algorithms use mixture models to infer genotypes of a large number of single nucleotide polymorphis...

Descripción completa

Detalles Bibliográficos
Autor principal: Li, Gengxin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4921002/
https://www.ncbi.nlm.nih.gov/pubmed/27343118
http://dx.doi.org/10.1186/s12863-016-0398-x
_version_ 1782439461373607936
author Li, Gengxin
author_facet Li, Gengxin
author_sort Li, Gengxin
collection PubMed
description BACKGROUND: Accurate genotype calling for high throughput Illumina data is an important step to extract more genetic information for a large scale genome wide association studies. Many popular calling algorithms use mixture models to infer genotypes of a large number of single nucleotide polymorphisms in a fast and efficient way. In practice, mixture models are mostly restricted to infer genotypes for common SNPs where their minor allele frequencies are quite large. However, it is still challenging to accurately genotype rare variants, especially for some rare variants where the boundaries of their genotypes are not clearly defined. RESULTS: To further improve the call accuracy and the quality of genotypes on rare variants, a new model calling procedure, named M-D, is proposed to infer genotypes for the Illumina BeadArray data. In this calling procedure, a Gaussian Mixture Model and a Dirichlet Process Gaussian Mixture Model are integrated to infer genotypes. CONCLUSIONS: Applications to Illumina data illustrate that this new approach can improve calling performance compared to other popular genotyping algorithms.
format Online
Article
Text
id pubmed-4921002
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-49210022016-06-26 A new model calling procedure for Illumina BeadArray data Li, Gengxin BMC Genet Methodology Article BACKGROUND: Accurate genotype calling for high throughput Illumina data is an important step to extract more genetic information for a large scale genome wide association studies. Many popular calling algorithms use mixture models to infer genotypes of a large number of single nucleotide polymorphisms in a fast and efficient way. In practice, mixture models are mostly restricted to infer genotypes for common SNPs where their minor allele frequencies are quite large. However, it is still challenging to accurately genotype rare variants, especially for some rare variants where the boundaries of their genotypes are not clearly defined. RESULTS: To further improve the call accuracy and the quality of genotypes on rare variants, a new model calling procedure, named M-D, is proposed to infer genotypes for the Illumina BeadArray data. In this calling procedure, a Gaussian Mixture Model and a Dirichlet Process Gaussian Mixture Model are integrated to infer genotypes. CONCLUSIONS: Applications to Illumina data illustrate that this new approach can improve calling performance compared to other popular genotyping algorithms. BioMed Central 2016-06-24 /pmc/articles/PMC4921002/ /pubmed/27343118 http://dx.doi.org/10.1186/s12863-016-0398-x Text en © The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Li, Gengxin
A new model calling procedure for Illumina BeadArray data
title A new model calling procedure for Illumina BeadArray data
title_full A new model calling procedure for Illumina BeadArray data
title_fullStr A new model calling procedure for Illumina BeadArray data
title_full_unstemmed A new model calling procedure for Illumina BeadArray data
title_short A new model calling procedure for Illumina BeadArray data
title_sort new model calling procedure for illumina beadarray data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4921002/
https://www.ncbi.nlm.nih.gov/pubmed/27343118
http://dx.doi.org/10.1186/s12863-016-0398-x
work_keys_str_mv AT ligengxin anewmodelcallingprocedureforilluminabeadarraydata
AT ligengxin newmodelcallingprocedureforilluminabeadarraydata