Cargando…

KRLMM: an adaptive genotype calling method for common and low frequency variants

BACKGROUND: SNP genotyping microarrays have revolutionized the study of complex disease. The current range of commercially available genotyping products contain extensive catalogues of low frequency and rare variants. Existing SNP calling algorithms have difficulty dealing with these low frequency v...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Ruijie, Dai, Zhiyin, Yeager, Meredith, Irizarry, Rafael A, Ritchie, Matthew E
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4064501/
https://www.ncbi.nlm.nih.gov/pubmed/24886250
http://dx.doi.org/10.1186/1471-2105-15-158
_version_ 1782321952921223168
author Liu, Ruijie
Dai, Zhiyin
Yeager, Meredith
Irizarry, Rafael A
Ritchie, Matthew E
author_facet Liu, Ruijie
Dai, Zhiyin
Yeager, Meredith
Irizarry, Rafael A
Ritchie, Matthew E
author_sort Liu, Ruijie
collection PubMed
description BACKGROUND: SNP genotyping microarrays have revolutionized the study of complex disease. The current range of commercially available genotyping products contain extensive catalogues of low frequency and rare variants. Existing SNP calling algorithms have difficulty dealing with these low frequency variants, as the underlying models rely on each genotype having a reasonable number of observations to ensure accurate clustering. RESULTS: Here we develop KRLMM, a new method for converting raw intensities into genotype calls that aims to overcome this issue. Our method is unique in that it applies careful between sample normalization and allows a variable number of clusters k (1, 2 or 3) for each SNP, where k is predicted using the available data. We compare our method to four genotyping algorithms (GenCall, GenoSNP, Illuminus and OptiCall) on several Illumina data sets that include samples from the HapMap project where the true genotypes are known in advance. All methods were found to have high overall accuracy (> 98%), with KRLMM consistently amongst the best. At low minor allele frequency, the KRLMM, OptiCall and GenoSNP algorithms were observed to be consistently more accurate than GenCall and Illuminus on our test data. CONCLUSIONS: Methods that tailor their approach to calling low frequency variants by either varying the number of clusters (KRLMM) or using information from other SNPs (OptiCall and GenoSNP) offer improved accuracy over methods that do not (GenCall and Illuminus). The KRLMM algorithm is implemented in the open-source crlmm package distributed via the Bioconductor project (http://www.bioconductor.org).
format Online
Article
Text
id pubmed-4064501
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40645012014-06-27 KRLMM: an adaptive genotype calling method for common and low frequency variants Liu, Ruijie Dai, Zhiyin Yeager, Meredith Irizarry, Rafael A Ritchie, Matthew E BMC Bioinformatics Methodology Article BACKGROUND: SNP genotyping microarrays have revolutionized the study of complex disease. The current range of commercially available genotyping products contain extensive catalogues of low frequency and rare variants. Existing SNP calling algorithms have difficulty dealing with these low frequency variants, as the underlying models rely on each genotype having a reasonable number of observations to ensure accurate clustering. RESULTS: Here we develop KRLMM, a new method for converting raw intensities into genotype calls that aims to overcome this issue. Our method is unique in that it applies careful between sample normalization and allows a variable number of clusters k (1, 2 or 3) for each SNP, where k is predicted using the available data. We compare our method to four genotyping algorithms (GenCall, GenoSNP, Illuminus and OptiCall) on several Illumina data sets that include samples from the HapMap project where the true genotypes are known in advance. All methods were found to have high overall accuracy (> 98%), with KRLMM consistently amongst the best. At low minor allele frequency, the KRLMM, OptiCall and GenoSNP algorithms were observed to be consistently more accurate than GenCall and Illuminus on our test data. CONCLUSIONS: Methods that tailor their approach to calling low frequency variants by either varying the number of clusters (KRLMM) or using information from other SNPs (OptiCall and GenoSNP) offer improved accuracy over methods that do not (GenCall and Illuminus). The KRLMM algorithm is implemented in the open-source crlmm package distributed via the Bioconductor project (http://www.bioconductor.org). BioMed Central 2014-05-23 /pmc/articles/PMC4064501/ /pubmed/24886250 http://dx.doi.org/10.1186/1471-2105-15-158 Text en Copyright © 2014 Liu et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Liu, Ruijie
Dai, Zhiyin
Yeager, Meredith
Irizarry, Rafael A
Ritchie, Matthew E
KRLMM: an adaptive genotype calling method for common and low frequency variants
title KRLMM: an adaptive genotype calling method for common and low frequency variants
title_full KRLMM: an adaptive genotype calling method for common and low frequency variants
title_fullStr KRLMM: an adaptive genotype calling method for common and low frequency variants
title_full_unstemmed KRLMM: an adaptive genotype calling method for common and low frequency variants
title_short KRLMM: an adaptive genotype calling method for common and low frequency variants
title_sort krlmm: an adaptive genotype calling method for common and low frequency variants
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4064501/
https://www.ncbi.nlm.nih.gov/pubmed/24886250
http://dx.doi.org/10.1186/1471-2105-15-158
work_keys_str_mv AT liuruijie krlmmanadaptivegenotypecallingmethodforcommonandlowfrequencyvariants
AT daizhiyin krlmmanadaptivegenotypecallingmethodforcommonandlowfrequencyvariants
AT yeagermeredith krlmmanadaptivegenotypecallingmethodforcommonandlowfrequencyvariants
AT irizarryrafaela krlmmanadaptivegenotypecallingmethodforcommonandlowfrequencyvariants
AT ritchiematthewe krlmmanadaptivegenotypecallingmethodforcommonandlowfrequencyvariants