Cargando…

A Simple and Fast Two-Locus Quality Control Test to Detect False Positives Due to Batch Effects in Genome-Wide Association Studies

The impact of erroneous genotypes having passed standard quality control (QC) can be severe in genome-wide association studies, genotype imputation, and estimation of heritability and prediction of genetic risk based on single nucleotide polymorphisms (SNP). To detect such genotyping errors, a simpl...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Sang Hong, Nyholt, Dale R, Macgregor, Stuart, Henders, Anjali K, Zondervan, Krina T, Montgomery, Grant W, Visscher, Peter M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Wiley Subscription Services, Inc., A Wiley Company 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3674525/
https://www.ncbi.nlm.nih.gov/pubmed/21104888
http://dx.doi.org/10.1002/gepi.20541
_version_ 1782272380949757952
author Lee, Sang Hong
Nyholt, Dale R
Macgregor, Stuart
Henders, Anjali K
Zondervan, Krina T
Montgomery, Grant W
Visscher, Peter M
author_facet Lee, Sang Hong
Nyholt, Dale R
Macgregor, Stuart
Henders, Anjali K
Zondervan, Krina T
Montgomery, Grant W
Visscher, Peter M
author_sort Lee, Sang Hong
collection PubMed
description The impact of erroneous genotypes having passed standard quality control (QC) can be severe in genome-wide association studies, genotype imputation, and estimation of heritability and prediction of genetic risk based on single nucleotide polymorphisms (SNP). To detect such genotyping errors, a simple two-locus QC method, based on the difference in test statistic of association between single SNPs and pairs of SNPs, was developed and applied. The proposed approach could detect many problematic SNPs with statistical significance even when standard single SNP QC analyses fail to detect them in real data. Depending on the data set used, the number of erroneous SNPs that were not filtered out by standard single SNP QC but detected by the proposed approach varied from a few hundred to thousands. Using simulated data, it was shown that the proposed method was powerful and performed better than other tested existing methods. The power of the proposed approach to detect erroneous genotypes was ∼80% for a 3% error rate per SNP. This novel QC approach is easy to implement and computationally efficient, and can lead to a better quality of genotypes for subsequent genotype-phenotype investigations. Genet. Epidemiol. 34:854–862, 2010. © 2010 Wiley-Liss, Inc.
format Online
Article
Text
id pubmed-3674525
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Wiley Subscription Services, Inc., A Wiley Company
record_format MEDLINE/PubMed
spelling pubmed-36745252013-06-06 A Simple and Fast Two-Locus Quality Control Test to Detect False Positives Due to Batch Effects in Genome-Wide Association Studies Lee, Sang Hong Nyholt, Dale R Macgregor, Stuart Henders, Anjali K Zondervan, Krina T Montgomery, Grant W Visscher, Peter M Genet Epidemiol Original Articles The impact of erroneous genotypes having passed standard quality control (QC) can be severe in genome-wide association studies, genotype imputation, and estimation of heritability and prediction of genetic risk based on single nucleotide polymorphisms (SNP). To detect such genotyping errors, a simple two-locus QC method, based on the difference in test statistic of association between single SNPs and pairs of SNPs, was developed and applied. The proposed approach could detect many problematic SNPs with statistical significance even when standard single SNP QC analyses fail to detect them in real data. Depending on the data set used, the number of erroneous SNPs that were not filtered out by standard single SNP QC but detected by the proposed approach varied from a few hundred to thousands. Using simulated data, it was shown that the proposed method was powerful and performed better than other tested existing methods. The power of the proposed approach to detect erroneous genotypes was ∼80% for a 3% error rate per SNP. This novel QC approach is easy to implement and computationally efficient, and can lead to a better quality of genotypes for subsequent genotype-phenotype investigations. Genet. Epidemiol. 34:854–862, 2010. © 2010 Wiley-Liss, Inc. Wiley Subscription Services, Inc., A Wiley Company 2010-12 2010-11-18 /pmc/articles/PMC3674525/ /pubmed/21104888 http://dx.doi.org/10.1002/gepi.20541 Text en © 2010 Wiley-Liss, Inc. http://creativecommons.org/licenses/by/2.5/ Re-use of this article is permitted in accordance with the Creative Commons Deed, Attribution 2.5, which does not permit commercial exploitation.
spellingShingle Original Articles
Lee, Sang Hong
Nyholt, Dale R
Macgregor, Stuart
Henders, Anjali K
Zondervan, Krina T
Montgomery, Grant W
Visscher, Peter M
A Simple and Fast Two-Locus Quality Control Test to Detect False Positives Due to Batch Effects in Genome-Wide Association Studies
title A Simple and Fast Two-Locus Quality Control Test to Detect False Positives Due to Batch Effects in Genome-Wide Association Studies
title_full A Simple and Fast Two-Locus Quality Control Test to Detect False Positives Due to Batch Effects in Genome-Wide Association Studies
title_fullStr A Simple and Fast Two-Locus Quality Control Test to Detect False Positives Due to Batch Effects in Genome-Wide Association Studies
title_full_unstemmed A Simple and Fast Two-Locus Quality Control Test to Detect False Positives Due to Batch Effects in Genome-Wide Association Studies
title_short A Simple and Fast Two-Locus Quality Control Test to Detect False Positives Due to Batch Effects in Genome-Wide Association Studies
title_sort simple and fast two-locus quality control test to detect false positives due to batch effects in genome-wide association studies
topic Original Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3674525/
https://www.ncbi.nlm.nih.gov/pubmed/21104888
http://dx.doi.org/10.1002/gepi.20541
work_keys_str_mv AT leesanghong asimpleandfasttwolocusqualitycontroltesttodetectfalsepositivesduetobatcheffectsingenomewideassociationstudies
AT nyholtdaler asimpleandfasttwolocusqualitycontroltesttodetectfalsepositivesduetobatcheffectsingenomewideassociationstudies
AT macgregorstuart asimpleandfasttwolocusqualitycontroltesttodetectfalsepositivesduetobatcheffectsingenomewideassociationstudies
AT hendersanjalik asimpleandfasttwolocusqualitycontroltesttodetectfalsepositivesduetobatcheffectsingenomewideassociationstudies
AT zondervankrinat asimpleandfasttwolocusqualitycontroltesttodetectfalsepositivesduetobatcheffectsingenomewideassociationstudies
AT montgomerygrantw asimpleandfasttwolocusqualitycontroltesttodetectfalsepositivesduetobatcheffectsingenomewideassociationstudies
AT visscherpeterm asimpleandfasttwolocusqualitycontroltesttodetectfalsepositivesduetobatcheffectsingenomewideassociationstudies
AT leesanghong simpleandfasttwolocusqualitycontroltesttodetectfalsepositivesduetobatcheffectsingenomewideassociationstudies
AT nyholtdaler simpleandfasttwolocusqualitycontroltesttodetectfalsepositivesduetobatcheffectsingenomewideassociationstudies
AT macgregorstuart simpleandfasttwolocusqualitycontroltesttodetectfalsepositivesduetobatcheffectsingenomewideassociationstudies
AT hendersanjalik simpleandfasttwolocusqualitycontroltesttodetectfalsepositivesduetobatcheffectsingenomewideassociationstudies
AT zondervankrinat simpleandfasttwolocusqualitycontroltesttodetectfalsepositivesduetobatcheffectsingenomewideassociationstudies
AT montgomerygrantw simpleandfasttwolocusqualitycontroltesttodetectfalsepositivesduetobatcheffectsingenomewideassociationstudies
AT visscherpeterm simpleandfasttwolocusqualitycontroltesttodetectfalsepositivesduetobatcheffectsingenomewideassociationstudies