Cargando…

Linkage Disequilibrium-Based Quality Control for Large-Scale Genetic Studies

Quality control (QC) is a critical step in large-scale studies of genetic variation. While, on average, high-throughput single nucleotide polymorphism (SNP) genotyping assays are now very accurate, the errors that remain tend to cluster into a small percentage of “problem” SNPs, which exhibit unusua...

Descripción completa

Detalles Bibliográficos
Autores principales: Scheet, Paul, Stephens, Matthew
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2475504/
https://www.ncbi.nlm.nih.gov/pubmed/18670630
http://dx.doi.org/10.1371/journal.pgen.1000147
_version_ 1782157551113076736
author Scheet, Paul
Stephens, Matthew
author_facet Scheet, Paul
Stephens, Matthew
author_sort Scheet, Paul
collection PubMed
description Quality control (QC) is a critical step in large-scale studies of genetic variation. While, on average, high-throughput single nucleotide polymorphism (SNP) genotyping assays are now very accurate, the errors that remain tend to cluster into a small percentage of “problem” SNPs, which exhibit unusually high error rates. Because most large-scale studies of genetic variation are searching for phenomena that are rare (e.g., SNPs associated with a phenotype), even this small percentage of problem SNPs can cause important practical problems. Here we describe and illustrate how patterns of linkage disequilibrium (LD) can be used to improve QC in large-scale, population-based studies. This approach has the advantage over existing filters (e.g., HWE or call rate) that it can actually reduce genotyping error rates by automatically correcting some genotyping errors. Applying this LD-based QC procedure to data from The International HapMap Project, we identify over 1,500 SNPs that likely have high error rates in the CHB and JPT samples and estimate corrected genotypes. Our method is implemented in the software package fastPHASE, available from the Stephens Lab website (http://stephenslab.uchicago.edu/software.html).
format Text
id pubmed-2475504
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-24755042008-08-01 Linkage Disequilibrium-Based Quality Control for Large-Scale Genetic Studies Scheet, Paul Stephens, Matthew PLoS Genet Research Article Quality control (QC) is a critical step in large-scale studies of genetic variation. While, on average, high-throughput single nucleotide polymorphism (SNP) genotyping assays are now very accurate, the errors that remain tend to cluster into a small percentage of “problem” SNPs, which exhibit unusually high error rates. Because most large-scale studies of genetic variation are searching for phenomena that are rare (e.g., SNPs associated with a phenotype), even this small percentage of problem SNPs can cause important practical problems. Here we describe and illustrate how patterns of linkage disequilibrium (LD) can be used to improve QC in large-scale, population-based studies. This approach has the advantage over existing filters (e.g., HWE or call rate) that it can actually reduce genotyping error rates by automatically correcting some genotyping errors. Applying this LD-based QC procedure to data from The International HapMap Project, we identify over 1,500 SNPs that likely have high error rates in the CHB and JPT samples and estimate corrected genotypes. Our method is implemented in the software package fastPHASE, available from the Stephens Lab website (http://stephenslab.uchicago.edu/software.html). Public Library of Science 2008-08-01 /pmc/articles/PMC2475504/ /pubmed/18670630 http://dx.doi.org/10.1371/journal.pgen.1000147 Text en Scheet, Stephens. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Scheet, Paul
Stephens, Matthew
Linkage Disequilibrium-Based Quality Control for Large-Scale Genetic Studies
title Linkage Disequilibrium-Based Quality Control for Large-Scale Genetic Studies
title_full Linkage Disequilibrium-Based Quality Control for Large-Scale Genetic Studies
title_fullStr Linkage Disequilibrium-Based Quality Control for Large-Scale Genetic Studies
title_full_unstemmed Linkage Disequilibrium-Based Quality Control for Large-Scale Genetic Studies
title_short Linkage Disequilibrium-Based Quality Control for Large-Scale Genetic Studies
title_sort linkage disequilibrium-based quality control for large-scale genetic studies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2475504/
https://www.ncbi.nlm.nih.gov/pubmed/18670630
http://dx.doi.org/10.1371/journal.pgen.1000147
work_keys_str_mv AT scheetpaul linkagedisequilibriumbasedqualitycontrolforlargescalegeneticstudies
AT stephensmatthew linkagedisequilibriumbasedqualitycontrolforlargescalegeneticstudies