Cargando…

Assessment of Genotype Imputation Performance Using 1000 Genomes in African American Studies

Genotype imputation, used in genome-wide association studies to expand coverage of single nucleotide polymorphisms (SNPs), has performed poorly in African Americans compared to less admixed populations. Overall, imputation has typically relied on HapMap reference haplotype panels from Africans (YRI)...

Descripción completa

Detalles Bibliográficos
Autores principales: Hancock, Dana B., Levy, Joshua L., Gaddis, Nathan C., Bierut, Laura J., Saccone, Nancy L., Page, Grier P., Johnson, Eric O.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3511547/
https://www.ncbi.nlm.nih.gov/pubmed/23226329
http://dx.doi.org/10.1371/journal.pone.0050610
_version_ 1782251634084020224
author Hancock, Dana B.
Levy, Joshua L.
Gaddis, Nathan C.
Bierut, Laura J.
Saccone, Nancy L.
Page, Grier P.
Johnson, Eric O.
author_facet Hancock, Dana B.
Levy, Joshua L.
Gaddis, Nathan C.
Bierut, Laura J.
Saccone, Nancy L.
Page, Grier P.
Johnson, Eric O.
author_sort Hancock, Dana B.
collection PubMed
description Genotype imputation, used in genome-wide association studies to expand coverage of single nucleotide polymorphisms (SNPs), has performed poorly in African Americans compared to less admixed populations. Overall, imputation has typically relied on HapMap reference haplotype panels from Africans (YRI), European Americans (CEU), and Asians (CHB/JPT). The 1000 Genomes project offers a wider range of reference populations, such as African Americans (ASW), but their imputation performance has had limited evaluation. Using 595 African Americans genotyped on Illumina’s HumanHap550v3 BeadChip, we compared imputation results from four software programs (IMPUTE2, BEAGLE, MaCH, and MaCH-Admix) and three reference panels consisting of different combinations of 1000 Genomes populations (February 2012 release): (1) 3 specifically selected populations (YRI, CEU, and ASW); (2) 8 populations of diverse African (AFR) or European (AFR) descent; and (3) all 14 available populations (ALL). Based on chromosome 22, we calculated three performance metrics: (1) concordance (percentage of masked genotyped SNPs with imputed and true genotype agreement); (2) imputation quality score (IQS; concordance adjusted for chance agreement, which is particularly informative for low minor allele frequency [MAF] SNPs); and (3) average r2hat (estimated correlation between the imputed and true genotypes, for all imputed SNPs). Across the reference panels, IMPUTE2 and MaCH had the highest concordance (91%–93%), but IMPUTE2 had the highest IQS (81%–83%) and average r2hat (0.68 using YRI+ASW+CEU, 0.62 using AFR+EUR, and 0.55 using ALL). Imputation quality for most programs was reduced by the addition of more distantly related reference populations, due entirely to the introduction of low frequency SNPs (MAF≤2%) that are monomorphic in the more closely related panels. While imputation was optimized by using IMPUTE2 with reference to the ALL panel (average r2hat = 0.86 for SNPs with MAF>2%), use of the ALL panel for African American studies requires careful interpretation of the population specificity and imputation quality of low frequency SNPs.
format Online
Article
Text
id pubmed-3511547
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-35115472012-12-05 Assessment of Genotype Imputation Performance Using 1000 Genomes in African American Studies Hancock, Dana B. Levy, Joshua L. Gaddis, Nathan C. Bierut, Laura J. Saccone, Nancy L. Page, Grier P. Johnson, Eric O. PLoS One Research Article Genotype imputation, used in genome-wide association studies to expand coverage of single nucleotide polymorphisms (SNPs), has performed poorly in African Americans compared to less admixed populations. Overall, imputation has typically relied on HapMap reference haplotype panels from Africans (YRI), European Americans (CEU), and Asians (CHB/JPT). The 1000 Genomes project offers a wider range of reference populations, such as African Americans (ASW), but their imputation performance has had limited evaluation. Using 595 African Americans genotyped on Illumina’s HumanHap550v3 BeadChip, we compared imputation results from four software programs (IMPUTE2, BEAGLE, MaCH, and MaCH-Admix) and three reference panels consisting of different combinations of 1000 Genomes populations (February 2012 release): (1) 3 specifically selected populations (YRI, CEU, and ASW); (2) 8 populations of diverse African (AFR) or European (AFR) descent; and (3) all 14 available populations (ALL). Based on chromosome 22, we calculated three performance metrics: (1) concordance (percentage of masked genotyped SNPs with imputed and true genotype agreement); (2) imputation quality score (IQS; concordance adjusted for chance agreement, which is particularly informative for low minor allele frequency [MAF] SNPs); and (3) average r2hat (estimated correlation between the imputed and true genotypes, for all imputed SNPs). Across the reference panels, IMPUTE2 and MaCH had the highest concordance (91%–93%), but IMPUTE2 had the highest IQS (81%–83%) and average r2hat (0.68 using YRI+ASW+CEU, 0.62 using AFR+EUR, and 0.55 using ALL). Imputation quality for most programs was reduced by the addition of more distantly related reference populations, due entirely to the introduction of low frequency SNPs (MAF≤2%) that are monomorphic in the more closely related panels. While imputation was optimized by using IMPUTE2 with reference to the ALL panel (average r2hat = 0.86 for SNPs with MAF>2%), use of the ALL panel for African American studies requires careful interpretation of the population specificity and imputation quality of low frequency SNPs. Public Library of Science 2012-11-30 /pmc/articles/PMC3511547/ /pubmed/23226329 http://dx.doi.org/10.1371/journal.pone.0050610 Text en © 2012 Hancock et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Hancock, Dana B.
Levy, Joshua L.
Gaddis, Nathan C.
Bierut, Laura J.
Saccone, Nancy L.
Page, Grier P.
Johnson, Eric O.
Assessment of Genotype Imputation Performance Using 1000 Genomes in African American Studies
title Assessment of Genotype Imputation Performance Using 1000 Genomes in African American Studies
title_full Assessment of Genotype Imputation Performance Using 1000 Genomes in African American Studies
title_fullStr Assessment of Genotype Imputation Performance Using 1000 Genomes in African American Studies
title_full_unstemmed Assessment of Genotype Imputation Performance Using 1000 Genomes in African American Studies
title_short Assessment of Genotype Imputation Performance Using 1000 Genomes in African American Studies
title_sort assessment of genotype imputation performance using 1000 genomes in african american studies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3511547/
https://www.ncbi.nlm.nih.gov/pubmed/23226329
http://dx.doi.org/10.1371/journal.pone.0050610
work_keys_str_mv AT hancockdanab assessmentofgenotypeimputationperformanceusing1000genomesinafricanamericanstudies
AT levyjoshual assessmentofgenotypeimputationperformanceusing1000genomesinafricanamericanstudies
AT gaddisnathanc assessmentofgenotypeimputationperformanceusing1000genomesinafricanamericanstudies
AT bierutlauraj assessmentofgenotypeimputationperformanceusing1000genomesinafricanamericanstudies
AT sacconenancyl assessmentofgenotypeimputationperformanceusing1000genomesinafricanamericanstudies
AT pagegrierp assessmentofgenotypeimputationperformanceusing1000genomesinafricanamericanstudies
AT johnsonerico assessmentofgenotypeimputationperformanceusing1000genomesinafricanamericanstudies