Cargando…

Analyzing the Korean reference genome with meta-imputation increased the imputation accuracy and spectrum of rare variants in the Korean population

Genotype imputation is essential for enhancing the power of association-mapping and discovering rare and indels that are missed by most genotyping arrays. Imputation analysis can be more accurate with a population-specific reference panel or a multi-ethnic reference panel with numerous samples. The...

Descripción completa

Detalles Bibliográficos
Autores principales: Hwang, Mi Yeong, Choi, Nak-Hyeon, Won, Hong Hee, Kim, Bong-Jo, Kim, Young Jin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9731225/
https://www.ncbi.nlm.nih.gov/pubmed/36506321
http://dx.doi.org/10.3389/fgene.2022.1008646
_version_ 1784845863442448384
author Hwang, Mi Yeong
Choi, Nak-Hyeon
Won, Hong Hee
Kim, Bong-Jo
Kim, Young Jin
author_facet Hwang, Mi Yeong
Choi, Nak-Hyeon
Won, Hong Hee
Kim, Bong-Jo
Kim, Young Jin
author_sort Hwang, Mi Yeong
collection PubMed
description Genotype imputation is essential for enhancing the power of association-mapping and discovering rare and indels that are missed by most genotyping arrays. Imputation analysis can be more accurate with a population-specific reference panel or a multi-ethnic reference panel with numerous samples. The National Institute of Health, Republic of Korea, initiated the Korean Reference Genome (KRG) project to identify variants in whole-genome sequences of ∼20,000 Korean participants. In the pilot phase, we analyzed the data from 1,490 participants. The genetic characteristics and imputation performance of the KRG were compared with those of the 1,000 Genomes Project Phase 3, GenomeAsia 100K Project, ChinaMAP, NARD, and TOPMed reference panels. For comparison analysis, genotype panels were artificially generated using whole-genome sequencing data from combinations of four different ancestries (Korean, Japanese, Chinese, and European) and two population-specific optimized microarrays (Korea Biobank Array and UK Biobank Array). The KRG reference panel performed best for the Korean population (R (2) = 0.78–0.84, percentage of well-imputed is 91.9% for allele frequency >5%), although the other reference panels comprised a larger number of samples with genetically different background. By comparing multiple reference panels and multi-ethnic genotype panels, optimal imputation was obtained using reference panels from genetically related populations and a population-optimized microarray. Indeed, the reference panels of KRG and TOPMed showed the best performance when applied to the genotype panels of KBA (R (2) = 0.84) and UKB (R (2) = 0.87), respectively. Using a meta-imputation approach to merge imputation results from different reference panels increased the imputation accuracy for rare variants (∼7%) and provided additional well-imputed variants (∼20%) with comparable imputation accuracy to that of the KRG. Our results demonstrate the importance of using a population-specific reference panel and meta-imputation to assess a substantial number of accurately imputed rare variants.
format Online
Article
Text
id pubmed-9731225
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-97312252022-12-09 Analyzing the Korean reference genome with meta-imputation increased the imputation accuracy and spectrum of rare variants in the Korean population Hwang, Mi Yeong Choi, Nak-Hyeon Won, Hong Hee Kim, Bong-Jo Kim, Young Jin Front Genet Genetics Genotype imputation is essential for enhancing the power of association-mapping and discovering rare and indels that are missed by most genotyping arrays. Imputation analysis can be more accurate with a population-specific reference panel or a multi-ethnic reference panel with numerous samples. The National Institute of Health, Republic of Korea, initiated the Korean Reference Genome (KRG) project to identify variants in whole-genome sequences of ∼20,000 Korean participants. In the pilot phase, we analyzed the data from 1,490 participants. The genetic characteristics and imputation performance of the KRG were compared with those of the 1,000 Genomes Project Phase 3, GenomeAsia 100K Project, ChinaMAP, NARD, and TOPMed reference panels. For comparison analysis, genotype panels were artificially generated using whole-genome sequencing data from combinations of four different ancestries (Korean, Japanese, Chinese, and European) and two population-specific optimized microarrays (Korea Biobank Array and UK Biobank Array). The KRG reference panel performed best for the Korean population (R (2) = 0.78–0.84, percentage of well-imputed is 91.9% for allele frequency >5%), although the other reference panels comprised a larger number of samples with genetically different background. By comparing multiple reference panels and multi-ethnic genotype panels, optimal imputation was obtained using reference panels from genetically related populations and a population-optimized microarray. Indeed, the reference panels of KRG and TOPMed showed the best performance when applied to the genotype panels of KBA (R (2) = 0.84) and UKB (R (2) = 0.87), respectively. Using a meta-imputation approach to merge imputation results from different reference panels increased the imputation accuracy for rare variants (∼7%) and provided additional well-imputed variants (∼20%) with comparable imputation accuracy to that of the KRG. Our results demonstrate the importance of using a population-specific reference panel and meta-imputation to assess a substantial number of accurately imputed rare variants. Frontiers Media S.A. 2022-11-24 /pmc/articles/PMC9731225/ /pubmed/36506321 http://dx.doi.org/10.3389/fgene.2022.1008646 Text en Copyright © 2022 Hwang, Choi, Won, Kim and Kim. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Hwang, Mi Yeong
Choi, Nak-Hyeon
Won, Hong Hee
Kim, Bong-Jo
Kim, Young Jin
Analyzing the Korean reference genome with meta-imputation increased the imputation accuracy and spectrum of rare variants in the Korean population
title Analyzing the Korean reference genome with meta-imputation increased the imputation accuracy and spectrum of rare variants in the Korean population
title_full Analyzing the Korean reference genome with meta-imputation increased the imputation accuracy and spectrum of rare variants in the Korean population
title_fullStr Analyzing the Korean reference genome with meta-imputation increased the imputation accuracy and spectrum of rare variants in the Korean population
title_full_unstemmed Analyzing the Korean reference genome with meta-imputation increased the imputation accuracy and spectrum of rare variants in the Korean population
title_short Analyzing the Korean reference genome with meta-imputation increased the imputation accuracy and spectrum of rare variants in the Korean population
title_sort analyzing the korean reference genome with meta-imputation increased the imputation accuracy and spectrum of rare variants in the korean population
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9731225/
https://www.ncbi.nlm.nih.gov/pubmed/36506321
http://dx.doi.org/10.3389/fgene.2022.1008646
work_keys_str_mv AT hwangmiyeong analyzingthekoreanreferencegenomewithmetaimputationincreasedtheimputationaccuracyandspectrumofrarevariantsinthekoreanpopulation
AT choinakhyeon analyzingthekoreanreferencegenomewithmetaimputationincreasedtheimputationaccuracyandspectrumofrarevariantsinthekoreanpopulation
AT wonhonghee analyzingthekoreanreferencegenomewithmetaimputationincreasedtheimputationaccuracyandspectrumofrarevariantsinthekoreanpopulation
AT kimbongjo analyzingthekoreanreferencegenomewithmetaimputationincreasedtheimputationaccuracyandspectrumofrarevariantsinthekoreanpopulation
AT kimyoungjin analyzingthekoreanreferencegenomewithmetaimputationincreasedtheimputationaccuracyandspectrumofrarevariantsinthekoreanpopulation