Cargando…

Validation of genotype imputation in Southeast Asian populations and the effect of single nucleotide polymorphism annotation on imputation outcome

BACKGROUND: Imputation involves the inference of untyped single nucleotide polymorphisms (SNPs) in genome-wide association studies. The haplotypic reference of choice for imputation in Southeast Asian populations is unclear. Moreover, the influence of SNP annotation on imputation results has not bee...

Descripción completa

Detalles Bibliográficos
Autores principales: Lert-itthiporn, Worachart, Suktitipat, Bhoom, Grove, Harald, Sakuntabhai, Anavaj, Malasit, Prida, Tangthawornchaikul, Nattaya, Matsuda, Fumihiko, Suriyaphol, Prapat
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5812212/
https://www.ncbi.nlm.nih.gov/pubmed/29439659
http://dx.doi.org/10.1186/s12881-018-0534-8
_version_ 1783300004503355392
author Lert-itthiporn, Worachart
Suktitipat, Bhoom
Grove, Harald
Sakuntabhai, Anavaj
Malasit, Prida
Tangthawornchaikul, Nattaya
Matsuda, Fumihiko
Suriyaphol, Prapat
author_facet Lert-itthiporn, Worachart
Suktitipat, Bhoom
Grove, Harald
Sakuntabhai, Anavaj
Malasit, Prida
Tangthawornchaikul, Nattaya
Matsuda, Fumihiko
Suriyaphol, Prapat
author_sort Lert-itthiporn, Worachart
collection PubMed
description BACKGROUND: Imputation involves the inference of untyped single nucleotide polymorphisms (SNPs) in genome-wide association studies. The haplotypic reference of choice for imputation in Southeast Asian populations is unclear. Moreover, the influence of SNP annotation on imputation results has not been examined. METHODS: This study was divided into two parts. In the first part, we applied imputation to genotyped SNPs from Southeast Asian populations from the Pan-Asian SNP database. Five percent of the total SNPs were removed. The remaining SNPs were applied to imputation with IMPUTE2. The imputed outcomes were verified with the removed SNPs. We compared imputation references from Chinese and Japanese haplotypes from the HapMap phase II (HMII) and the complete set of haplotypes from the 1000 Genomes Project (1000G). The second part was imputation accuracy and yield in Thai patient dataset. Half of the autosomal SNPs was removed to create Set 1. Another dataset, Set 2, was then created where we switched which half of the SNPs were removed. Both Set 1 and Set 2 were imputed with HMII to create a complete imputed SNPs dataset. The dataset was used to validate association testing, SNPs annotation and imputation outcome. RESULTS: The accuracy was highest for all populations when using the HMII reference, but at the cost of a lower yield. Thai genotypes showed the highest accuracy over other populations in both HMII and 1000G panels, although accuracy and yield varied across chromosomes. Imputation was tested in a clinical dataset to compare accuracy in gene-related regions, and coding regions were found to have a higher accuracy and yield. CONCLUSIONS: This work provides the first evidence of imputation reference selection for Southeast Asian studies and highlights the effects of SNP locations respective to genes on imputation outcome. Researchers will need to consider the trade-off between accuracy and yield in future imputation studies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12881-018-0534-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5812212
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-58122122018-02-15 Validation of genotype imputation in Southeast Asian populations and the effect of single nucleotide polymorphism annotation on imputation outcome Lert-itthiporn, Worachart Suktitipat, Bhoom Grove, Harald Sakuntabhai, Anavaj Malasit, Prida Tangthawornchaikul, Nattaya Matsuda, Fumihiko Suriyaphol, Prapat BMC Med Genet Research Article BACKGROUND: Imputation involves the inference of untyped single nucleotide polymorphisms (SNPs) in genome-wide association studies. The haplotypic reference of choice for imputation in Southeast Asian populations is unclear. Moreover, the influence of SNP annotation on imputation results has not been examined. METHODS: This study was divided into two parts. In the first part, we applied imputation to genotyped SNPs from Southeast Asian populations from the Pan-Asian SNP database. Five percent of the total SNPs were removed. The remaining SNPs were applied to imputation with IMPUTE2. The imputed outcomes were verified with the removed SNPs. We compared imputation references from Chinese and Japanese haplotypes from the HapMap phase II (HMII) and the complete set of haplotypes from the 1000 Genomes Project (1000G). The second part was imputation accuracy and yield in Thai patient dataset. Half of the autosomal SNPs was removed to create Set 1. Another dataset, Set 2, was then created where we switched which half of the SNPs were removed. Both Set 1 and Set 2 were imputed with HMII to create a complete imputed SNPs dataset. The dataset was used to validate association testing, SNPs annotation and imputation outcome. RESULTS: The accuracy was highest for all populations when using the HMII reference, but at the cost of a lower yield. Thai genotypes showed the highest accuracy over other populations in both HMII and 1000G panels, although accuracy and yield varied across chromosomes. Imputation was tested in a clinical dataset to compare accuracy in gene-related regions, and coding regions were found to have a higher accuracy and yield. CONCLUSIONS: This work provides the first evidence of imputation reference selection for Southeast Asian studies and highlights the effects of SNP locations respective to genes on imputation outcome. Researchers will need to consider the trade-off between accuracy and yield in future imputation studies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12881-018-0534-8) contains supplementary material, which is available to authorized users. BioMed Central 2018-02-13 /pmc/articles/PMC5812212/ /pubmed/29439659 http://dx.doi.org/10.1186/s12881-018-0534-8 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Lert-itthiporn, Worachart
Suktitipat, Bhoom
Grove, Harald
Sakuntabhai, Anavaj
Malasit, Prida
Tangthawornchaikul, Nattaya
Matsuda, Fumihiko
Suriyaphol, Prapat
Validation of genotype imputation in Southeast Asian populations and the effect of single nucleotide polymorphism annotation on imputation outcome
title Validation of genotype imputation in Southeast Asian populations and the effect of single nucleotide polymorphism annotation on imputation outcome
title_full Validation of genotype imputation in Southeast Asian populations and the effect of single nucleotide polymorphism annotation on imputation outcome
title_fullStr Validation of genotype imputation in Southeast Asian populations and the effect of single nucleotide polymorphism annotation on imputation outcome
title_full_unstemmed Validation of genotype imputation in Southeast Asian populations and the effect of single nucleotide polymorphism annotation on imputation outcome
title_short Validation of genotype imputation in Southeast Asian populations and the effect of single nucleotide polymorphism annotation on imputation outcome
title_sort validation of genotype imputation in southeast asian populations and the effect of single nucleotide polymorphism annotation on imputation outcome
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5812212/
https://www.ncbi.nlm.nih.gov/pubmed/29439659
http://dx.doi.org/10.1186/s12881-018-0534-8
work_keys_str_mv AT lertitthipornworachart validationofgenotypeimputationinsoutheastasianpopulationsandtheeffectofsinglenucleotidepolymorphismannotationonimputationoutcome
AT suktitipatbhoom validationofgenotypeimputationinsoutheastasianpopulationsandtheeffectofsinglenucleotidepolymorphismannotationonimputationoutcome
AT groveharald validationofgenotypeimputationinsoutheastasianpopulationsandtheeffectofsinglenucleotidepolymorphismannotationonimputationoutcome
AT sakuntabhaianavaj validationofgenotypeimputationinsoutheastasianpopulationsandtheeffectofsinglenucleotidepolymorphismannotationonimputationoutcome
AT malasitprida validationofgenotypeimputationinsoutheastasianpopulationsandtheeffectofsinglenucleotidepolymorphismannotationonimputationoutcome
AT tangthawornchaikulnattaya validationofgenotypeimputationinsoutheastasianpopulationsandtheeffectofsinglenucleotidepolymorphismannotationonimputationoutcome
AT matsudafumihiko validationofgenotypeimputationinsoutheastasianpopulationsandtheeffectofsinglenucleotidepolymorphismannotationonimputationoutcome
AT suriyapholprapat validationofgenotypeimputationinsoutheastasianpopulationsandtheeffectofsinglenucleotidepolymorphismannotationonimputationoutcome