Cargando…

Analyses and Comparison of Accuracy of Different Genotype Imputation Methods

The power of genetic association analyses is often compromised by missing genotypic data which contributes to lack of significant findings, e.g., in in silico replication studies. One solution is to impute untyped SNPs from typed flanking markers, based on known linkage disequilibrium (LD) relations...

Descripción completa

Detalles Bibliográficos
Autores principales: Pei, Yu-Fang, Li, Jian, Zhang, Lei, Papasian, Christopher J., Deng, Hong-Wen
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2569208/
https://www.ncbi.nlm.nih.gov/pubmed/18958166
http://dx.doi.org/10.1371/journal.pone.0003551
_version_ 1782160081108860928
author Pei, Yu-Fang
Li, Jian
Zhang, Lei
Papasian, Christopher J.
Deng, Hong-Wen
author_facet Pei, Yu-Fang
Li, Jian
Zhang, Lei
Papasian, Christopher J.
Deng, Hong-Wen
author_sort Pei, Yu-Fang
collection PubMed
description The power of genetic association analyses is often compromised by missing genotypic data which contributes to lack of significant findings, e.g., in in silico replication studies. One solution is to impute untyped SNPs from typed flanking markers, based on known linkage disequilibrium (LD) relationships. Several imputation methods are available and their usefulness in association studies has been demonstrated, but factors affecting their relative performance in accuracy have not been systematically investigated. Therefore, we investigated and compared the performance of five popular genotype imputation methods, MACH, IMPUTE, fastPHASE, PLINK and Beagle, to assess and compare the effects of factors that affect imputation accuracy rates (ARs). Our results showed that a stronger LD and a lower MAF for an untyped marker produced better ARs for all the five methods. We also observed that a greater number of haplotypes in the reference sample resulted in higher ARs for MACH, IMPUTE, PLINK and Beagle, but had little influence on the ARs for fastPHASE. In general, MACH and IMPUTE produced similar results and these two methods consistently outperformed fastPHASE, PLINK and Beagle. Our study is helpful in guiding application of imputation methods in association analyses when genotype data are missing.
format Text
id pubmed-2569208
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-25692082008-10-29 Analyses and Comparison of Accuracy of Different Genotype Imputation Methods Pei, Yu-Fang Li, Jian Zhang, Lei Papasian, Christopher J. Deng, Hong-Wen PLoS One Research Article The power of genetic association analyses is often compromised by missing genotypic data which contributes to lack of significant findings, e.g., in in silico replication studies. One solution is to impute untyped SNPs from typed flanking markers, based on known linkage disequilibrium (LD) relationships. Several imputation methods are available and their usefulness in association studies has been demonstrated, but factors affecting their relative performance in accuracy have not been systematically investigated. Therefore, we investigated and compared the performance of five popular genotype imputation methods, MACH, IMPUTE, fastPHASE, PLINK and Beagle, to assess and compare the effects of factors that affect imputation accuracy rates (ARs). Our results showed that a stronger LD and a lower MAF for an untyped marker produced better ARs for all the five methods. We also observed that a greater number of haplotypes in the reference sample resulted in higher ARs for MACH, IMPUTE, PLINK and Beagle, but had little influence on the ARs for fastPHASE. In general, MACH and IMPUTE produced similar results and these two methods consistently outperformed fastPHASE, PLINK and Beagle. Our study is helpful in guiding application of imputation methods in association analyses when genotype data are missing. Public Library of Science 2008-10-29 /pmc/articles/PMC2569208/ /pubmed/18958166 http://dx.doi.org/10.1371/journal.pone.0003551 Text en Pei et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Pei, Yu-Fang
Li, Jian
Zhang, Lei
Papasian, Christopher J.
Deng, Hong-Wen
Analyses and Comparison of Accuracy of Different Genotype Imputation Methods
title Analyses and Comparison of Accuracy of Different Genotype Imputation Methods
title_full Analyses and Comparison of Accuracy of Different Genotype Imputation Methods
title_fullStr Analyses and Comparison of Accuracy of Different Genotype Imputation Methods
title_full_unstemmed Analyses and Comparison of Accuracy of Different Genotype Imputation Methods
title_short Analyses and Comparison of Accuracy of Different Genotype Imputation Methods
title_sort analyses and comparison of accuracy of different genotype imputation methods
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2569208/
https://www.ncbi.nlm.nih.gov/pubmed/18958166
http://dx.doi.org/10.1371/journal.pone.0003551
work_keys_str_mv AT peiyufang analysesandcomparisonofaccuracyofdifferentgenotypeimputationmethods
AT lijian analysesandcomparisonofaccuracyofdifferentgenotypeimputationmethods
AT zhanglei analysesandcomparisonofaccuracyofdifferentgenotypeimputationmethods
AT papasianchristopherj analysesandcomparisonofaccuracyofdifferentgenotypeimputationmethods
AT denghongwen analysesandcomparisonofaccuracyofdifferentgenotypeimputationmethods