Cargando…

Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees

Genotype imputation is widely used in genome-wide association studies to boost variant density, allowing increased power in association testing. Many studies currently include pedigree data due to increasing interest in rare variants coupled with the availability of appropriate analysis tools. The p...

Descripción completa

Detalles Bibliográficos
Autores principales: Ullah, Ehsan, Mall, Raghvendra, Abbas, Mostafa M., Kunji, Khalid, Nato, Alejandro Q., Bensmail, Halima, Wijsman, Ellen M., Saad, Mohamad
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6314157/
https://www.ncbi.nlm.nih.gov/pubmed/30514702
http://dx.doi.org/10.1101/gr.236315.118
_version_ 1783384076517900288
author Ullah, Ehsan
Mall, Raghvendra
Abbas, Mostafa M.
Kunji, Khalid
Nato, Alejandro Q.
Bensmail, Halima
Wijsman, Ellen M.
Saad, Mohamad
author_facet Ullah, Ehsan
Mall, Raghvendra
Abbas, Mostafa M.
Kunji, Khalid
Nato, Alejandro Q.
Bensmail, Halima
Wijsman, Ellen M.
Saad, Mohamad
author_sort Ullah, Ehsan
collection PubMed
description Genotype imputation is widely used in genome-wide association studies to boost variant density, allowing increased power in association testing. Many studies currently include pedigree data due to increasing interest in rare variants coupled with the availability of appropriate analysis tools. The performance of population-based (subjects are unrelated) imputation methods is well established. However, the performance of family- and population-based imputation methods on family data has been subject to much less scrutiny. Here, we extensively compare several family- and population-based imputation methods on family data of large pedigrees with both European and African ancestry. Our comparison includes many widely used family- and population-based tools and another method, Ped_Pop, which combines family- and population-based imputation results. We also compare four subject selection strategies for full sequencing to serve as the reference panel for imputation: GIGI-Pick, ExomePicks, PRIMUS, and random selection. Moreover, we compare two imputation accuracy metrics: the Imputation Quality Score and Pearson's correlation R(2) for predicting power of association analysis using imputation results. Our results show that (1) GIGI outperforms Merlin; (2) family-based imputation outperforms population-based imputation for rare variants but not for common ones; (3) combining family- and population-based imputation outperforms all imputation approaches for all minor allele frequencies; (4) GIGI-Pick gives the best selection strategy based on the R(2) criterion; and (5) R(2) is the best measure of imputation accuracy. Our study is the first to extensively evaluate the imputation performance of many available family- and population-based tools on the same family data and provides guidelines for future studies.
format Online
Article
Text
id pubmed-6314157
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-63141572019-01-11 Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees Ullah, Ehsan Mall, Raghvendra Abbas, Mostafa M. Kunji, Khalid Nato, Alejandro Q. Bensmail, Halima Wijsman, Ellen M. Saad, Mohamad Genome Res Resource Genotype imputation is widely used in genome-wide association studies to boost variant density, allowing increased power in association testing. Many studies currently include pedigree data due to increasing interest in rare variants coupled with the availability of appropriate analysis tools. The performance of population-based (subjects are unrelated) imputation methods is well established. However, the performance of family- and population-based imputation methods on family data has been subject to much less scrutiny. Here, we extensively compare several family- and population-based imputation methods on family data of large pedigrees with both European and African ancestry. Our comparison includes many widely used family- and population-based tools and another method, Ped_Pop, which combines family- and population-based imputation results. We also compare four subject selection strategies for full sequencing to serve as the reference panel for imputation: GIGI-Pick, ExomePicks, PRIMUS, and random selection. Moreover, we compare two imputation accuracy metrics: the Imputation Quality Score and Pearson's correlation R(2) for predicting power of association analysis using imputation results. Our results show that (1) GIGI outperforms Merlin; (2) family-based imputation outperforms population-based imputation for rare variants but not for common ones; (3) combining family- and population-based imputation outperforms all imputation approaches for all minor allele frequencies; (4) GIGI-Pick gives the best selection strategy based on the R(2) criterion; and (5) R(2) is the best measure of imputation accuracy. Our study is the first to extensively evaluate the imputation performance of many available family- and population-based tools on the same family data and provides guidelines for future studies. Cold Spring Harbor Laboratory Press 2019-01 /pmc/articles/PMC6314157/ /pubmed/30514702 http://dx.doi.org/10.1101/gr.236315.118 Text en © 2019 Ullah et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by/4.0/ This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.
spellingShingle Resource
Ullah, Ehsan
Mall, Raghvendra
Abbas, Mostafa M.
Kunji, Khalid
Nato, Alejandro Q.
Bensmail, Halima
Wijsman, Ellen M.
Saad, Mohamad
Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees
title Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees
title_full Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees
title_fullStr Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees
title_full_unstemmed Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees
title_short Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees
title_sort comparison and assessment of family- and population-based genotype imputation methods in large pedigrees
topic Resource
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6314157/
https://www.ncbi.nlm.nih.gov/pubmed/30514702
http://dx.doi.org/10.1101/gr.236315.118
work_keys_str_mv AT ullahehsan comparisonandassessmentoffamilyandpopulationbasedgenotypeimputationmethodsinlargepedigrees
AT mallraghvendra comparisonandassessmentoffamilyandpopulationbasedgenotypeimputationmethodsinlargepedigrees
AT abbasmostafam comparisonandassessmentoffamilyandpopulationbasedgenotypeimputationmethodsinlargepedigrees
AT kunjikhalid comparisonandassessmentoffamilyandpopulationbasedgenotypeimputationmethodsinlargepedigrees
AT natoalejandroq comparisonandassessmentoffamilyandpopulationbasedgenotypeimputationmethodsinlargepedigrees
AT bensmailhalima comparisonandassessmentoffamilyandpopulationbasedgenotypeimputationmethodsinlargepedigrees
AT wijsmanellenm comparisonandassessmentoffamilyandpopulationbasedgenotypeimputationmethodsinlargepedigrees
AT saadmohamad comparisonandassessmentoffamilyandpopulationbasedgenotypeimputationmethodsinlargepedigrees