Cargando…

Evaluation and application of summary statistic imputation to discover new height-associated loci

As most of the heritability of complex traits is attributed to common and low frequency genetic variants, imputing them by combining genotyping chips and large sequenced reference panels is the most cost-effective approach to discover the genetic basis of these traits. Association summary statistics...

Descripción completa

Detalles Bibliográficos
Autores principales: Rüeger, Sina, McDaid, Aaron, Kutalik, Zoltán
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5983877/
https://www.ncbi.nlm.nih.gov/pubmed/29782485
http://dx.doi.org/10.1371/journal.pgen.1007371
_version_ 1783328520692301824
author Rüeger, Sina
McDaid, Aaron
Kutalik, Zoltán
author_facet Rüeger, Sina
McDaid, Aaron
Kutalik, Zoltán
author_sort Rüeger, Sina
collection PubMed
description As most of the heritability of complex traits is attributed to common and low frequency genetic variants, imputing them by combining genotyping chips and large sequenced reference panels is the most cost-effective approach to discover the genetic basis of these traits. Association summary statistics from genome-wide meta-analyses are available for hundreds of traits. Updating these to ever-increasing reference panels is very cumbersome as it requires reimputation of the genetic data, rerunning the association scan, and meta-analysing the results. A much more efficient method is to directly impute the summary statistics, termed as summary statistics imputation, which we improved to accommodate variable sample size across SNVs. Its performance relative to genotype imputation and practical utility has not yet been fully investigated. To this end, we compared the two approaches on real (genotyped and imputed) data from 120K samples from the UK Biobank and show that, genotype imputation boasts a 3- to 5-fold lower root-mean-square error, and better distinguishes true associations from null ones: We observed the largest differences in power for variants with low minor allele frequency and low imputation quality. For fixed false positive rates of 0.001, 0.01, 0.05, using summary statistics imputation yielded a decrease in statistical power by 9, 43 and 35%, respectively. To test its capacity to discover novel associations, we applied summary statistics imputation to the GIANT height meta-analysis summary statistics covering HapMap variants, and identified 34 novel loci, 19 of which replicated using data in the UK Biobank. Additionally, we successfully replicated 55 out of the 111 variants published in an exome chip study. Our study demonstrates that summary statistics imputation is a very efficient and cost-effective way to identify and fine-map trait-associated loci. Moreover, the ability to impute summary statistics is important for follow-up analyses, such as Mendelian randomisation or LD-score regression.
format Online
Article
Text
id pubmed-5983877
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-59838772018-06-17 Evaluation and application of summary statistic imputation to discover new height-associated loci Rüeger, Sina McDaid, Aaron Kutalik, Zoltán PLoS Genet Research Article As most of the heritability of complex traits is attributed to common and low frequency genetic variants, imputing them by combining genotyping chips and large sequenced reference panels is the most cost-effective approach to discover the genetic basis of these traits. Association summary statistics from genome-wide meta-analyses are available for hundreds of traits. Updating these to ever-increasing reference panels is very cumbersome as it requires reimputation of the genetic data, rerunning the association scan, and meta-analysing the results. A much more efficient method is to directly impute the summary statistics, termed as summary statistics imputation, which we improved to accommodate variable sample size across SNVs. Its performance relative to genotype imputation and practical utility has not yet been fully investigated. To this end, we compared the two approaches on real (genotyped and imputed) data from 120K samples from the UK Biobank and show that, genotype imputation boasts a 3- to 5-fold lower root-mean-square error, and better distinguishes true associations from null ones: We observed the largest differences in power for variants with low minor allele frequency and low imputation quality. For fixed false positive rates of 0.001, 0.01, 0.05, using summary statistics imputation yielded a decrease in statistical power by 9, 43 and 35%, respectively. To test its capacity to discover novel associations, we applied summary statistics imputation to the GIANT height meta-analysis summary statistics covering HapMap variants, and identified 34 novel loci, 19 of which replicated using data in the UK Biobank. Additionally, we successfully replicated 55 out of the 111 variants published in an exome chip study. Our study demonstrates that summary statistics imputation is a very efficient and cost-effective way to identify and fine-map trait-associated loci. Moreover, the ability to impute summary statistics is important for follow-up analyses, such as Mendelian randomisation or LD-score regression. Public Library of Science 2018-05-21 /pmc/articles/PMC5983877/ /pubmed/29782485 http://dx.doi.org/10.1371/journal.pgen.1007371 Text en © 2018 Rüeger et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Rüeger, Sina
McDaid, Aaron
Kutalik, Zoltán
Evaluation and application of summary statistic imputation to discover new height-associated loci
title Evaluation and application of summary statistic imputation to discover new height-associated loci
title_full Evaluation and application of summary statistic imputation to discover new height-associated loci
title_fullStr Evaluation and application of summary statistic imputation to discover new height-associated loci
title_full_unstemmed Evaluation and application of summary statistic imputation to discover new height-associated loci
title_short Evaluation and application of summary statistic imputation to discover new height-associated loci
title_sort evaluation and application of summary statistic imputation to discover new height-associated loci
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5983877/
https://www.ncbi.nlm.nih.gov/pubmed/29782485
http://dx.doi.org/10.1371/journal.pgen.1007371
work_keys_str_mv AT ruegersina evaluationandapplicationofsummarystatisticimputationtodiscovernewheightassociatedloci
AT mcdaidaaron evaluationandapplicationofsummarystatisticimputationtodiscovernewheightassociatedloci
AT kutalikzoltan evaluationandapplicationofsummarystatisticimputationtodiscovernewheightassociatedloci