Cargando…

Sequencing and imputation in GWAS: Cost‐effective strategies to increase power and genomic coverage across diverse populations

A key aim for current genome‐wide association studies (GWAS) is to interrogate the full spectrum of genetic variation underlying human traits, including rare variants, across populations. Deep whole‐genome sequencing is the gold standard to fully capture genetic variation, but remains prohibitively...

Descripción completa

Detalles Bibliográficos
Autores principales: Quick, Corbin, Anugu, Pramod, Musani, Solomon, Weiss, Scott T., Burchard, Esteban G., White, Marquitta J., Keys, Kevin L., Cucca, Francesco, Sidore, Carlo, Boehnke, Michael, Fuchsberger, Christian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7449570/
https://www.ncbi.nlm.nih.gov/pubmed/32519380
http://dx.doi.org/10.1002/gepi.22326
_version_ 1783574655310757888
author Quick, Corbin
Anugu, Pramod
Musani, Solomon
Weiss, Scott T.
Burchard, Esteban G.
White, Marquitta J.
Keys, Kevin L.
Cucca, Francesco
Sidore, Carlo
Boehnke, Michael
Fuchsberger, Christian
author_facet Quick, Corbin
Anugu, Pramod
Musani, Solomon
Weiss, Scott T.
Burchard, Esteban G.
White, Marquitta J.
Keys, Kevin L.
Cucca, Francesco
Sidore, Carlo
Boehnke, Michael
Fuchsberger, Christian
author_sort Quick, Corbin
collection PubMed
description A key aim for current genome‐wide association studies (GWAS) is to interrogate the full spectrum of genetic variation underlying human traits, including rare variants, across populations. Deep whole‐genome sequencing is the gold standard to fully capture genetic variation, but remains prohibitively expensive for large sample sizes. Array genotyping interrogates a sparser set of variants, which can be used as a scaffold for genotype imputation to capture a wider set of variants. However, imputation quality depends crucially on reference panel size and genetic distance from the target population. Here, we consider sequencing a subset of GWAS participants and imputing the rest using a reference panel that includes both sequenced GWAS participants and an external reference panel. We investigate how imputation quality and GWAS power are affected by the number of participants sequenced for admixed populations (African and Latino Americans) and European population isolates (Sardinians and Finns), and identify powerful, cost‐effective GWAS designs given current sequencing and array costs. For populations that are well‐represented in existing reference panels, we find that array genotyping alone is cost‐effective and well‐powered to detect common‐ and rare‐variant associations. For poorly represented populations, sequencing a subset of participants is often most cost‐effective, and can substantially increase imputation quality and GWAS power.
format Online
Article
Text
id pubmed-7449570
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-74495702020-09-25 Sequencing and imputation in GWAS: Cost‐effective strategies to increase power and genomic coverage across diverse populations Quick, Corbin Anugu, Pramod Musani, Solomon Weiss, Scott T. Burchard, Esteban G. White, Marquitta J. Keys, Kevin L. Cucca, Francesco Sidore, Carlo Boehnke, Michael Fuchsberger, Christian Genet Epidemiol Research Articles A key aim for current genome‐wide association studies (GWAS) is to interrogate the full spectrum of genetic variation underlying human traits, including rare variants, across populations. Deep whole‐genome sequencing is the gold standard to fully capture genetic variation, but remains prohibitively expensive for large sample sizes. Array genotyping interrogates a sparser set of variants, which can be used as a scaffold for genotype imputation to capture a wider set of variants. However, imputation quality depends crucially on reference panel size and genetic distance from the target population. Here, we consider sequencing a subset of GWAS participants and imputing the rest using a reference panel that includes both sequenced GWAS participants and an external reference panel. We investigate how imputation quality and GWAS power are affected by the number of participants sequenced for admixed populations (African and Latino Americans) and European population isolates (Sardinians and Finns), and identify powerful, cost‐effective GWAS designs given current sequencing and array costs. For populations that are well‐represented in existing reference panels, we find that array genotyping alone is cost‐effective and well‐powered to detect common‐ and rare‐variant associations. For poorly represented populations, sequencing a subset of participants is often most cost‐effective, and can substantially increase imputation quality and GWAS power. John Wiley and Sons Inc. 2020-06-09 2020-09 /pmc/articles/PMC7449570/ /pubmed/32519380 http://dx.doi.org/10.1002/gepi.22326 Text en © 2020 The Authors. Genetic Epidemiology Published by Wiley Periodicals LLC This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Articles
Quick, Corbin
Anugu, Pramod
Musani, Solomon
Weiss, Scott T.
Burchard, Esteban G.
White, Marquitta J.
Keys, Kevin L.
Cucca, Francesco
Sidore, Carlo
Boehnke, Michael
Fuchsberger, Christian
Sequencing and imputation in GWAS: Cost‐effective strategies to increase power and genomic coverage across diverse populations
title Sequencing and imputation in GWAS: Cost‐effective strategies to increase power and genomic coverage across diverse populations
title_full Sequencing and imputation in GWAS: Cost‐effective strategies to increase power and genomic coverage across diverse populations
title_fullStr Sequencing and imputation in GWAS: Cost‐effective strategies to increase power and genomic coverage across diverse populations
title_full_unstemmed Sequencing and imputation in GWAS: Cost‐effective strategies to increase power and genomic coverage across diverse populations
title_short Sequencing and imputation in GWAS: Cost‐effective strategies to increase power and genomic coverage across diverse populations
title_sort sequencing and imputation in gwas: cost‐effective strategies to increase power and genomic coverage across diverse populations
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7449570/
https://www.ncbi.nlm.nih.gov/pubmed/32519380
http://dx.doi.org/10.1002/gepi.22326
work_keys_str_mv AT quickcorbin sequencingandimputationingwascosteffectivestrategiestoincreasepowerandgenomiccoverageacrossdiversepopulations
AT anugupramod sequencingandimputationingwascosteffectivestrategiestoincreasepowerandgenomiccoverageacrossdiversepopulations
AT musanisolomon sequencingandimputationingwascosteffectivestrategiestoincreasepowerandgenomiccoverageacrossdiversepopulations
AT weissscottt sequencingandimputationingwascosteffectivestrategiestoincreasepowerandgenomiccoverageacrossdiversepopulations
AT burchardestebang sequencingandimputationingwascosteffectivestrategiestoincreasepowerandgenomiccoverageacrossdiversepopulations
AT whitemarquittaj sequencingandimputationingwascosteffectivestrategiestoincreasepowerandgenomiccoverageacrossdiversepopulations
AT keyskevinl sequencingandimputationingwascosteffectivestrategiestoincreasepowerandgenomiccoverageacrossdiversepopulations
AT cuccafrancesco sequencingandimputationingwascosteffectivestrategiestoincreasepowerandgenomiccoverageacrossdiversepopulations
AT sidorecarlo sequencingandimputationingwascosteffectivestrategiestoincreasepowerandgenomiccoverageacrossdiversepopulations
AT boehnkemichael sequencingandimputationingwascosteffectivestrategiestoincreasepowerandgenomiccoverageacrossdiversepopulations
AT fuchsbergerchristian sequencingandimputationingwascosteffectivestrategiestoincreasepowerandgenomiccoverageacrossdiversepopulations