Cargando…

Efficient approaches for large-scale GWAS with genotype uncertainty

Association studies using genetic data from SNP-chip-based imputation or low-depth sequencing data provide a cost-efficient design for large-scale association studies. We explore methods for performing association studies applicable to such genetic data and investigate how using different priors whe...

Descripción completa

Detalles Bibliográficos
Autores principales:	Jørsboe, Emil, Albrechtsen, Anders
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2021
Materias:	Software and Data Resources
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8727990/ https://www.ncbi.nlm.nih.gov/pubmed/34865001 http://dx.doi.org/10.1093/g3journal/jkab385

_version_	1784626632192950272
author	Jørsboe, Emil Albrechtsen, Anders
author_facet	Jørsboe, Emil Albrechtsen, Anders
author_sort	Jørsboe, Emil
collection	PubMed
description	Association studies using genetic data from SNP-chip-based imputation or low-depth sequencing data provide a cost-efficient design for large-scale association studies. We explore methods for performing association studies applicable to such genetic data and investigate how using different priors when estimating genotype probabilities affects the association results. Our proposed method, ANGSD-asso’s latent model, models the unobserved genotype as a latent variable in a generalized linear model framework. The software is implemented in C/C++ and can be run multi-threaded. ANGSD-asso is based on genotype probabilities, which can be estimated using either the sample allele frequency or the individual allele frequencies as a prior. We explore through simulations how genotype probability-based methods compare with using genetic dosages. Our simulations show that in a structured population using the individual allele frequency prior has better power than the sample allele frequency. In scenarios with sequencing depth and phenotype correlation ANGSD-asso’s latent model has higher statistical power and less bias than using dosages. Adding additional covariates to the linear model of ANGSD-asso’s latent model has higher statistical power and less bias than other methods that accommodate genotype uncertainty, while also being much faster. This is shown with imputed data from UK Biobank and simulations.
format	Online Article Text
id	pubmed-8727990
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-87279902022-01-05 Efficient approaches for large-scale GWAS with genotype uncertainty Jørsboe, Emil Albrechtsen, Anders G3 (Bethesda) Software and Data Resources Association studies using genetic data from SNP-chip-based imputation or low-depth sequencing data provide a cost-efficient design for large-scale association studies. We explore methods for performing association studies applicable to such genetic data and investigate how using different priors when estimating genotype probabilities affects the association results. Our proposed method, ANGSD-asso’s latent model, models the unobserved genotype as a latent variable in a generalized linear model framework. The software is implemented in C/C++ and can be run multi-threaded. ANGSD-asso is based on genotype probabilities, which can be estimated using either the sample allele frequency or the individual allele frequencies as a prior. We explore through simulations how genotype probability-based methods compare with using genetic dosages. Our simulations show that in a structured population using the individual allele frequency prior has better power than the sample allele frequency. In scenarios with sequencing depth and phenotype correlation ANGSD-asso’s latent model has higher statistical power and less bias than using dosages. Adding additional covariates to the linear model of ANGSD-asso’s latent model has higher statistical power and less bias than other methods that accommodate genotype uncertainty, while also being much faster. This is shown with imputed data from UK Biobank and simulations. Oxford University Press 2021-12-04 /pmc/articles/PMC8727990/ /pubmed/34865001 http://dx.doi.org/10.1093/g3journal/jkab385 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Software and Data Resources Jørsboe, Emil Albrechtsen, Anders Efficient approaches for large-scale GWAS with genotype uncertainty
title	Efficient approaches for large-scale GWAS with genotype uncertainty
title_full	Efficient approaches for large-scale GWAS with genotype uncertainty
title_fullStr	Efficient approaches for large-scale GWAS with genotype uncertainty
title_full_unstemmed	Efficient approaches for large-scale GWAS with genotype uncertainty
title_short	Efficient approaches for large-scale GWAS with genotype uncertainty
title_sort	efficient approaches for large-scale gwas with genotype uncertainty
topic	Software and Data Resources
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8727990/ https://www.ncbi.nlm.nih.gov/pubmed/34865001 http://dx.doi.org/10.1093/g3journal/jkab385
work_keys_str_mv	AT jørsboeemil efficientapproachesforlargescalegwaswithgenotypeuncertainty AT albrechtsenanders efficientapproachesforlargescalegwaswithgenotypeuncertainty

Efficient approaches for large-scale GWAS with genotype uncertainty

Ejemplares similares