Cargando…

Efficient permutation-based genome-wide association studies for normal and skewed phenotypic distributions

MOTIVATION: Genome-wide association studies (GWAS) are an integral tool for studying the architecture of complex genotype and phenotype relationships. Linear mixed models (LMMs) are commonly used to detect associations between genetic markers and a trait of interest, while at the same time allowing...

Descripción completa

Detalles Bibliográficos
Autores principales: John, Maura, Ankenbrand, Markus J, Artmann, Carolin, Freudenthal, Jan A, Korte, Arthur, Grimm, Dominik G
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9486594/
https://www.ncbi.nlm.nih.gov/pubmed/36124808
http://dx.doi.org/10.1093/bioinformatics/btac455
_version_ 1784792317262036992
author John, Maura
Ankenbrand, Markus J
Artmann, Carolin
Freudenthal, Jan A
Korte, Arthur
Grimm, Dominik G
author_facet John, Maura
Ankenbrand, Markus J
Artmann, Carolin
Freudenthal, Jan A
Korte, Arthur
Grimm, Dominik G
author_sort John, Maura
collection PubMed
description MOTIVATION: Genome-wide association studies (GWAS) are an integral tool for studying the architecture of complex genotype and phenotype relationships. Linear mixed models (LMMs) are commonly used to detect associations between genetic markers and a trait of interest, while at the same time allowing to account for population structure and cryptic relatedness. Assumptions of LMMs include a normal distribution of the residuals and that the genetic markers are independent and identically distributed—both assumptions are often violated in real data. Permutation-based methods can help to overcome some of these limitations and provide more realistic thresholds for the discovery of true associations. Still, in practice, they are rarely implemented due to the high computational complexity. RESULTS: We propose permGWAS, an efficient LMM reformulation based on 4D tensors that can provide permutation-based significance thresholds. We show that our method outperforms current state-of-the-art LMMs with respect to runtime and that permutation-based thresholds have lower false discovery rates for skewed phenotypes compared to the commonly used Bonferroni threshold. Furthermore, using permGWAS we re-analyzed more than 500 Arabidopsis thaliana phenotypes with 100 permutations each in less than 8 days on a single GPU. Our re-analyses suggest that applying a permutation-based threshold can improve and refine the interpretation of GWAS results. AVAILABILITY AND IMPLEMENTATION: permGWAS is open-source and publicly available on GitHub for download: https://github.com/grimmlab/permGWAS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9486594
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-94865942022-09-20 Efficient permutation-based genome-wide association studies for normal and skewed phenotypic distributions John, Maura Ankenbrand, Markus J Artmann, Carolin Freudenthal, Jan A Korte, Arthur Grimm, Dominik G Bioinformatics Genes Track MOTIVATION: Genome-wide association studies (GWAS) are an integral tool for studying the architecture of complex genotype and phenotype relationships. Linear mixed models (LMMs) are commonly used to detect associations between genetic markers and a trait of interest, while at the same time allowing to account for population structure and cryptic relatedness. Assumptions of LMMs include a normal distribution of the residuals and that the genetic markers are independent and identically distributed—both assumptions are often violated in real data. Permutation-based methods can help to overcome some of these limitations and provide more realistic thresholds for the discovery of true associations. Still, in practice, they are rarely implemented due to the high computational complexity. RESULTS: We propose permGWAS, an efficient LMM reformulation based on 4D tensors that can provide permutation-based significance thresholds. We show that our method outperforms current state-of-the-art LMMs with respect to runtime and that permutation-based thresholds have lower false discovery rates for skewed phenotypes compared to the commonly used Bonferroni threshold. Furthermore, using permGWAS we re-analyzed more than 500 Arabidopsis thaliana phenotypes with 100 permutations each in less than 8 days on a single GPU. Our re-analyses suggest that applying a permutation-based threshold can improve and refine the interpretation of GWAS results. AVAILABILITY AND IMPLEMENTATION: permGWAS is open-source and publicly available on GitHub for download: https://github.com/grimmlab/permGWAS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-09-18 /pmc/articles/PMC9486594/ /pubmed/36124808 http://dx.doi.org/10.1093/bioinformatics/btac455 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Genes Track
John, Maura
Ankenbrand, Markus J
Artmann, Carolin
Freudenthal, Jan A
Korte, Arthur
Grimm, Dominik G
Efficient permutation-based genome-wide association studies for normal and skewed phenotypic distributions
title Efficient permutation-based genome-wide association studies for normal and skewed phenotypic distributions
title_full Efficient permutation-based genome-wide association studies for normal and skewed phenotypic distributions
title_fullStr Efficient permutation-based genome-wide association studies for normal and skewed phenotypic distributions
title_full_unstemmed Efficient permutation-based genome-wide association studies for normal and skewed phenotypic distributions
title_short Efficient permutation-based genome-wide association studies for normal and skewed phenotypic distributions
title_sort efficient permutation-based genome-wide association studies for normal and skewed phenotypic distributions
topic Genes Track
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9486594/
https://www.ncbi.nlm.nih.gov/pubmed/36124808
http://dx.doi.org/10.1093/bioinformatics/btac455
work_keys_str_mv AT johnmaura efficientpermutationbasedgenomewideassociationstudiesfornormalandskewedphenotypicdistributions
AT ankenbrandmarkusj efficientpermutationbasedgenomewideassociationstudiesfornormalandskewedphenotypicdistributions
AT artmanncarolin efficientpermutationbasedgenomewideassociationstudiesfornormalandskewedphenotypicdistributions
AT freudenthaljana efficientpermutationbasedgenomewideassociationstudiesfornormalandskewedphenotypicdistributions
AT kortearthur efficientpermutationbasedgenomewideassociationstudiesfornormalandskewedphenotypicdistributions
AT grimmdominikg efficientpermutationbasedgenomewideassociationstudiesfornormalandskewedphenotypicdistributions