Cargando…

Identifying novel associations in GWAS by hierarchical Bayesian latent variable detection of differentially misclassified phenotypes

BACKGROUND: Heterogeneity in the definition and measurement of complex diseases in Genome-Wide Association Studies (GWAS) may lead to misdiagnoses and misclassification errors that can significantly impact discovery of disease loci. While well appreciated, almost all analyses of GWAS data consider r...

Descripción completa

Detalles Bibliográficos
Autores principales: Shafquat, Afrah, Crystal, Ronald G., Mezey, Jason G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7204256/
https://www.ncbi.nlm.nih.gov/pubmed/32381021
http://dx.doi.org/10.1186/s12859-020-3387-z
_version_ 1783530027376181248
author Shafquat, Afrah
Crystal, Ronald G.
Mezey, Jason G.
author_facet Shafquat, Afrah
Crystal, Ronald G.
Mezey, Jason G.
author_sort Shafquat, Afrah
collection PubMed
description BACKGROUND: Heterogeneity in the definition and measurement of complex diseases in Genome-Wide Association Studies (GWAS) may lead to misdiagnoses and misclassification errors that can significantly impact discovery of disease loci. While well appreciated, almost all analyses of GWAS data consider reported disease phenotype values as is without accounting for potential misclassification. RESULTS: Here, we introduce Phenotype Latent variable Extraction of disease misdiagnosis (PheLEx), a GWAS analysis framework that learns and corrects misclassified phenotypes using structured genotype associations within a dataset. PheLEx consists of a hierarchical Bayesian latent variable model, where inference of differential misclassification is accomplished using filtered genotypes while implementing a full mixed model to account for population structure and genetic relatedness in study populations. Through simulations, we show that the PheLEx framework dramatically improves recovery of the correct disease state when considering realistic allele effect sizes compared to existing methodologies designed for Bayesian recovery of disease phenotypes. We also demonstrate the potential of PheLEx for extracting new potential loci from existing GWAS data by analyzing bipolar disorder and epilepsy phenotypes available from the UK Biobank. From the PheLEx analysis of these data, we identified new candidate disease loci not previously reported for these datasets that have value for supplemental hypothesis generation. CONCLUSION: PheLEx shows promise in reanalyzing GWAS datasets to provide supplemental candidate loci that are ignored by traditional GWAS analysis methodologies.
format Online
Article
Text
id pubmed-7204256
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-72042562020-05-12 Identifying novel associations in GWAS by hierarchical Bayesian latent variable detection of differentially misclassified phenotypes Shafquat, Afrah Crystal, Ronald G. Mezey, Jason G. BMC Bioinformatics Research Article BACKGROUND: Heterogeneity in the definition and measurement of complex diseases in Genome-Wide Association Studies (GWAS) may lead to misdiagnoses and misclassification errors that can significantly impact discovery of disease loci. While well appreciated, almost all analyses of GWAS data consider reported disease phenotype values as is without accounting for potential misclassification. RESULTS: Here, we introduce Phenotype Latent variable Extraction of disease misdiagnosis (PheLEx), a GWAS analysis framework that learns and corrects misclassified phenotypes using structured genotype associations within a dataset. PheLEx consists of a hierarchical Bayesian latent variable model, where inference of differential misclassification is accomplished using filtered genotypes while implementing a full mixed model to account for population structure and genetic relatedness in study populations. Through simulations, we show that the PheLEx framework dramatically improves recovery of the correct disease state when considering realistic allele effect sizes compared to existing methodologies designed for Bayesian recovery of disease phenotypes. We also demonstrate the potential of PheLEx for extracting new potential loci from existing GWAS data by analyzing bipolar disorder and epilepsy phenotypes available from the UK Biobank. From the PheLEx analysis of these data, we identified new candidate disease loci not previously reported for these datasets that have value for supplemental hypothesis generation. CONCLUSION: PheLEx shows promise in reanalyzing GWAS datasets to provide supplemental candidate loci that are ignored by traditional GWAS analysis methodologies. BioMed Central 2020-05-07 /pmc/articles/PMC7204256/ /pubmed/32381021 http://dx.doi.org/10.1186/s12859-020-3387-z Text en © The Author(s). 2020 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Shafquat, Afrah
Crystal, Ronald G.
Mezey, Jason G.
Identifying novel associations in GWAS by hierarchical Bayesian latent variable detection of differentially misclassified phenotypes
title Identifying novel associations in GWAS by hierarchical Bayesian latent variable detection of differentially misclassified phenotypes
title_full Identifying novel associations in GWAS by hierarchical Bayesian latent variable detection of differentially misclassified phenotypes
title_fullStr Identifying novel associations in GWAS by hierarchical Bayesian latent variable detection of differentially misclassified phenotypes
title_full_unstemmed Identifying novel associations in GWAS by hierarchical Bayesian latent variable detection of differentially misclassified phenotypes
title_short Identifying novel associations in GWAS by hierarchical Bayesian latent variable detection of differentially misclassified phenotypes
title_sort identifying novel associations in gwas by hierarchical bayesian latent variable detection of differentially misclassified phenotypes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7204256/
https://www.ncbi.nlm.nih.gov/pubmed/32381021
http://dx.doi.org/10.1186/s12859-020-3387-z
work_keys_str_mv AT shafquatafrah identifyingnovelassociationsingwasbyhierarchicalbayesianlatentvariabledetectionofdifferentiallymisclassifiedphenotypes
AT crystalronaldg identifyingnovelassociationsingwasbyhierarchicalbayesianlatentvariabledetectionofdifferentiallymisclassifiedphenotypes
AT mezeyjasong identifyingnovelassociationsingwasbyhierarchicalbayesianlatentvariabledetectionofdifferentiallymisclassifiedphenotypes