Cargando…

Topic modeling identifies novel genetic loci associated with multimorbidities in UK Biobank

Many diseases show patterns of co-occurrence, possibly driven by systemic dysregulation of underlying processes affecting multiple traits. We have developed a method (treeLFA) for identifying such multimorbidities from routine health-care data, which combines topic modeling with an informative prior...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Yidong, Jiang, Xilin, Mentzer, Alexander J., McVean, Gil, Lunter, Gerton
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10435382/
https://www.ncbi.nlm.nih.gov/pubmed/37601973
http://dx.doi.org/10.1016/j.xgen.2023.100371
_version_ 1785092083103563776
author Zhang, Yidong
Jiang, Xilin
Mentzer, Alexander J.
McVean, Gil
Lunter, Gerton
author_facet Zhang, Yidong
Jiang, Xilin
Mentzer, Alexander J.
McVean, Gil
Lunter, Gerton
author_sort Zhang, Yidong
collection PubMed
description Many diseases show patterns of co-occurrence, possibly driven by systemic dysregulation of underlying processes affecting multiple traits. We have developed a method (treeLFA) for identifying such multimorbidities from routine health-care data, which combines topic modeling with an informative prior derived from medical ontology. We apply treeLFA to UK Biobank data and identify a variety of topics representing multimorbidity clusters, including a healthy topic. We find that loci identified using topic weights as traits in a genome-wide association study (GWAS) analysis, which we validated with a range of approaches, only partially overlap with loci from GWASs on constituent single diseases. We also show that treeLFA improves upon existing methods like latent Dirichlet allocation in various ways. Overall, our findings indicate that topic models can characterize multimorbidity patterns and that genetic analysis of these patterns can provide insight into the etiology of complex traits that cannot be determined from the analysis of constituent traits alone.
format Online
Article
Text
id pubmed-10435382
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-104353822023-08-19 Topic modeling identifies novel genetic loci associated with multimorbidities in UK Biobank Zhang, Yidong Jiang, Xilin Mentzer, Alexander J. McVean, Gil Lunter, Gerton Cell Genom Article Many diseases show patterns of co-occurrence, possibly driven by systemic dysregulation of underlying processes affecting multiple traits. We have developed a method (treeLFA) for identifying such multimorbidities from routine health-care data, which combines topic modeling with an informative prior derived from medical ontology. We apply treeLFA to UK Biobank data and identify a variety of topics representing multimorbidity clusters, including a healthy topic. We find that loci identified using topic weights as traits in a genome-wide association study (GWAS) analysis, which we validated with a range of approaches, only partially overlap with loci from GWASs on constituent single diseases. We also show that treeLFA improves upon existing methods like latent Dirichlet allocation in various ways. Overall, our findings indicate that topic models can characterize multimorbidity patterns and that genetic analysis of these patterns can provide insight into the etiology of complex traits that cannot be determined from the analysis of constituent traits alone. Elsevier 2023-08-01 /pmc/articles/PMC10435382/ /pubmed/37601973 http://dx.doi.org/10.1016/j.xgen.2023.100371 Text en © 2023 The Authors https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhang, Yidong
Jiang, Xilin
Mentzer, Alexander J.
McVean, Gil
Lunter, Gerton
Topic modeling identifies novel genetic loci associated with multimorbidities in UK Biobank
title Topic modeling identifies novel genetic loci associated with multimorbidities in UK Biobank
title_full Topic modeling identifies novel genetic loci associated with multimorbidities in UK Biobank
title_fullStr Topic modeling identifies novel genetic loci associated with multimorbidities in UK Biobank
title_full_unstemmed Topic modeling identifies novel genetic loci associated with multimorbidities in UK Biobank
title_short Topic modeling identifies novel genetic loci associated with multimorbidities in UK Biobank
title_sort topic modeling identifies novel genetic loci associated with multimorbidities in uk biobank
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10435382/
https://www.ncbi.nlm.nih.gov/pubmed/37601973
http://dx.doi.org/10.1016/j.xgen.2023.100371
work_keys_str_mv AT zhangyidong topicmodelingidentifiesnovelgeneticlociassociatedwithmultimorbiditiesinukbiobank
AT jiangxilin topicmodelingidentifiesnovelgeneticlociassociatedwithmultimorbiditiesinukbiobank
AT mentzeralexanderj topicmodelingidentifiesnovelgeneticlociassociatedwithmultimorbiditiesinukbiobank
AT mcveangil topicmodelingidentifiesnovelgeneticlociassociatedwithmultimorbiditiesinukbiobank
AT luntergerton topicmodelingidentifiesnovelgeneticlociassociatedwithmultimorbiditiesinukbiobank