Cargando…
Deep integrative models for large-scale human genomics
Polygenic risk scores (PRSs) are expected to play a critical role in precision medicine. Currently, PRS predictors are generally based on linear models using summary statistics, and more recently individual-level data. However, these predictors mainly capture additive relationships and are limited i...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10325897/ https://www.ncbi.nlm.nih.gov/pubmed/37224538 http://dx.doi.org/10.1093/nar/gkad373 |
_version_ | 1785069314029649920 |
---|---|
author | Sigurdsson, Arnór I Louloudis, Ioannis Banasik, Karina Westergaard, David Winther, Ole Lund, Ole Ostrowski, Sisse Rye Erikstrup, Christian Pedersen, Ole Birger Vesterager Nyegaard, Mette Brunak, Søren Vilhjálmsson, Bjarni J Rasmussen, Simon |
author_facet | Sigurdsson, Arnór I Louloudis, Ioannis Banasik, Karina Westergaard, David Winther, Ole Lund, Ole Ostrowski, Sisse Rye Erikstrup, Christian Pedersen, Ole Birger Vesterager Nyegaard, Mette Brunak, Søren Vilhjálmsson, Bjarni J Rasmussen, Simon |
author_sort | Sigurdsson, Arnór I |
collection | PubMed |
description | Polygenic risk scores (PRSs) are expected to play a critical role in precision medicine. Currently, PRS predictors are generally based on linear models using summary statistics, and more recently individual-level data. However, these predictors mainly capture additive relationships and are limited in data modalities they can use. We developed a deep learning framework (EIR) for PRS prediction which includes a model, genome-local-net (GLN), specifically designed for large-scale genomics data. The framework supports multi-task learning, automatic integration of other clinical and biochemical data, and model explainability. When applied to individual-level data from the UK Biobank, the GLN model demonstrated a competitive performance compared to established neural network architectures, particularly for certain traits, showcasing its potential in modeling complex genetic relationships. Furthermore, the GLN model outperformed linear PRS methods for Type 1 Diabetes, likely due to modeling non-additive genetic effects and epistasis. This was supported by our identification of widespread non-additive genetic effects and epistasis in the context of T1D. Finally, we constructed PRS models that integrated genotype, blood, urine, and anthropometric data and found that this improved performance for 93% of the 290 diseases and disorders considered. EIR is available at https://github.com/arnor-sigurdsson/EIR. |
format | Online Article Text |
id | pubmed-10325897 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-103258972023-07-08 Deep integrative models for large-scale human genomics Sigurdsson, Arnór I Louloudis, Ioannis Banasik, Karina Westergaard, David Winther, Ole Lund, Ole Ostrowski, Sisse Rye Erikstrup, Christian Pedersen, Ole Birger Vesterager Nyegaard, Mette Brunak, Søren Vilhjálmsson, Bjarni J Rasmussen, Simon Nucleic Acids Res Methods Online Polygenic risk scores (PRSs) are expected to play a critical role in precision medicine. Currently, PRS predictors are generally based on linear models using summary statistics, and more recently individual-level data. However, these predictors mainly capture additive relationships and are limited in data modalities they can use. We developed a deep learning framework (EIR) for PRS prediction which includes a model, genome-local-net (GLN), specifically designed for large-scale genomics data. The framework supports multi-task learning, automatic integration of other clinical and biochemical data, and model explainability. When applied to individual-level data from the UK Biobank, the GLN model demonstrated a competitive performance compared to established neural network architectures, particularly for certain traits, showcasing its potential in modeling complex genetic relationships. Furthermore, the GLN model outperformed linear PRS methods for Type 1 Diabetes, likely due to modeling non-additive genetic effects and epistasis. This was supported by our identification of widespread non-additive genetic effects and epistasis in the context of T1D. Finally, we constructed PRS models that integrated genotype, blood, urine, and anthropometric data and found that this improved performance for 93% of the 290 diseases and disorders considered. EIR is available at https://github.com/arnor-sigurdsson/EIR. Oxford University Press 2023-05-24 /pmc/articles/PMC10325897/ /pubmed/37224538 http://dx.doi.org/10.1093/nar/gkad373 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Online Sigurdsson, Arnór I Louloudis, Ioannis Banasik, Karina Westergaard, David Winther, Ole Lund, Ole Ostrowski, Sisse Rye Erikstrup, Christian Pedersen, Ole Birger Vesterager Nyegaard, Mette Brunak, Søren Vilhjálmsson, Bjarni J Rasmussen, Simon Deep integrative models for large-scale human genomics |
title | Deep integrative models for large-scale human genomics |
title_full | Deep integrative models for large-scale human genomics |
title_fullStr | Deep integrative models for large-scale human genomics |
title_full_unstemmed | Deep integrative models for large-scale human genomics |
title_short | Deep integrative models for large-scale human genomics |
title_sort | deep integrative models for large-scale human genomics |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10325897/ https://www.ncbi.nlm.nih.gov/pubmed/37224538 http://dx.doi.org/10.1093/nar/gkad373 |
work_keys_str_mv | AT sigurdssonarnori deepintegrativemodelsforlargescalehumangenomics AT louloudisioannis deepintegrativemodelsforlargescalehumangenomics AT banasikkarina deepintegrativemodelsforlargescalehumangenomics AT westergaarddavid deepintegrativemodelsforlargescalehumangenomics AT wintherole deepintegrativemodelsforlargescalehumangenomics AT lundole deepintegrativemodelsforlargescalehumangenomics AT ostrowskisisserye deepintegrativemodelsforlargescalehumangenomics AT erikstrupchristian deepintegrativemodelsforlargescalehumangenomics AT pedersenolebirgervesterager deepintegrativemodelsforlargescalehumangenomics AT nyegaardmette deepintegrativemodelsforlargescalehumangenomics AT deepintegrativemodelsforlargescalehumangenomics AT brunaksøren deepintegrativemodelsforlargescalehumangenomics AT vilhjalmssonbjarnij deepintegrativemodelsforlargescalehumangenomics AT rasmussensimon deepintegrativemodelsforlargescalehumangenomics |