Cargando…

Deep integrative models for large-scale human genomics

Polygenic risk scores (PRSs) are expected to play a critical role in precision medicine. Currently, PRS predictors are generally based on linear models using summary statistics, and more recently individual-level data. However, these predictors mainly capture additive relationships and are limited i...

Descripción completa

Detalles Bibliográficos
Autores principales: Sigurdsson, Arnór I, Louloudis, Ioannis, Banasik, Karina, Westergaard, David, Winther, Ole, Lund, Ole, Ostrowski, Sisse Rye, Erikstrup, Christian, Pedersen, Ole Birger Vesterager, Nyegaard, Mette, Brunak, Søren, Vilhjálmsson, Bjarni J, Rasmussen, Simon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10325897/
https://www.ncbi.nlm.nih.gov/pubmed/37224538
http://dx.doi.org/10.1093/nar/gkad373
_version_ 1785069314029649920
author Sigurdsson, Arnór I
Louloudis, Ioannis
Banasik, Karina
Westergaard, David
Winther, Ole
Lund, Ole
Ostrowski, Sisse Rye
Erikstrup, Christian
Pedersen, Ole Birger Vesterager
Nyegaard, Mette
Brunak, Søren
Vilhjálmsson, Bjarni J
Rasmussen, Simon
author_facet Sigurdsson, Arnór I
Louloudis, Ioannis
Banasik, Karina
Westergaard, David
Winther, Ole
Lund, Ole
Ostrowski, Sisse Rye
Erikstrup, Christian
Pedersen, Ole Birger Vesterager
Nyegaard, Mette
Brunak, Søren
Vilhjálmsson, Bjarni J
Rasmussen, Simon
author_sort Sigurdsson, Arnór I
collection PubMed
description Polygenic risk scores (PRSs) are expected to play a critical role in precision medicine. Currently, PRS predictors are generally based on linear models using summary statistics, and more recently individual-level data. However, these predictors mainly capture additive relationships and are limited in data modalities they can use. We developed a deep learning framework (EIR) for PRS prediction which includes a model, genome-local-net (GLN), specifically designed for large-scale genomics data. The framework supports multi-task learning, automatic integration of other clinical and biochemical data, and model explainability. When applied to individual-level data from the UK Biobank, the GLN model demonstrated a competitive performance compared to established neural network architectures, particularly for certain traits, showcasing its potential in modeling complex genetic relationships. Furthermore, the GLN model outperformed linear PRS methods for Type 1 Diabetes, likely due to modeling non-additive genetic effects and epistasis. This was supported by our identification of widespread non-additive genetic effects and epistasis in the context of T1D. Finally, we constructed PRS models that integrated genotype, blood, urine, and anthropometric data and found that this improved performance for 93% of the 290 diseases and disorders considered. EIR is available at https://github.com/arnor-sigurdsson/EIR.
format Online
Article
Text
id pubmed-10325897
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-103258972023-07-08 Deep integrative models for large-scale human genomics Sigurdsson, Arnór I Louloudis, Ioannis Banasik, Karina Westergaard, David Winther, Ole Lund, Ole Ostrowski, Sisse Rye Erikstrup, Christian Pedersen, Ole Birger Vesterager Nyegaard, Mette Brunak, Søren Vilhjálmsson, Bjarni J Rasmussen, Simon Nucleic Acids Res Methods Online Polygenic risk scores (PRSs) are expected to play a critical role in precision medicine. Currently, PRS predictors are generally based on linear models using summary statistics, and more recently individual-level data. However, these predictors mainly capture additive relationships and are limited in data modalities they can use. We developed a deep learning framework (EIR) for PRS prediction which includes a model, genome-local-net (GLN), specifically designed for large-scale genomics data. The framework supports multi-task learning, automatic integration of other clinical and biochemical data, and model explainability. When applied to individual-level data from the UK Biobank, the GLN model demonstrated a competitive performance compared to established neural network architectures, particularly for certain traits, showcasing its potential in modeling complex genetic relationships. Furthermore, the GLN model outperformed linear PRS methods for Type 1 Diabetes, likely due to modeling non-additive genetic effects and epistasis. This was supported by our identification of widespread non-additive genetic effects and epistasis in the context of T1D. Finally, we constructed PRS models that integrated genotype, blood, urine, and anthropometric data and found that this improved performance for 93% of the 290 diseases and disorders considered. EIR is available at https://github.com/arnor-sigurdsson/EIR. Oxford University Press 2023-05-24 /pmc/articles/PMC10325897/ /pubmed/37224538 http://dx.doi.org/10.1093/nar/gkad373 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Sigurdsson, Arnór I
Louloudis, Ioannis
Banasik, Karina
Westergaard, David
Winther, Ole
Lund, Ole
Ostrowski, Sisse Rye
Erikstrup, Christian
Pedersen, Ole Birger Vesterager
Nyegaard, Mette
Brunak, Søren
Vilhjálmsson, Bjarni J
Rasmussen, Simon
Deep integrative models for large-scale human genomics
title Deep integrative models for large-scale human genomics
title_full Deep integrative models for large-scale human genomics
title_fullStr Deep integrative models for large-scale human genomics
title_full_unstemmed Deep integrative models for large-scale human genomics
title_short Deep integrative models for large-scale human genomics
title_sort deep integrative models for large-scale human genomics
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10325897/
https://www.ncbi.nlm.nih.gov/pubmed/37224538
http://dx.doi.org/10.1093/nar/gkad373
work_keys_str_mv AT sigurdssonarnori deepintegrativemodelsforlargescalehumangenomics
AT louloudisioannis deepintegrativemodelsforlargescalehumangenomics
AT banasikkarina deepintegrativemodelsforlargescalehumangenomics
AT westergaarddavid deepintegrativemodelsforlargescalehumangenomics
AT wintherole deepintegrativemodelsforlargescalehumangenomics
AT lundole deepintegrativemodelsforlargescalehumangenomics
AT ostrowskisisserye deepintegrativemodelsforlargescalehumangenomics
AT erikstrupchristian deepintegrativemodelsforlargescalehumangenomics
AT pedersenolebirgervesterager deepintegrativemodelsforlargescalehumangenomics
AT nyegaardmette deepintegrativemodelsforlargescalehumangenomics
AT deepintegrativemodelsforlargescalehumangenomics
AT brunaksøren deepintegrativemodelsforlargescalehumangenomics
AT vilhjalmssonbjarnij deepintegrativemodelsforlargescalehumangenomics
AT rasmussensimon deepintegrativemodelsforlargescalehumangenomics