Cargando…

An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction

Great efforts are being made to develop advanced polygenic risk scores (PRS) to improve the prediction of complex traits and diseases. However, most existing PRS are primarily trained on European ancestry populations, limiting their transferability to non-European populations. In this article, we pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Jingning, Zhan, Jianan, Jin, Jin, Ma, Cheng, Zhao, Ruzhang, O’ Connell, Jared, Jiang, Yunxuan, Koelsch, Bertram L., Zhang, Haoyu, Chatterjee, Nilanjan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10055041/
https://www.ncbi.nlm.nih.gov/pubmed/36993331
http://dx.doi.org/10.1101/2023.03.15.532652
_version_ 1785015810159280128
author Zhang, Jingning
Zhan, Jianan
Jin, Jin
Ma, Cheng
Zhao, Ruzhang
O’ Connell, Jared
Jiang, Yunxuan
Koelsch, Bertram L.
Zhang, Haoyu
Chatterjee, Nilanjan
author_facet Zhang, Jingning
Zhan, Jianan
Jin, Jin
Ma, Cheng
Zhao, Ruzhang
O’ Connell, Jared
Jiang, Yunxuan
Koelsch, Bertram L.
Zhang, Haoyu
Chatterjee, Nilanjan
author_sort Zhang, Jingning
collection PubMed
description Great efforts are being made to develop advanced polygenic risk scores (PRS) to improve the prediction of complex traits and diseases. However, most existing PRS are primarily trained on European ancestry populations, limiting their transferability to non-European populations. In this article, we propose a novel method for generating multi-ancestry Polygenic Risk scOres based on enSemble of PEnalized Regression models (PROSPER). PROSPER integrates genome-wide association studies (GWAS) summary statistics from diverse populations to develop ancestry-specific PRS with improved predictive power for minority populations. The method uses a combination of ℒ(1) (lasso) and ℒ(2) (ridge) penalty functions, a parsimonious specification of the penalty parameters across populations, and an ensemble step to combine PRS generated across different penalty parameters. We evaluate the performance of PROSPER and other existing methods on large-scale simulated and real datasets, including those from 23andMe Inc., the Global Lipids Genetics Consortium, and All of Us. Results show that PROSPER can substantially improve multi-ancestry polygenic prediction compared to alternative methods across a wide variety of genetic architectures. In real data analyses, for example, PROSPER increased out-of-sample prediction R(2) for continuous traits by an average of 70% compared to a state-of-the-art Bayesian method (PRS-CSx) in the African ancestry population. Further, PROSPER is computationally highly scalable for the analysis of large SNP contents and many diverse populations.
format Online
Article
Text
id pubmed-10055041
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-100550412023-03-30 An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction Zhang, Jingning Zhan, Jianan Jin, Jin Ma, Cheng Zhao, Ruzhang O’ Connell, Jared Jiang, Yunxuan Koelsch, Bertram L. Zhang, Haoyu Chatterjee, Nilanjan bioRxiv Article Great efforts are being made to develop advanced polygenic risk scores (PRS) to improve the prediction of complex traits and diseases. However, most existing PRS are primarily trained on European ancestry populations, limiting their transferability to non-European populations. In this article, we propose a novel method for generating multi-ancestry Polygenic Risk scOres based on enSemble of PEnalized Regression models (PROSPER). PROSPER integrates genome-wide association studies (GWAS) summary statistics from diverse populations to develop ancestry-specific PRS with improved predictive power for minority populations. The method uses a combination of ℒ(1) (lasso) and ℒ(2) (ridge) penalty functions, a parsimonious specification of the penalty parameters across populations, and an ensemble step to combine PRS generated across different penalty parameters. We evaluate the performance of PROSPER and other existing methods on large-scale simulated and real datasets, including those from 23andMe Inc., the Global Lipids Genetics Consortium, and All of Us. Results show that PROSPER can substantially improve multi-ancestry polygenic prediction compared to alternative methods across a wide variety of genetic architectures. In real data analyses, for example, PROSPER increased out-of-sample prediction R(2) for continuous traits by an average of 70% compared to a state-of-the-art Bayesian method (PRS-CSx) in the African ancestry population. Further, PROSPER is computationally highly scalable for the analysis of large SNP contents and many diverse populations. Cold Spring Harbor Laboratory 2023-09-17 /pmc/articles/PMC10055041/ /pubmed/36993331 http://dx.doi.org/10.1101/2023.03.15.532652 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Zhang, Jingning
Zhan, Jianan
Jin, Jin
Ma, Cheng
Zhao, Ruzhang
O’ Connell, Jared
Jiang, Yunxuan
Koelsch, Bertram L.
Zhang, Haoyu
Chatterjee, Nilanjan
An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction
title An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction
title_full An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction
title_fullStr An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction
title_full_unstemmed An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction
title_short An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction
title_sort ensemble penalized regression method for multi-ancestry polygenic risk prediction
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10055041/
https://www.ncbi.nlm.nih.gov/pubmed/36993331
http://dx.doi.org/10.1101/2023.03.15.532652
work_keys_str_mv AT zhangjingning anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT zhanjianan anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT jinjin anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT macheng anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT zhaoruzhang anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT oconnelljared anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT jiangyunxuan anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT koelschbertraml anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT zhanghaoyu anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT chatterjeenilanjan anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT zhangjingning ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT zhanjianan ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT jinjin ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT macheng ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT zhaoruzhang ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT oconnelljared ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT jiangyunxuan ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT koelschbertraml ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT zhanghaoyu ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction
AT chatterjeenilanjan ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction