Cargando…
An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction
Great efforts are being made to develop advanced polygenic risk scores (PRS) to improve the prediction of complex traits and diseases. However, most existing PRS are primarily trained on European ancestry populations, limiting their transferability to non-European populations. In this article, we pr...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10055041/ https://www.ncbi.nlm.nih.gov/pubmed/36993331 http://dx.doi.org/10.1101/2023.03.15.532652 |
_version_ | 1785015810159280128 |
---|---|
author | Zhang, Jingning Zhan, Jianan Jin, Jin Ma, Cheng Zhao, Ruzhang O’ Connell, Jared Jiang, Yunxuan Koelsch, Bertram L. Zhang, Haoyu Chatterjee, Nilanjan |
author_facet | Zhang, Jingning Zhan, Jianan Jin, Jin Ma, Cheng Zhao, Ruzhang O’ Connell, Jared Jiang, Yunxuan Koelsch, Bertram L. Zhang, Haoyu Chatterjee, Nilanjan |
author_sort | Zhang, Jingning |
collection | PubMed |
description | Great efforts are being made to develop advanced polygenic risk scores (PRS) to improve the prediction of complex traits and diseases. However, most existing PRS are primarily trained on European ancestry populations, limiting their transferability to non-European populations. In this article, we propose a novel method for generating multi-ancestry Polygenic Risk scOres based on enSemble of PEnalized Regression models (PROSPER). PROSPER integrates genome-wide association studies (GWAS) summary statistics from diverse populations to develop ancestry-specific PRS with improved predictive power for minority populations. The method uses a combination of ℒ(1) (lasso) and ℒ(2) (ridge) penalty functions, a parsimonious specification of the penalty parameters across populations, and an ensemble step to combine PRS generated across different penalty parameters. We evaluate the performance of PROSPER and other existing methods on large-scale simulated and real datasets, including those from 23andMe Inc., the Global Lipids Genetics Consortium, and All of Us. Results show that PROSPER can substantially improve multi-ancestry polygenic prediction compared to alternative methods across a wide variety of genetic architectures. In real data analyses, for example, PROSPER increased out-of-sample prediction R(2) for continuous traits by an average of 70% compared to a state-of-the-art Bayesian method (PRS-CSx) in the African ancestry population. Further, PROSPER is computationally highly scalable for the analysis of large SNP contents and many diverse populations. |
format | Online Article Text |
id | pubmed-10055041 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-100550412023-03-30 An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction Zhang, Jingning Zhan, Jianan Jin, Jin Ma, Cheng Zhao, Ruzhang O’ Connell, Jared Jiang, Yunxuan Koelsch, Bertram L. Zhang, Haoyu Chatterjee, Nilanjan bioRxiv Article Great efforts are being made to develop advanced polygenic risk scores (PRS) to improve the prediction of complex traits and diseases. However, most existing PRS are primarily trained on European ancestry populations, limiting their transferability to non-European populations. In this article, we propose a novel method for generating multi-ancestry Polygenic Risk scOres based on enSemble of PEnalized Regression models (PROSPER). PROSPER integrates genome-wide association studies (GWAS) summary statistics from diverse populations to develop ancestry-specific PRS with improved predictive power for minority populations. The method uses a combination of ℒ(1) (lasso) and ℒ(2) (ridge) penalty functions, a parsimonious specification of the penalty parameters across populations, and an ensemble step to combine PRS generated across different penalty parameters. We evaluate the performance of PROSPER and other existing methods on large-scale simulated and real datasets, including those from 23andMe Inc., the Global Lipids Genetics Consortium, and All of Us. Results show that PROSPER can substantially improve multi-ancestry polygenic prediction compared to alternative methods across a wide variety of genetic architectures. In real data analyses, for example, PROSPER increased out-of-sample prediction R(2) for continuous traits by an average of 70% compared to a state-of-the-art Bayesian method (PRS-CSx) in the African ancestry population. Further, PROSPER is computationally highly scalable for the analysis of large SNP contents and many diverse populations. Cold Spring Harbor Laboratory 2023-09-17 /pmc/articles/PMC10055041/ /pubmed/36993331 http://dx.doi.org/10.1101/2023.03.15.532652 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Zhang, Jingning Zhan, Jianan Jin, Jin Ma, Cheng Zhao, Ruzhang O’ Connell, Jared Jiang, Yunxuan Koelsch, Bertram L. Zhang, Haoyu Chatterjee, Nilanjan An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction |
title | An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction |
title_full | An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction |
title_fullStr | An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction |
title_full_unstemmed | An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction |
title_short | An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction |
title_sort | ensemble penalized regression method for multi-ancestry polygenic risk prediction |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10055041/ https://www.ncbi.nlm.nih.gov/pubmed/36993331 http://dx.doi.org/10.1101/2023.03.15.532652 |
work_keys_str_mv | AT zhangjingning anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT zhanjianan anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT jinjin anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT macheng anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT zhaoruzhang anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT oconnelljared anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT jiangyunxuan anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT koelschbertraml anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT zhanghaoyu anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT chatterjeenilanjan anensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT zhangjingning ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT zhanjianan ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT jinjin ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT macheng ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT zhaoruzhang ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT oconnelljared ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT jiangyunxuan ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT koelschbertraml ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT zhanghaoyu ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction AT chatterjeenilanjan ensemblepenalizedregressionmethodformultiancestrypolygenicriskprediction |