Cargando…
A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population
Multi-population cohorts offer unprecedented opportunities for profiling disease risk in large samples, however, heterogeneous risk effects underlying complex traits across populations make integrative prediction challenging. In this study, we propose a novel Bayesian probability framework, the Pris...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9642904/ https://www.ncbi.nlm.nih.gov/pubmed/36302058 http://dx.doi.org/10.1371/journal.pgen.1010443 |
_version_ | 1784826411709628416 |
---|---|
author | Xia, Xiaoxuan Zhang, Yexian Sun, Rui Wei, Yingying Li, Qi Chong, Marc Ka Chun Wu, William Ka Kei Zee, Benny Chung-Ying Tang, Hua Wang, Maggie Haitian |
author_facet | Xia, Xiaoxuan Zhang, Yexian Sun, Rui Wei, Yingying Li, Qi Chong, Marc Ka Chun Wu, William Ka Kei Zee, Benny Chung-Ying Tang, Hua Wang, Maggie Haitian |
author_sort | Xia, Xiaoxuan |
collection | PubMed |
description | Multi-population cohorts offer unprecedented opportunities for profiling disease risk in large samples, however, heterogeneous risk effects underlying complex traits across populations make integrative prediction challenging. In this study, we propose a novel Bayesian probability framework, the Prism Vote (PV), to construct risk predictions in heterogeneous genetic data. The PV views the trait of an individual as a composite risk from subpopulations, in which stratum-specific predictors can be formed in data of more homogeneous genetic structure. Since each individual is described by a composition of subpopulation memberships, the framework enables individualized risk characterization. Simulations demonstrated that the PV framework applied with alternative prediction methods significantly improved prediction accuracy in mixed and admixed populations. The advantage of PV enlarges as genetic heterogeneity and sample size increase. In two real genome-wide association data consists of multiple populations, we showed that the framework considerably enhanced prediction accuracy of the linear mixed model in five-group cross validations. The proposed method offers a new aspect to analyze individual’s disease risk and improve accuracy for predicting complex traits in genotype data. |
format | Online Article Text |
id | pubmed-9642904 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-96429042022-11-15 A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population Xia, Xiaoxuan Zhang, Yexian Sun, Rui Wei, Yingying Li, Qi Chong, Marc Ka Chun Wu, William Ka Kei Zee, Benny Chung-Ying Tang, Hua Wang, Maggie Haitian PLoS Genet Methods Multi-population cohorts offer unprecedented opportunities for profiling disease risk in large samples, however, heterogeneous risk effects underlying complex traits across populations make integrative prediction challenging. In this study, we propose a novel Bayesian probability framework, the Prism Vote (PV), to construct risk predictions in heterogeneous genetic data. The PV views the trait of an individual as a composite risk from subpopulations, in which stratum-specific predictors can be formed in data of more homogeneous genetic structure. Since each individual is described by a composition of subpopulation memberships, the framework enables individualized risk characterization. Simulations demonstrated that the PV framework applied with alternative prediction methods significantly improved prediction accuracy in mixed and admixed populations. The advantage of PV enlarges as genetic heterogeneity and sample size increase. In two real genome-wide association data consists of multiple populations, we showed that the framework considerably enhanced prediction accuracy of the linear mixed model in five-group cross validations. The proposed method offers a new aspect to analyze individual’s disease risk and improve accuracy for predicting complex traits in genotype data. Public Library of Science 2022-10-27 /pmc/articles/PMC9642904/ /pubmed/36302058 http://dx.doi.org/10.1371/journal.pgen.1010443 Text en © 2022 Xia et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Methods Xia, Xiaoxuan Zhang, Yexian Sun, Rui Wei, Yingying Li, Qi Chong, Marc Ka Chun Wu, William Ka Kei Zee, Benny Chung-Ying Tang, Hua Wang, Maggie Haitian A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population |
title | A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population |
title_full | A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population |
title_fullStr | A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population |
title_full_unstemmed | A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population |
title_short | A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population |
title_sort | prism vote method for individualized risk prediction of traits in genotype data of multi-population |
topic | Methods |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9642904/ https://www.ncbi.nlm.nih.gov/pubmed/36302058 http://dx.doi.org/10.1371/journal.pgen.1010443 |
work_keys_str_mv | AT xiaxiaoxuan aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT zhangyexian aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT sunrui aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT weiyingying aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT liqi aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT chongmarckachun aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT wuwilliamkakei aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT zeebennychungying aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT tanghua aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT wangmaggiehaitian aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT xiaxiaoxuan prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT zhangyexian prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT sunrui prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT weiyingying prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT liqi prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT chongmarckachun prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT wuwilliamkakei prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT zeebennychungying prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT tanghua prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT wangmaggiehaitian prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation |