Cargando…

A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population

Multi-population cohorts offer unprecedented opportunities for profiling disease risk in large samples, however, heterogeneous risk effects underlying complex traits across populations make integrative prediction challenging. In this study, we propose a novel Bayesian probability framework, the Pris...

Descripción completa

Detalles Bibliográficos
Autores principales: Xia, Xiaoxuan, Zhang, Yexian, Sun, Rui, Wei, Yingying, Li, Qi, Chong, Marc Ka Chun, Wu, William Ka Kei, Zee, Benny Chung-Ying, Tang, Hua, Wang, Maggie Haitian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9642904/
https://www.ncbi.nlm.nih.gov/pubmed/36302058
http://dx.doi.org/10.1371/journal.pgen.1010443
_version_ 1784826411709628416
author Xia, Xiaoxuan
Zhang, Yexian
Sun, Rui
Wei, Yingying
Li, Qi
Chong, Marc Ka Chun
Wu, William Ka Kei
Zee, Benny Chung-Ying
Tang, Hua
Wang, Maggie Haitian
author_facet Xia, Xiaoxuan
Zhang, Yexian
Sun, Rui
Wei, Yingying
Li, Qi
Chong, Marc Ka Chun
Wu, William Ka Kei
Zee, Benny Chung-Ying
Tang, Hua
Wang, Maggie Haitian
author_sort Xia, Xiaoxuan
collection PubMed
description Multi-population cohorts offer unprecedented opportunities for profiling disease risk in large samples, however, heterogeneous risk effects underlying complex traits across populations make integrative prediction challenging. In this study, we propose a novel Bayesian probability framework, the Prism Vote (PV), to construct risk predictions in heterogeneous genetic data. The PV views the trait of an individual as a composite risk from subpopulations, in which stratum-specific predictors can be formed in data of more homogeneous genetic structure. Since each individual is described by a composition of subpopulation memberships, the framework enables individualized risk characterization. Simulations demonstrated that the PV framework applied with alternative prediction methods significantly improved prediction accuracy in mixed and admixed populations. The advantage of PV enlarges as genetic heterogeneity and sample size increase. In two real genome-wide association data consists of multiple populations, we showed that the framework considerably enhanced prediction accuracy of the linear mixed model in five-group cross validations. The proposed method offers a new aspect to analyze individual’s disease risk and improve accuracy for predicting complex traits in genotype data.
format Online
Article
Text
id pubmed-9642904
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-96429042022-11-15 A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population Xia, Xiaoxuan Zhang, Yexian Sun, Rui Wei, Yingying Li, Qi Chong, Marc Ka Chun Wu, William Ka Kei Zee, Benny Chung-Ying Tang, Hua Wang, Maggie Haitian PLoS Genet Methods Multi-population cohorts offer unprecedented opportunities for profiling disease risk in large samples, however, heterogeneous risk effects underlying complex traits across populations make integrative prediction challenging. In this study, we propose a novel Bayesian probability framework, the Prism Vote (PV), to construct risk predictions in heterogeneous genetic data. The PV views the trait of an individual as a composite risk from subpopulations, in which stratum-specific predictors can be formed in data of more homogeneous genetic structure. Since each individual is described by a composition of subpopulation memberships, the framework enables individualized risk characterization. Simulations demonstrated that the PV framework applied with alternative prediction methods significantly improved prediction accuracy in mixed and admixed populations. The advantage of PV enlarges as genetic heterogeneity and sample size increase. In two real genome-wide association data consists of multiple populations, we showed that the framework considerably enhanced prediction accuracy of the linear mixed model in five-group cross validations. The proposed method offers a new aspect to analyze individual’s disease risk and improve accuracy for predicting complex traits in genotype data. Public Library of Science 2022-10-27 /pmc/articles/PMC9642904/ /pubmed/36302058 http://dx.doi.org/10.1371/journal.pgen.1010443 Text en © 2022 Xia et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Methods
Xia, Xiaoxuan
Zhang, Yexian
Sun, Rui
Wei, Yingying
Li, Qi
Chong, Marc Ka Chun
Wu, William Ka Kei
Zee, Benny Chung-Ying
Tang, Hua
Wang, Maggie Haitian
A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population
title A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population
title_full A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population
title_fullStr A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population
title_full_unstemmed A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population
title_short A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population
title_sort prism vote method for individualized risk prediction of traits in genotype data of multi-population
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9642904/
https://www.ncbi.nlm.nih.gov/pubmed/36302058
http://dx.doi.org/10.1371/journal.pgen.1010443
work_keys_str_mv AT xiaxiaoxuan aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT zhangyexian aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT sunrui aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT weiyingying aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT liqi aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT chongmarckachun aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT wuwilliamkakei aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT zeebennychungying aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT tanghua aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT wangmaggiehaitian aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT xiaxiaoxuan prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT zhangyexian prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT sunrui prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT weiyingying prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT liqi prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT chongmarckachun prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT wuwilliamkakei prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT zeebennychungying prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT tanghua prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation
AT wangmaggiehaitian prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation