Cargando…

Jackknife Model Averaging Prediction Methods for Complex Phenotypes with Gene Expression Levels by Integrating External Pathway Information

MOTIVATION: In the past few years many prediction approaches have been proposed and widely employed in high dimensional genetic data for disease risk evaluation. However, those approaches typically ignore in model fitting the important group structures that naturally exists in genetic data. METHODS:...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Xinghao, Xiao, Lishun, Zeng, Ping, Huang, Shuiping
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6476151/
https://www.ncbi.nlm.nih.gov/pubmed/31089389
http://dx.doi.org/10.1155/2019/2807470
_version_ 1783412857869697024
author Yu, Xinghao
Xiao, Lishun
Zeng, Ping
Huang, Shuiping
author_facet Yu, Xinghao
Xiao, Lishun
Zeng, Ping
Huang, Shuiping
author_sort Yu, Xinghao
collection PubMed
description MOTIVATION: In the past few years many prediction approaches have been proposed and widely employed in high dimensional genetic data for disease risk evaluation. However, those approaches typically ignore in model fitting the important group structures that naturally exists in genetic data. METHODS: In the present study, we applied a novel model-averaging approach, called jackknife model averaging prediction (JMAP), for high dimensional genetic risk prediction while incorporating pathway information into the model specification. JMAP selects the optimal weights across candidate models by minimizing a cross validation criterion in a jackknife way. Compared with previous approaches, one of the primary features of JMAP is to allow model weights to vary from 0 to 1 but without the limitation that the summation of weights is equal to one. We evaluated the performance of JMAP using extensive simulation studies and compared it with existing methods. We finally applied JMAP to four real cancer datasets that are publicly available from TCGA. RESULTS: The simulations showed that compared with other existing approaches (e.g., gsslasso), JMAP performed best or is among the best methods across a range of scenarios. For example, among 14 out of 16 simulation settings with PVE = 0.3, JMAP has an average of 0.075 higher prediction accuracy compared with gsslasso. We further found that in the simulation, the model weights for the true candidate models have much smaller chances to be zero compared with those for the null candidate models and are substantially greater in magnitude. In the real data application, JMAP also behaves comparably or better compared with the other methods for continuous phenotypes. For example, for the COAD, CRC, and PAAD datasets, the average gains of predictive accuracy of JMAP are 0.019, 0.064, and 0.052 compared with gsslasso. CONCLUSION: The proposed method JMAP is a novel model-averaging approach for high dimensional genetic risk prediction while incorporating external useful group structures into the model specification.
format Online
Article
Text
id pubmed-6476151
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-64761512019-05-14 Jackknife Model Averaging Prediction Methods for Complex Phenotypes with Gene Expression Levels by Integrating External Pathway Information Yu, Xinghao Xiao, Lishun Zeng, Ping Huang, Shuiping Comput Math Methods Med Research Article MOTIVATION: In the past few years many prediction approaches have been proposed and widely employed in high dimensional genetic data for disease risk evaluation. However, those approaches typically ignore in model fitting the important group structures that naturally exists in genetic data. METHODS: In the present study, we applied a novel model-averaging approach, called jackknife model averaging prediction (JMAP), for high dimensional genetic risk prediction while incorporating pathway information into the model specification. JMAP selects the optimal weights across candidate models by minimizing a cross validation criterion in a jackknife way. Compared with previous approaches, one of the primary features of JMAP is to allow model weights to vary from 0 to 1 but without the limitation that the summation of weights is equal to one. We evaluated the performance of JMAP using extensive simulation studies and compared it with existing methods. We finally applied JMAP to four real cancer datasets that are publicly available from TCGA. RESULTS: The simulations showed that compared with other existing approaches (e.g., gsslasso), JMAP performed best or is among the best methods across a range of scenarios. For example, among 14 out of 16 simulation settings with PVE = 0.3, JMAP has an average of 0.075 higher prediction accuracy compared with gsslasso. We further found that in the simulation, the model weights for the true candidate models have much smaller chances to be zero compared with those for the null candidate models and are substantially greater in magnitude. In the real data application, JMAP also behaves comparably or better compared with the other methods for continuous phenotypes. For example, for the COAD, CRC, and PAAD datasets, the average gains of predictive accuracy of JMAP are 0.019, 0.064, and 0.052 compared with gsslasso. CONCLUSION: The proposed method JMAP is a novel model-averaging approach for high dimensional genetic risk prediction while incorporating external useful group structures into the model specification. Hindawi 2019-04-08 /pmc/articles/PMC6476151/ /pubmed/31089389 http://dx.doi.org/10.1155/2019/2807470 Text en Copyright © 2019 Xinghao Yu et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Yu, Xinghao
Xiao, Lishun
Zeng, Ping
Huang, Shuiping
Jackknife Model Averaging Prediction Methods for Complex Phenotypes with Gene Expression Levels by Integrating External Pathway Information
title Jackknife Model Averaging Prediction Methods for Complex Phenotypes with Gene Expression Levels by Integrating External Pathway Information
title_full Jackknife Model Averaging Prediction Methods for Complex Phenotypes with Gene Expression Levels by Integrating External Pathway Information
title_fullStr Jackknife Model Averaging Prediction Methods for Complex Phenotypes with Gene Expression Levels by Integrating External Pathway Information
title_full_unstemmed Jackknife Model Averaging Prediction Methods for Complex Phenotypes with Gene Expression Levels by Integrating External Pathway Information
title_short Jackknife Model Averaging Prediction Methods for Complex Phenotypes with Gene Expression Levels by Integrating External Pathway Information
title_sort jackknife model averaging prediction methods for complex phenotypes with gene expression levels by integrating external pathway information
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6476151/
https://www.ncbi.nlm.nih.gov/pubmed/31089389
http://dx.doi.org/10.1155/2019/2807470
work_keys_str_mv AT yuxinghao jackknifemodelaveragingpredictionmethodsforcomplexphenotypeswithgeneexpressionlevelsbyintegratingexternalpathwayinformation
AT xiaolishun jackknifemodelaveragingpredictionmethodsforcomplexphenotypeswithgeneexpressionlevelsbyintegratingexternalpathwayinformation
AT zengping jackknifemodelaveragingpredictionmethodsforcomplexphenotypeswithgeneexpressionlevelsbyintegratingexternalpathwayinformation
AT huangshuiping jackknifemodelaveragingpredictionmethodsforcomplexphenotypeswithgeneexpressionlevelsbyintegratingexternalpathwayinformation