Cargando…

Risk Prediction Modeling of Sequencing Data Using a Forward Random Field Method

With the advance in high-throughput sequencing technology, it is feasible to investigate the role of common and rare variants in disease risk prediction. While the new technology holds great promise to improve disease prediction, the massive amount of data and low frequency of rare variants pose gre...

Descripción completa

Detalles Bibliográficos
Autores principales: Wen, Yalu, He, Zihuai, Li, Ming, Lu, Qing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4759688/
https://www.ncbi.nlm.nih.gov/pubmed/26892725
http://dx.doi.org/10.1038/srep21120
_version_ 1782416766473863168
author Wen, Yalu
He, Zihuai
Li, Ming
Lu, Qing
author_facet Wen, Yalu
He, Zihuai
Li, Ming
Lu, Qing
author_sort Wen, Yalu
collection PubMed
description With the advance in high-throughput sequencing technology, it is feasible to investigate the role of common and rare variants in disease risk prediction. While the new technology holds great promise to improve disease prediction, the massive amount of data and low frequency of rare variants pose great analytical challenges on risk prediction modeling. In this paper, we develop a forward random field method (FRF) for risk prediction modeling using sequencing data. In FRF, subjects’ phenotypes are treated as stochastic realizations of a random field on a genetic space formed by subjects’ genotypes, and an individual’s phenotype can be predicted by adjacent subjects with similar genotypes. The FRF method allows for multiple similarity measures and candidate genes in the model, and adaptively chooses the optimal similarity measure and disease-associated genes to reflect the underlying disease model. It also avoids the specification of the threshold of rare variants and allows for different directions and magnitudes of genetic effects. Through simulations, we demonstrate the FRF method attains higher or comparable accuracy over commonly used support vector machine based methods under various disease models. We further illustrate the FRF method with an application to the sequencing data obtained from the Dallas Heart Study.
format Online
Article
Text
id pubmed-4759688
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-47596882016-02-29 Risk Prediction Modeling of Sequencing Data Using a Forward Random Field Method Wen, Yalu He, Zihuai Li, Ming Lu, Qing Sci Rep Article With the advance in high-throughput sequencing technology, it is feasible to investigate the role of common and rare variants in disease risk prediction. While the new technology holds great promise to improve disease prediction, the massive amount of data and low frequency of rare variants pose great analytical challenges on risk prediction modeling. In this paper, we develop a forward random field method (FRF) for risk prediction modeling using sequencing data. In FRF, subjects’ phenotypes are treated as stochastic realizations of a random field on a genetic space formed by subjects’ genotypes, and an individual’s phenotype can be predicted by adjacent subjects with similar genotypes. The FRF method allows for multiple similarity measures and candidate genes in the model, and adaptively chooses the optimal similarity measure and disease-associated genes to reflect the underlying disease model. It also avoids the specification of the threshold of rare variants and allows for different directions and magnitudes of genetic effects. Through simulations, we demonstrate the FRF method attains higher or comparable accuracy over commonly used support vector machine based methods under various disease models. We further illustrate the FRF method with an application to the sequencing data obtained from the Dallas Heart Study. Nature Publishing Group 2016-02-19 /pmc/articles/PMC4759688/ /pubmed/26892725 http://dx.doi.org/10.1038/srep21120 Text en Copyright © 2016, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Wen, Yalu
He, Zihuai
Li, Ming
Lu, Qing
Risk Prediction Modeling of Sequencing Data Using a Forward Random Field Method
title Risk Prediction Modeling of Sequencing Data Using a Forward Random Field Method
title_full Risk Prediction Modeling of Sequencing Data Using a Forward Random Field Method
title_fullStr Risk Prediction Modeling of Sequencing Data Using a Forward Random Field Method
title_full_unstemmed Risk Prediction Modeling of Sequencing Data Using a Forward Random Field Method
title_short Risk Prediction Modeling of Sequencing Data Using a Forward Random Field Method
title_sort risk prediction modeling of sequencing data using a forward random field method
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4759688/
https://www.ncbi.nlm.nih.gov/pubmed/26892725
http://dx.doi.org/10.1038/srep21120
work_keys_str_mv AT wenyalu riskpredictionmodelingofsequencingdatausingaforwardrandomfieldmethod
AT hezihuai riskpredictionmodelingofsequencingdatausingaforwardrandomfieldmethod
AT liming riskpredictionmodelingofsequencingdatausingaforwardrandomfieldmethod
AT luqing riskpredictionmodelingofsequencingdatausingaforwardrandomfieldmethod