Cargando…

Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies

Background: There is increasing interest in investigating genetic risk models in empirical studies, but such studies are premature when the expected predictive ability of the risk model is low. We assessed how accurately the predictive ability of genetic risk models can be estimated in simulated dat...

Descripción completa

Detalles Bibliográficos
Autores principales: Kundu, Suman, Mihaescu, Raluca, Meijer, Catherina M. C., Bakker, Rachel, Janssens, A. Cecile J. W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4056181/
https://www.ncbi.nlm.nih.gov/pubmed/24982668
http://dx.doi.org/10.3389/fgene.2014.00179
_version_ 1782320799194021888
author Kundu, Suman
Mihaescu, Raluca
Meijer, Catherina M. C.
Bakker, Rachel
Janssens, A. Cecile J. W.
author_facet Kundu, Suman
Mihaescu, Raluca
Meijer, Catherina M. C.
Bakker, Rachel
Janssens, A. Cecile J. W.
author_sort Kundu, Suman
collection PubMed
description Background: There is increasing interest in investigating genetic risk models in empirical studies, but such studies are premature when the expected predictive ability of the risk model is low. We assessed how accurately the predictive ability of genetic risk models can be estimated in simulated data that are created based on the odds ratios (ORs) and frequencies of single-nucleotide polymorphisms (SNPs) obtained from genome-wide association studies (GWASs). Methods: We aimed to replicate published prediction studies that reported the area under the receiver operating characteristic curve (AUC) as a measure of predictive ability. We searched GWAS articles for all SNPs included in these models and extracted ORs and risk allele frequencies to construct genotypes and disease status for a hypothetical population. Using these hypothetical data, we reconstructed the published genetic risk models and compared their AUC values to those reported in the original articles. Results: The accuracy of the AUC values varied with the method used for the construction of the risk models. When logistic regression analysis was used to construct the genetic risk model, AUC values estimated by the simulation method were similar to the published values with a median absolute difference of 0.02 [range: 0.00, 0.04]. This difference was 0.03 [range: 0.01, 0.06] and 0.05 [range: 0.01, 0.08] for unweighted and weighted risk scores. Conclusions: The predictive ability of genetic risk models can be estimated using simulated data based on results from GWASs. Simulation methods can be useful to estimate the predictive ability in the absence of empirical data and to decide whether empirical investigation of genetic risk models is warranted.
format Online
Article
Text
id pubmed-4056181
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-40561812014-06-30 Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies Kundu, Suman Mihaescu, Raluca Meijer, Catherina M. C. Bakker, Rachel Janssens, A. Cecile J. W. Front Genet Genetics Background: There is increasing interest in investigating genetic risk models in empirical studies, but such studies are premature when the expected predictive ability of the risk model is low. We assessed how accurately the predictive ability of genetic risk models can be estimated in simulated data that are created based on the odds ratios (ORs) and frequencies of single-nucleotide polymorphisms (SNPs) obtained from genome-wide association studies (GWASs). Methods: We aimed to replicate published prediction studies that reported the area under the receiver operating characteristic curve (AUC) as a measure of predictive ability. We searched GWAS articles for all SNPs included in these models and extracted ORs and risk allele frequencies to construct genotypes and disease status for a hypothetical population. Using these hypothetical data, we reconstructed the published genetic risk models and compared their AUC values to those reported in the original articles. Results: The accuracy of the AUC values varied with the method used for the construction of the risk models. When logistic regression analysis was used to construct the genetic risk model, AUC values estimated by the simulation method were similar to the published values with a median absolute difference of 0.02 [range: 0.00, 0.04]. This difference was 0.03 [range: 0.01, 0.06] and 0.05 [range: 0.01, 0.08] for unweighted and weighted risk scores. Conclusions: The predictive ability of genetic risk models can be estimated using simulated data based on results from GWASs. Simulation methods can be useful to estimate the predictive ability in the absence of empirical data and to decide whether empirical investigation of genetic risk models is warranted. Frontiers Media S.A. 2014-06-13 /pmc/articles/PMC4056181/ /pubmed/24982668 http://dx.doi.org/10.3389/fgene.2014.00179 Text en Copyright © 2014 Kundu, Mihaescu, Meijer, Bakker and Janssens. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Kundu, Suman
Mihaescu, Raluca
Meijer, Catherina M. C.
Bakker, Rachel
Janssens, A. Cecile J. W.
Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies
title Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies
title_full Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies
title_fullStr Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies
title_full_unstemmed Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies
title_short Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies
title_sort estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4056181/
https://www.ncbi.nlm.nih.gov/pubmed/24982668
http://dx.doi.org/10.3389/fgene.2014.00179
work_keys_str_mv AT kundusuman estimatingthepredictiveabilityofgeneticriskmodelsinsimulateddatabasedonpublishedresultsfromgenomewideassociationstudies
AT mihaescuraluca estimatingthepredictiveabilityofgeneticriskmodelsinsimulateddatabasedonpublishedresultsfromgenomewideassociationstudies
AT meijercatherinamc estimatingthepredictiveabilityofgeneticriskmodelsinsimulateddatabasedonpublishedresultsfromgenomewideassociationstudies
AT bakkerrachel estimatingthepredictiveabilityofgeneticriskmodelsinsimulateddatabasedonpublishedresultsfromgenomewideassociationstudies
AT janssensacecilejw estimatingthepredictiveabilityofgeneticriskmodelsinsimulateddatabasedonpublishedresultsfromgenomewideassociationstudies