Cargando…
Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies
Background: There is increasing interest in investigating genetic risk models in empirical studies, but such studies are premature when the expected predictive ability of the risk model is low. We assessed how accurately the predictive ability of genetic risk models can be estimated in simulated dat...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4056181/ https://www.ncbi.nlm.nih.gov/pubmed/24982668 http://dx.doi.org/10.3389/fgene.2014.00179 |
_version_ | 1782320799194021888 |
---|---|
author | Kundu, Suman Mihaescu, Raluca Meijer, Catherina M. C. Bakker, Rachel Janssens, A. Cecile J. W. |
author_facet | Kundu, Suman Mihaescu, Raluca Meijer, Catherina M. C. Bakker, Rachel Janssens, A. Cecile J. W. |
author_sort | Kundu, Suman |
collection | PubMed |
description | Background: There is increasing interest in investigating genetic risk models in empirical studies, but such studies are premature when the expected predictive ability of the risk model is low. We assessed how accurately the predictive ability of genetic risk models can be estimated in simulated data that are created based on the odds ratios (ORs) and frequencies of single-nucleotide polymorphisms (SNPs) obtained from genome-wide association studies (GWASs). Methods: We aimed to replicate published prediction studies that reported the area under the receiver operating characteristic curve (AUC) as a measure of predictive ability. We searched GWAS articles for all SNPs included in these models and extracted ORs and risk allele frequencies to construct genotypes and disease status for a hypothetical population. Using these hypothetical data, we reconstructed the published genetic risk models and compared their AUC values to those reported in the original articles. Results: The accuracy of the AUC values varied with the method used for the construction of the risk models. When logistic regression analysis was used to construct the genetic risk model, AUC values estimated by the simulation method were similar to the published values with a median absolute difference of 0.02 [range: 0.00, 0.04]. This difference was 0.03 [range: 0.01, 0.06] and 0.05 [range: 0.01, 0.08] for unweighted and weighted risk scores. Conclusions: The predictive ability of genetic risk models can be estimated using simulated data based on results from GWASs. Simulation methods can be useful to estimate the predictive ability in the absence of empirical data and to decide whether empirical investigation of genetic risk models is warranted. |
format | Online Article Text |
id | pubmed-4056181 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-40561812014-06-30 Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies Kundu, Suman Mihaescu, Raluca Meijer, Catherina M. C. Bakker, Rachel Janssens, A. Cecile J. W. Front Genet Genetics Background: There is increasing interest in investigating genetic risk models in empirical studies, but such studies are premature when the expected predictive ability of the risk model is low. We assessed how accurately the predictive ability of genetic risk models can be estimated in simulated data that are created based on the odds ratios (ORs) and frequencies of single-nucleotide polymorphisms (SNPs) obtained from genome-wide association studies (GWASs). Methods: We aimed to replicate published prediction studies that reported the area under the receiver operating characteristic curve (AUC) as a measure of predictive ability. We searched GWAS articles for all SNPs included in these models and extracted ORs and risk allele frequencies to construct genotypes and disease status for a hypothetical population. Using these hypothetical data, we reconstructed the published genetic risk models and compared their AUC values to those reported in the original articles. Results: The accuracy of the AUC values varied with the method used for the construction of the risk models. When logistic regression analysis was used to construct the genetic risk model, AUC values estimated by the simulation method were similar to the published values with a median absolute difference of 0.02 [range: 0.00, 0.04]. This difference was 0.03 [range: 0.01, 0.06] and 0.05 [range: 0.01, 0.08] for unweighted and weighted risk scores. Conclusions: The predictive ability of genetic risk models can be estimated using simulated data based on results from GWASs. Simulation methods can be useful to estimate the predictive ability in the absence of empirical data and to decide whether empirical investigation of genetic risk models is warranted. Frontiers Media S.A. 2014-06-13 /pmc/articles/PMC4056181/ /pubmed/24982668 http://dx.doi.org/10.3389/fgene.2014.00179 Text en Copyright © 2014 Kundu, Mihaescu, Meijer, Bakker and Janssens. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Kundu, Suman Mihaescu, Raluca Meijer, Catherina M. C. Bakker, Rachel Janssens, A. Cecile J. W. Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies |
title | Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies |
title_full | Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies |
title_fullStr | Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies |
title_full_unstemmed | Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies |
title_short | Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies |
title_sort | estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4056181/ https://www.ncbi.nlm.nih.gov/pubmed/24982668 http://dx.doi.org/10.3389/fgene.2014.00179 |
work_keys_str_mv | AT kundusuman estimatingthepredictiveabilityofgeneticriskmodelsinsimulateddatabasedonpublishedresultsfromgenomewideassociationstudies AT mihaescuraluca estimatingthepredictiveabilityofgeneticriskmodelsinsimulateddatabasedonpublishedresultsfromgenomewideassociationstudies AT meijercatherinamc estimatingthepredictiveabilityofgeneticriskmodelsinsimulateddatabasedonpublishedresultsfromgenomewideassociationstudies AT bakkerrachel estimatingthepredictiveabilityofgeneticriskmodelsinsimulateddatabasedonpublishedresultsfromgenomewideassociationstudies AT janssensacecilejw estimatingthepredictiveabilityofgeneticriskmodelsinsimulateddatabasedonpublishedresultsfromgenomewideassociationstudies |