Cargando…

Power and Predictive Accuracy of Polygenic Risk Scores

Polygenic scores have recently been used to summarise genetic effects among an ensemble of markers that do not individually achieve significance in a large-scale association study. Markers are selected using an initial training sample and used to construct a score in an independent replication sampl...

Descripción completa

Detalles Bibliográficos
Autor principal: Dudbridge, Frank
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3605113/
https://www.ncbi.nlm.nih.gov/pubmed/23555274
http://dx.doi.org/10.1371/journal.pgen.1003348
_version_ 1782263822752415744
author Dudbridge, Frank
author_facet Dudbridge, Frank
author_sort Dudbridge, Frank
collection PubMed
description Polygenic scores have recently been used to summarise genetic effects among an ensemble of markers that do not individually achieve significance in a large-scale association study. Markers are selected using an initial training sample and used to construct a score in an independent replication sample by forming the weighted sum of associated alleles within each subject. Association between a trait and this composite score implies that a genetic signal is present among the selected markers, and the score can then be used for prediction of individual trait values. This approach has been used to obtain evidence of a genetic effect when no single markers are significant, to establish a common genetic basis for related disorders, and to construct risk prediction models. In some cases, however, the desired association or prediction has not been achieved. Here, the power and predictive accuracy of a polygenic score are derived from a quantitative genetics model as a function of the sizes of the two samples, explained genetic variance, selection thresholds for including a marker in the score, and methods for weighting effect sizes in the score. Expressions are derived for quantitative and discrete traits, the latter allowing for case/control sampling. A novel approach to estimating the variance explained by a marker panel is also proposed. It is shown that published studies with significant association of polygenic scores have been well powered, whereas those with negative results can be explained by low sample size. It is also shown that useful levels of prediction may only be approached when predictors are estimated from very large samples, up to an order of magnitude greater than currently available. Therefore, polygenic scores currently have more utility for association testing than predicting complex traits, but prediction will become more feasible as sample sizes continue to grow.
format Online
Article
Text
id pubmed-3605113
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-36051132013-04-03 Power and Predictive Accuracy of Polygenic Risk Scores Dudbridge, Frank PLoS Genet Research Article Polygenic scores have recently been used to summarise genetic effects among an ensemble of markers that do not individually achieve significance in a large-scale association study. Markers are selected using an initial training sample and used to construct a score in an independent replication sample by forming the weighted sum of associated alleles within each subject. Association between a trait and this composite score implies that a genetic signal is present among the selected markers, and the score can then be used for prediction of individual trait values. This approach has been used to obtain evidence of a genetic effect when no single markers are significant, to establish a common genetic basis for related disorders, and to construct risk prediction models. In some cases, however, the desired association or prediction has not been achieved. Here, the power and predictive accuracy of a polygenic score are derived from a quantitative genetics model as a function of the sizes of the two samples, explained genetic variance, selection thresholds for including a marker in the score, and methods for weighting effect sizes in the score. Expressions are derived for quantitative and discrete traits, the latter allowing for case/control sampling. A novel approach to estimating the variance explained by a marker panel is also proposed. It is shown that published studies with significant association of polygenic scores have been well powered, whereas those with negative results can be explained by low sample size. It is also shown that useful levels of prediction may only be approached when predictors are estimated from very large samples, up to an order of magnitude greater than currently available. Therefore, polygenic scores currently have more utility for association testing than predicting complex traits, but prediction will become more feasible as sample sizes continue to grow. Public Library of Science 2013-03-21 /pmc/articles/PMC3605113/ /pubmed/23555274 http://dx.doi.org/10.1371/journal.pgen.1003348 Text en © 2013 Frank Dudbridge http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Dudbridge, Frank
Power and Predictive Accuracy of Polygenic Risk Scores
title Power and Predictive Accuracy of Polygenic Risk Scores
title_full Power and Predictive Accuracy of Polygenic Risk Scores
title_fullStr Power and Predictive Accuracy of Polygenic Risk Scores
title_full_unstemmed Power and Predictive Accuracy of Polygenic Risk Scores
title_short Power and Predictive Accuracy of Polygenic Risk Scores
title_sort power and predictive accuracy of polygenic risk scores
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3605113/
https://www.ncbi.nlm.nih.gov/pubmed/23555274
http://dx.doi.org/10.1371/journal.pgen.1003348
work_keys_str_mv AT dudbridgefrank powerandpredictiveaccuracyofpolygenicriskscores