Cargando…

Evaluation of polygenic prediction methodology within a reference-standardized framework

The predictive utility of polygenic scores is increasing, and many polygenic scoring methods are available, but it is unclear which method performs best. This study evaluates the predictive utility of polygenic scoring methods within a reference-standardized framework, which uses a common set of var...

Descripción completa

Detalles Bibliográficos
Autores principales: Pain, Oliver, Glanville, Kylie P., Hagenaars, Saskia P., Selzam, Saskia, Fürtjes, Anna E., Gaspar, Héléna A., Coleman, Jonathan R. I., Rimfeld, Kaili, Breen, Gerome, Plomin, Robert, Folkersen, Lasse, Lewis, Cathryn M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8121285/
https://www.ncbi.nlm.nih.gov/pubmed/33945532
http://dx.doi.org/10.1371/journal.pgen.1009021
_version_ 1783692303490088960
author Pain, Oliver
Glanville, Kylie P.
Hagenaars, Saskia P.
Selzam, Saskia
Fürtjes, Anna E.
Gaspar, Héléna A.
Coleman, Jonathan R. I.
Rimfeld, Kaili
Breen, Gerome
Plomin, Robert
Folkersen, Lasse
Lewis, Cathryn M.
author_facet Pain, Oliver
Glanville, Kylie P.
Hagenaars, Saskia P.
Selzam, Saskia
Fürtjes, Anna E.
Gaspar, Héléna A.
Coleman, Jonathan R. I.
Rimfeld, Kaili
Breen, Gerome
Plomin, Robert
Folkersen, Lasse
Lewis, Cathryn M.
author_sort Pain, Oliver
collection PubMed
description The predictive utility of polygenic scores is increasing, and many polygenic scoring methods are available, but it is unclear which method performs best. This study evaluates the predictive utility of polygenic scoring methods within a reference-standardized framework, which uses a common set of variants and reference-based estimates of linkage disequilibrium and allele frequencies to construct scores. Eight polygenic score methods were tested: p-value thresholding and clumping (pT+clump), SBLUP, lassosum, LDpred1, LDpred2, PRScs, DBSLMM and SBayesR, evaluating their performance to predict outcomes in UK Biobank and the Twins Early Development Study (TEDS). Strategies to identify optimal p-value thresholds and shrinkage parameters were compared, including 10-fold cross validation, pseudovalidation and infinitesimal models (with no validation sample), and multi-polygenic score elastic net models. LDpred2, lassosum and PRScs performed strongly using 10-fold cross-validation to identify the most predictive p-value threshold or shrinkage parameter, giving a relative improvement of 16–18% over pT+clump in the correlation between observed and predicted outcome values. Using pseudovalidation, the best methods were PRScs, DBSLMM and SBayesR. PRScs pseudovalidation was only 3% worse than the best polygenic score identified by 10-fold cross validation. Elastic net models containing polygenic scores based on a range of parameters consistently improved prediction over any single polygenic score. Within a reference-standardized framework, the best polygenic prediction was achieved using LDpred2, lassosum and PRScs, modeling multiple polygenic scores derived using multiple parameters. This study will help researchers performing polygenic score studies to select the most powerful and predictive analysis methods.
format Online
Article
Text
id pubmed-8121285
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-81212852021-05-24 Evaluation of polygenic prediction methodology within a reference-standardized framework Pain, Oliver Glanville, Kylie P. Hagenaars, Saskia P. Selzam, Saskia Fürtjes, Anna E. Gaspar, Héléna A. Coleman, Jonathan R. I. Rimfeld, Kaili Breen, Gerome Plomin, Robert Folkersen, Lasse Lewis, Cathryn M. PLoS Genet Research Article The predictive utility of polygenic scores is increasing, and many polygenic scoring methods are available, but it is unclear which method performs best. This study evaluates the predictive utility of polygenic scoring methods within a reference-standardized framework, which uses a common set of variants and reference-based estimates of linkage disequilibrium and allele frequencies to construct scores. Eight polygenic score methods were tested: p-value thresholding and clumping (pT+clump), SBLUP, lassosum, LDpred1, LDpred2, PRScs, DBSLMM and SBayesR, evaluating their performance to predict outcomes in UK Biobank and the Twins Early Development Study (TEDS). Strategies to identify optimal p-value thresholds and shrinkage parameters were compared, including 10-fold cross validation, pseudovalidation and infinitesimal models (with no validation sample), and multi-polygenic score elastic net models. LDpred2, lassosum and PRScs performed strongly using 10-fold cross-validation to identify the most predictive p-value threshold or shrinkage parameter, giving a relative improvement of 16–18% over pT+clump in the correlation between observed and predicted outcome values. Using pseudovalidation, the best methods were PRScs, DBSLMM and SBayesR. PRScs pseudovalidation was only 3% worse than the best polygenic score identified by 10-fold cross validation. Elastic net models containing polygenic scores based on a range of parameters consistently improved prediction over any single polygenic score. Within a reference-standardized framework, the best polygenic prediction was achieved using LDpred2, lassosum and PRScs, modeling multiple polygenic scores derived using multiple parameters. This study will help researchers performing polygenic score studies to select the most powerful and predictive analysis methods. Public Library of Science 2021-05-04 /pmc/articles/PMC8121285/ /pubmed/33945532 http://dx.doi.org/10.1371/journal.pgen.1009021 Text en © 2021 Pain et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Pain, Oliver
Glanville, Kylie P.
Hagenaars, Saskia P.
Selzam, Saskia
Fürtjes, Anna E.
Gaspar, Héléna A.
Coleman, Jonathan R. I.
Rimfeld, Kaili
Breen, Gerome
Plomin, Robert
Folkersen, Lasse
Lewis, Cathryn M.
Evaluation of polygenic prediction methodology within a reference-standardized framework
title Evaluation of polygenic prediction methodology within a reference-standardized framework
title_full Evaluation of polygenic prediction methodology within a reference-standardized framework
title_fullStr Evaluation of polygenic prediction methodology within a reference-standardized framework
title_full_unstemmed Evaluation of polygenic prediction methodology within a reference-standardized framework
title_short Evaluation of polygenic prediction methodology within a reference-standardized framework
title_sort evaluation of polygenic prediction methodology within a reference-standardized framework
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8121285/
https://www.ncbi.nlm.nih.gov/pubmed/33945532
http://dx.doi.org/10.1371/journal.pgen.1009021
work_keys_str_mv AT painoliver evaluationofpolygenicpredictionmethodologywithinareferencestandardizedframework
AT glanvillekyliep evaluationofpolygenicpredictionmethodologywithinareferencestandardizedframework
AT hagenaarssaskiap evaluationofpolygenicpredictionmethodologywithinareferencestandardizedframework
AT selzamsaskia evaluationofpolygenicpredictionmethodologywithinareferencestandardizedframework
AT furtjesannae evaluationofpolygenicpredictionmethodologywithinareferencestandardizedframework
AT gasparhelenaa evaluationofpolygenicpredictionmethodologywithinareferencestandardizedframework
AT colemanjonathanri evaluationofpolygenicpredictionmethodologywithinareferencestandardizedframework
AT rimfeldkaili evaluationofpolygenicpredictionmethodologywithinareferencestandardizedframework
AT breengerome evaluationofpolygenicpredictionmethodologywithinareferencestandardizedframework
AT plominrobert evaluationofpolygenicpredictionmethodologywithinareferencestandardizedframework
AT folkersenlasse evaluationofpolygenicpredictionmethodologywithinareferencestandardizedframework
AT lewiscathrynm evaluationofpolygenicpredictionmethodologywithinareferencestandardizedframework