Cargando…

A flexible and parallelizable approach to genome‐wide polygenic risk scores

The heritability of most complex traits is driven by variants throughout the genome. Consequently, polygenic risk scores, which combine information on multiple variants genome‐wide, have demonstrated improved accuracy in genetic risk prediction. We present a new two‐step approach to constructing gen...

Descripción completa

Detalles Bibliográficos
Autores principales: Newcombe, Paul J., Nelson, Christopher P., Samani, Nilesh J., Dudbridge, Frank
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6764842/
https://www.ncbi.nlm.nih.gov/pubmed/31328830
http://dx.doi.org/10.1002/gepi.22245
_version_ 1783454458150125568
author Newcombe, Paul J.
Nelson, Christopher P.
Samani, Nilesh J.
Dudbridge, Frank
author_facet Newcombe, Paul J.
Nelson, Christopher P.
Samani, Nilesh J.
Dudbridge, Frank
author_sort Newcombe, Paul J.
collection PubMed
description The heritability of most complex traits is driven by variants throughout the genome. Consequently, polygenic risk scores, which combine information on multiple variants genome‐wide, have demonstrated improved accuracy in genetic risk prediction. We present a new two‐step approach to constructing genome‐wide polygenic risk scores from meta‐GWAS summary statistics. Local linkage disequilibrium (LD) is adjusted for in Step 1, followed by, uniquely, long‐range LD in Step 2. Our algorithm is highly parallelizable since block‐wise analyses in Step 1 can be distributed across a high‐performance computing cluster, and flexible, since sparsity and heritability are estimated within each block. Inference is obtained through a formal Bayesian variable selection framework, meaning final risk predictions are averaged over competing models. We compared our method to two alternative approaches: LDPred and lassosum using all seven traits in the Welcome Trust Case Control Consortium as well as meta‐GWAS summaries for type 1 diabetes (T1D), coronary artery disease, and schizophrenia. Performance was generally similar across methods, although our framework provided more accurate predictions for T1D, for which there are multiple heterogeneous signals in regions of both short‐ and long‐range LD. With sufficient compute resources, our method also allows the fastest runtimes.
format Online
Article
Text
id pubmed-6764842
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-67648422019-10-01 A flexible and parallelizable approach to genome‐wide polygenic risk scores Newcombe, Paul J. Nelson, Christopher P. Samani, Nilesh J. Dudbridge, Frank Genet Epidemiol Research Articles The heritability of most complex traits is driven by variants throughout the genome. Consequently, polygenic risk scores, which combine information on multiple variants genome‐wide, have demonstrated improved accuracy in genetic risk prediction. We present a new two‐step approach to constructing genome‐wide polygenic risk scores from meta‐GWAS summary statistics. Local linkage disequilibrium (LD) is adjusted for in Step 1, followed by, uniquely, long‐range LD in Step 2. Our algorithm is highly parallelizable since block‐wise analyses in Step 1 can be distributed across a high‐performance computing cluster, and flexible, since sparsity and heritability are estimated within each block. Inference is obtained through a formal Bayesian variable selection framework, meaning final risk predictions are averaged over competing models. We compared our method to two alternative approaches: LDPred and lassosum using all seven traits in the Welcome Trust Case Control Consortium as well as meta‐GWAS summaries for type 1 diabetes (T1D), coronary artery disease, and schizophrenia. Performance was generally similar across methods, although our framework provided more accurate predictions for T1D, for which there are multiple heterogeneous signals in regions of both short‐ and long‐range LD. With sufficient compute resources, our method also allows the fastest runtimes. John Wiley and Sons Inc. 2019-07-22 2019-10 /pmc/articles/PMC6764842/ /pubmed/31328830 http://dx.doi.org/10.1002/gepi.22245 Text en © 2019 The Authors. Genetic Epidemiology Published by Wiley Periodicals, Inc. This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Articles
Newcombe, Paul J.
Nelson, Christopher P.
Samani, Nilesh J.
Dudbridge, Frank
A flexible and parallelizable approach to genome‐wide polygenic risk scores
title A flexible and parallelizable approach to genome‐wide polygenic risk scores
title_full A flexible and parallelizable approach to genome‐wide polygenic risk scores
title_fullStr A flexible and parallelizable approach to genome‐wide polygenic risk scores
title_full_unstemmed A flexible and parallelizable approach to genome‐wide polygenic risk scores
title_short A flexible and parallelizable approach to genome‐wide polygenic risk scores
title_sort flexible and parallelizable approach to genome‐wide polygenic risk scores
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6764842/
https://www.ncbi.nlm.nih.gov/pubmed/31328830
http://dx.doi.org/10.1002/gepi.22245
work_keys_str_mv AT newcombepaulj aflexibleandparallelizableapproachtogenomewidepolygenicriskscores
AT nelsonchristopherp aflexibleandparallelizableapproachtogenomewidepolygenicriskscores
AT samaninileshj aflexibleandparallelizableapproachtogenomewidepolygenicriskscores
AT dudbridgefrank aflexibleandparallelizableapproachtogenomewidepolygenicriskscores
AT newcombepaulj flexibleandparallelizableapproachtogenomewidepolygenicriskscores
AT nelsonchristopherp flexibleandparallelizableapproachtogenomewidepolygenicriskscores
AT samaninileshj flexibleandparallelizableapproachtogenomewidepolygenicriskscores
AT dudbridgefrank flexibleandparallelizableapproachtogenomewidepolygenicriskscores