Cargando…

Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes

We introduce cross-trait penalized regression (CTPR), a powerful and practical approach for multi-trait polygenic risk prediction in large cohorts. Specifically, we propose a novel cross-trait penalty function with the Lasso and the minimax concave penalty (MCP) to incorporate the shared genetic eff...

Descripción completa

Detalles Bibliográficos
Autores principales: Chung, Wonil, Chen, Jun, Turman, Constance, Lindstrom, Sara, Zhu, Zhaozhong, Loh, Po-Ru, Kraft, Peter, Liang, Liming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6361917/
https://www.ncbi.nlm.nih.gov/pubmed/30718517
http://dx.doi.org/10.1038/s41467-019-08535-0
_version_ 1783392777501933568
author Chung, Wonil
Chen, Jun
Turman, Constance
Lindstrom, Sara
Zhu, Zhaozhong
Loh, Po-Ru
Kraft, Peter
Liang, Liming
author_facet Chung, Wonil
Chen, Jun
Turman, Constance
Lindstrom, Sara
Zhu, Zhaozhong
Loh, Po-Ru
Kraft, Peter
Liang, Liming
author_sort Chung, Wonil
collection PubMed
description We introduce cross-trait penalized regression (CTPR), a powerful and practical approach for multi-trait polygenic risk prediction in large cohorts. Specifically, we propose a novel cross-trait penalty function with the Lasso and the minimax concave penalty (MCP) to incorporate the shared genetic effects across multiple traits for large-sample GWAS data. Our approach extracts information from the secondary traits that is beneficial for predicting the primary trait based on individual-level genotypes and/or summary statistics. Our novel implementation of a parallel computing algorithm makes it feasible to apply our method to biobank-scale GWAS data. We illustrate our method using large-scale GWAS data (~1M SNPs) from the UK Biobank (N = 456,837). We show that our multi-trait method outperforms the recently proposed multi-trait analysis of GWAS (MTAG) for predictive performance. The prediction accuracy for height by the aid of BMI improves from R(2) = 35.8% (MTAG) to 42.5% (MCP + CTPR) or 42.8% (Lasso + CTPR) with UK Biobank data.
format Online
Article
Text
id pubmed-6361917
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-63619172019-02-06 Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes Chung, Wonil Chen, Jun Turman, Constance Lindstrom, Sara Zhu, Zhaozhong Loh, Po-Ru Kraft, Peter Liang, Liming Nat Commun Article We introduce cross-trait penalized regression (CTPR), a powerful and practical approach for multi-trait polygenic risk prediction in large cohorts. Specifically, we propose a novel cross-trait penalty function with the Lasso and the minimax concave penalty (MCP) to incorporate the shared genetic effects across multiple traits for large-sample GWAS data. Our approach extracts information from the secondary traits that is beneficial for predicting the primary trait based on individual-level genotypes and/or summary statistics. Our novel implementation of a parallel computing algorithm makes it feasible to apply our method to biobank-scale GWAS data. We illustrate our method using large-scale GWAS data (~1M SNPs) from the UK Biobank (N = 456,837). We show that our multi-trait method outperforms the recently proposed multi-trait analysis of GWAS (MTAG) for predictive performance. The prediction accuracy for height by the aid of BMI improves from R(2) = 35.8% (MTAG) to 42.5% (MCP + CTPR) or 42.8% (Lasso + CTPR) with UK Biobank data. Nature Publishing Group UK 2019-02-04 /pmc/articles/PMC6361917/ /pubmed/30718517 http://dx.doi.org/10.1038/s41467-019-08535-0 Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Chung, Wonil
Chen, Jun
Turman, Constance
Lindstrom, Sara
Zhu, Zhaozhong
Loh, Po-Ru
Kraft, Peter
Liang, Liming
Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes
title Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes
title_full Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes
title_fullStr Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes
title_full_unstemmed Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes
title_short Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes
title_sort efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6361917/
https://www.ncbi.nlm.nih.gov/pubmed/30718517
http://dx.doi.org/10.1038/s41467-019-08535-0
work_keys_str_mv AT chungwonil efficientcrosstraitpenalizedregressionincreasespredictionaccuracyinlargecohortsusingsecondaryphenotypes
AT chenjun efficientcrosstraitpenalizedregressionincreasespredictionaccuracyinlargecohortsusingsecondaryphenotypes
AT turmanconstance efficientcrosstraitpenalizedregressionincreasespredictionaccuracyinlargecohortsusingsecondaryphenotypes
AT lindstromsara efficientcrosstraitpenalizedregressionincreasespredictionaccuracyinlargecohortsusingsecondaryphenotypes
AT zhuzhaozhong efficientcrosstraitpenalizedregressionincreasespredictionaccuracyinlargecohortsusingsecondaryphenotypes
AT lohporu efficientcrosstraitpenalizedregressionincreasespredictionaccuracyinlargecohortsusingsecondaryphenotypes
AT kraftpeter efficientcrosstraitpenalizedregressionincreasespredictionaccuracyinlargecohortsusingsecondaryphenotypes
AT liangliming efficientcrosstraitpenalizedregressionincreasespredictionaccuracyinlargecohortsusingsecondaryphenotypes