Cargando…

Fast estimation of genetic correlation for biobank-scale data

Genetic correlation is an important parameter in efforts to understand the relationships among complex traits. Current methods that analyze individual genotype data for estimating genetic correlation are challenging to scale to large datasets. Methods that analyze summary data, while being computati...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Yue, Burch, Kathryn S., Ganna, Andrea, Pajukanta, Päivi, Pasaniuc, Bogdan, Sankararaman, Sriram
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8764132/
https://www.ncbi.nlm.nih.gov/pubmed/34861179
http://dx.doi.org/10.1016/j.ajhg.2021.11.015
_version_ 1784634097260298240
author Wu, Yue
Burch, Kathryn S.
Ganna, Andrea
Pajukanta, Päivi
Pasaniuc, Bogdan
Sankararaman, Sriram
author_facet Wu, Yue
Burch, Kathryn S.
Ganna, Andrea
Pajukanta, Päivi
Pasaniuc, Bogdan
Sankararaman, Sriram
author_sort Wu, Yue
collection PubMed
description Genetic correlation is an important parameter in efforts to understand the relationships among complex traits. Current methods that analyze individual genotype data for estimating genetic correlation are challenging to scale to large datasets. Methods that analyze summary data, while being computationally efficient, tend to yield estimates of genetic correlation with reduced precision. We propose SCORE (scalable genetic correlation estimator), a randomized method of moments estimator of genetic correlation that is both scalable and accurate. SCORE obtains more precise estimates of genetic correlations relative to summary-statistic methods that can be applied at scale; it achieves a [Formula: see text] reduction in standard error relative to LD-score regression (LDSC) and a [Formula: see text] reduction relative to high-definition likelihood (HDL) (averaged over all simulations). The efficiency of SCORE enables computation of genetic correlations on the UK Biobank dataset, consisting of [Formula: see text] K individuals and [Formula: see text] K SNPs, in a few h (orders of magnitude faster than methods that analyze individual data, such as GCTA). Across 780 pairs of traits in [Formula: see text] unrelated white British individuals in the UK Biobank, SCORE identifies significant genetic correlation between 200 additional pairs of traits over LDSC (beyond the 245 pairs identified by both).
format Online
Article
Text
id pubmed-8764132
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-87641322022-01-20 Fast estimation of genetic correlation for biobank-scale data Wu, Yue Burch, Kathryn S. Ganna, Andrea Pajukanta, Päivi Pasaniuc, Bogdan Sankararaman, Sriram Am J Hum Genet Article Genetic correlation is an important parameter in efforts to understand the relationships among complex traits. Current methods that analyze individual genotype data for estimating genetic correlation are challenging to scale to large datasets. Methods that analyze summary data, while being computationally efficient, tend to yield estimates of genetic correlation with reduced precision. We propose SCORE (scalable genetic correlation estimator), a randomized method of moments estimator of genetic correlation that is both scalable and accurate. SCORE obtains more precise estimates of genetic correlations relative to summary-statistic methods that can be applied at scale; it achieves a [Formula: see text] reduction in standard error relative to LD-score regression (LDSC) and a [Formula: see text] reduction relative to high-definition likelihood (HDL) (averaged over all simulations). The efficiency of SCORE enables computation of genetic correlations on the UK Biobank dataset, consisting of [Formula: see text] K individuals and [Formula: see text] K SNPs, in a few h (orders of magnitude faster than methods that analyze individual data, such as GCTA). Across 780 pairs of traits in [Formula: see text] unrelated white British individuals in the UK Biobank, SCORE identifies significant genetic correlation between 200 additional pairs of traits over LDSC (beyond the 245 pairs identified by both). Elsevier 2022-01-06 2021-12-02 /pmc/articles/PMC8764132/ /pubmed/34861179 http://dx.doi.org/10.1016/j.ajhg.2021.11.015 Text en © 2021 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Wu, Yue
Burch, Kathryn S.
Ganna, Andrea
Pajukanta, Päivi
Pasaniuc, Bogdan
Sankararaman, Sriram
Fast estimation of genetic correlation for biobank-scale data
title Fast estimation of genetic correlation for biobank-scale data
title_full Fast estimation of genetic correlation for biobank-scale data
title_fullStr Fast estimation of genetic correlation for biobank-scale data
title_full_unstemmed Fast estimation of genetic correlation for biobank-scale data
title_short Fast estimation of genetic correlation for biobank-scale data
title_sort fast estimation of genetic correlation for biobank-scale data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8764132/
https://www.ncbi.nlm.nih.gov/pubmed/34861179
http://dx.doi.org/10.1016/j.ajhg.2021.11.015
work_keys_str_mv AT wuyue fastestimationofgeneticcorrelationforbiobankscaledata
AT burchkathryns fastestimationofgeneticcorrelationforbiobankscaledata
AT gannaandrea fastestimationofgeneticcorrelationforbiobankscaledata
AT pajukantapaivi fastestimationofgeneticcorrelationforbiobankscaledata
AT pasaniucbogdan fastestimationofgeneticcorrelationforbiobankscaledata
AT sankararamansriram fastestimationofgeneticcorrelationforbiobankscaledata