Cargando…

Statistical methods for cis‐Mendelian randomization with two‐sample summary‐level data

Mendelian randomization (MR) is the use of genetic variants to assess the existence of a causal relationship between a risk factor and an outcome of interest. Here, we focus on two‐sample summary‐data MR analyses with many correlated variants from a single gene region, particularly on cis‐MR studies...

Descripción completa

Detalles Bibliográficos
Autores principales: Gkatzionis, Apostolos, Burgess, Stephen, Newcombe, Paul J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7614127/
https://www.ncbi.nlm.nih.gov/pubmed/36273411
http://dx.doi.org/10.1002/gepi.22506
_version_ 1783605568302219264
author Gkatzionis, Apostolos
Burgess, Stephen
Newcombe, Paul J.
author_facet Gkatzionis, Apostolos
Burgess, Stephen
Newcombe, Paul J.
author_sort Gkatzionis, Apostolos
collection PubMed
description Mendelian randomization (MR) is the use of genetic variants to assess the existence of a causal relationship between a risk factor and an outcome of interest. Here, we focus on two‐sample summary‐data MR analyses with many correlated variants from a single gene region, particularly on cis‐MR studies which use protein expression as a risk factor. Such studies must rely on a small, curated set of variants from the studied region; using all variants in the region requires inverting an ill‐conditioned genetic correlation matrix and results in numerically unstable causal effect estimates. We review methods for variable selection and estimation in cis‐MR with summary‐level data, ranging from stepwise pruning and conditional analysis to principal components analysis, factor analysis, and Bayesian variable selection. In a simulation study, we show that the various methods have comparable performance in analyses with large sample sizes and strong genetic instruments. However, when weak instrument bias is suspected, factor analysis and Bayesian variable selection produce more reliable inferences than simple pruning approaches, which are often used in practice. We conclude by examining two case studies, assessing the effects of low‐density lipoprotein‐cholesterol and serum testosterone on coronary heart disease risk using variants in the HMGCR and SHBG gene regions, respectively.
format Online
Article
Text
id pubmed-7614127
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-76141272023-02-01 Statistical methods for cis‐Mendelian randomization with two‐sample summary‐level data Gkatzionis, Apostolos Burgess, Stephen Newcombe, Paul J. Genet Epidemiol Research Articles Mendelian randomization (MR) is the use of genetic variants to assess the existence of a causal relationship between a risk factor and an outcome of interest. Here, we focus on two‐sample summary‐data MR analyses with many correlated variants from a single gene region, particularly on cis‐MR studies which use protein expression as a risk factor. Such studies must rely on a small, curated set of variants from the studied region; using all variants in the region requires inverting an ill‐conditioned genetic correlation matrix and results in numerically unstable causal effect estimates. We review methods for variable selection and estimation in cis‐MR with summary‐level data, ranging from stepwise pruning and conditional analysis to principal components analysis, factor analysis, and Bayesian variable selection. In a simulation study, we show that the various methods have comparable performance in analyses with large sample sizes and strong genetic instruments. However, when weak instrument bias is suspected, factor analysis and Bayesian variable selection produce more reliable inferences than simple pruning approaches, which are often used in practice. We conclude by examining two case studies, assessing the effects of low‐density lipoprotein‐cholesterol and serum testosterone on coronary heart disease risk using variants in the HMGCR and SHBG gene regions, respectively. John Wiley and Sons Inc. 2022-10-23 2023-02 /pmc/articles/PMC7614127/ /pubmed/36273411 http://dx.doi.org/10.1002/gepi.22506 Text en © 2022 The Authors. Genetic Epidemiology published by Wiley Periodicals LLC. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Articles
Gkatzionis, Apostolos
Burgess, Stephen
Newcombe, Paul J.
Statistical methods for cis‐Mendelian randomization with two‐sample summary‐level data
title Statistical methods for cis‐Mendelian randomization with two‐sample summary‐level data
title_full Statistical methods for cis‐Mendelian randomization with two‐sample summary‐level data
title_fullStr Statistical methods for cis‐Mendelian randomization with two‐sample summary‐level data
title_full_unstemmed Statistical methods for cis‐Mendelian randomization with two‐sample summary‐level data
title_short Statistical methods for cis‐Mendelian randomization with two‐sample summary‐level data
title_sort statistical methods for cis‐mendelian randomization with two‐sample summary‐level data
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7614127/
https://www.ncbi.nlm.nih.gov/pubmed/36273411
http://dx.doi.org/10.1002/gepi.22506
work_keys_str_mv AT gkatzionisapostolos statisticalmethodsforcismendelianrandomizationwithtwosamplesummaryleveldata
AT burgessstephen statisticalmethodsforcismendelianrandomizationwithtwosamplesummaryleveldata
AT newcombepaulj statisticalmethodsforcismendelianrandomizationwithtwosamplesummaryleveldata