Cargando…

Design and quality control of large-scale two-sample Mendelian randomization studies

BACKGROUND: Mendelian randomization (MR) studies are susceptible to metadata errors (e.g. incorrect specification of the effect allele column) and other analytical issues that can introduce substantial bias into analyses. We developed a quality control (QC) pipeline for the Fatty Acids in Cancer Men...

Descripción completa

Detalles Bibliográficos
Autores principales: Haycock, Philip C, Borges, Maria Carolina, Burrows, Kimberley, Lemaitre, Rozenn N, Harrison, Sean, Burgess, Stephen, Chang, Xuling, Westra, Jason, Khankari, Nikhil K, Tsilidis, Kostas K, Gaunt, Tom, Hemani, Gibran, Zheng, Jie, Truong, Therese, O’Mara, Tracy A, Spurdle, Amanda B, Law, Matthew H, Slager, Susan L, Birmann, Brenda M, Saberi Hosnijeh, Fatemeh, Mariosa, Daniela, Amos, Christopher I, Hung, Rayjean J, Zheng, Wei, Gunter, Marc J, Davey Smith, George, Relton, Caroline, Martin, Richard M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10555669/
http://dx.doi.org/10.1093/ije/dyad018
_version_ 1785116707979788288
author Haycock, Philip C
Borges, Maria Carolina
Burrows, Kimberley
Lemaitre, Rozenn N
Harrison, Sean
Burgess, Stephen
Chang, Xuling
Westra, Jason
Khankari, Nikhil K
Tsilidis, Kostas K
Gaunt, Tom
Hemani, Gibran
Zheng, Jie
Truong, Therese
O’Mara, Tracy A
Spurdle, Amanda B
Law, Matthew H
Slager, Susan L
Birmann, Brenda M
Saberi Hosnijeh, Fatemeh
Mariosa, Daniela
Amos, Christopher I
Hung, Rayjean J
Zheng, Wei
Gunter, Marc J
Davey Smith, George
Relton, Caroline
Martin, Richard M
author_facet Haycock, Philip C
Borges, Maria Carolina
Burrows, Kimberley
Lemaitre, Rozenn N
Harrison, Sean
Burgess, Stephen
Chang, Xuling
Westra, Jason
Khankari, Nikhil K
Tsilidis, Kostas K
Gaunt, Tom
Hemani, Gibran
Zheng, Jie
Truong, Therese
O’Mara, Tracy A
Spurdle, Amanda B
Law, Matthew H
Slager, Susan L
Birmann, Brenda M
Saberi Hosnijeh, Fatemeh
Mariosa, Daniela
Amos, Christopher I
Hung, Rayjean J
Zheng, Wei
Gunter, Marc J
Davey Smith, George
Relton, Caroline
Martin, Richard M
author_sort Haycock, Philip C
collection PubMed
description BACKGROUND: Mendelian randomization (MR) studies are susceptible to metadata errors (e.g. incorrect specification of the effect allele column) and other analytical issues that can introduce substantial bias into analyses. We developed a quality control (QC) pipeline for the Fatty Acids in Cancer Mendelian Randomization Collaboration (FAMRC) that can be used to identify and correct for such errors. METHODS: We collated summary association statistics from fatty acid and cancer genome-wide association studies (GWAS) and subjected the collated data to a comprehensive QC pipeline. We identified metadata errors through comparison of study-specific statistics to external reference data sets (the National Human Genome Research Institute-European Bioinformatics Institute GWAS catalogue and 1000 genome super populations) and other analytical issues through comparison of reported to expected genetic effect sizes. Comparisons were based on three sets of genetic variants: (i) GWAS hits for fatty acids, (ii) GWAS hits for cancer and (iii) a 1000 genomes reference set. RESULTS: We collated summary data from 6 fatty acid and 54 cancer GWAS. Metadata errors and analytical issues with the potential to introduce substantial bias were identified in seven studies (11.6%). After resolving metadata errors and analytical issues, we created a data set of 219 842 genetic associations with 90 cancer types, generated in analyses of 566 665 cancer cases and 1 622 374 controls. CONCLUSIONS: In this large MR collaboration, 11.6% of included studies were affected by a substantial metadata error or analytical issue. By increasing the integrity of collated summary data prior to their analysis, our protocol can be used to increase the reliability of downstream MR analyses. Our pipeline is available to other researchers via the CheckSumStats package (https://github.com/MRCIEU/CheckSumStats).
format Online
Article
Text
id pubmed-10555669
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-105556692023-10-07 Design and quality control of large-scale two-sample Mendelian randomization studies Haycock, Philip C Borges, Maria Carolina Burrows, Kimberley Lemaitre, Rozenn N Harrison, Sean Burgess, Stephen Chang, Xuling Westra, Jason Khankari, Nikhil K Tsilidis, Kostas K Gaunt, Tom Hemani, Gibran Zheng, Jie Truong, Therese O’Mara, Tracy A Spurdle, Amanda B Law, Matthew H Slager, Susan L Birmann, Brenda M Saberi Hosnijeh, Fatemeh Mariosa, Daniela Amos, Christopher I Hung, Rayjean J Zheng, Wei Gunter, Marc J Davey Smith, George Relton, Caroline Martin, Richard M Int J Epidemiol Methods BACKGROUND: Mendelian randomization (MR) studies are susceptible to metadata errors (e.g. incorrect specification of the effect allele column) and other analytical issues that can introduce substantial bias into analyses. We developed a quality control (QC) pipeline for the Fatty Acids in Cancer Mendelian Randomization Collaboration (FAMRC) that can be used to identify and correct for such errors. METHODS: We collated summary association statistics from fatty acid and cancer genome-wide association studies (GWAS) and subjected the collated data to a comprehensive QC pipeline. We identified metadata errors through comparison of study-specific statistics to external reference data sets (the National Human Genome Research Institute-European Bioinformatics Institute GWAS catalogue and 1000 genome super populations) and other analytical issues through comparison of reported to expected genetic effect sizes. Comparisons were based on three sets of genetic variants: (i) GWAS hits for fatty acids, (ii) GWAS hits for cancer and (iii) a 1000 genomes reference set. RESULTS: We collated summary data from 6 fatty acid and 54 cancer GWAS. Metadata errors and analytical issues with the potential to introduce substantial bias were identified in seven studies (11.6%). After resolving metadata errors and analytical issues, we created a data set of 219 842 genetic associations with 90 cancer types, generated in analyses of 566 665 cancer cases and 1 622 374 controls. CONCLUSIONS: In this large MR collaboration, 11.6% of included studies were affected by a substantial metadata error or analytical issue. By increasing the integrity of collated summary data prior to their analysis, our protocol can be used to increase the reliability of downstream MR analyses. Our pipeline is available to other researchers via the CheckSumStats package (https://github.com/MRCIEU/CheckSumStats). Oxford University Press 2023-04-12 /pmc/articles/PMC10555669/ http://dx.doi.org/10.1093/ije/dyad018 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of the International Epidemiological Association. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods
Haycock, Philip C
Borges, Maria Carolina
Burrows, Kimberley
Lemaitre, Rozenn N
Harrison, Sean
Burgess, Stephen
Chang, Xuling
Westra, Jason
Khankari, Nikhil K
Tsilidis, Kostas K
Gaunt, Tom
Hemani, Gibran
Zheng, Jie
Truong, Therese
O’Mara, Tracy A
Spurdle, Amanda B
Law, Matthew H
Slager, Susan L
Birmann, Brenda M
Saberi Hosnijeh, Fatemeh
Mariosa, Daniela
Amos, Christopher I
Hung, Rayjean J
Zheng, Wei
Gunter, Marc J
Davey Smith, George
Relton, Caroline
Martin, Richard M
Design and quality control of large-scale two-sample Mendelian randomization studies
title Design and quality control of large-scale two-sample Mendelian randomization studies
title_full Design and quality control of large-scale two-sample Mendelian randomization studies
title_fullStr Design and quality control of large-scale two-sample Mendelian randomization studies
title_full_unstemmed Design and quality control of large-scale two-sample Mendelian randomization studies
title_short Design and quality control of large-scale two-sample Mendelian randomization studies
title_sort design and quality control of large-scale two-sample mendelian randomization studies
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10555669/
http://dx.doi.org/10.1093/ije/dyad018
work_keys_str_mv AT haycockphilipc designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT borgesmariacarolina designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT burrowskimberley designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT lemaitrerozennn designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT harrisonsean designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT burgessstephen designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT changxuling designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT westrajason designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT khankarinikhilk designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT tsilidiskostask designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT gaunttom designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT hemanigibran designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT zhengjie designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT truongtherese designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT omaratracya designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT spurdleamandab designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT lawmatthewh designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT slagersusanl designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT birmannbrendam designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT saberihosnijehfatemeh designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT mariosadaniela designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT amoschristopheri designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT hungrayjeanj designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT zhengwei designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT guntermarcj designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT daveysmithgeorge designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT reltoncaroline designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT martinrichardm designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies
AT designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies