Cargando…
Design and quality control of large-scale two-sample Mendelian randomization studies
BACKGROUND: Mendelian randomization (MR) studies are susceptible to metadata errors (e.g. incorrect specification of the effect allele column) and other analytical issues that can introduce substantial bias into analyses. We developed a quality control (QC) pipeline for the Fatty Acids in Cancer Men...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10555669/ http://dx.doi.org/10.1093/ije/dyad018 |
_version_ | 1785116707979788288 |
---|---|
author | Haycock, Philip C Borges, Maria Carolina Burrows, Kimberley Lemaitre, Rozenn N Harrison, Sean Burgess, Stephen Chang, Xuling Westra, Jason Khankari, Nikhil K Tsilidis, Kostas K Gaunt, Tom Hemani, Gibran Zheng, Jie Truong, Therese O’Mara, Tracy A Spurdle, Amanda B Law, Matthew H Slager, Susan L Birmann, Brenda M Saberi Hosnijeh, Fatemeh Mariosa, Daniela Amos, Christopher I Hung, Rayjean J Zheng, Wei Gunter, Marc J Davey Smith, George Relton, Caroline Martin, Richard M |
author_facet | Haycock, Philip C Borges, Maria Carolina Burrows, Kimberley Lemaitre, Rozenn N Harrison, Sean Burgess, Stephen Chang, Xuling Westra, Jason Khankari, Nikhil K Tsilidis, Kostas K Gaunt, Tom Hemani, Gibran Zheng, Jie Truong, Therese O’Mara, Tracy A Spurdle, Amanda B Law, Matthew H Slager, Susan L Birmann, Brenda M Saberi Hosnijeh, Fatemeh Mariosa, Daniela Amos, Christopher I Hung, Rayjean J Zheng, Wei Gunter, Marc J Davey Smith, George Relton, Caroline Martin, Richard M |
author_sort | Haycock, Philip C |
collection | PubMed |
description | BACKGROUND: Mendelian randomization (MR) studies are susceptible to metadata errors (e.g. incorrect specification of the effect allele column) and other analytical issues that can introduce substantial bias into analyses. We developed a quality control (QC) pipeline for the Fatty Acids in Cancer Mendelian Randomization Collaboration (FAMRC) that can be used to identify and correct for such errors. METHODS: We collated summary association statistics from fatty acid and cancer genome-wide association studies (GWAS) and subjected the collated data to a comprehensive QC pipeline. We identified metadata errors through comparison of study-specific statistics to external reference data sets (the National Human Genome Research Institute-European Bioinformatics Institute GWAS catalogue and 1000 genome super populations) and other analytical issues through comparison of reported to expected genetic effect sizes. Comparisons were based on three sets of genetic variants: (i) GWAS hits for fatty acids, (ii) GWAS hits for cancer and (iii) a 1000 genomes reference set. RESULTS: We collated summary data from 6 fatty acid and 54 cancer GWAS. Metadata errors and analytical issues with the potential to introduce substantial bias were identified in seven studies (11.6%). After resolving metadata errors and analytical issues, we created a data set of 219 842 genetic associations with 90 cancer types, generated in analyses of 566 665 cancer cases and 1 622 374 controls. CONCLUSIONS: In this large MR collaboration, 11.6% of included studies were affected by a substantial metadata error or analytical issue. By increasing the integrity of collated summary data prior to their analysis, our protocol can be used to increase the reliability of downstream MR analyses. Our pipeline is available to other researchers via the CheckSumStats package (https://github.com/MRCIEU/CheckSumStats). |
format | Online Article Text |
id | pubmed-10555669 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-105556692023-10-07 Design and quality control of large-scale two-sample Mendelian randomization studies Haycock, Philip C Borges, Maria Carolina Burrows, Kimberley Lemaitre, Rozenn N Harrison, Sean Burgess, Stephen Chang, Xuling Westra, Jason Khankari, Nikhil K Tsilidis, Kostas K Gaunt, Tom Hemani, Gibran Zheng, Jie Truong, Therese O’Mara, Tracy A Spurdle, Amanda B Law, Matthew H Slager, Susan L Birmann, Brenda M Saberi Hosnijeh, Fatemeh Mariosa, Daniela Amos, Christopher I Hung, Rayjean J Zheng, Wei Gunter, Marc J Davey Smith, George Relton, Caroline Martin, Richard M Int J Epidemiol Methods BACKGROUND: Mendelian randomization (MR) studies are susceptible to metadata errors (e.g. incorrect specification of the effect allele column) and other analytical issues that can introduce substantial bias into analyses. We developed a quality control (QC) pipeline for the Fatty Acids in Cancer Mendelian Randomization Collaboration (FAMRC) that can be used to identify and correct for such errors. METHODS: We collated summary association statistics from fatty acid and cancer genome-wide association studies (GWAS) and subjected the collated data to a comprehensive QC pipeline. We identified metadata errors through comparison of study-specific statistics to external reference data sets (the National Human Genome Research Institute-European Bioinformatics Institute GWAS catalogue and 1000 genome super populations) and other analytical issues through comparison of reported to expected genetic effect sizes. Comparisons were based on three sets of genetic variants: (i) GWAS hits for fatty acids, (ii) GWAS hits for cancer and (iii) a 1000 genomes reference set. RESULTS: We collated summary data from 6 fatty acid and 54 cancer GWAS. Metadata errors and analytical issues with the potential to introduce substantial bias were identified in seven studies (11.6%). After resolving metadata errors and analytical issues, we created a data set of 219 842 genetic associations with 90 cancer types, generated in analyses of 566 665 cancer cases and 1 622 374 controls. CONCLUSIONS: In this large MR collaboration, 11.6% of included studies were affected by a substantial metadata error or analytical issue. By increasing the integrity of collated summary data prior to their analysis, our protocol can be used to increase the reliability of downstream MR analyses. Our pipeline is available to other researchers via the CheckSumStats package (https://github.com/MRCIEU/CheckSumStats). Oxford University Press 2023-04-12 /pmc/articles/PMC10555669/ http://dx.doi.org/10.1093/ije/dyad018 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of the International Epidemiological Association. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Haycock, Philip C Borges, Maria Carolina Burrows, Kimberley Lemaitre, Rozenn N Harrison, Sean Burgess, Stephen Chang, Xuling Westra, Jason Khankari, Nikhil K Tsilidis, Kostas K Gaunt, Tom Hemani, Gibran Zheng, Jie Truong, Therese O’Mara, Tracy A Spurdle, Amanda B Law, Matthew H Slager, Susan L Birmann, Brenda M Saberi Hosnijeh, Fatemeh Mariosa, Daniela Amos, Christopher I Hung, Rayjean J Zheng, Wei Gunter, Marc J Davey Smith, George Relton, Caroline Martin, Richard M Design and quality control of large-scale two-sample Mendelian randomization studies |
title | Design and quality control of large-scale two-sample Mendelian randomization studies |
title_full | Design and quality control of large-scale two-sample Mendelian randomization studies |
title_fullStr | Design and quality control of large-scale two-sample Mendelian randomization studies |
title_full_unstemmed | Design and quality control of large-scale two-sample Mendelian randomization studies |
title_short | Design and quality control of large-scale two-sample Mendelian randomization studies |
title_sort | design and quality control of large-scale two-sample mendelian randomization studies |
topic | Methods |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10555669/ http://dx.doi.org/10.1093/ije/dyad018 |
work_keys_str_mv | AT haycockphilipc designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT borgesmariacarolina designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT burrowskimberley designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT lemaitrerozennn designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT harrisonsean designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT burgessstephen designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT changxuling designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT westrajason designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT khankarinikhilk designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT tsilidiskostask designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT gaunttom designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT hemanigibran designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT zhengjie designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT truongtherese designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT omaratracya designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT spurdleamandab designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT lawmatthewh designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT slagersusanl designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT birmannbrendam designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT saberihosnijehfatemeh designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT mariosadaniela designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT amoschristopheri designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT hungrayjeanj designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT zhengwei designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT guntermarcj designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT daveysmithgeorge designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT reltoncaroline designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT martinrichardm designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies AT designandqualitycontroloflargescaletwosamplemendelianrandomizationstudies |