Cargando…

Gene-based whole genome sequencing meta-analysis of 250 circulating proteins in three isolated European populations

OBJECTIVE: Deep sequencing offers unparalleled access to rare variants in human populations. Understanding their role in disease is a priority, yet prohibitive sequencing costs mean that many cohorts lack the sample size to discover these effects on their own. Meta-analysis of individual variant sco...

Descripción completa

Detalles Bibliográficos
Autores principales: Gilly, Arthur, Klaric, Lucija, Park, Young-Chan, Png, Grace, Barysenka, Andrei, Marsh, Joseph A., Tsafantakis, Emmanouil, Karaleftheri, Maria, Dedoussis, George, Wilson, James F., Zeggini, Eleftheria
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9118462/
https://www.ncbi.nlm.nih.gov/pubmed/35504531
http://dx.doi.org/10.1016/j.molmet.2022.101509
_version_ 1784710501438062592
author Gilly, Arthur
Klaric, Lucija
Park, Young-Chan
Png, Grace
Barysenka, Andrei
Marsh, Joseph A.
Tsafantakis, Emmanouil
Karaleftheri, Maria
Dedoussis, George
Wilson, James F.
Zeggini, Eleftheria
author_facet Gilly, Arthur
Klaric, Lucija
Park, Young-Chan
Png, Grace
Barysenka, Andrei
Marsh, Joseph A.
Tsafantakis, Emmanouil
Karaleftheri, Maria
Dedoussis, George
Wilson, James F.
Zeggini, Eleftheria
author_sort Gilly, Arthur
collection PubMed
description OBJECTIVE: Deep sequencing offers unparalleled access to rare variants in human populations. Understanding their role in disease is a priority, yet prohibitive sequencing costs mean that many cohorts lack the sample size to discover these effects on their own. Meta-analysis of individual variant scores allows the combination of rare variants across cohorts and study of their aggregated effect at the gene level, boosting discovery power. However, the methods involved have largely not been field-tested. In this study, we aim to perform the first meta-analysis of gene-based rare variant aggregation optimal tests, applied to the human cardiometabolic proteome. METHODS: Here, we carry out this analysis across MANOLIS, Pomak and ORCADES, three isolated European cohorts with whole-genome sequencing (total N = 4,422). We examine the genetic architecture of 250 proteomic traits of cardiometabolic relevance. We use a containerised pipeline to harmonise variant lists across cohorts and define four sets of qualifying variants. For every gene, we interrogate protein-damaging variants, exonic variants, exonic and regulatory variants, and regulatory only variants, using the CADD and Eigen scores to weigh variants according to their predicted functional consequence. We perform single-cohort rare variant analysis and meta-analyse variant scores using the SMMAT package. RESULTS: We describe 5 rare variant pQTLs (RV-pQTL) which pass our stringent significance threshold (7.45 × 10(−11)) and quality control procedure. These were split between four cis signals for MARCO, TEK, MMP2 and MPO, and one trans association for GDF2 in the SERPINA11 gene. We show that the cis-MPO association, which was not detectable using the single-point data alone, is driven by 5 missense and frameshift variants. These include rs140636390 and rs119468010, which are specific to MANOLIS and ORCADES, respectively. We show how this kind of signal could improve the predictive accuracy of genetic factors in common complex disease such as stroke and cardiovascular disease. CONCLUSIONS: Our proof-of-concept study demonstrates the power of gene-based meta-analyses for discovering disease-relevant associations complementing common-variant signals by incorporating population-specific rare variation.
format Online
Article
Text
id pubmed-9118462
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-91184622022-05-20 Gene-based whole genome sequencing meta-analysis of 250 circulating proteins in three isolated European populations Gilly, Arthur Klaric, Lucija Park, Young-Chan Png, Grace Barysenka, Andrei Marsh, Joseph A. Tsafantakis, Emmanouil Karaleftheri, Maria Dedoussis, George Wilson, James F. Zeggini, Eleftheria Mol Metab Original Article OBJECTIVE: Deep sequencing offers unparalleled access to rare variants in human populations. Understanding their role in disease is a priority, yet prohibitive sequencing costs mean that many cohorts lack the sample size to discover these effects on their own. Meta-analysis of individual variant scores allows the combination of rare variants across cohorts and study of their aggregated effect at the gene level, boosting discovery power. However, the methods involved have largely not been field-tested. In this study, we aim to perform the first meta-analysis of gene-based rare variant aggregation optimal tests, applied to the human cardiometabolic proteome. METHODS: Here, we carry out this analysis across MANOLIS, Pomak and ORCADES, three isolated European cohorts with whole-genome sequencing (total N = 4,422). We examine the genetic architecture of 250 proteomic traits of cardiometabolic relevance. We use a containerised pipeline to harmonise variant lists across cohorts and define four sets of qualifying variants. For every gene, we interrogate protein-damaging variants, exonic variants, exonic and regulatory variants, and regulatory only variants, using the CADD and Eigen scores to weigh variants according to their predicted functional consequence. We perform single-cohort rare variant analysis and meta-analyse variant scores using the SMMAT package. RESULTS: We describe 5 rare variant pQTLs (RV-pQTL) which pass our stringent significance threshold (7.45 × 10(−11)) and quality control procedure. These were split between four cis signals for MARCO, TEK, MMP2 and MPO, and one trans association for GDF2 in the SERPINA11 gene. We show that the cis-MPO association, which was not detectable using the single-point data alone, is driven by 5 missense and frameshift variants. These include rs140636390 and rs119468010, which are specific to MANOLIS and ORCADES, respectively. We show how this kind of signal could improve the predictive accuracy of genetic factors in common complex disease such as stroke and cardiovascular disease. CONCLUSIONS: Our proof-of-concept study demonstrates the power of gene-based meta-analyses for discovering disease-relevant associations complementing common-variant signals by incorporating population-specific rare variation. Elsevier 2022-04-30 /pmc/articles/PMC9118462/ /pubmed/35504531 http://dx.doi.org/10.1016/j.molmet.2022.101509 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Original Article
Gilly, Arthur
Klaric, Lucija
Park, Young-Chan
Png, Grace
Barysenka, Andrei
Marsh, Joseph A.
Tsafantakis, Emmanouil
Karaleftheri, Maria
Dedoussis, George
Wilson, James F.
Zeggini, Eleftheria
Gene-based whole genome sequencing meta-analysis of 250 circulating proteins in three isolated European populations
title Gene-based whole genome sequencing meta-analysis of 250 circulating proteins in three isolated European populations
title_full Gene-based whole genome sequencing meta-analysis of 250 circulating proteins in three isolated European populations
title_fullStr Gene-based whole genome sequencing meta-analysis of 250 circulating proteins in three isolated European populations
title_full_unstemmed Gene-based whole genome sequencing meta-analysis of 250 circulating proteins in three isolated European populations
title_short Gene-based whole genome sequencing meta-analysis of 250 circulating proteins in three isolated European populations
title_sort gene-based whole genome sequencing meta-analysis of 250 circulating proteins in three isolated european populations
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9118462/
https://www.ncbi.nlm.nih.gov/pubmed/35504531
http://dx.doi.org/10.1016/j.molmet.2022.101509
work_keys_str_mv AT gillyarthur genebasedwholegenomesequencingmetaanalysisof250circulatingproteinsinthreeisolatedeuropeanpopulations
AT klariclucija genebasedwholegenomesequencingmetaanalysisof250circulatingproteinsinthreeisolatedeuropeanpopulations
AT parkyoungchan genebasedwholegenomesequencingmetaanalysisof250circulatingproteinsinthreeisolatedeuropeanpopulations
AT pnggrace genebasedwholegenomesequencingmetaanalysisof250circulatingproteinsinthreeisolatedeuropeanpopulations
AT barysenkaandrei genebasedwholegenomesequencingmetaanalysisof250circulatingproteinsinthreeisolatedeuropeanpopulations
AT marshjosepha genebasedwholegenomesequencingmetaanalysisof250circulatingproteinsinthreeisolatedeuropeanpopulations
AT tsafantakisemmanouil genebasedwholegenomesequencingmetaanalysisof250circulatingproteinsinthreeisolatedeuropeanpopulations
AT karaleftherimaria genebasedwholegenomesequencingmetaanalysisof250circulatingproteinsinthreeisolatedeuropeanpopulations
AT dedoussisgeorge genebasedwholegenomesequencingmetaanalysisof250circulatingproteinsinthreeisolatedeuropeanpopulations
AT wilsonjamesf genebasedwholegenomesequencingmetaanalysisof250circulatingproteinsinthreeisolatedeuropeanpopulations
AT zegginieleftheria genebasedwholegenomesequencingmetaanalysisof250circulatingproteinsinthreeisolatedeuropeanpopulations