Cargando…

A framework for pathway knowledge driven prioritization in genome‐wide association studies

Many variants with low frequencies or with low to modest effects likely remain unidentified in genome‐wide association studies (GWAS) because of stringent genome‐wide thresholds for detection. To improve the power of detection, variant prioritization based on their functional annotations and epigene...

Descripción completa

Detalles Bibliográficos
Autores principales:	Biswas, Shrayashi, Pal, Soumen, Majumder, Partha P., Bhattacharjee, Samsiddhi
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	John Wiley and Sons Inc. 2020
Materias:	Research Articles
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7116354/ https://www.ncbi.nlm.nih.gov/pubmed/32779262 http://dx.doi.org/10.1002/gepi.22345

_version_	1783514213036064768
author	Biswas, Shrayashi Pal, Soumen Majumder, Partha P. Bhattacharjee, Samsiddhi
author_facet	Biswas, Shrayashi Pal, Soumen Majumder, Partha P. Bhattacharjee, Samsiddhi
author_sort	Biswas, Shrayashi
collection	PubMed
description	Many variants with low frequencies or with low to modest effects likely remain unidentified in genome‐wide association studies (GWAS) because of stringent genome‐wide thresholds for detection. To improve the power of detection, variant prioritization based on their functional annotations and epigenetic landmarks has been used successfully. Here, we propose a novel method of prioritization of a GWAS by exploiting gene‐level knowledge (e.g., annotations to pathways and ontologies) and show that it further improves power. Often, disease associated variants are found near genes that are coinvolved in specific biological pathways relevant to disease process. Utilization of this knowledge to conduct a prioritized scan increases the power to detect loci that map to genes clustered in a few specific pathways. We have developed a computationally scalable framework based on penalized logistic regression (termed GKnowMTest—Genomic Knowledge‐guided Multiplte Testing) to enable a prioritized pathway‐guided GWAS scan with a very large number of gene‐level annotations. We demonstrate that the proposed strategy improves overall power and maintains the Type 1 error globally. Our method works on genome‐wide summary level data and a user‐specified list of pathways (e.g., those extracted from large pathway databases without reference to biology of a specific disease). It automatically reweights the input p values by incorporating the pathway enrichments as “adaptively learned” from the data using a cross‐validation technique to avoid overfitting. We used whole‐genome simulations and some publicly available GWAS data sets to illustrate the application of our method. The GKnowMTest framework has been implemented as a user‐friendly open‐source R package.
format	Online Article Text
id	pubmed-7116354
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	John Wiley and Sons Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-71163542020-11-12 A framework for pathway knowledge driven prioritization in genome‐wide association studies Biswas, Shrayashi Pal, Soumen Majumder, Partha P. Bhattacharjee, Samsiddhi Genet Epidemiol Research Articles Many variants with low frequencies or with low to modest effects likely remain unidentified in genome‐wide association studies (GWAS) because of stringent genome‐wide thresholds for detection. To improve the power of detection, variant prioritization based on their functional annotations and epigenetic landmarks has been used successfully. Here, we propose a novel method of prioritization of a GWAS by exploiting gene‐level knowledge (e.g., annotations to pathways and ontologies) and show that it further improves power. Often, disease associated variants are found near genes that are coinvolved in specific biological pathways relevant to disease process. Utilization of this knowledge to conduct a prioritized scan increases the power to detect loci that map to genes clustered in a few specific pathways. We have developed a computationally scalable framework based on penalized logistic regression (termed GKnowMTest—Genomic Knowledge‐guided Multiplte Testing) to enable a prioritized pathway‐guided GWAS scan with a very large number of gene‐level annotations. We demonstrate that the proposed strategy improves overall power and maintains the Type 1 error globally. Our method works on genome‐wide summary level data and a user‐specified list of pathways (e.g., those extracted from large pathway databases without reference to biology of a specific disease). It automatically reweights the input p values by incorporating the pathway enrichments as “adaptively learned” from the data using a cross‐validation technique to avoid overfitting. We used whole‐genome simulations and some publicly available GWAS data sets to illustrate the application of our method. The GKnowMTest framework has been implemented as a user‐friendly open‐source R package. John Wiley and Sons Inc. 2020-08-10 2020-11 /pmc/articles/PMC7116354/ /pubmed/32779262 http://dx.doi.org/10.1002/gepi.22345 Text en © 2020 The Authors. Genetic Epidemiology Published by Wiley Periodicals LLC This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Articles Biswas, Shrayashi Pal, Soumen Majumder, Partha P. Bhattacharjee, Samsiddhi A framework for pathway knowledge driven prioritization in genome‐wide association studies
title	A framework for pathway knowledge driven prioritization in genome‐wide association studies
title_full	A framework for pathway knowledge driven prioritization in genome‐wide association studies
title_fullStr	A framework for pathway knowledge driven prioritization in genome‐wide association studies
title_full_unstemmed	A framework for pathway knowledge driven prioritization in genome‐wide association studies
title_short	A framework for pathway knowledge driven prioritization in genome‐wide association studies
title_sort	framework for pathway knowledge driven prioritization in genome‐wide association studies
topic	Research Articles
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7116354/ https://www.ncbi.nlm.nih.gov/pubmed/32779262 http://dx.doi.org/10.1002/gepi.22345
work_keys_str_mv	AT biswasshrayashi aframeworkforpathwayknowledgedrivenprioritizationingenomewideassociationstudies AT palsoumen aframeworkforpathwayknowledgedrivenprioritizationingenomewideassociationstudies AT majumderparthap aframeworkforpathwayknowledgedrivenprioritizationingenomewideassociationstudies AT bhattacharjeesamsiddhi aframeworkforpathwayknowledgedrivenprioritizationingenomewideassociationstudies AT biswasshrayashi frameworkforpathwayknowledgedrivenprioritizationingenomewideassociationstudies AT palsoumen frameworkforpathwayknowledgedrivenprioritizationingenomewideassociationstudies AT majumderparthap frameworkforpathwayknowledgedrivenprioritizationingenomewideassociationstudies AT bhattacharjeesamsiddhi frameworkforpathwayknowledgedrivenprioritizationingenomewideassociationstudies

A framework for pathway knowledge driven prioritization in genome‐wide association studies

Ejemplares similares