Cargando…

Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies

Standard statistical approaches for prioritization of variants for functional testing in fine-mapping studies either use marginal association statistics or estimate posterior probabilities for variants to be causal under simplifying assumptions. Here, we present a probabilistic framework that integr...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kichaev, Gleb, Yang, Wen-Yun, Lindstrom, Sara, Hormozdiari, Farhad, Eskin, Eleazar, Price, Alkes L., Kraft, Peter, Pasaniuc, Bogdan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2014
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4214605/ https://www.ncbi.nlm.nih.gov/pubmed/25357204 http://dx.doi.org/10.1371/journal.pgen.1004722

_version_	1782341983127207936
author	Kichaev, Gleb Yang, Wen-Yun Lindstrom, Sara Hormozdiari, Farhad Eskin, Eleazar Price, Alkes L. Kraft, Peter Pasaniuc, Bogdan
author_facet	Kichaev, Gleb Yang, Wen-Yun Lindstrom, Sara Hormozdiari, Farhad Eskin, Eleazar Price, Alkes L. Kraft, Peter Pasaniuc, Bogdan
author_sort	Kichaev, Gleb
collection	PubMed
description	Standard statistical approaches for prioritization of variants for functional testing in fine-mapping studies either use marginal association statistics or estimate posterior probabilities for variants to be causal under simplifying assumptions. Here, we present a probabilistic framework that integrates association strength with functional genomic annotation data to improve accuracy in selecting plausible causal variants for functional validation. A key feature of our approach is that it empirically estimates the contribution of each functional annotation to the trait of interest directly from summary association statistics while allowing for multiple causal variants at any risk locus. We devise efficient algorithms that estimate the parameters of our model across all risk loci to further increase performance. Using simulations starting from the 1000 Genomes data, we find that our framework consistently outperforms the current state-of-the-art fine-mapping methods, reducing the number of variants that need to be selected to capture 90% of the causal variants from an average of 13.3 to 10.4 SNPs per locus (as compared to the next-best performing strategy). Furthermore, we introduce a cost-to-benefit optimization framework for determining the number of variants to be followed up in functional assays and assess its performance using real and simulation data. We validate our findings using a large scale meta-analysis of four blood lipids traits and find that the relative probability for causality is increased for variants in exons and transcription start sites and decreased in repressed genomic regions at the risk loci of these traits. Using these highly predictive, trait-specific functional annotations, we estimate causality probabilities across all traits and variants, reducing the size of the 90% confidence set from an average of 17.5 to 13.5 variants per locus in this data.
format	Online Article Text
id	pubmed-4214605
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-42146052014-11-05 Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies Kichaev, Gleb Yang, Wen-Yun Lindstrom, Sara Hormozdiari, Farhad Eskin, Eleazar Price, Alkes L. Kraft, Peter Pasaniuc, Bogdan PLoS Genet Research Article Standard statistical approaches for prioritization of variants for functional testing in fine-mapping studies either use marginal association statistics or estimate posterior probabilities for variants to be causal under simplifying assumptions. Here, we present a probabilistic framework that integrates association strength with functional genomic annotation data to improve accuracy in selecting plausible causal variants for functional validation. A key feature of our approach is that it empirically estimates the contribution of each functional annotation to the trait of interest directly from summary association statistics while allowing for multiple causal variants at any risk locus. We devise efficient algorithms that estimate the parameters of our model across all risk loci to further increase performance. Using simulations starting from the 1000 Genomes data, we find that our framework consistently outperforms the current state-of-the-art fine-mapping methods, reducing the number of variants that need to be selected to capture 90% of the causal variants from an average of 13.3 to 10.4 SNPs per locus (as compared to the next-best performing strategy). Furthermore, we introduce a cost-to-benefit optimization framework for determining the number of variants to be followed up in functional assays and assess its performance using real and simulation data. We validate our findings using a large scale meta-analysis of four blood lipids traits and find that the relative probability for causality is increased for variants in exons and transcription start sites and decreased in repressed genomic regions at the risk loci of these traits. Using these highly predictive, trait-specific functional annotations, we estimate causality probabilities across all traits and variants, reducing the size of the 90% confidence set from an average of 17.5 to 13.5 variants per locus in this data. Public Library of Science 2014-10-30 /pmc/articles/PMC4214605/ /pubmed/25357204 http://dx.doi.org/10.1371/journal.pgen.1004722 Text en © 2014 Kichaev et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle	Research Article Kichaev, Gleb Yang, Wen-Yun Lindstrom, Sara Hormozdiari, Farhad Eskin, Eleazar Price, Alkes L. Kraft, Peter Pasaniuc, Bogdan Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies
title	Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies
title_full	Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies
title_fullStr	Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies
title_full_unstemmed	Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies
title_short	Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies
title_sort	integrating functional data to prioritize causal variants in statistical fine-mapping studies
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4214605/ https://www.ncbi.nlm.nih.gov/pubmed/25357204 http://dx.doi.org/10.1371/journal.pgen.1004722
work_keys_str_mv	AT kichaevgleb integratingfunctionaldatatoprioritizecausalvariantsinstatisticalfinemappingstudies AT yangwenyun integratingfunctionaldatatoprioritizecausalvariantsinstatisticalfinemappingstudies AT lindstromsara integratingfunctionaldatatoprioritizecausalvariantsinstatisticalfinemappingstudies AT hormozdiarifarhad integratingfunctionaldatatoprioritizecausalvariantsinstatisticalfinemappingstudies AT eskineleazar integratingfunctionaldatatoprioritizecausalvariantsinstatisticalfinemappingstudies AT pricealkesl integratingfunctionaldatatoprioritizecausalvariantsinstatisticalfinemappingstudies AT kraftpeter integratingfunctionaldatatoprioritizecausalvariantsinstatisticalfinemappingstudies AT pasaniucbogdan integratingfunctionaldatatoprioritizecausalvariantsinstatisticalfinemappingstudies

Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies

Ejemplares similares