Cargando…

Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding?

Large experimental efforts are characterizing the regulatory genome, yet we are still missing a systematic definition of functional and silent genetic variants in non-coding regions. Here, we integrated DNaseI footprinting data with sequence-based transcription factor (TF) motif models to predict th...

Descripción completa

Detalles Bibliográficos
Autores principales: Moyerbrailean, Gregory A., Kalita, Cynthia A., Harvey, Chris T., Wen, Xiaoquan, Luca, Francesca, Pique-Regi, Roger
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4764260/
https://www.ncbi.nlm.nih.gov/pubmed/26901046
http://dx.doi.org/10.1371/journal.pgen.1005875
_version_ 1782417353915498496
author Moyerbrailean, Gregory A.
Kalita, Cynthia A.
Harvey, Chris T.
Wen, Xiaoquan
Luca, Francesca
Pique-Regi, Roger
author_facet Moyerbrailean, Gregory A.
Kalita, Cynthia A.
Harvey, Chris T.
Wen, Xiaoquan
Luca, Francesca
Pique-Regi, Roger
author_sort Moyerbrailean, Gregory A.
collection PubMed
description Large experimental efforts are characterizing the regulatory genome, yet we are still missing a systematic definition of functional and silent genetic variants in non-coding regions. Here, we integrated DNaseI footprinting data with sequence-based transcription factor (TF) motif models to predict the impact of a genetic variant on TF binding across 153 tissues and 1,372 TF motifs. Each annotation we derived is specific for a cell-type condition or assay and is locally motif-driven. We found 5.8 million genetic variants in footprints, 66% of which are predicted by our model to affect TF binding. Comprehensive examination using allele-specific hypersensitivity (ASH) reveals that only the latter group consistently shows evidence for ASH (3,217 SNPs at 20% FDR), suggesting that most (97%) genetic variants in footprinted regulatory regions are indeed silent. Combining this information with GWAS data reveals that our annotation helps in computationally fine-mapping 86 SNPs in GWAS hit regions with at least a 2-fold increase in the posterior odds of picking the causal SNP. The rich meta information provided by the tissue-specificity and the identity of the putative TF binding site being affected also helps in identifying the underlying mechanism supporting the association. As an example, the enrichment for LDL level-associated SNPs is 9.1-fold higher among SNPs predicted to affect HNF4 binding sites than in a background model already including tissue-specific annotation.
format Online
Article
Text
id pubmed-4764260
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-47642602016-03-07 Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding? Moyerbrailean, Gregory A. Kalita, Cynthia A. Harvey, Chris T. Wen, Xiaoquan Luca, Francesca Pique-Regi, Roger PLoS Genet Research Article Large experimental efforts are characterizing the regulatory genome, yet we are still missing a systematic definition of functional and silent genetic variants in non-coding regions. Here, we integrated DNaseI footprinting data with sequence-based transcription factor (TF) motif models to predict the impact of a genetic variant on TF binding across 153 tissues and 1,372 TF motifs. Each annotation we derived is specific for a cell-type condition or assay and is locally motif-driven. We found 5.8 million genetic variants in footprints, 66% of which are predicted by our model to affect TF binding. Comprehensive examination using allele-specific hypersensitivity (ASH) reveals that only the latter group consistently shows evidence for ASH (3,217 SNPs at 20% FDR), suggesting that most (97%) genetic variants in footprinted regulatory regions are indeed silent. Combining this information with GWAS data reveals that our annotation helps in computationally fine-mapping 86 SNPs in GWAS hit regions with at least a 2-fold increase in the posterior odds of picking the causal SNP. The rich meta information provided by the tissue-specificity and the identity of the putative TF binding site being affected also helps in identifying the underlying mechanism supporting the association. As an example, the enrichment for LDL level-associated SNPs is 9.1-fold higher among SNPs predicted to affect HNF4 binding sites than in a background model already including tissue-specific annotation. Public Library of Science 2016-02-22 /pmc/articles/PMC4764260/ /pubmed/26901046 http://dx.doi.org/10.1371/journal.pgen.1005875 Text en © 2016 Moyerbrailean et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Moyerbrailean, Gregory A.
Kalita, Cynthia A.
Harvey, Chris T.
Wen, Xiaoquan
Luca, Francesca
Pique-Regi, Roger
Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding?
title Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding?
title_full Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding?
title_fullStr Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding?
title_full_unstemmed Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding?
title_short Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding?
title_sort which genetics variants in dnase-seq footprints are more likely to alter binding?
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4764260/
https://www.ncbi.nlm.nih.gov/pubmed/26901046
http://dx.doi.org/10.1371/journal.pgen.1005875
work_keys_str_mv AT moyerbraileangregorya whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding
AT kalitacynthiaa whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding
AT harveychrist whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding
AT wenxiaoquan whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding
AT lucafrancesca whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding
AT piqueregiroger whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding