Cargando…
Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding?
Large experimental efforts are characterizing the regulatory genome, yet we are still missing a systematic definition of functional and silent genetic variants in non-coding regions. Here, we integrated DNaseI footprinting data with sequence-based transcription factor (TF) motif models to predict th...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4764260/ https://www.ncbi.nlm.nih.gov/pubmed/26901046 http://dx.doi.org/10.1371/journal.pgen.1005875 |
_version_ | 1782417353915498496 |
---|---|
author | Moyerbrailean, Gregory A. Kalita, Cynthia A. Harvey, Chris T. Wen, Xiaoquan Luca, Francesca Pique-Regi, Roger |
author_facet | Moyerbrailean, Gregory A. Kalita, Cynthia A. Harvey, Chris T. Wen, Xiaoquan Luca, Francesca Pique-Regi, Roger |
author_sort | Moyerbrailean, Gregory A. |
collection | PubMed |
description | Large experimental efforts are characterizing the regulatory genome, yet we are still missing a systematic definition of functional and silent genetic variants in non-coding regions. Here, we integrated DNaseI footprinting data with sequence-based transcription factor (TF) motif models to predict the impact of a genetic variant on TF binding across 153 tissues and 1,372 TF motifs. Each annotation we derived is specific for a cell-type condition or assay and is locally motif-driven. We found 5.8 million genetic variants in footprints, 66% of which are predicted by our model to affect TF binding. Comprehensive examination using allele-specific hypersensitivity (ASH) reveals that only the latter group consistently shows evidence for ASH (3,217 SNPs at 20% FDR), suggesting that most (97%) genetic variants in footprinted regulatory regions are indeed silent. Combining this information with GWAS data reveals that our annotation helps in computationally fine-mapping 86 SNPs in GWAS hit regions with at least a 2-fold increase in the posterior odds of picking the causal SNP. The rich meta information provided by the tissue-specificity and the identity of the putative TF binding site being affected also helps in identifying the underlying mechanism supporting the association. As an example, the enrichment for LDL level-associated SNPs is 9.1-fold higher among SNPs predicted to affect HNF4 binding sites than in a background model already including tissue-specific annotation. |
format | Online Article Text |
id | pubmed-4764260 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-47642602016-03-07 Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding? Moyerbrailean, Gregory A. Kalita, Cynthia A. Harvey, Chris T. Wen, Xiaoquan Luca, Francesca Pique-Regi, Roger PLoS Genet Research Article Large experimental efforts are characterizing the regulatory genome, yet we are still missing a systematic definition of functional and silent genetic variants in non-coding regions. Here, we integrated DNaseI footprinting data with sequence-based transcription factor (TF) motif models to predict the impact of a genetic variant on TF binding across 153 tissues and 1,372 TF motifs. Each annotation we derived is specific for a cell-type condition or assay and is locally motif-driven. We found 5.8 million genetic variants in footprints, 66% of which are predicted by our model to affect TF binding. Comprehensive examination using allele-specific hypersensitivity (ASH) reveals that only the latter group consistently shows evidence for ASH (3,217 SNPs at 20% FDR), suggesting that most (97%) genetic variants in footprinted regulatory regions are indeed silent. Combining this information with GWAS data reveals that our annotation helps in computationally fine-mapping 86 SNPs in GWAS hit regions with at least a 2-fold increase in the posterior odds of picking the causal SNP. The rich meta information provided by the tissue-specificity and the identity of the putative TF binding site being affected also helps in identifying the underlying mechanism supporting the association. As an example, the enrichment for LDL level-associated SNPs is 9.1-fold higher among SNPs predicted to affect HNF4 binding sites than in a background model already including tissue-specific annotation. Public Library of Science 2016-02-22 /pmc/articles/PMC4764260/ /pubmed/26901046 http://dx.doi.org/10.1371/journal.pgen.1005875 Text en © 2016 Moyerbrailean et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Moyerbrailean, Gregory A. Kalita, Cynthia A. Harvey, Chris T. Wen, Xiaoquan Luca, Francesca Pique-Regi, Roger Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding? |
title | Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding? |
title_full | Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding? |
title_fullStr | Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding? |
title_full_unstemmed | Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding? |
title_short | Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding? |
title_sort | which genetics variants in dnase-seq footprints are more likely to alter binding? |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4764260/ https://www.ncbi.nlm.nih.gov/pubmed/26901046 http://dx.doi.org/10.1371/journal.pgen.1005875 |
work_keys_str_mv | AT moyerbraileangregorya whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding AT kalitacynthiaa whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding AT harveychrist whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding AT wenxiaoquan whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding AT lucafrancesca whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding AT piqueregiroger whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding |