Cargando…

Predicting regulatory variants using a dense epigenomic mapped CNN model elucidated the molecular basis of trait-tissue associations

Assessing the causal tissues of human complex diseases is important for the prioritization of trait-associated genetic variants. Yet, the biological underpinnings of trait-associated variants are extremely difficult to infer due to statistical noise in genome-wide association studies (GWAS), and bec...

Descripción completa

Detalles Bibliográficos
Autores principales: Pei, Guangsheng, Hu, Ruifeng, Dai, Yulin, Manuel, Astrid Marilyn, Zhao, Zhongming, Jia, Peilin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7797043/
https://www.ncbi.nlm.nih.gov/pubmed/33300042
http://dx.doi.org/10.1093/nar/gkaa1137
_version_ 1783634788688592896
author Pei, Guangsheng
Hu, Ruifeng
Dai, Yulin
Manuel, Astrid Marilyn
Zhao, Zhongming
Jia, Peilin
author_facet Pei, Guangsheng
Hu, Ruifeng
Dai, Yulin
Manuel, Astrid Marilyn
Zhao, Zhongming
Jia, Peilin
author_sort Pei, Guangsheng
collection PubMed
description Assessing the causal tissues of human complex diseases is important for the prioritization of trait-associated genetic variants. Yet, the biological underpinnings of trait-associated variants are extremely difficult to infer due to statistical noise in genome-wide association studies (GWAS), and because >90% of genetic variants from GWAS are located in non-coding regions. Here, we collected the largest human epigenomic map from ENCODE and Roadmap consortia and implemented a deep-learning-based convolutional neural network (CNN) model to predict the regulatory roles of genetic variants across a comprehensive list of epigenomic modifications. Our model, called DeepFun, was built on DNA accessibility maps, histone modification marks, and transcription factors. DeepFun can systematically assess the impact of non-coding variants in the most functional elements with tissue or cell-type specificity, even for rare variants or de novo mutations. By applying this model, we prioritized trait-associated loci for 51 publicly-available GWAS studies. We demonstrated that CNN-based analyses on dense and high-resolution epigenomic annotations can refine important GWAS associations in order to identify regulatory loci from background signals, which yield novel insights for better understanding the molecular basis of human complex disease. We anticipate our approaches will become routine in GWAS downstream analysis and non-coding variant evaluation.
format Online
Article
Text
id pubmed-7797043
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-77970432021-01-13 Predicting regulatory variants using a dense epigenomic mapped CNN model elucidated the molecular basis of trait-tissue associations Pei, Guangsheng Hu, Ruifeng Dai, Yulin Manuel, Astrid Marilyn Zhao, Zhongming Jia, Peilin Nucleic Acids Res Computational Biology Assessing the causal tissues of human complex diseases is important for the prioritization of trait-associated genetic variants. Yet, the biological underpinnings of trait-associated variants are extremely difficult to infer due to statistical noise in genome-wide association studies (GWAS), and because >90% of genetic variants from GWAS are located in non-coding regions. Here, we collected the largest human epigenomic map from ENCODE and Roadmap consortia and implemented a deep-learning-based convolutional neural network (CNN) model to predict the regulatory roles of genetic variants across a comprehensive list of epigenomic modifications. Our model, called DeepFun, was built on DNA accessibility maps, histone modification marks, and transcription factors. DeepFun can systematically assess the impact of non-coding variants in the most functional elements with tissue or cell-type specificity, even for rare variants or de novo mutations. By applying this model, we prioritized trait-associated loci for 51 publicly-available GWAS studies. We demonstrated that CNN-based analyses on dense and high-resolution epigenomic annotations can refine important GWAS associations in order to identify regulatory loci from background signals, which yield novel insights for better understanding the molecular basis of human complex disease. We anticipate our approaches will become routine in GWAS downstream analysis and non-coding variant evaluation. Oxford University Press 2020-12-09 /pmc/articles/PMC7797043/ /pubmed/33300042 http://dx.doi.org/10.1093/nar/gkaa1137 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Computational Biology
Pei, Guangsheng
Hu, Ruifeng
Dai, Yulin
Manuel, Astrid Marilyn
Zhao, Zhongming
Jia, Peilin
Predicting regulatory variants using a dense epigenomic mapped CNN model elucidated the molecular basis of trait-tissue associations
title Predicting regulatory variants using a dense epigenomic mapped CNN model elucidated the molecular basis of trait-tissue associations
title_full Predicting regulatory variants using a dense epigenomic mapped CNN model elucidated the molecular basis of trait-tissue associations
title_fullStr Predicting regulatory variants using a dense epigenomic mapped CNN model elucidated the molecular basis of trait-tissue associations
title_full_unstemmed Predicting regulatory variants using a dense epigenomic mapped CNN model elucidated the molecular basis of trait-tissue associations
title_short Predicting regulatory variants using a dense epigenomic mapped CNN model elucidated the molecular basis of trait-tissue associations
title_sort predicting regulatory variants using a dense epigenomic mapped cnn model elucidated the molecular basis of trait-tissue associations
topic Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7797043/
https://www.ncbi.nlm.nih.gov/pubmed/33300042
http://dx.doi.org/10.1093/nar/gkaa1137
work_keys_str_mv AT peiguangsheng predictingregulatoryvariantsusingadenseepigenomicmappedcnnmodelelucidatedthemolecularbasisoftraittissueassociations
AT huruifeng predictingregulatoryvariantsusingadenseepigenomicmappedcnnmodelelucidatedthemolecularbasisoftraittissueassociations
AT daiyulin predictingregulatoryvariantsusingadenseepigenomicmappedcnnmodelelucidatedthemolecularbasisoftraittissueassociations
AT manuelastridmarilyn predictingregulatoryvariantsusingadenseepigenomicmappedcnnmodelelucidatedthemolecularbasisoftraittissueassociations
AT zhaozhongming predictingregulatoryvariantsusingadenseepigenomicmappedcnnmodelelucidatedthemolecularbasisoftraittissueassociations
AT jiapeilin predictingregulatoryvariantsusingadenseepigenomicmappedcnnmodelelucidatedthemolecularbasisoftraittissueassociations