Cargando…
Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk
A key challenge for human genetics, precision medicine, and evolutionary biology is deciphering the regulatory code of gene expression, including understanding the transcriptional effects of genome variation. Yet this is extremely difficult due to the enormous scale of the noncoding mutation space....
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6094955/ https://www.ncbi.nlm.nih.gov/pubmed/30013180 http://dx.doi.org/10.1038/s41588-018-0160-6 |
_version_ | 1783347892787871744 |
---|---|
author | Zhou, Jian Theesfeld, Chandra L. Yao, Kevin Chen, Kathleen M. Wong, Aaron K. Troyanskaya, Olga G. |
author_facet | Zhou, Jian Theesfeld, Chandra L. Yao, Kevin Chen, Kathleen M. Wong, Aaron K. Troyanskaya, Olga G. |
author_sort | Zhou, Jian |
collection | PubMed |
description | A key challenge for human genetics, precision medicine, and evolutionary biology is deciphering the regulatory code of gene expression, including understanding the transcriptional effects of genome variation. Yet this is extremely difficult due to the enormous scale of the noncoding mutation space. We developed a deep-learning-based framework, ExPecto, that can accurately predict, ab initio from DNA sequence, the tissue-specific transcriptional effects of mutations, including rare or never observed. We prioritized causal variants within disease/trait-associated loci from all publicly-available GWAS studies, and experimentally validated predictions for four immune-related diseases. Exploiting the scalability of ExPecto, we characterized the regulatory mutation space for all human Pol II-transcribed genes by in silico saturation mutagenesis, profiling >140 million promoter-proximal mutations. This enables probing of evolutionary constraints on gene expression and ab initio prediction of mutation disease effect, making ExPecto an end-to-end computational framework for in silico prediction of expression and disease risk. |
format | Online Article Text |
id | pubmed-6094955 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
record_format | MEDLINE/PubMed |
spelling | pubmed-60949552019-01-16 Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk Zhou, Jian Theesfeld, Chandra L. Yao, Kevin Chen, Kathleen M. Wong, Aaron K. Troyanskaya, Olga G. Nat Genet Article A key challenge for human genetics, precision medicine, and evolutionary biology is deciphering the regulatory code of gene expression, including understanding the transcriptional effects of genome variation. Yet this is extremely difficult due to the enormous scale of the noncoding mutation space. We developed a deep-learning-based framework, ExPecto, that can accurately predict, ab initio from DNA sequence, the tissue-specific transcriptional effects of mutations, including rare or never observed. We prioritized causal variants within disease/trait-associated loci from all publicly-available GWAS studies, and experimentally validated predictions for four immune-related diseases. Exploiting the scalability of ExPecto, we characterized the regulatory mutation space for all human Pol II-transcribed genes by in silico saturation mutagenesis, profiling >140 million promoter-proximal mutations. This enables probing of evolutionary constraints on gene expression and ab initio prediction of mutation disease effect, making ExPecto an end-to-end computational framework for in silico prediction of expression and disease risk. 2018-07-16 2018-08 /pmc/articles/PMC6094955/ /pubmed/30013180 http://dx.doi.org/10.1038/s41588-018-0160-6 Text en Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms |
spellingShingle | Article Zhou, Jian Theesfeld, Chandra L. Yao, Kevin Chen, Kathleen M. Wong, Aaron K. Troyanskaya, Olga G. Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk |
title | Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk |
title_full | Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk |
title_fullStr | Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk |
title_full_unstemmed | Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk |
title_short | Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk |
title_sort | deep learning sequence-based ab initio prediction of variant effects on expression and disease risk |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6094955/ https://www.ncbi.nlm.nih.gov/pubmed/30013180 http://dx.doi.org/10.1038/s41588-018-0160-6 |
work_keys_str_mv | AT zhoujian deeplearningsequencebasedabinitiopredictionofvarianteffectsonexpressionanddiseaserisk AT theesfeldchandral deeplearningsequencebasedabinitiopredictionofvarianteffectsonexpressionanddiseaserisk AT yaokevin deeplearningsequencebasedabinitiopredictionofvarianteffectsonexpressionanddiseaserisk AT chenkathleenm deeplearningsequencebasedabinitiopredictionofvarianteffectsonexpressionanddiseaserisk AT wongaaronk deeplearningsequencebasedabinitiopredictionofvarianteffectsonexpressionanddiseaserisk AT troyanskayaolgag deeplearningsequencebasedabinitiopredictionofvarianteffectsonexpressionanddiseaserisk |