Cargando…

Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk

A key challenge for human genetics, precision medicine, and evolutionary biology is deciphering the regulatory code of gene expression, including understanding the transcriptional effects of genome variation. Yet this is extremely difficult due to the enormous scale of the noncoding mutation space....

Descripción completa

Detalles Bibliográficos
Autores principales: Zhou, Jian, Theesfeld, Chandra L., Yao, Kevin, Chen, Kathleen M., Wong, Aaron K., Troyanskaya, Olga G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6094955/
https://www.ncbi.nlm.nih.gov/pubmed/30013180
http://dx.doi.org/10.1038/s41588-018-0160-6
Descripción
Sumario:A key challenge for human genetics, precision medicine, and evolutionary biology is deciphering the regulatory code of gene expression, including understanding the transcriptional effects of genome variation. Yet this is extremely difficult due to the enormous scale of the noncoding mutation space. We developed a deep-learning-based framework, ExPecto, that can accurately predict, ab initio from DNA sequence, the tissue-specific transcriptional effects of mutations, including rare or never observed. We prioritized causal variants within disease/trait-associated loci from all publicly-available GWAS studies, and experimentally validated predictions for four immune-related diseases. Exploiting the scalability of ExPecto, we characterized the regulatory mutation space for all human Pol II-transcribed genes by in silico saturation mutagenesis, profiling >140 million promoter-proximal mutations. This enables probing of evolutionary constraints on gene expression and ab initio prediction of mutation disease effect, making ExPecto an end-to-end computational framework for in silico prediction of expression and disease risk.