Cargando…

Ranking of non-coding pathogenic variants and putative essential regions of the human genome

A gene is considered essential if loss of function results in loss of viability, fitness or in disease. This concept is well established for coding genes; however, non-coding regions are thought less likely to be determinants of critical functions. Here we train a machine learning model using functi...

Descripción completa

Detalles Bibliográficos
Autores principales: Wells, Alex, Heckerman, David, Torkamani, Ali, Yin, Li, Sebat, Jonathan, Ren, Bing, Telenti, Amalio, di Iulio, Julia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6868241/
https://www.ncbi.nlm.nih.gov/pubmed/31748530
http://dx.doi.org/10.1038/s41467-019-13212-3
Descripción
Sumario:A gene is considered essential if loss of function results in loss of viability, fitness or in disease. This concept is well established for coding genes; however, non-coding regions are thought less likely to be determinants of critical functions. Here we train a machine learning model using functional, mutational and structural features, including new genome essentiality metrics, 3D genome organization and enhancer reporter data to identify deleterious variants in non-coding regions. We assess the model for functional correlates by using data from tiling-deletion-based and CRISPR interference screens of activity of cis-regulatory elements in over 3 Mb of genome sequence. Finally, we explore two user cases that involve indels and the disruption of enhancers associated with a developmental disease. We rank variants in the non-coding genome according to their predicted deleteriousness. The model prioritizes non-coding regions associated with regulation of important genes and with cell viability, an in vitro surrogate of essentiality.