Cargando…

Atlas of primary cell-type-specific sequence models of gene expression and variant effects

Human biology is rooted in highly specialized cell types programmed by a common genome, 98% of which is outside of genes. Genetic variation in the enormous noncoding space is linked to the majority of disease risk. To address the problem of linking these variants to expression changes in primary hum...

Descripción completa

Detalles Bibliográficos
Autores principales: Sokolova, Ksenia, Theesfeld, Chandra L., Wong, Aaron K., Zhang, Zijun, Dolinski, Kara, Troyanskaya, Olga G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10545936/
https://www.ncbi.nlm.nih.gov/pubmed/37703883
http://dx.doi.org/10.1016/j.crmeth.2023.100580
_version_ 1785114770631819264
author Sokolova, Ksenia
Theesfeld, Chandra L.
Wong, Aaron K.
Zhang, Zijun
Dolinski, Kara
Troyanskaya, Olga G.
author_facet Sokolova, Ksenia
Theesfeld, Chandra L.
Wong, Aaron K.
Zhang, Zijun
Dolinski, Kara
Troyanskaya, Olga G.
author_sort Sokolova, Ksenia
collection PubMed
description Human biology is rooted in highly specialized cell types programmed by a common genome, 98% of which is outside of genes. Genetic variation in the enormous noncoding space is linked to the majority of disease risk. To address the problem of linking these variants to expression changes in primary human cells, we introduce ExPectoSC, an atlas of modular deep-learning-based models for predicting cell-type-specific gene expression directly from sequence. We provide models for 105 primary human cell types covering 7 organ systems, demonstrate their accuracy, and then apply them to prioritize relevant cell types for complex human diseases. The resulting atlas of sequence-based gene expression and variant effects is publicly available in a user-friendly interface and readily extensible to any primary cell types. We demonstrate the accuracy of our approach through systematic evaluations and apply the models to prioritize ClinVar clinical variants of uncertain significance, verifying our top predictions experimentally.
format Online
Article
Text
id pubmed-10545936
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-105459362023-10-04 Atlas of primary cell-type-specific sequence models of gene expression and variant effects Sokolova, Ksenia Theesfeld, Chandra L. Wong, Aaron K. Zhang, Zijun Dolinski, Kara Troyanskaya, Olga G. Cell Rep Methods Article Human biology is rooted in highly specialized cell types programmed by a common genome, 98% of which is outside of genes. Genetic variation in the enormous noncoding space is linked to the majority of disease risk. To address the problem of linking these variants to expression changes in primary human cells, we introduce ExPectoSC, an atlas of modular deep-learning-based models for predicting cell-type-specific gene expression directly from sequence. We provide models for 105 primary human cell types covering 7 organ systems, demonstrate their accuracy, and then apply them to prioritize relevant cell types for complex human diseases. The resulting atlas of sequence-based gene expression and variant effects is publicly available in a user-friendly interface and readily extensible to any primary cell types. We demonstrate the accuracy of our approach through systematic evaluations and apply the models to prioritize ClinVar clinical variants of uncertain significance, verifying our top predictions experimentally. Elsevier 2023-09-12 /pmc/articles/PMC10545936/ /pubmed/37703883 http://dx.doi.org/10.1016/j.crmeth.2023.100580 Text en © 2023 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Sokolova, Ksenia
Theesfeld, Chandra L.
Wong, Aaron K.
Zhang, Zijun
Dolinski, Kara
Troyanskaya, Olga G.
Atlas of primary cell-type-specific sequence models of gene expression and variant effects
title Atlas of primary cell-type-specific sequence models of gene expression and variant effects
title_full Atlas of primary cell-type-specific sequence models of gene expression and variant effects
title_fullStr Atlas of primary cell-type-specific sequence models of gene expression and variant effects
title_full_unstemmed Atlas of primary cell-type-specific sequence models of gene expression and variant effects
title_short Atlas of primary cell-type-specific sequence models of gene expression and variant effects
title_sort atlas of primary cell-type-specific sequence models of gene expression and variant effects
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10545936/
https://www.ncbi.nlm.nih.gov/pubmed/37703883
http://dx.doi.org/10.1016/j.crmeth.2023.100580
work_keys_str_mv AT sokolovaksenia atlasofprimarycelltypespecificsequencemodelsofgeneexpressionandvarianteffects
AT theesfeldchandral atlasofprimarycelltypespecificsequencemodelsofgeneexpressionandvarianteffects
AT wongaaronk atlasofprimarycelltypespecificsequencemodelsofgeneexpressionandvarianteffects
AT zhangzijun atlasofprimarycelltypespecificsequencemodelsofgeneexpressionandvarianteffects
AT dolinskikara atlasofprimarycelltypespecificsequencemodelsofgeneexpressionandvarianteffects
AT troyanskayaolgag atlasofprimarycelltypespecificsequencemodelsofgeneexpressionandvarianteffects