Cargando…

Enhanced performance of gene expression predictive models with protein-mediated spatial chromatin interactions

There have been multiple attempts to predict the expression of the genes based on the sequence, epigenetics, and various other factors. To improve those predictions, we have decided to investigate adding protein-specific 3D interactions that play a significant role in the condensation of the chromat...

Descripción completa

Detalles Bibliográficos
Autores principales: Chiliński, Mateusz, Lipiński, Jakub, Agarwal, Abhishek, Ruan, Yijun, Plewczynski, Dariusz
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10359366/
https://www.ncbi.nlm.nih.gov/pubmed/37474564
http://dx.doi.org/10.1038/s41598-023-38865-5
_version_ 1785075866006454272
author Chiliński, Mateusz
Lipiński, Jakub
Agarwal, Abhishek
Ruan, Yijun
Plewczynski, Dariusz
author_facet Chiliński, Mateusz
Lipiński, Jakub
Agarwal, Abhishek
Ruan, Yijun
Plewczynski, Dariusz
author_sort Chiliński, Mateusz
collection PubMed
description There have been multiple attempts to predict the expression of the genes based on the sequence, epigenetics, and various other factors. To improve those predictions, we have decided to investigate adding protein-specific 3D interactions that play a significant role in the condensation of the chromatin structure in the cell nucleus. To achieve this, we have used the architecture of one of the state-of-the-art algorithms, ExPecto, and investigated the changes in the model metrics upon adding the spatially relevant data. We have used ChIA-PET interactions that are mediated by cohesin (24 cell lines), CTCF (4 cell lines), and RNAPOL2 (4 cell lines). As the output of the study, we have developed the Spatial Gene Expression (SpEx) algorithm that shows statistically significant improvements in most cell lines. We have compared ourselves to the baseline ExPecto model, which obtained a 0.82 Spearman's rank correlation coefficient (SCC) score, and 0.85, which is reported by newer Enformer were able to obtain the average correlation score of 0.83. However, in some cases (e.g. RNAPOL2 on GM12878), our improvement reached 0.04, and in some cases (e.g. RNAPOL2 on H1), we reached an SCC of 0.86.
format Online
Article
Text
id pubmed-10359366
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-103593662023-07-22 Enhanced performance of gene expression predictive models with protein-mediated spatial chromatin interactions Chiliński, Mateusz Lipiński, Jakub Agarwal, Abhishek Ruan, Yijun Plewczynski, Dariusz Sci Rep Article There have been multiple attempts to predict the expression of the genes based on the sequence, epigenetics, and various other factors. To improve those predictions, we have decided to investigate adding protein-specific 3D interactions that play a significant role in the condensation of the chromatin structure in the cell nucleus. To achieve this, we have used the architecture of one of the state-of-the-art algorithms, ExPecto, and investigated the changes in the model metrics upon adding the spatially relevant data. We have used ChIA-PET interactions that are mediated by cohesin (24 cell lines), CTCF (4 cell lines), and RNAPOL2 (4 cell lines). As the output of the study, we have developed the Spatial Gene Expression (SpEx) algorithm that shows statistically significant improvements in most cell lines. We have compared ourselves to the baseline ExPecto model, which obtained a 0.82 Spearman's rank correlation coefficient (SCC) score, and 0.85, which is reported by newer Enformer were able to obtain the average correlation score of 0.83. However, in some cases (e.g. RNAPOL2 on GM12878), our improvement reached 0.04, and in some cases (e.g. RNAPOL2 on H1), we reached an SCC of 0.86. Nature Publishing Group UK 2023-07-20 /pmc/articles/PMC10359366/ /pubmed/37474564 http://dx.doi.org/10.1038/s41598-023-38865-5 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Chiliński, Mateusz
Lipiński, Jakub
Agarwal, Abhishek
Ruan, Yijun
Plewczynski, Dariusz
Enhanced performance of gene expression predictive models with protein-mediated spatial chromatin interactions
title Enhanced performance of gene expression predictive models with protein-mediated spatial chromatin interactions
title_full Enhanced performance of gene expression predictive models with protein-mediated spatial chromatin interactions
title_fullStr Enhanced performance of gene expression predictive models with protein-mediated spatial chromatin interactions
title_full_unstemmed Enhanced performance of gene expression predictive models with protein-mediated spatial chromatin interactions
title_short Enhanced performance of gene expression predictive models with protein-mediated spatial chromatin interactions
title_sort enhanced performance of gene expression predictive models with protein-mediated spatial chromatin interactions
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10359366/
https://www.ncbi.nlm.nih.gov/pubmed/37474564
http://dx.doi.org/10.1038/s41598-023-38865-5
work_keys_str_mv AT chilinskimateusz enhancedperformanceofgeneexpressionpredictivemodelswithproteinmediatedspatialchromatininteractions
AT lipinskijakub enhancedperformanceofgeneexpressionpredictivemodelswithproteinmediatedspatialchromatininteractions
AT agarwalabhishek enhancedperformanceofgeneexpressionpredictivemodelswithproteinmediatedspatialchromatininteractions
AT ruanyijun enhancedperformanceofgeneexpressionpredictivemodelswithproteinmediatedspatialchromatininteractions
AT plewczynskidariusz enhancedperformanceofgeneexpressionpredictivemodelswithproteinmediatedspatialchromatininteractions