Cargando…

PredTAD: A machine learning framework that models 3D chromatin organization alterations leading to oncogene dysregulation in breast cancer cell lines

Topologically associating domains, or TADs, play important roles in genome organization and gene regulation; however, they are often altered in diseases. High-throughput chromatin conformation capturing assays, such as Hi-C, can capture domains of increased interactions, and TADs and boundaries can...

Descripción completa

Detalles Bibliográficos
Autores principales: Chyr, Jacqueline, Zhang, Zhigang, Chen, Xi, Zhou, Xiaobo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8142020/
https://www.ncbi.nlm.nih.gov/pubmed/34093998
http://dx.doi.org/10.1016/j.csbj.2021.05.013
_version_ 1783696491295014912
author Chyr, Jacqueline
Zhang, Zhigang
Chen, Xi
Zhou, Xiaobo
author_facet Chyr, Jacqueline
Zhang, Zhigang
Chen, Xi
Zhou, Xiaobo
author_sort Chyr, Jacqueline
collection PubMed
description Topologically associating domains, or TADs, play important roles in genome organization and gene regulation; however, they are often altered in diseases. High-throughput chromatin conformation capturing assays, such as Hi-C, can capture domains of increased interactions, and TADs and boundaries can be identified using well-established analytical tools. However, generating Hi-C data is expensive. In our study, we addressed the relationship between multi-omics data and higher-order chromatin structures using a newly developed machine-learning model called PredTAD. Our tool uses already-available and cost-effective datatypes such as transcription factor and histone modification ChIPseq data. Specifically, PredTAD utilizes both epigenetic and genetic features as well as neighboring information to classify the entire human genome as boundary or non-boundary regions. Our tool can predict boundary changes between normal and breast cancer genomes. Among the most important features for predicting boundary alterations were CTCF, subunits of cohesin (RAD21 and SMC3), and chromosome number, suggesting their roles in conserved and dynamic boundaries formation. Upon further analysis, we observed that genes near altered TAD boundaries were found to be involved in several important breast cancer signaling pathways such as Ras, Jak-STAT, and estrogen signaling pathways. We also discovered a TAD boundary alteration that contributes to RET oncogene overexpression. PredTAD can also successfully predict TAD boundary changes in other conditions and diseases. In conclusion, our newly developed machine learning tool allowed for a more complete understanding of the dynamic 3D chromatin structures involved in signaling pathway activation, altered gene expression, and disease state in breast cancer cells.
format Online
Article
Text
id pubmed-8142020
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-81420202021-06-03 PredTAD: A machine learning framework that models 3D chromatin organization alterations leading to oncogene dysregulation in breast cancer cell lines Chyr, Jacqueline Zhang, Zhigang Chen, Xi Zhou, Xiaobo Comput Struct Biotechnol J Research Article Topologically associating domains, or TADs, play important roles in genome organization and gene regulation; however, they are often altered in diseases. High-throughput chromatin conformation capturing assays, such as Hi-C, can capture domains of increased interactions, and TADs and boundaries can be identified using well-established analytical tools. However, generating Hi-C data is expensive. In our study, we addressed the relationship between multi-omics data and higher-order chromatin structures using a newly developed machine-learning model called PredTAD. Our tool uses already-available and cost-effective datatypes such as transcription factor and histone modification ChIPseq data. Specifically, PredTAD utilizes both epigenetic and genetic features as well as neighboring information to classify the entire human genome as boundary or non-boundary regions. Our tool can predict boundary changes between normal and breast cancer genomes. Among the most important features for predicting boundary alterations were CTCF, subunits of cohesin (RAD21 and SMC3), and chromosome number, suggesting their roles in conserved and dynamic boundaries formation. Upon further analysis, we observed that genes near altered TAD boundaries were found to be involved in several important breast cancer signaling pathways such as Ras, Jak-STAT, and estrogen signaling pathways. We also discovered a TAD boundary alteration that contributes to RET oncogene overexpression. PredTAD can also successfully predict TAD boundary changes in other conditions and diseases. In conclusion, our newly developed machine learning tool allowed for a more complete understanding of the dynamic 3D chromatin structures involved in signaling pathway activation, altered gene expression, and disease state in breast cancer cells. Research Network of Computational and Structural Biotechnology 2021-05-07 /pmc/articles/PMC8142020/ /pubmed/34093998 http://dx.doi.org/10.1016/j.csbj.2021.05.013 Text en Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Article
Chyr, Jacqueline
Zhang, Zhigang
Chen, Xi
Zhou, Xiaobo
PredTAD: A machine learning framework that models 3D chromatin organization alterations leading to oncogene dysregulation in breast cancer cell lines
title PredTAD: A machine learning framework that models 3D chromatin organization alterations leading to oncogene dysregulation in breast cancer cell lines
title_full PredTAD: A machine learning framework that models 3D chromatin organization alterations leading to oncogene dysregulation in breast cancer cell lines
title_fullStr PredTAD: A machine learning framework that models 3D chromatin organization alterations leading to oncogene dysregulation in breast cancer cell lines
title_full_unstemmed PredTAD: A machine learning framework that models 3D chromatin organization alterations leading to oncogene dysregulation in breast cancer cell lines
title_short PredTAD: A machine learning framework that models 3D chromatin organization alterations leading to oncogene dysregulation in breast cancer cell lines
title_sort predtad: a machine learning framework that models 3d chromatin organization alterations leading to oncogene dysregulation in breast cancer cell lines
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8142020/
https://www.ncbi.nlm.nih.gov/pubmed/34093998
http://dx.doi.org/10.1016/j.csbj.2021.05.013
work_keys_str_mv AT chyrjacqueline predtadamachinelearningframeworkthatmodels3dchromatinorganizationalterationsleadingtooncogenedysregulationinbreastcancercelllines
AT zhangzhigang predtadamachinelearningframeworkthatmodels3dchromatinorganizationalterationsleadingtooncogenedysregulationinbreastcancercelllines
AT chenxi predtadamachinelearningframeworkthatmodels3dchromatinorganizationalterationsleadingtooncogenedysregulationinbreastcancercelllines
AT zhouxiaobo predtadamachinelearningframeworkthatmodels3dchromatinorganizationalterationsleadingtooncogenedysregulationinbreastcancercelllines