Cargando…

Integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers

Super-enhancers (SEs) are clusters of transcriptional enhancers which control the expression of cell identity and disease-associated genes. Current studies demonstrated the role of multiple factors in SE formation; however, a systematic analysis to assess the relative predictive importance of chroma...

Descripción completa

Detalles Bibliográficos
Autores principales: Khan, Aziz, Zhang, Xuegong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6393462/
https://www.ncbi.nlm.nih.gov/pubmed/30814546
http://dx.doi.org/10.1038/s41598-019-38979-9
_version_ 1783398695465648128
author Khan, Aziz
Zhang, Xuegong
author_facet Khan, Aziz
Zhang, Xuegong
author_sort Khan, Aziz
collection PubMed
description Super-enhancers (SEs) are clusters of transcriptional enhancers which control the expression of cell identity and disease-associated genes. Current studies demonstrated the role of multiple factors in SE formation; however, a systematic analysis to assess the relative predictive importance of chromatin and sequence features of SEs and their constituents is lacking. In addition, a predictive model that integrates various types of data to predict SEs has not been established. Here, we integrated diverse types of genomic and epigenomic datasets to identify key signatures of SEs and investigated their predictive importance. Through integrative modeling, we found Cdk8, Cdk9, and Smad3 as new features of SEs, which can define known and new SEs in mouse embryonic stem cells and pro-B cells. We compared six state-of-the-art machine learning models to predict SEs and showed that non-parametric ensemble models performed better as compared to parametric. We validated these models using cross-validation and also independent datasets in four human cell-types. Taken together, our systematic analysis and ranking of features can be used as a platform to define and understand the biology of SEs in other cell-types.
format Online
Article
Text
id pubmed-6393462
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-63934622019-03-01 Integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers Khan, Aziz Zhang, Xuegong Sci Rep Article Super-enhancers (SEs) are clusters of transcriptional enhancers which control the expression of cell identity and disease-associated genes. Current studies demonstrated the role of multiple factors in SE formation; however, a systematic analysis to assess the relative predictive importance of chromatin and sequence features of SEs and their constituents is lacking. In addition, a predictive model that integrates various types of data to predict SEs has not been established. Here, we integrated diverse types of genomic and epigenomic datasets to identify key signatures of SEs and investigated their predictive importance. Through integrative modeling, we found Cdk8, Cdk9, and Smad3 as new features of SEs, which can define known and new SEs in mouse embryonic stem cells and pro-B cells. We compared six state-of-the-art machine learning models to predict SEs and showed that non-parametric ensemble models performed better as compared to parametric. We validated these models using cross-validation and also independent datasets in four human cell-types. Taken together, our systematic analysis and ranking of features can be used as a platform to define and understand the biology of SEs in other cell-types. Nature Publishing Group UK 2019-02-27 /pmc/articles/PMC6393462/ /pubmed/30814546 http://dx.doi.org/10.1038/s41598-019-38979-9 Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Khan, Aziz
Zhang, Xuegong
Integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers
title Integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers
title_full Integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers
title_fullStr Integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers
title_full_unstemmed Integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers
title_short Integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers
title_sort integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6393462/
https://www.ncbi.nlm.nih.gov/pubmed/30814546
http://dx.doi.org/10.1038/s41598-019-38979-9
work_keys_str_mv AT khanaziz integrativemodelingrevealskeychromatinandsequencesignaturespredictingsuperenhancers
AT zhangxuegong integrativemodelingrevealskeychromatinandsequencesignaturespredictingsuperenhancers