Cargando…
Integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers
Super-enhancers (SEs) are clusters of transcriptional enhancers which control the expression of cell identity and disease-associated genes. Current studies demonstrated the role of multiple factors in SE formation; however, a systematic analysis to assess the relative predictive importance of chroma...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6393462/ https://www.ncbi.nlm.nih.gov/pubmed/30814546 http://dx.doi.org/10.1038/s41598-019-38979-9 |
_version_ | 1783398695465648128 |
---|---|
author | Khan, Aziz Zhang, Xuegong |
author_facet | Khan, Aziz Zhang, Xuegong |
author_sort | Khan, Aziz |
collection | PubMed |
description | Super-enhancers (SEs) are clusters of transcriptional enhancers which control the expression of cell identity and disease-associated genes. Current studies demonstrated the role of multiple factors in SE formation; however, a systematic analysis to assess the relative predictive importance of chromatin and sequence features of SEs and their constituents is lacking. In addition, a predictive model that integrates various types of data to predict SEs has not been established. Here, we integrated diverse types of genomic and epigenomic datasets to identify key signatures of SEs and investigated their predictive importance. Through integrative modeling, we found Cdk8, Cdk9, and Smad3 as new features of SEs, which can define known and new SEs in mouse embryonic stem cells and pro-B cells. We compared six state-of-the-art machine learning models to predict SEs and showed that non-parametric ensemble models performed better as compared to parametric. We validated these models using cross-validation and also independent datasets in four human cell-types. Taken together, our systematic analysis and ranking of features can be used as a platform to define and understand the biology of SEs in other cell-types. |
format | Online Article Text |
id | pubmed-6393462 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-63934622019-03-01 Integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers Khan, Aziz Zhang, Xuegong Sci Rep Article Super-enhancers (SEs) are clusters of transcriptional enhancers which control the expression of cell identity and disease-associated genes. Current studies demonstrated the role of multiple factors in SE formation; however, a systematic analysis to assess the relative predictive importance of chromatin and sequence features of SEs and their constituents is lacking. In addition, a predictive model that integrates various types of data to predict SEs has not been established. Here, we integrated diverse types of genomic and epigenomic datasets to identify key signatures of SEs and investigated their predictive importance. Through integrative modeling, we found Cdk8, Cdk9, and Smad3 as new features of SEs, which can define known and new SEs in mouse embryonic stem cells and pro-B cells. We compared six state-of-the-art machine learning models to predict SEs and showed that non-parametric ensemble models performed better as compared to parametric. We validated these models using cross-validation and also independent datasets in four human cell-types. Taken together, our systematic analysis and ranking of features can be used as a platform to define and understand the biology of SEs in other cell-types. Nature Publishing Group UK 2019-02-27 /pmc/articles/PMC6393462/ /pubmed/30814546 http://dx.doi.org/10.1038/s41598-019-38979-9 Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Khan, Aziz Zhang, Xuegong Integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers |
title | Integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers |
title_full | Integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers |
title_fullStr | Integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers |
title_full_unstemmed | Integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers |
title_short | Integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers |
title_sort | integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6393462/ https://www.ncbi.nlm.nih.gov/pubmed/30814546 http://dx.doi.org/10.1038/s41598-019-38979-9 |
work_keys_str_mv | AT khanaziz integrativemodelingrevealskeychromatinandsequencesignaturespredictingsuperenhancers AT zhangxuegong integrativemodelingrevealskeychromatinandsequencesignaturespredictingsuperenhancers |