Cargando…

Machine learning predicts nucleosome binding modes of transcription factors

BACKGROUND: Most transcription factors (TFs) compete with nucleosomes to gain access to their cognate binding sites. Recent studies have identified several TF-nucleosome interaction modes including end binding (EB), oriented binding, periodic binding, dyad binding, groove binding, and gyre spanning....

Descripción completa

Detalles Bibliográficos
Autores principales: Kishan, K. C., Subramanya, Sridevi K., Li, Rui, Cui, Feng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8008688/
https://www.ncbi.nlm.nih.gov/pubmed/33784978
http://dx.doi.org/10.1186/s12859-021-04093-9
Descripción
Sumario:BACKGROUND: Most transcription factors (TFs) compete with nucleosomes to gain access to their cognate binding sites. Recent studies have identified several TF-nucleosome interaction modes including end binding (EB), oriented binding, periodic binding, dyad binding, groove binding, and gyre spanning. However, there are substantial experimental challenges in measuring nucleosome binding modes for thousands of TFs in different species. RESULTS: We present a computational prediction of the binding modes based on TF protein sequences. With a nested cross-validation procedure, our model outperforms several fine-tuned off-the-shelf machine learning (ML) methods in the multi-label classification task. Our binary classifier for the EB mode performs better than these ML methods with the area under precision-recall curve achieving 75%. The end preference of most TFs is consistent with low nucleosome occupancy around their binding site in GM12878 cells. The nucleosome occupancy data is used as an alternative dataset to confirm the superiority of our EB classifier. CONCLUSIONS: We develop the first ML-based approach for efficient and comprehensive analysis of nucleosome binding modes of TFs. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04093-9.