Cargando…
Assessing the model transferability for prediction of transcription factor binding sites based on chromatin accessibility
BACKGROUND: Computational prediction of transcription factor (TF) binding sites in different cell types is challenging. Recent technology development allows us to determine the genome-wide chromatin accessibility in various cellular and developmental contexts. The chromatin accessibility profiles pr...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5530957/ https://www.ncbi.nlm.nih.gov/pubmed/28750606 http://dx.doi.org/10.1186/s12859-017-1769-7 |
_version_ | 1783253325699874816 |
---|---|
author | Liu, Sheng Zibetti, Cristina Wan, Jun Wang, Guohua Blackshaw, Seth Qian, Jiang |
author_facet | Liu, Sheng Zibetti, Cristina Wan, Jun Wang, Guohua Blackshaw, Seth Qian, Jiang |
author_sort | Liu, Sheng |
collection | PubMed |
description | BACKGROUND: Computational prediction of transcription factor (TF) binding sites in different cell types is challenging. Recent technology development allows us to determine the genome-wide chromatin accessibility in various cellular and developmental contexts. The chromatin accessibility profiles provide useful information in prediction of TF binding events in various physiological conditions. Furthermore, ChIP-Seq analysis was used to determine genome-wide binding sites for a range of different TFs in multiple cell types. Integration of these two types of genomic information can improve the prediction of TF binding events. RESULTS: We assessed to what extent a model built upon on other TFs and/or other cell types could be used to predict the binding sites of TFs of interest. A random forest model was built using a set of cell type-independent features such as specific sequences recognized by the TFs and evolutionary conservation, as well as cell type-specific features derived from chromatin accessibility data. Our analysis suggested that the models learned from other TFs and/or cell lines performed almost as well as the model learned from the target TF in the cell type of interest. Interestingly, models based on multiple TFs performed better than single-TF models. Finally, we proposed a universal model, BPAC, which was generated using ChIP-Seq data from multiple TFs in various cell types. CONCLUSION: Integrating chromatin accessibility information with sequence information improves prediction of TF binding.The prediction of TF binding is transferable across TFs and/or cell lines suggesting there are a set of universal “rules”. A computational tool was developed to predict TF binding sites based on the universal “rules”. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1769-7) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5530957 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-55309572017-08-02 Assessing the model transferability for prediction of transcription factor binding sites based on chromatin accessibility Liu, Sheng Zibetti, Cristina Wan, Jun Wang, Guohua Blackshaw, Seth Qian, Jiang BMC Bioinformatics Research Article BACKGROUND: Computational prediction of transcription factor (TF) binding sites in different cell types is challenging. Recent technology development allows us to determine the genome-wide chromatin accessibility in various cellular and developmental contexts. The chromatin accessibility profiles provide useful information in prediction of TF binding events in various physiological conditions. Furthermore, ChIP-Seq analysis was used to determine genome-wide binding sites for a range of different TFs in multiple cell types. Integration of these two types of genomic information can improve the prediction of TF binding events. RESULTS: We assessed to what extent a model built upon on other TFs and/or other cell types could be used to predict the binding sites of TFs of interest. A random forest model was built using a set of cell type-independent features such as specific sequences recognized by the TFs and evolutionary conservation, as well as cell type-specific features derived from chromatin accessibility data. Our analysis suggested that the models learned from other TFs and/or cell lines performed almost as well as the model learned from the target TF in the cell type of interest. Interestingly, models based on multiple TFs performed better than single-TF models. Finally, we proposed a universal model, BPAC, which was generated using ChIP-Seq data from multiple TFs in various cell types. CONCLUSION: Integrating chromatin accessibility information with sequence information improves prediction of TF binding.The prediction of TF binding is transferable across TFs and/or cell lines suggesting there are a set of universal “rules”. A computational tool was developed to predict TF binding sites based on the universal “rules”. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1769-7) contains supplementary material, which is available to authorized users. BioMed Central 2017-07-27 /pmc/articles/PMC5530957/ /pubmed/28750606 http://dx.doi.org/10.1186/s12859-017-1769-7 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Liu, Sheng Zibetti, Cristina Wan, Jun Wang, Guohua Blackshaw, Seth Qian, Jiang Assessing the model transferability for prediction of transcription factor binding sites based on chromatin accessibility |
title | Assessing the model transferability for prediction of transcription factor binding sites based on chromatin accessibility |
title_full | Assessing the model transferability for prediction of transcription factor binding sites based on chromatin accessibility |
title_fullStr | Assessing the model transferability for prediction of transcription factor binding sites based on chromatin accessibility |
title_full_unstemmed | Assessing the model transferability for prediction of transcription factor binding sites based on chromatin accessibility |
title_short | Assessing the model transferability for prediction of transcription factor binding sites based on chromatin accessibility |
title_sort | assessing the model transferability for prediction of transcription factor binding sites based on chromatin accessibility |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5530957/ https://www.ncbi.nlm.nih.gov/pubmed/28750606 http://dx.doi.org/10.1186/s12859-017-1769-7 |
work_keys_str_mv | AT liusheng assessingthemodeltransferabilityforpredictionoftranscriptionfactorbindingsitesbasedonchromatinaccessibility AT zibetticristina assessingthemodeltransferabilityforpredictionoftranscriptionfactorbindingsitesbasedonchromatinaccessibility AT wanjun assessingthemodeltransferabilityforpredictionoftranscriptionfactorbindingsitesbasedonchromatinaccessibility AT wangguohua assessingthemodeltransferabilityforpredictionoftranscriptionfactorbindingsitesbasedonchromatinaccessibility AT blackshawseth assessingthemodeltransferabilityforpredictionoftranscriptionfactorbindingsitesbasedonchromatinaccessibility AT qianjiang assessingthemodeltransferabilityforpredictionoftranscriptionfactorbindingsitesbasedonchromatinaccessibility |