Cargando…
iPNHOT: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots
BACKGROUND: The interaction between proteins and nucleic acids plays pivotal roles in various biological processes such as transcription, translation, and gene regulation. Hot spots are a small set of residues that contribute most to the binding affinity of a protein-nucleic acid interaction. Compar...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7336410/ https://www.ncbi.nlm.nih.gov/pubmed/32631222 http://dx.doi.org/10.1186/s12859-020-03636-w |
_version_ | 1783554313043312640 |
---|---|
author | Zhu, Xiaolei Liu, Ling He, Jingjing Fang, Ting Xiong, Yi Mitchell, Julie C. |
author_facet | Zhu, Xiaolei Liu, Ling He, Jingjing Fang, Ting Xiong, Yi Mitchell, Julie C. |
author_sort | Zhu, Xiaolei |
collection | PubMed |
description | BACKGROUND: The interaction between proteins and nucleic acids plays pivotal roles in various biological processes such as transcription, translation, and gene regulation. Hot spots are a small set of residues that contribute most to the binding affinity of a protein-nucleic acid interaction. Compared to the extensive studies of the hot spots on protein-protein interfaces, the hot spot residues within protein-nucleic acids interfaces remain less well-studied, in part because mutagenesis data for protein-nucleic acids interaction are not as abundant as that for protein-protein interactions. RESULTS: In this study, we built a new computational model, iPNHOT, to effectively predict hot spot residues on protein-nucleic acids interfaces. One training data set and an independent test set were collected from dbAMEPNI and some recent literature, respectively. To build our model, we generated 97 different sequential and structural features and used a two-step strategy to select the relevant features. The final model was built based only on 7 features using a support vector machine (SVM). The features include two unique features such as ∆SASsa(1/2) and esp3, which are newly proposed in this study. Based on the cross validation results, our model gave F1 score and AUROC as 0.725 and 0.807 on the subset collected from ProNIT, respectively, compared to 0.407 and 0.670 of mCSM-NA, a state-of-the art model to predict the thermodynamic effects of protein-nucleic acid interaction. The iPNHOT model was further tested on the independent test set, which showed that our model outperformed other methods. CONCLUSION: In this study, by collecting data from a recently published database dbAMEPNI, we proposed a new model, iPNHOT, to predict hotspots on both protein-DNA and protein-RNA interfaces. The results show that our model outperforms the existing state-of-art models. Our model is available for users through a webserver: http://zhulab.ahu.edu.cn/iPNHOT/. |
format | Online Article Text |
id | pubmed-7336410 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-73364102020-07-07 iPNHOT: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots Zhu, Xiaolei Liu, Ling He, Jingjing Fang, Ting Xiong, Yi Mitchell, Julie C. BMC Bioinformatics Research Article BACKGROUND: The interaction between proteins and nucleic acids plays pivotal roles in various biological processes such as transcription, translation, and gene regulation. Hot spots are a small set of residues that contribute most to the binding affinity of a protein-nucleic acid interaction. Compared to the extensive studies of the hot spots on protein-protein interfaces, the hot spot residues within protein-nucleic acids interfaces remain less well-studied, in part because mutagenesis data for protein-nucleic acids interaction are not as abundant as that for protein-protein interactions. RESULTS: In this study, we built a new computational model, iPNHOT, to effectively predict hot spot residues on protein-nucleic acids interfaces. One training data set and an independent test set were collected from dbAMEPNI and some recent literature, respectively. To build our model, we generated 97 different sequential and structural features and used a two-step strategy to select the relevant features. The final model was built based only on 7 features using a support vector machine (SVM). The features include two unique features such as ∆SASsa(1/2) and esp3, which are newly proposed in this study. Based on the cross validation results, our model gave F1 score and AUROC as 0.725 and 0.807 on the subset collected from ProNIT, respectively, compared to 0.407 and 0.670 of mCSM-NA, a state-of-the art model to predict the thermodynamic effects of protein-nucleic acid interaction. The iPNHOT model was further tested on the independent test set, which showed that our model outperformed other methods. CONCLUSION: In this study, by collecting data from a recently published database dbAMEPNI, we proposed a new model, iPNHOT, to predict hotspots on both protein-DNA and protein-RNA interfaces. The results show that our model outperforms the existing state-of-art models. Our model is available for users through a webserver: http://zhulab.ahu.edu.cn/iPNHOT/. BioMed Central 2020-07-06 /pmc/articles/PMC7336410/ /pubmed/32631222 http://dx.doi.org/10.1186/s12859-020-03636-w Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Zhu, Xiaolei Liu, Ling He, Jingjing Fang, Ting Xiong, Yi Mitchell, Julie C. iPNHOT: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots |
title | iPNHOT: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots |
title_full | iPNHOT: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots |
title_fullStr | iPNHOT: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots |
title_full_unstemmed | iPNHOT: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots |
title_short | iPNHOT: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots |
title_sort | ipnhot: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7336410/ https://www.ncbi.nlm.nih.gov/pubmed/32631222 http://dx.doi.org/10.1186/s12859-020-03636-w |
work_keys_str_mv | AT zhuxiaolei ipnhotaknowledgebasedapproachforidentifyingproteinnucleicacidinteractionhotspots AT liuling ipnhotaknowledgebasedapproachforidentifyingproteinnucleicacidinteractionhotspots AT hejingjing ipnhotaknowledgebasedapproachforidentifyingproteinnucleicacidinteractionhotspots AT fangting ipnhotaknowledgebasedapproachforidentifyingproteinnucleicacidinteractionhotspots AT xiongyi ipnhotaknowledgebasedapproachforidentifyingproteinnucleicacidinteractionhotspots AT mitchelljuliec ipnhotaknowledgebasedapproachforidentifyingproteinnucleicacidinteractionhotspots |