Cargando…

iPNHOT: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots

BACKGROUND: The interaction between proteins and nucleic acids plays pivotal roles in various biological processes such as transcription, translation, and gene regulation. Hot spots are a small set of residues that contribute most to the binding affinity of a protein-nucleic acid interaction. Compar...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Xiaolei, Liu, Ling, He, Jingjing, Fang, Ting, Xiong, Yi, Mitchell, Julie C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7336410/
https://www.ncbi.nlm.nih.gov/pubmed/32631222
http://dx.doi.org/10.1186/s12859-020-03636-w
_version_ 1783554313043312640
author Zhu, Xiaolei
Liu, Ling
He, Jingjing
Fang, Ting
Xiong, Yi
Mitchell, Julie C.
author_facet Zhu, Xiaolei
Liu, Ling
He, Jingjing
Fang, Ting
Xiong, Yi
Mitchell, Julie C.
author_sort Zhu, Xiaolei
collection PubMed
description BACKGROUND: The interaction between proteins and nucleic acids plays pivotal roles in various biological processes such as transcription, translation, and gene regulation. Hot spots are a small set of residues that contribute most to the binding affinity of a protein-nucleic acid interaction. Compared to the extensive studies of the hot spots on protein-protein interfaces, the hot spot residues within protein-nucleic acids interfaces remain less well-studied, in part because mutagenesis data for protein-nucleic acids interaction are not as abundant as that for protein-protein interactions. RESULTS: In this study, we built a new computational model, iPNHOT, to effectively predict hot spot residues on protein-nucleic acids interfaces. One training data set and an independent test set were collected from dbAMEPNI and some recent literature, respectively. To build our model, we generated 97 different sequential and structural features and used a two-step strategy to select the relevant features. The final model was built based only on 7 features using a support vector machine (SVM). The features include two unique features such as ∆SASsa(1/2) and esp3, which are newly proposed in this study. Based on the cross validation results, our model gave F1 score and AUROC as 0.725 and 0.807 on the subset collected from ProNIT, respectively, compared to 0.407 and 0.670 of mCSM-NA, a state-of-the art model to predict the thermodynamic effects of protein-nucleic acid interaction. The iPNHOT model was further tested on the independent test set, which showed that our model outperformed other methods. CONCLUSION: In this study, by collecting data from a recently published database dbAMEPNI, we proposed a new model, iPNHOT, to predict hotspots on both protein-DNA and protein-RNA interfaces. The results show that our model outperforms the existing state-of-art models. Our model is available for users through a webserver: http://zhulab.ahu.edu.cn/iPNHOT/.
format Online
Article
Text
id pubmed-7336410
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-73364102020-07-07 iPNHOT: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots Zhu, Xiaolei Liu, Ling He, Jingjing Fang, Ting Xiong, Yi Mitchell, Julie C. BMC Bioinformatics Research Article BACKGROUND: The interaction between proteins and nucleic acids plays pivotal roles in various biological processes such as transcription, translation, and gene regulation. Hot spots are a small set of residues that contribute most to the binding affinity of a protein-nucleic acid interaction. Compared to the extensive studies of the hot spots on protein-protein interfaces, the hot spot residues within protein-nucleic acids interfaces remain less well-studied, in part because mutagenesis data for protein-nucleic acids interaction are not as abundant as that for protein-protein interactions. RESULTS: In this study, we built a new computational model, iPNHOT, to effectively predict hot spot residues on protein-nucleic acids interfaces. One training data set and an independent test set were collected from dbAMEPNI and some recent literature, respectively. To build our model, we generated 97 different sequential and structural features and used a two-step strategy to select the relevant features. The final model was built based only on 7 features using a support vector machine (SVM). The features include two unique features such as ∆SASsa(1/2) and esp3, which are newly proposed in this study. Based on the cross validation results, our model gave F1 score and AUROC as 0.725 and 0.807 on the subset collected from ProNIT, respectively, compared to 0.407 and 0.670 of mCSM-NA, a state-of-the art model to predict the thermodynamic effects of protein-nucleic acid interaction. The iPNHOT model was further tested on the independent test set, which showed that our model outperformed other methods. CONCLUSION: In this study, by collecting data from a recently published database dbAMEPNI, we proposed a new model, iPNHOT, to predict hotspots on both protein-DNA and protein-RNA interfaces. The results show that our model outperforms the existing state-of-art models. Our model is available for users through a webserver: http://zhulab.ahu.edu.cn/iPNHOT/. BioMed Central 2020-07-06 /pmc/articles/PMC7336410/ /pubmed/32631222 http://dx.doi.org/10.1186/s12859-020-03636-w Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Zhu, Xiaolei
Liu, Ling
He, Jingjing
Fang, Ting
Xiong, Yi
Mitchell, Julie C.
iPNHOT: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots
title iPNHOT: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots
title_full iPNHOT: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots
title_fullStr iPNHOT: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots
title_full_unstemmed iPNHOT: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots
title_short iPNHOT: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots
title_sort ipnhot: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7336410/
https://www.ncbi.nlm.nih.gov/pubmed/32631222
http://dx.doi.org/10.1186/s12859-020-03636-w
work_keys_str_mv AT zhuxiaolei ipnhotaknowledgebasedapproachforidentifyingproteinnucleicacidinteractionhotspots
AT liuling ipnhotaknowledgebasedapproachforidentifyingproteinnucleicacidinteractionhotspots
AT hejingjing ipnhotaknowledgebasedapproachforidentifyingproteinnucleicacidinteractionhotspots
AT fangting ipnhotaknowledgebasedapproachforidentifyingproteinnucleicacidinteractionhotspots
AT xiongyi ipnhotaknowledgebasedapproachforidentifyingproteinnucleicacidinteractionhotspots
AT mitchelljuliec ipnhotaknowledgebasedapproachforidentifyingproteinnucleicacidinteractionhotspots