Cargando…

pLMSNOSite: an ensemble-based approach for predicting protein S-nitrosylation sites by integrating supervised word embedding and embedding from pre-trained protein language model

BACKGROUND: Protein S-nitrosylation (SNO) plays a key role in transferring nitric oxide-mediated signals in both animals and plants and has emerged as an important mechanism for regulating protein functions and cell signaling of all main classes of protein. It is involved in several biological proce...

Descripción completa

Detalles Bibliográficos
Autores principales: Pratyush, Pawel, Pokharel, Suresh, Saigo, Hiroto, KC, Dukka B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9909867/
https://www.ncbi.nlm.nih.gov/pubmed/36755242
http://dx.doi.org/10.1186/s12859-023-05164-9
_version_ 1784884665551683584
author Pratyush, Pawel
Pokharel, Suresh
Saigo, Hiroto
KC, Dukka B.
author_facet Pratyush, Pawel
Pokharel, Suresh
Saigo, Hiroto
KC, Dukka B.
author_sort Pratyush, Pawel
collection PubMed
description BACKGROUND: Protein S-nitrosylation (SNO) plays a key role in transferring nitric oxide-mediated signals in both animals and plants and has emerged as an important mechanism for regulating protein functions and cell signaling of all main classes of protein. It is involved in several biological processes including immune response, protein stability, transcription regulation, post translational regulation, DNA damage repair, redox regulation, and is an emerging paradigm of redox signaling for protection against oxidative stress. The development of robust computational tools to predict protein SNO sites would contribute to further interpretation of the pathological and physiological mechanisms of SNO. RESULTS: Using an intermediate fusion-based stacked generalization approach, we integrated embeddings from supervised embedding layer and contextualized protein language model (ProtT5) and developed a tool called pLMSNOSite (protein language model-based SNO site predictor). On an independent test set of experimentally identified SNO sites, pLMSNOSite achieved values of 0.340, 0.735 and 0.773 for MCC, sensitivity and specificity respectively. These results show that pLMSNOSite performs better than the compared approaches for the prediction of S-nitrosylation sites. CONCLUSION: Together, the experimental results suggest that pLMSNOSite achieves significant improvement in the prediction performance of S-nitrosylation sites and represents a robust computational approach for predicting protein S-nitrosylation sites. pLMSNOSite could be a useful resource for further elucidation of SNO and is publicly available at https://github.com/KCLabMTU/pLMSNOSite. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05164-9.
format Online
Article
Text
id pubmed-9909867
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-99098672023-02-10 pLMSNOSite: an ensemble-based approach for predicting protein S-nitrosylation sites by integrating supervised word embedding and embedding from pre-trained protein language model Pratyush, Pawel Pokharel, Suresh Saigo, Hiroto KC, Dukka B. BMC Bioinformatics Research BACKGROUND: Protein S-nitrosylation (SNO) plays a key role in transferring nitric oxide-mediated signals in both animals and plants and has emerged as an important mechanism for regulating protein functions and cell signaling of all main classes of protein. It is involved in several biological processes including immune response, protein stability, transcription regulation, post translational regulation, DNA damage repair, redox regulation, and is an emerging paradigm of redox signaling for protection against oxidative stress. The development of robust computational tools to predict protein SNO sites would contribute to further interpretation of the pathological and physiological mechanisms of SNO. RESULTS: Using an intermediate fusion-based stacked generalization approach, we integrated embeddings from supervised embedding layer and contextualized protein language model (ProtT5) and developed a tool called pLMSNOSite (protein language model-based SNO site predictor). On an independent test set of experimentally identified SNO sites, pLMSNOSite achieved values of 0.340, 0.735 and 0.773 for MCC, sensitivity and specificity respectively. These results show that pLMSNOSite performs better than the compared approaches for the prediction of S-nitrosylation sites. CONCLUSION: Together, the experimental results suggest that pLMSNOSite achieves significant improvement in the prediction performance of S-nitrosylation sites and represents a robust computational approach for predicting protein S-nitrosylation sites. pLMSNOSite could be a useful resource for further elucidation of SNO and is publicly available at https://github.com/KCLabMTU/pLMSNOSite. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05164-9. BioMed Central 2023-02-08 /pmc/articles/PMC9909867/ /pubmed/36755242 http://dx.doi.org/10.1186/s12859-023-05164-9 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Pratyush, Pawel
Pokharel, Suresh
Saigo, Hiroto
KC, Dukka B.
pLMSNOSite: an ensemble-based approach for predicting protein S-nitrosylation sites by integrating supervised word embedding and embedding from pre-trained protein language model
title pLMSNOSite: an ensemble-based approach for predicting protein S-nitrosylation sites by integrating supervised word embedding and embedding from pre-trained protein language model
title_full pLMSNOSite: an ensemble-based approach for predicting protein S-nitrosylation sites by integrating supervised word embedding and embedding from pre-trained protein language model
title_fullStr pLMSNOSite: an ensemble-based approach for predicting protein S-nitrosylation sites by integrating supervised word embedding and embedding from pre-trained protein language model
title_full_unstemmed pLMSNOSite: an ensemble-based approach for predicting protein S-nitrosylation sites by integrating supervised word embedding and embedding from pre-trained protein language model
title_short pLMSNOSite: an ensemble-based approach for predicting protein S-nitrosylation sites by integrating supervised word embedding and embedding from pre-trained protein language model
title_sort plmsnosite: an ensemble-based approach for predicting protein s-nitrosylation sites by integrating supervised word embedding and embedding from pre-trained protein language model
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9909867/
https://www.ncbi.nlm.nih.gov/pubmed/36755242
http://dx.doi.org/10.1186/s12859-023-05164-9
work_keys_str_mv AT pratyushpawel plmsnositeanensemblebasedapproachforpredictingproteinsnitrosylationsitesbyintegratingsupervisedwordembeddingandembeddingfrompretrainedproteinlanguagemodel
AT pokharelsuresh plmsnositeanensemblebasedapproachforpredictingproteinsnitrosylationsitesbyintegratingsupervisedwordembeddingandembeddingfrompretrainedproteinlanguagemodel
AT saigohiroto plmsnositeanensemblebasedapproachforpredictingproteinsnitrosylationsitesbyintegratingsupervisedwordembeddingandembeddingfrompretrainedproteinlanguagemodel
AT kcdukkab plmsnositeanensemblebasedapproachforpredictingproteinsnitrosylationsitesbyintegratingsupervisedwordembeddingandembeddingfrompretrainedproteinlanguagemodel